Segmentation Fault (Core Dumped) when convert whisper with int8 quantization #61695

SantiagoMoreno-UdeA · 2023-08-25T11:32:23Z

System information
Linux 20.04
pip Tensorflow==2.12.0
using tranformers WhisperForConditionalgeneration

I'm trying to convert from TF to tflite and quantized to int8 Whisper, using the whisper model from tranformers WhisperForConditionalGeneration. At some point the conversion crash.
Here is the colab for more details:
Colab: https://colab.research.google.com/drive/1oAVoUxRFZLkS1uqqFN8HdgRVk0IWAlsN?usp=sharing

Also I attach the Error Trace from my server running in CPU and also running in GPU (TITAN RTX 24GB).
CPU: TraceTflite.txt

GPU: TraceTflite_GPU.txt

pjpratik · 2023-08-28T11:17:36Z

Hi @SantiagoMoreno-UdeA

I was able to reproduce this issue in TF Nightly as well. Please find the gist here.

A similar issue is being tracked in #59716

Does dynamic range quantization works for your case?

Thanks.

SantiagoMoreno-UdeA · 2023-08-28T11:33:28Z

Hi @pjpratik!

Thanks for answering

I need all model in Int8 'cause I'm attempting to run whisper inference in a NPU and this only support int8 data type.
So Dynamic quantization is not an option for me :/.

Looking in advance for your answer.

Cheers!

pjpratik · 2023-08-28T11:50:35Z

@SantiagoMoreno-UdeA Thanks for the information.

@pkgoogle Could you please look into this issue?

Thanks.

pkgoogle · 2023-08-28T18:39:56Z

I was able to reproduce from @pjpratik's gist.

@abattery can you please take a look

SantiagoMoreno-UdeA · 2023-09-12T07:21:17Z

Hello @abattery have you had time to take a look on this?

nyadla-sys · 2023-09-14T18:02:11Z

@SantiagoMoreno-UdeA I guess MUL op used in this model requires 16bit activation in order to preserve its accuracy. I am still not sure what is going on with TFLiteconverter
Here is the same issue I raised so long time back and no one addresses it
#58451

nyadla-sys · 2023-09-14T18:14:44Z

When I have analyzed, observed seg fault here ...
#0 0x00007f623573d7d3 in mlir::quant::QuantizedType::getExpressedType() const () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#1 0x00007f623573e1ac in mlir::quant::QuantizedType::castFromExpressedType(mlir::Type) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so

SantiagoMoreno-UdeA · 2023-09-22T10:46:02Z

@nyadla-sys It seems that so far quantize Whisper it's very tricky. Thank you for your information I'll take a look.

emirkin · 2023-10-22T21:17:25Z

Related to #29829

James-Shared-Studios · 2024-02-29T18:19:17Z

Hi there, I am facing the same issue when trying to convert whisper into int8 for running on TPU, is there any update please? Thank you.

SantiagoMoreno-UdeA · 2024-03-01T18:55:07Z

Hi @James-Shared-Studios, No the error remains. It seems that it's very low level error.

6nl · 2024-03-08T08:51:41Z

I found that tflite versions of Whisper generate NaN values when processing the -float("inf") values that are used in one part of the transformer codebase (specifically, the logits processor that kicks in when you call generate with forced tokens). Perhaps those NaNs make the int8 quantization crash too. I made a crude patch here, which has worked for me to stop the NaNs happening: nyadla-sys/whisper.tflite#15

SantiagoMoreno-UdeA added the TFLiteConverter For issues related to TFLite converter label Aug 25, 2023

google-ml-butler bot assigned tilakrayal Aug 25, 2023

tilakrayal added comp:lite TF Lite related issues type:bug Bug TF 2.12 For issues related to Tensorflow 2.12 labels Aug 28, 2023

tilakrayal assigned pjpratik and unassigned tilakrayal Aug 28, 2023

SantiagoMoreno-UdeA changed the title ~~Core Dumped when convert whisper with int8 quantization~~ Segmentation Fault (Core Dumped) when convert whisper with int8 quantization Aug 28, 2023

pjpratik added the stat:awaiting response Status - Awaiting response from author label Aug 28, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Aug 28, 2023

pjpratik assigned pkgoogle and unassigned pjpratik Aug 28, 2023

pkgoogle added the ModelOptimizationToolkit TF Model Optimization Toolkit label Aug 28, 2023

pkgoogle assigned abattery Aug 28, 2023

pkgoogle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation Fault (Core Dumped) when convert whisper with int8 quantization #61695

Segmentation Fault (Core Dumped) when convert whisper with int8 quantization #61695

Segmentation Fault (Core Dumped) when convert whisper with int8 quantization #61695

Segmentation Fault (Core Dumped) when convert whisper with int8 quantization #61695

Comments