Allow int16 input/output even when not using 16x8 quantization mode #56615

hguihot · 2022-06-29T00:42:40Z

Currently the valid types can be either int16 or int8/uint8, but not a combination of both. Some models could for example contain a custom op returning an int16 tensor as model output, and converting such model to TFLite is failing. It looks like always adding _dtypes.int16 to the list of supported types when quant_mode.is_integer_quantization() is true would be enough to make it work.

The text was updated successfully, but these errors were encountered:

mohantym · 2022-06-29T04:25:30Z

Hi @hguihot ! Could you also share a minimal standalone code too which will help expedite the issue. Thank you!

hguihot · 2022-06-29T18:42:19Z

Here is an example with 2 convolutions, both with the same 8-bit input but one with 8-bit output and the other with 16-bit output.

import tensorflow as tf

def quant(x, num_bits=8):
    return tf.quantization.fake_quant_with_min_max_args(x, -1, 1, num_bits, False)

class QConv(tf.keras.layers.Conv2D):
    def __init__(self, filters, kernel_size, weight_quantizer, activation_quantizer):
        self.weight_quantizer = weight_quantizer
        self.activation_quantizer = activation_quantizer
        super().__init__(filters = filters, kernel_size = kernel_size)

    def call(self, bottom):
        return self.activation_quantizer(self.convolution_op(bottom, self.weight_quantizer(self.kernel)))

tf.keras.backend.set_image_data_format("channels_last")
input_tensor = tf.keras.Input(shape=(64, 64, 3), batch_size=1)
quantized_input_tensor = quant(input_tensor, num_bits=8)

# 8->8bit convolution
layer = QConv(filters=32, kernel_size=3, weight_quantizer=quant, activation_quantizer=lambda x: quant(x, num_bits=8))
output8 = layer(quantized_input_tensor)

# 8->16bit convolution
layer = QConv(filters=32, kernel_size=3, weight_quantizer=quant, activation_quantizer=lambda x: quant(x, num_bits=16))
output16 = layer(quantized_input_tensor)

model = tf.keras.Model(inputs=[input_tensor], outputs=[output8, output16])

train_save_path = "/tmp/debug_model"
convert_model_path = "/tmp/converted.tflite"

model.save(train_save_path)
converter = tf.lite.TFLiteConverter.from_saved_model(train_save_path)
converter.optimizations =[tf.lite.Optimize.DEFAULT]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int16
tflite_model = converter.convert()

with open(convert_model_path, "wb") as f:
    f.write(tflite_model)

Two changes were actually required to make the conversion succeed:

Add _dtypes.int16 to all_types in _validate_inference_input_output_types in tensorflow/lite/python/lite.py
Add dtypes.int16 to _MAP_QUANT_TO_IO_TYPES in tensorflow/lite/python/util.py

mohantym · 2022-06-30T07:19:44Z

Hi @sachinprasadhs ! Could you look at this feature request. Attached gist for reference. Thank you!

hguihot · 2022-08-31T18:12:02Z

Any update?

As a feature request #56615, the _dtypes.int16 to be allowed when 16x8 quantization is not used so that the custom ops returning 16bit outputs can be benefitted.

pjpratik · 2023-02-03T06:56:10Z

We created PR #59526 to enable support dtypes.int16. The issue will be closed once this is merged. Thanks!

pjpratik · 2023-11-15T10:12:39Z

Hi @hguihot

The support for int16 has been added with the commit 33d76ac.

Closing this issue as resolved. Please reopen if you'd like to work on this further.

hguihot added the TFLiteConverter For issues related to TFLite converter label Jun 29, 2022

google-ml-butler bot assigned mohantym Jun 29, 2022

mohantym added comp:lite TF Lite related issues type:feature Feature requests labels Jun 29, 2022

mohantym added the stat:awaiting response Status - Awaiting response from author label Jun 29, 2022

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Jun 29, 2022

mohantym assigned sachinprasadhs and unassigned mohantym Jun 30, 2022

sachinprasadhs assigned karimnosseir Jul 1, 2022

sachinprasadhs added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jul 1, 2022

karimnosseir assigned haozha111 and abattery and unassigned karimnosseir Aug 24, 2022

pjpratik added a commit that referenced this issue Feb 2, 2023

Update lite.py

24a8dd2

As a feature request #56615, the _dtypes.int16 to be allowed when 16x8 quantization is not used so that the custom ops returning 16bit outputs can be benefitted.

pjpratik mentioned this issue Feb 2, 2023

Add support for int16 dtype regardless of 16x8 quantization usage #59526

Merged

pjpratik added awaiting PR merge awaiting PR merge and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Feb 3, 2023

pjpratik self-assigned this Feb 3, 2023

sachinprasadhs removed their assignment Feb 16, 2023

pjpratik closed this as completed Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow int16 input/output even when not using 16x8 quantization mode #56615

Allow int16 input/output even when not using 16x8 quantization mode #56615

Allow int16 input/output even when not using 16x8 quantization mode #56615

Allow int16 input/output even when not using 16x8 quantization mode #56615

Comments