[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion failure: tfl.batch_matmul "expected 3 but got 2" (regression since 2.14, worked in 2.13) #65769

Closed
gustavla opened this issue Apr 15, 2024 · 7 comments
Assignees
Labels
comp:lite TF Lite related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TFLiteConverter For issues related to TFLite converter

Comments

@gustavla
Copy link
Contributor

1. System information

  • macOS 14.1.2 / Python 3.10
  • pip install tensorflow-macos==2.16.1 tf-keras==2.16.0 keras==3.2.1

2. Code

import tensorflow as tf
import keras

input0_shape = [1, 5]
input1_shape = [1, 5, 7]
output_shape = [1, 1, 7]

tf_input0 = keras.Input(input0_shape[1:], batch_size=1)
tf_input1 = keras.Input(input1_shape[1:], batch_size=1)


class MyMatMul(keras.layers.Layer):
    def call(self, tf_input0, tf_input1):
        # -> [1, 1, 5]
        tf_input0_rank3 = tf.expand_dims(tf_input0, [1])

        # [1, 1, 5] x [1, 5, 7] -> [1, 1, 7]
        tf_output_rank3 = tf.linalg.matmul(tf_input0_rank3, tf_input1)

        # -> [1, 7]
        tf_output = tf.squeeze(tf_output_rank3, [1])

        return tf_output

tf_output = MyMatMul()(tf_input0, tf_input1)

model = keras.Model(inputs=[tf_input0, tf_input1], outputs=[tf_output])

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

3. Failure after conversion

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1713222146.555150 18460241 tf_tfl_flatbuffer_helpers.cc:390] Ignored output_format.
W0000 00:00:1713222146.555506 18460241 tf_tfl_flatbuffer_helpers.cc:393] Ignored drop_control_dependency.
2024-04-15 16:02:26.556328: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: /var/folders/d5/6vzc45z14_79pfg8d_mbpmsr049hy4/T/tmpw802bhj6
2024-04-15 16:02:26.556581: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve }
2024-04-15 16:02:26.556589: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: /var/folders/d5/6vzc45z14_79pfg8d_mbpmsr049hy4/T/tmpw802bhj6
2024-04-15 16:02:26.558343: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
2024-04-15 16:02:26.558600: I tensorflow/cc/saved_model/loader.cc:234] Restoring SavedModel bundle.
2024-04-15 16:02:26.582278: I tensorflow/cc/saved_model/loader.cc:218] Running initialization op on SavedModel bundle at path: /var/folders/d5/6vzc45z14_79pfg8d_mbpmsr049hy4/T/tmpw802bhj6
2024-04-15 16:02:26.583903: I tensorflow/cc/saved_model/loader.cc:317] SavedModel load for tags { serve }; Status: success: OK. Took 27577 microseconds.
2024-04-15 16:02:26.602113: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
loc(fused[callsite(callsite(fused["Squeeze:", "functional_1_1/my_mat_mul_1/Squeeze@__inference_serving_default_16"] at fused["PartitionedCall:", "PartitionedCall@__inference_signature_wrapper_serving_default_23"]) at fused["PartitionedCall:", "PartitionedCall"]), callsite(callsite(fused["ExpandDims:", "functional_1_1/my_mat_mul_1/ExpandDims@__inference_serving_default_16"] at fused["PartitionedCall:", "PartitionedCall@__inference_signature_wrapper_serving_default_23"]) at fused["PartitionedCall:", "PartitionedCall"]), callsite(callsite(fused["BatchMatMulV2:", "functional_1_1/my_mat_mul_1/MatMul@__inference_serving_default_16"] at fused["PartitionedCall:", "PartitionedCall@__inference_signature_wrapper_serving_default_23"]) at fused["PartitionedCall:", "PartitionedCall"])]): error: 'tfl.batch_matmul' op found invalid output rank, expected 3 but got 2

The key is the final line: error: 'tfl.batch_matmul' op found invalid output rank, expected 3 but got 2.

Note that the expand and squeeze are both required for the failure to reproduce.

4. Regression analysis

Replace tensorflow version in the pip installation above:

  • 2.13: Works (no conversion error, model.tflite produced successfully)
  • 2.14: Failure
  • 2.15: Failure
  • 2.16: Failure

The above error happens across 2.14-2.16.

@gustavla gustavla added the TFLiteConverter For issues related to TFLite converter label Apr 15, 2024
@Venkat6871 Venkat6871 added the comp:lite TF Lite related issues label Apr 17, 2024
@sawantkumar
Copy link

hi @gustavla ,

I have replicated the issue and i got similar results. I am looking into the issue and will get back to you.

@sawantkumar sawantkumar assigned pkgoogle and unassigned sawantkumar May 6, 2024
@pkgoogle
Copy link
pkgoogle commented May 6, 2024

Hi @gustavla, can you let us know what chip your mac is using? M series? Intel? Thanks for your help.

@pkgoogle pkgoogle added stat:awaiting response Status - Awaiting response from author and removed WIP labels May 6, 2024
@gustavla
Copy link
Contributor Author
gustavla commented May 8, 2024

@pkgoogle Apple silicon. Also repros on x86_64 Ubuntu.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label May 8, 2024
@pkgoogle
Copy link
pkgoogle commented May 8, 2024

I was able to reproduce on x86_64 Debian on tf-nightly as well with the exact program above. One difference is I got this error/warning:

tensorflow.lite.python.convert_phase.ConverterError: Variable constant folding is failed. Please consider using enabling `experimental_enable_resource_variables` flag in the TFLite converter object. For example, converter.experimental_enable_resource_variables = True test.py:32:1: error: 'tfl.batch_matmul' op found invalid output rank, expected 3 but got 2

I attempted adding this line to see if it helps:

converter.experimental_enable_resource_variables = True

It did not help. @zichuan-wei, can you please take a look? Thanks.

@pkgoogle pkgoogle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 8, 2024
@pkgoogle
Copy link

Hi @gustavla, if you are able to access a linux system you may be able to resolve your issue by using AI-Edge-Torch, you can find more information here: googleblog.

I have actually created a simple script for converting your model here:

import torch
import torch.nn as nn
import ai_edge_torch


class MyMatMul(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x0, x1):
        x0_rank3 = torch.unsqueeze(x0, 1)
        out = x0_rank3 @ x1
        out = torch.squeeze(out, 1)
        return out


input0_shape = (1, 5)
input1_shape = (1, 5, 7)
model = MyMatMul()
sample_inputs = (torch.randn(*input0_shape), torch.randn(*input1_shape))

edge_model = ai_edge_torch.convert(model.eval(), sample_inputs)
edge_model.export("my_mat_mul.tflite")

If you want to, you can actually try visualizing the result in model-explorer as well.

Please try them out and let us know if this resolves your issue. If you still need further help, feel free to open a new issue at the respective repo.

@pkgoogle pkgoogle added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jun 10, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 18, 2024
Copy link

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TFLiteConverter For issues related to TFLite converter
Projects
None yet
Development

No branches or pull requests

6 participants