[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having non-converted operations, even for simplest models #62855

Open
Black3rror opened this issue Jan 28, 2024 · 16 comments
Open

Having non-converted operations, even for simplest models #62855

Black3rror opened this issue Jan 28, 2024 · 16 comments
Assignees
Labels
comp:lite TF Lite related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.15 For issues related to 2.15.x TFLiteConverter For issues related to TFLite converter

Comments

@Black3rror
Copy link

System information

  • Platform: Tried on Google Colab
  • TensorFlow version: 2.15.0

Steps to reproduce

  • Creating a Python file with the following content in Google Colab (let's call it test.py):
import tensorflow as tf

model = tf.keras.models.Sequential([
    tf.keras.layers.InputLayer(input_shape=(1,)),
    tf.keras.layers.Dense(1)
])

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

with open('model.tflite', 'wb') as f:
  f.write(tflite_model)
  • Calling it in Jupyter Notebook by !python test.py
  • The output will be:
2024-01-28 11:17:11.939381: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-28 11:17:11.939450: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-28 11:17:11.941203: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-28 11:17:13.646897: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-01-28 11:17:16.713671: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:378] Ignored output_format.
2024-01-28 11:17:16.713736: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:381] Ignored drop_control_dependency.
Summary on the non-converted ops:
---------------------------------
 * Accepted dialects: tfl, builtin, func
 * Non-Converted Ops: 1, Total Ops 6, % non-converted = 16.67 %
 * 1 ARITH ops

- arith.constant:    1 occurrences  (f32: 1)



  (f32: 1)

Note: running the code directly in Jupyter Notebook won't print anything

Problem

  • Why can it not convert all the operations in such a simple model?
  • Was it supposed to be like that?
  • Does having non-converted operations affect the performance of the model when deployed on a microcontroller using TFLM?
  • Is there a way to solve it?
@Black3rror Black3rror added the TFLiteConverter For issues related to TFLite converter label Jan 28, 2024
@LakshmiKalaKadali LakshmiKalaKadali added comp:lite TF Lite related issues TF 2.15 For issues related to 2.15.x labels Jan 29, 2024
@zhuochenKIDD
Copy link

perhaps "TF-TRT Warning: Could not find TensorRT" ?

@yunhao-qian
Copy link

I have just got the same issue for TensorFlow 2.15.0 on my M3 Pro MacBook. It seems that TF Lite somehow failed to support 20 arith.constant operations in my model.

@LakshmiKalaKadali
Copy link
Contributor

Hi @Black3rror,

Good observation. Exactly, when running !python test.py , on colab, the Non-converted ops are being shown but the model is converted to tflite successfully and runs fine. So, could you please try on TFLM and let us know if you encounter any blockers. Regarding performance point of view, I tested with another sample code, the accuracy was maintained the same even after converting to TFLite. Please refer to the following sample code: sample_train.py.

import tensorflow as tf
from tensorflow import keras
import numpy as np
import pathlib

# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0

# Define the model architecture
model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
model.fit(
  train_images,
  train_labels,
  epochs=1,
  validation_data=(test_images, test_labels)
)
model.save('tf_model')
tf_accuracy = model.evaluate(test_images, test_labels, verbose=0)
#results.append(['TF', '', '{:.2f}%'.format(tf_accuracy * 100)]
print(tf_accuracy)

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()
with open('model_quant.tflite', 'wb') as f:
  f.write(tflite_model) 

then run the code for accuracy checking.

Thank You

@LakshmiKalaKadali LakshmiKalaKadali added the stat:awaiting response Status - Awaiting response from author label Jan 31, 2024
Copy link
github-actions bot commented Feb 8, 2024

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Feb 8, 2024
@Black3rror
Copy link
Author
Black3rror commented Feb 11, 2024

@zhuochenKIDD Are you suggesting that I fix the TensorRT warning and the problem with Non-Converted Ops will get resolved automatically?

@LakshmiKalaKadali Hi, and thanks for the response.
I've converted the models to TFLM and executed them on a microcontroller. It was successful in giving output. So, if it gets converted successfully and runs on microcontrollers without any problem, should we just ignore the non-converted ops message? (Hope to see this message removed in the next fix, if it's supposed to be ignored)

@google-ml-butler google-ml-butler bot removed stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author labels Feb 11, 2024
@pkgoogle
Copy link

Hi @Black3rror, you can still run non-converted ops successfully and because they have higher precision the performance in terms of accuracy is actually usually better, however you lose the benefits of efficiency and latency performance. Not all ops are convertable but I can't imagine that arith.constant is one of them.

@zichuan-wei, can you please take a look? Thanks.

@pkgoogle pkgoogle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Feb 20, 2024
@Black3rror
Copy link
Author

@pkgoogle
Thanks for the response. I'm a bit confused: What was supposed to happen in the course of the conversion? The conversion is basic, i.e., no quantization and just having tflite from tf model. So, since you said "they have higher precision", should I expect any loss of precision in this process?!

Also, it might worth mentioning that I've tested different networks with different quantization techniques (including full_int, full_int_only (input and outputs are in int as well), 16x8, ...), and they all worked. I mean, I'm still getting similar non-converted messages to those I asked in the first place, but looking at interpreter.get_tensor_details()[...][dtype], it seems everything is getting converted successfully (unless I'm wrong :)) and putting these models on a microcontroller using TFLM goes well.

Can you please clarify your answer?

Yes, I also believe something is wrong since I'm not able to convert a minimum network without getting such a message, and as you said, at least arith.constant shouldn't be the problem.

@pkgoogle
Copy link

Hi @Black3rror, if that is the configuration of your conversion then, you are correct, you should not expect any loss of precision, however with any quantization techniques included -- then of course there is a potential loss. Non-converted ops are by definition not going to go through any quantization, so they will maintain precision and still "work" in the sense that you can still run inference through the model. Hope that clarifies my answer?

@Black3rror
Copy link
Author

@pkgoogle Yes, now things make more sense to me. Still, it leaves me wondering why I'm getting the "non-converted operations" message even in this situation where I'm not using any quantization (based on your reply, this message should be related to quantization conversion).
Still, this question is subsidiary, and the main question remains: If the message is valid and some operations are actually not getting converted, why is TFLite not able to convert them in such a simple model?
Looking forward to its answer.

@KarenMars
Copy link

I have the same issue of having non converted operations, and my model is quite simple as well. It is just a multi perceptron model with fully connected layers, relu activation and softmax activation functions. I wonder how I can suppress the log message of 'having non-converted operations ......' in the converter.convert() method.

Summary on the non-converted ops:
---------------------------------
 * Accepted dialects: tfl, builtin, func
 * Non-Converted Ops: 10, Total Ops 19, % non-converted = 52.63 %
 * 10 ARITH ops

- arith.constant:   10 occurrences  (f32: 10)



  (f32: 5)
  (f32: 1)

@xushangnjlh
Copy link

seems like a tensorflow version problem. I just change the version from tf2.15 to tf2.10. The non-converted ops are gone...

@hkayann
Copy link
hkayann commented Mar 31, 2024

I am having the same issue, the model runs fine though. The inference results are as expected.

@Black3rror
Copy link
Author

Running the same code with TensorFlow version 2.16.1 outputs:

2024-05-10 08:16:11.827293: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2.16.1
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1715328974.281010    3144 tf_tfl_flatbuffer_helpers.cc:390] Ignored output_format.
W0000 00:00:1715328974.281077    3144 tf_tfl_flatbuffer_helpers.cc:393] Ignored drop_control_dependency.
loc(fused["ReadVariableOp:", callsite("sequential_1/dense_1/Add/ReadVariableOp@__inference_serving_default_29"("/content/test.py":11:1) at callsite("/usr/local/lib/python3.10/dist-packages/tensorflow/lite/python/lite.py":1175:1 at callsite("/usr/local/lib/python3.10/dist-packages/tensorflow/lite/python/lite.py":1129:1 at callsite("/usr/local/lib/python3.10/dist-packages/tensorflow/lite/python/lite.py":1636:1 at callsite("/usr/local/lib/python3.10/dist-packages/tensorflow/lite/python/lite.py":1614:1 at callsite("/usr/local/lib/python3.10/dist-packages/tensorflow/lite/python/convert_phase.py":205:1 at callsite("/usr/local/lib/python3.10/dist-packages/tensorflow/lite/python/lite.py":1537:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/layer.py":58:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/layer.py":120:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py":117:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/layers/layer.py":846:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py":117:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/ops/operation.py":48:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py":156:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/models/sequential.py":209:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/models/functional.py":202:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/ops/function.py":155:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/models/functional.py":592:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py":117:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/layers/layer.py":846:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py":117:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/ops/operation.py":48:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py":156:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/layers/core/dense.py":152:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/ops/numpy.py":168:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/sparse.py":493:1 at callsite("/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/numpy.py":36:1 at "/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/core.py":65:1)))))))))))))))))))))))))))]): error: missing attribute 'value'
LLVM ERROR: Failed to infer result type(s).

This is an error, and no tflite file will be generated anymore.

@pkgoogle
Copy link

Hi @Black3rror, If I use AI-Edge-Torch, it appears to work well:

convert.py

import torch
import torch.nn as nn
import ai_edge_torch


class LinearRegression(nn.Module):
    def __init__(self):
        super().__init__()
        self.dense = nn.Linear(1, 1)

    def forward(self, x):
        return self.dense(x)


model = LinearRegression()
sample_inputs = (torch.randn(1),)

edge_model = ai_edge_torch.convert(model.eval(), sample_inputs)
edge_model.export("lr.tflite")

my output:

python convert.py
2024-06-11 21:49:20.614022: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-11 21:49:20.617583: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-11 21:49:20.656618: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-11 21:49:21.702560: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:root:PJRT is now the default runtime. For more information, see https://github.com/pytorch/xla/blob/master/docs/pjrt.md
WARNING:root:Defaulting to PJRT_DEVICE=CPU
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1718142563.953505  816506 cpu_client.cc:424] TfrtCpuClient created.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1718142565.742263  816506 tf_tfl_flatbuffer_helpers.cc:392] Ignored output_format.
W0000 00:00:1718142565.742290  816506 tf_tfl_flatbuffer_helpers.cc:395] Ignored drop_control_dependency.
2024-06-11 21:49:25.743230: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: /tmp/tmp1n3s0fho
2024-06-11 21:49:25.743512: I tensorflow/cc/saved_model/reader.cc:52] Reading meta graph with tags { serve }
2024-06-11 21:49:25.743532: I tensorflow/cc/saved_model/reader.cc:147] Reading SavedModel debug info (if present) from: /tmp/tmp1n3s0fho
2024-06-11 21:49:25.748880: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
2024-06-11 21:49:25.749202: I tensorflow/cc/saved_model/loader.cc:236] Restoring SavedModel bundle.
2024-06-11 21:49:25.760743: I tensorflow/cc/saved_model/loader.cc:220] Running initialization op on SavedModel bundle at path: /tmp/tmp1n3s0fho
2024-06-11 21:49:25.764008: I tensorflow/cc/saved_model/loader.cc:462] SavedModel load for tags { serve }; Status: success: OK. Took 20779 microseconds.
2024-06-11 21:49:25.770494: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-06-11 21:49:25.803013: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:3531] Estimated count of arithmetic ops: 3  ops, equivalently 1  MACs
I0000 00:00:1718142566.683158  816506 cpu_client.cc:427] TfrtCpuClient destroyed.

Does this resolve your issue?

@pkgoogle pkgoogle removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jun 11, 2024
@pkgoogle pkgoogle added the stat:awaiting response Status - Awaiting response from author label Jun 11, 2024
@Black3rror
Copy link
Author

Hi @pkgoogle, and thanks a lot for your answer.
While your workaround might help some people reach their goals, my project is much larger than the simple example I've used in this issue and it's based in TensorFlow. Therefore, moving to PyTorch is not a viable option for me. Additionally, I'm sure that many other issues would arise if I were to go in that direction. Essentially, it makes sense for TFLite to work best with TensorFlow.

I've opened this issue to help identify a problem with TFLite in the hope that it will be improved. Even though a workaround might be very helpful, I hope the problem gets resolved entirely.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Jun 18, 2024
@pkgoogle
Copy link

Hi @Black3rror, thanks for the feedback, that is helpful.

@pkgoogle pkgoogle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.15 For issues related to 2.15.x TFLiteConverter For issues related to TFLite converter
Projects
None yet
Development

No branches or pull requests

10 participants