-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure in convert Gemma 2B models to TfLite #63025
Comments
Hi @RageshAntonyHM, I am trying to reproduce the issue while I had another error Thank You |
it is keras 3.0.5 and installed keras-nlp via pip install git+https://github.com/keras-team/keras-nlp (0.8.1) |
first install |
Then install tensorflow-datasets also @LakshmiKalaKadali |
Also crashing in Colab with or without quantization. |
This conversion pipeline needs lot of Vram. At Least 24 GB. @LakshmiKalaKadali any updates on this please? |
@RageshAntonyHM |
Yeah. Actually, till it is crashing for me in 48 GB RTX 6000. (What I told to @farmaker47 was, it will crash prematurally if VRAM is low. But also crashes in final step even if you have enough VRAM) |
I saw that training is working OK having installed first TensorFlow nightly version (2.17.0-dev20240223). @RageshAntonyHM can you try with nightly version and check again the conversion? |
How to install TensorFlow nightly version? I tried pip install tf-nightly, but I am getting error File "/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/core.py", line 5, in Name: tf-nightly |
I work with Colab. !pip install tf-nightly |
Now, again I am getting that first mentioned error could you please share your notebook link ? |
The colab is from this example https://ai.google.dev/gemma/docs/lora_tuning I have changed nothing. So the idea is if you install tf-nightly the error for conversion disappears? I don't understand from your previous answer if the error is during tf-nightly installation or during conversion. |
I hope some package conflicts ,like some packages reinstall 'stable' version of tensorflow. Let me check |
I able to ran inference already. my problem is, i need to create a TFlite model for Gemma 2B. I think there is some problem still in conversion i am very new to AI and even python. |
Then we have to wait a little bit so the TF team solves this and provide us the tf-nightly version we can use to convert it. |
But when using all nightly versions, I got some "GraphDef" issue |
a minimal script to reproduce the issue import keras
import keras_nlp
import tensorflow as tf
os.environ["KAGGLE_USERNAME"] = '....'
os.environ["KAGGLE_KEY"] = '...'
os.environ["KERAS_BACKEND"] = "tensorflow"
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")
tf.saved_model.save(gemma_lm.backbone, '/tmp/gemma_saved_model/')
f = tf.lite.TFLiteConverter.from_saved_model('/tmp/gemma_saved_model/').convert() I tested with tf-2.15, 2.16, and 2.17 nightly and their corresponding packages. None of them works. |
Hi @pkgoogle, I have reproduced the issue in Colab with TF 2.15, the session crashed at the step Thank You |
Adding @advaitjain and @paulinesho for visibility. |
Use the below code snippet to generate gemma2 quantized model and it would be around 2.33gb
|
Will post the inference results soon |
@nyadla-sys |
You told that you able to run it in Android I generated the quantized model as per @nyadla-sys code and tried running in Android:
But I get this error:
How to fix this? |
@farmaker47 @nyadla-sys @freedomtan How to create Gemma Tokenizer for input and outputs in Android ? I think The inputs formats: 0 serving_default_inputs_1:0 FLOAT32 [1, 3][-1, 3] output: |
After Gemma release they have presented also this for converting Gemma models: You can also dig in there for the tokenizer I suppose. |
@farmaker47 @nyadla-sys @freedomtan |
Hi All, @RageshAntonyHM, if the media-pipe workflow is unideal for your use case we have another option, AI-Edge-Torch, our PyTorch conversion library, you can find more information here: googleblog. We actually have examples of converting and quantizing decoder-only LLMs here: https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative/examples. If the conversion is successful, you can also try visualizing the result in model-explorer as well. Please try them out and let us know if this resolves your issue. If you still need further help, feel free to open a new issue at the respective repo. Thanks for your help. |
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you. |
Hello Everyone, I followed all steps in this issue to create a llm in mobile using gemma-2b-it, but I have a problem when I try to run the llm with mediapipe.
the step that i followed were:
does Anybody know something about that error? |
Hi @jigmam, if you have your converted model and android studio project please share it. We need the context around the call which is doing the error for us to debug it. Thanks for your help! |
Thanks for your response. and if you need to know how I created .task, you can see:
if you have any questions or suggestions, please let me know. |
Hi @jigmam, while there is probably a way to make mediapipe work with reactnative, I suspect we will keep running into different issues if we go that route (as it's not a well traveled road) -- Would you be open to switching to an Android Studio project? If you are targeting Android: https://ai.google.dev/edge/mediapipe/solutions/setup_android |
It's okey for my, I just need MVP to present a prototype. Following your recommendation I created a new project using example that link, but not working .task, could it be a issue in converter between keras to tflite? |
Hi @jigmam, currently Keras3 does not work well with tflite primarily due to saved model format issues. What I'm hearing from you is that you were able to follow https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative/examples successfully? If so, great! If not, let us know if you are stuck. |
No yet, I am reading the example, several questions: do I need to make fine-tuning again? Is That tutorial just using to convert to pytorch to tflite? Remember I Made fine-tuning in keras |
I tried converting Google Gemma 2B models to TfLite. Found it ending in failure
1. System information
2. Code
3. Failure after conversion
I am getting this error:
tensorflow/core.py":65:1))))))))))))))))))))))))))]): error: missing attribute 'value'
LLVM ERROR: Failed to infer result type(s).
Aborted (core dumped)
5. (optional) Any other info / logs
The text was updated successfully, but these errors were encountered: