-
-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add class for CLIPVisionModel #816
Comments
I suppose we have to make a class very similar to this: Line 3164 in 3072008
@xenova please help |
Hey there! 👋 That's right, you need to do two things:
If you have example python code for running the models, feel free to post here and I can show you the equivalent JS code 👍 |
Thanks for the help @xenova! I think the ONNX export class for this model would look like:
And I suppose that the And here's the python code we need to run the models:
|
Hi @xenova, please let me know if the following looks good, I could open a PR if the following code is alright: Code for transformersjs/scripts/extra/clip.py:
for transformersjs/sc/models.js:
|
Hi @xenova, thanks a lot for building such a fantastic repository and I truly appreciate all the hard work you've put in! I understand that it must be difficult to stay on track with such an active repository. I would humbly like to request you to help me understand whether I'd be good to go with the code above. I understand that it's not a really good thing to ask for a quick response after what all you've already built but I require this feature for a project the deadline of which is just around the corner, so it would be very kind of you if you could help me get there speedily. Thanks again. |
Hi again 👋 Sure that looks like it will work - have you tried exporting a model with that config? The usage will be similar to #799 (comment). To use from javascript, you can literally just import |
Hi again, thanks for your response! I was able to run the model by extending the class. However, I'm observing large deviation in output values when I run the model in javascript. After porting the model to onnx, I tested the model outputs, the fp32 and fp16 models are giving an average deviation of ~10^-5, which is very good but the quantized model is having a very large deviation, which is expected. However, when I tried to run the model in javascript, it is giving the same deviated output even when I explicitly add
Edit: I've verified that the processed to the python and js models are almost identical and couldn't justify this difference at all. In fact, I'm using the same preprocessor_config.json file for preprocessing the input for the transformersjs model, the onnxruntime session and the python transformers model. While the outputs of python transformers and onnxruntime agree, outputs from transformersjs don't Tried it with transformersjs#3 and with 2.17.1. specifying the model name to the fp32 model(vision_model.onnx) and quantized: false and using device: 'cpu' and still it repeatedly keeps giving the same result. I'm using node to run this file. |
Model description
The transformersjs equivalent of https://huggingface.co/docs/transformers/v4.41.3/en/model_doc/clip#transformers.CLIPVisionModel
Prerequisites
Additional information
We could use optimum to export the entire CLIP model to ONNX and then use transformersjs to use the CLIPVisionModel from the exported CLIP model. Since CLIPVisionModelWithProjection is already in place I believe we could use it's class to obtain the CLIPVisionModel as well.
Your contribution
Given access to relevant resources, I'd be more than happy to contribute to this repo by creating a class for this model type!
The text was updated successfully, but these errors were encountered: