[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using unsloth mode gradient checkpointing without LoRA #644

Open
Robinysh opened this issue Jun 14, 2024 · 2 comments
Open

Using unsloth mode gradient checkpointing without LoRA #644

Robinysh opened this issue Jun 14, 2024 · 2 comments

Comments

@Robinysh
Copy link

Currently unsloth offers a customized version of gradient checkpointing that claims to be better. The only way I'm aware of using it is with the below code.

model = FastLanguageModel.get_peft_model(
    model,
    use_gradient_checkpointing = "unsloth", # <<<<<<<
)

But using FastLanguageModel.get_peft_model will patch the model with LoRA. Is there any way to use the unsloth customized gradient checkpointing without LoRA? Or does it even make sense to use it without? Are the customized tricks specific to pefts?

@danielhanchen
Copy link
Contributor

We'll be adding all model support in a future release which will enable Unsloth GC for other models! Unsure on normal full finetuning or pretraining - I would suggest using Deepspeed to offload stuff, and not Unsloth

@Robinysh
Copy link
Author
Robinysh commented Jun 15, 2024

Great to know its on the todo list. I'm not looking for offloading techniques as the performance drop is quite significant, I'm rather trying to do gradient checkpointing during pretraining. The pytorch implementation should be good enough for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants