You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently unsloth offers a customized version of gradient checkpointing that claims to be better. The only way I'm aware of using it is with the below code.
model = FastLanguageModel.get_peft_model(
model,
use_gradient_checkpointing = "unsloth", # <<<<<<<
)
But using FastLanguageModel.get_peft_model will patch the model with LoRA. Is there any way to use the unsloth customized gradient checkpointing without LoRA? Or does it even make sense to use it without? Are the customized tricks specific to pefts?
The text was updated successfully, but these errors were encountered:
We'll be adding all model support in a future release which will enable Unsloth GC for other models! Unsure on normal full finetuning or pretraining - I would suggest using Deepspeed to offload stuff, and not Unsloth
Great to know its on the todo list. I'm not looking for offloading techniques as the performance drop is quite significant, I'm rather trying to do gradient checkpointing during pretraining. The pytorch implementation should be good enough for the time being.
Currently unsloth offers a customized version of gradient checkpointing that claims to be better. The only way I'm aware of using it is with the below code.
But using
FastLanguageModel.get_peft_model
will patch the model with LoRA. Is there any way to use the unsloth customized gradient checkpointing without LoRA? Or does it even make sense to use it without? Are the customized tricks specific to pefts?The text was updated successfully, but these errors were encountered: