You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run python examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py, the following message gets printed as expected on the screen for model.print_trainable_parameters() (Line 219)
trainable params: 3932160 || all params: 7072948224 || trainable%: 0.055594355783029126
However, when I follow the instructions on the README and set up Accelerate with DeepSpeed CPU offloading, the same line now outputs the following
trainable params: 3932160 || all params: 3932160 || trainable%: 100.0
Upon digging a little deeper, it looks like the model.named_parameters() returns tensors of size 0 (except for the Lora A and B matrices) when running with Accelerate and DeepSpeed CPU offloading
This is the original output for the model.named_parameters() - showing only the top few parameters
When I run
python examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py
, the following message gets printed as expected on the screen formodel.print_trainable_parameters()
(Line 219)However, when I follow the instructions on the README and set up Accelerate with DeepSpeed CPU offloading, the same line now outputs the following
Upon digging a little deeper, it looks like the
model.named_parameters()
returns tensors of size 0 (except for the Lora A and B matrices) when running with Accelerate and DeepSpeed CPU offloadingThis is the original output for the model.named_parameters() - showing only the top few parameters
This is the output when running with accelerate and DeepSpeed CPU offloading
It doesn't look like this impacts fine-tuning but was curious why this is happening!
The text was updated successfully, but these errors were encountered: