You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was running the examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py script for the bloomz-7b1 model. As per the README, I was expecting ~18.1GB GPU memory and 35GB CPU memory, however from the logs generated (please see below; logs for the 15th epoch) the GPU memory consumption seems to be a lot more i.e. close to 32GB GPU memory while CPU memory is much lesser.
Edit: I think I missed some setup steps required for the deepspeed offloading since the is_ds_zero_3 variable in line 238 is always False. Please let me know! Thank you
Note: I'm running this on a Ubuntu 18.04 x86_64 machine with a single 40GB A100 GPU.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:06<00:00, 1.11it/s]
GPU Memory before entering the train : 27026
GPU Memory consumed at the end of the train (end-begin): 242
GPU Peak Memory consumed during the train (max-begin): 5011
GPU Total Peak Memory consumed during the train (max): 32037
CPU Memory before entering the train : 4080
CPU Memory consumed at the end of the train (end-begin): 0
CPU Peak Memory consumed during the train (max-begin): 0
CPU Total Peak Memory consumed during the train (max): 4080
epoch=15: train_ppl=tensor(2.0908, device='cuda:0') train_epoch_loss=tensor(0.7375, device='cuda:0')
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:08<00:00, 1.26s/it]
GPU Memory before entering the eval : 27268
GPU Memory consumed at the end of the eval (end-begin): -242
GPU Peak Memory consumed during the eval (max-begin): 1465
GPU Total Peak Memory consumed during the eval (max): 28733
CPU Memory before entering the eval : 4080
CPU Memory consumed at the end of the eval (end-begin): 0
CPU Peak Memory consumed during the eval (max-begin): 0
CPU Total Peak Memory consumed during the eval (max): 4080
accuracy=84.0
eval_preds[:10]=['no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint', 'no complaint', 'no complaint', 'no complaint', 'complaint', 'no complaint']
dataset['train'][label_column][:10]=['no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint', 'no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint']
The text was updated successfully, but these errors were encountered:
Hi! Thanks a lot for this fantastic package!
I was running the
examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py
script for thebloomz-7b1
model. As per the README, I was expecting ~18.1GB GPU memory and 35GB CPU memory, however from the logs generated (please see below; logs for the 15th epoch) the GPU memory consumption seems to be a lot more i.e. close to 32GB GPU memory while CPU memory is much lesser.Edit: I think I missed some setup steps required for the deepspeed offloading since the
is_ds_zero_3
variable in line 238 is always False. Please let me know! Thank youNote: I'm running this on a Ubuntu 18.04 x86_64 machine with a single 40GB A100 GPU.
The text was updated successfully, but these errors were encountered: