[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glm4 deepspeed lora sft : Cannot copy out of meta tensor; no data! #4689

Open
1 task done
ldknight opened this issue Jul 5, 2024 · 3 comments
Open
1 task done

glm4 deepspeed lora sft : Cannot copy out of meta tensor; no data! #4689

ldknight opened this issue Jul 5, 2024 · 3 comments
Labels
pending This problem is yet to be addressed

Comments

@ldknight
Copy link
ldknight commented Jul 5, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.8.3.dev0
  • Platform: Linux-5.15.0-112-generic-x86_64-with-glibc2.31
  • Python version: 3.10.13
  • PyTorch version: 2.1.2+cu121 (GPU)
  • Transformers version: 4.41.2
  • Datasets version: 2.18.0
  • Accelerate version: 0.30.1
  • PEFT version: 0.11.2.dev0
  • TRL version: 0.9.4
  • GPU type: NVIDIA GeForce RTX 4090 D
  • DeepSpeed version: 0.14.0
  • Bitsandbytes version: 0.43.1
  • vLLM version: 0.4.0.post1

Reproduction

!NCCL_P2P_DISABLE=1 NCCL_IB_DISABLE=1 deepspeed --include="localhost:0" src/train.py
--deepspeed /home/LLaMA-Factory-latest/examples/deepspeed/ds_z3_offload_config.json
--stage sft
--do_train True
--model_name_or_path /home/models_dir/glm-4-9b
--preprocessing_num_workers 4
--finetuning_type lora
--template glm4
--flash_attn auto
--dataset identity
--dataset_dir data
--cutoff_len 1024
--learning_rate 5e-05
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 1
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 10
--save_steps 1
--eval_steps 1
--val_size 0.1
--evaluation_strategy epoch
--save_strategy epoch
--save_total_limit 1
--optim adamw_torch
--packing True
--upcast_layernorm False
--output_dir /home/LLaMA-Factory-latest/save/lora/sft/train_2407051411
--ddp_timeout 180000000
--overwrite_cache
--overwrite_output_dir
--quantization_bit 4
--low_cpu_mem_usage False

Expected behavior

I have try #3062 #1130 , and I want to know how to solve the problem. Thanks!

Others

[2024-07-05 06:53:04,664] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-07-05 06:53:05,218] [WARNING] [runner.py:202:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
[2024-07-05 06:53:05,229] [INFO] [runner.py:568:main] cmd = /opt/conda/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 --enable_each_rank_log=None src/train.py --deepspeed /home/LLaMA-Factory-latest/examples/deepspeed/ds_z3_offload_config.json --stage sft --do_train True --model_name_or_path /home/models_dir/glm-4-9b --preprocessing_num_workers 4 --finetuning_type lora --template glm4 --flash_attn auto --dataset identity --dataset_dir data --cutoff_len 1024 --learning_rate 5e-05 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --gradient_accumulation_steps 1 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 10 --save_steps 1 --eval_steps 1 --val_size 0.1 --evaluation_strategy epoch --save_strategy epoch --save_total_limit 1 --optim adamw_torch --packing True --upcast_layernorm False --output_dir /home/LLaMA-Factory-latest/save/lora/sft/train_2407051411 --ddp_timeout 180000000 --overwrite_cache --overwrite_output_dir --quantization_bit 4 --low_cpu_mem_usage False
[2024-07-05 06:53:08,836] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NCCL_IB_DISABLE=1
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NCCL_P2P_DISABLE=1
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.17.1-1+cuda12.1
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_DEV_PACKAGE_VERSION=2.17.1-1
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NCCL_VERSION=2.17.1-1
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_PACKAGE=libnccl2=2.17.1-1+cuda12.1
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_PACKAGE_NAME=libnccl2
[2024-07-05 06:53:09,417] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_PACKAGE_VERSION=2.17.1-1
[2024-07-05 06:53:09,417] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0]}
[2024-07-05 06:53:09,417] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=1, node_rank=0
[2024-07-05 06:53:09,417] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]})
[2024-07-05 06:53:09,417] [INFO] [launch.py:163:main] dist_world_size=1
[2024-07-05 06:53:09,417] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0
[2024-07-05 06:53:09,418] [INFO] [launch.py:253:main] process 758460 spawned with command: ['/opt/conda/bin/python', '-u', 'src/train.py', '--local_rank=0', '--deepspeed', '/home/LLaMA-Factory-latest/examples/deepspeed/ds_z3_offload_config.json', '--stage', 'sft', '--do_train', 'True', '--model_name_or_path', '/home/models_dir/glm-4-9b', '--preprocessing_num_workers', '4', '--finetuning_type', 'lora', '--template', 'glm4', '--flash_attn', 'auto', '--dataset', 'identity', '--dataset_dir', 'data', '--cutoff_len', '1024', '--learning_rate', '5e-05', '--num_train_epochs', '1', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--lr_scheduler_type', 'cosine', '--max_grad_norm', '1.0', '--logging_steps', '10', '--save_steps', '1', '--eval_steps', '1', '--val_size', '0.1', '--evaluation_strategy', 'epoch', '--save_strategy', 'epoch', '--save_total_limit', '1', '--optim', 'adamw_torch', '--packing', 'True', '--upcast_layernorm', 'False', '--output_dir', '/home/LLaMA-Factory-latest/save/lora/sft/train_2407051411', '--ddp_timeout', '180000000', '--overwrite_cache', '--overwrite_output_dir', '--quantization_bit', '4', '--low_cpu_mem_usage', 'False']
[2024-07-05 06:53:15,370] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/opt/conda/lib/python3.10/site-packages/transformers/training_args.py:1474: FutureWarning: evaluation_strategy is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use eval_strategy instead
warnings.warn(
[2024-07-05 06:53:19,402] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-07-05 06:53:19,403] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
07/05/2024 06:53:19 - WARNING - llamafactory.hparams.parser - We recommend enable upcast_layernorm in quantized training.
07/05/2024 06:53:19 - WARNING - llamafactory.hparams.parser - We recommend enable mixed precision training.
07/05/2024 06:53:19 - WARNING - llamafactory.hparams.parser - ddp_find_unused_parameters needs to be set as False for LoRA in DDP training.
07/05/2024 06:53:19 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: True, compute dtype: None
[INFO|tokenization_utils_base.py:2106] 2024-07-05 06:53:19,507 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2106] 2024-07-05 06:53:19,507 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2106] 2024-07-05 06:53:19,507 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2106] 2024-07-05 06:53:19,507 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2106] 2024-07-05 06:53:19,507 >> loading file tokenizer.json
[WARNING|logging.py:314] 2024-07-05 06:53:20,366 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
07/05/2024 06:53:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
07/05/2024 06:53:20 - INFO - llamafactory.data.loader - Loading dataset identity.json...
Converting format of dataset (num_proc=4): 100%|█| 91/91 [00:00<00:00, 487.86 ex
Running tokenizer on dataset (num_proc=4): 100%|█| 91/91 [00:10<00:00, 8.34 exa
[INFO|configuration_utils.py:731] 2024-07-05 06:53:35,706 >> loading configuration file /home/models_dir/glm-4-9b/config.json
[INFO|configuration_utils.py:731] 2024-07-05 06:53:35,709 >> loading configuration file /home/models_dir/glm-4-9b/config.json
[INFO|configuration_utils.py:796] 2024-07-05 06:53:35,711 >> Model config ChatGLMConfig {
"_name_or_path": "/home/models_dir/glm-4-9b",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMModel"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
},
"bias_dropout_fusion": true,
"classifier_dropout": null,
"eos_token_id": [
151329,
151336,
151338
],
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1.5625e-07,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_layers": 40,
"original_rope": true,
"pad_token_id": 151329,
"padded_vocab_size": 151552,
"post_layer_norm": true,
"rmsnorm": true,
"rope_ratio": 1,
"seq_length": 8192,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"vocab_size": 151552
}

07/05/2024 06:53:35 - INFO - llamafactory.model.model_utils.quantization - Quantizing model to 4 bit with bitsandbytes.
[INFO|quantizer_bnb_4bit.py:244] 2024-07-05 06:53:35,781 >> The device_map was not initialized. Setting device_map to {'':torch.cuda.current_device()}. If you want to use the model for inference, please set device_map ='auto'
[INFO|modeling_utils.py:3471] 2024-07-05 06:53:35,782 >> loading weights file /home/models_dir/glm-4-9b/model.safetensors.index.json
[INFO|modeling_utils.py:1519] 2024-07-05 06:53:35,782 >> Instantiating ChatGLMForConditionalGeneration model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:962] 2024-07-05 06:53:35,784 >> Generate config GenerationConfig {
"eos_token_id": [
151329,
151336,
151338
],
"pad_token_id": 151329
}

Loading checkpoint shards: 0%| | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.0.self_attention.query_key_value.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.0.self_attention.query_key_value.bias: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.0.self_attention.dense.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.0.mlp.dense_h_to_4h.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.0.mlp.dense_4h_to_h.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.1.self_attention.query_key_value.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.1.self_attention.query_key_value.bias: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.1.self_attention.dense.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.1.mlp.dense_h_to_4h.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.39.self_attention.query_key_value.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.39.self_attention.query_key_value.bias: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.39.self_attention.dense.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.39.mlp.dense_h_to_4h.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for transformer.encoder.layers.39.mlp.dense_4h_to_h.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta '
Loading checkpoint shards: 100%|████████████████| 10/10 [00:02<00:00, 3.63it/s]
[INFO|modeling_utils.py:4280] 2024-07-05 06:53:42,079 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[INFO|modeling_utils.py:4288] 2024-07-05 06:53:42,080 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /home/models_dir/glm-4-9b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training.
[INFO|modeling_utils.py:3797] 2024-07-05 06:53:42,084 >> Generation config file not found, using a generation config created from the model config.
07/05/2024 06:53:42 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
07/05/2024 06:53:42 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
07/05/2024 06:53:42 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
07/05/2024 06:53:42 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
07/05/2024 06:53:42 - INFO - llamafactory.model.model_utils.misc - Found linear modules: dense_h_to_4h,dense,query_key_value,dense_4h_to_h
07/05/2024 06:53:42 - INFO - llamafactory.model.loader - trainable params: 21,176,320 || all params: 33,894,891,520 || trainable%: 0.0625
[INFO|deepspeed.py:328] 2024-07-05 06:53:42,880 >> Detected ZeRO Offload and non-DeepSpeed optimizers: This combination should work as long as the custom optimizer has both CPU and GPU implementation (except LAMB)
Using /root/.cache/torch_extensions/py310_cu121 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py310_cu121/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module cpu_adam...
Time to load cpu_adam op: 2.567744731903076 seconds
Adam Optimizer #0 is created with AVX2 arithmetic capability.
Config: alpha=0.000050, betas=(0.900000, 0.999000), weight_decay=0.010000, adam_w=1
[2024-07-05 06:53:47,936] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.14.0, git-hash=unknown, git-branch=unknown
Traceback (most recent call last):
File "/home/LLaMA-Factory-latest/src/train.py", line 28, in
main()
File "/home/LLaMA-Factory-latest/src/train.py", line 19, in main
run_exp()
File "/home/LLaMA-Factory-latest/src/llamafactory/train/tuner.py", line 50, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/home/LLaMA-Factory-latest/src/llamafactory/train/sft/workflow.py", line 90, in run_sft
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train
return inner_training_loop(
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2042, in _inner_training_loop
model, self.optimizer = self.accelerator.prepare(self.model, self.optimizer)
File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 1284, in prepare
result = self._prepare_deepspeed(*args)
File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 1751, in _prepare_deepspeed
engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/init.py", line 176, in initialize
engine = DeepSpeedEngine(args=args,
File "/opt/conda/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 262, in init
self._configure_distributed_model(model)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1112, in _configure_distributed_model
self.module.to(self.device)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1160, in to
return self._apply(convert)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
[Previous line repeated 6 more times]
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1158, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/opt/conda/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 324, in to
return self._quantize(device)
File "/opt/conda/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 288, in _quantize
w = self.data.contiguous().cuda(device)
NotImplementedError: Cannot copy out of meta tensor; no data!
[2024-07-05 06:53:51,461] [INFO] [launch.py:316:sigkill_handler] Killing subprocess 758460

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jul 5, 2024
@codemayq
Copy link
Collaborator
codemayq commented Jul 5, 2024

按照 #1130 修改源码之后,需要重新安装一下本项目,才能使用修改后的效果

@ldknight
Copy link
Author
ldknight commented Jul 8, 2024

按照 #1130 修改源码之后,需要重新安装一下本项目,才能使用修改后的效果

@codemayq 重新安装是指重新执行命令(pip install -e ".[torch,metrics]")吗?

@ldknight
Copy link
Author
ldknight commented Jul 8, 2024

按照 #1130 修改源码之后,需要重新安装一下本项目,才能使用修改后的效果

@codemayq 重新安装是指重新执行命令(pip install -e ".[torch,metrics]")吗?

已尝试重新安装项目,无效。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

2 participants