[`gpt`] Gpt2 fix half precision causal mask #23256

younesbelkada · 2023-05-10T11:19:36Z

What does this PR do?

Applies a similar fix than #23136 but for GPT2.

To reproduce:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto", load_in_8bit=True)
inputs = torch.LongTensor([[1, 1, 1], [1, 2, 1]]).to(0)

print(model(inputs))

The explanation is the same as the tagged PR:

When going for low_cpu_mem_usage each parameter is force-casted to the expected dtype, which is force-set to torch.float16 for 8bit models.

Therefore, for 8bit models (and also half-precision models) the causal mask is always force casted to float16 as it is part of the model's state dict, hence expected to be loaded from the Hub if the mask is available on the state dict.

The fix is to add persistant=False and add a field _keys_to_ignore_on_unexpected (for removing the warnings) to avoid loading that causal mask from the state dict and assign it to the buffer, and all causal masks that are saved as buffers should do the same to avoid unexpected behaviors.

Some users reported that they were also able to reproduce on PyTorch main branch but without load_in_8bit, I didn't managed to reproduce that way, I will have a deeper look

cc @amyeroberts

HuggingFaceDocBuilderDev · 2023-05-10T11:34:48Z

The documentation is not available anymore as the PR was closed or merged.

amyeroberts

Thanks for fixing! I just have a question about _keys_to_ignore_on_load_missing for decision transformer

amyeroberts · 2023-05-10T15:04:17Z

src/transformers/models/decision_transformer/modeling_decision_transformer.py

@@ -746,7 +747,8 @@ class DecisionTransformerPreTrainedModel(PreTrainedModel):
    base_model_prefix = "decision_transformer"
    main_input_name = "states"
    supports_gradient_checkpointing = False
-    _keys_to_ignore_on_load_missing = [r"position_ids"]
+    _keys_to_ignore_on_load_missing = [r"position_ids", r"h\.\d+\.attn\.masked_bias", r"h\.\d+\.attn\.bias"]


Should r"h\.\d+\.attn\.masked_bias", r"h\.\d+\.attn\.bias" be in _keys_to_ignore_on_load_missing?

Ah you are right it should probably be in keys_to_ignore_on_load_unexpected only; will modify that!

amyeroberts

LGTM! Thanks for fixing :)

amyeroberts

LGTM! Thanks for fixing :)

* fix gpt2 inference * fixup * no need to be in `_keys_to_ignore_on_load_missing`

younesbelkada added 2 commits May 10, 2023 11:00

fix gpt2 inference

8aa135c

fixup

67804ab

younesbelkada changed the title ~~[gpt] Gpt2 fix 8bit inference~~ [gpt] Gpt2 fix half precision causal mask May 10, 2023

younesbelkada requested a review from amyeroberts May 10, 2023 11:20

Merge remote-tracking branch 'upstream/main' into gpt2-fix-inferencnce

67d9f96

amyeroberts reviewed May 10, 2023

View reviewed changes

no need to be in _keys_to_ignore_on_load_missing

74ea502

younesbelkada requested a review from amyeroberts May 10, 2023 16:24

amyeroberts approved these changes May 10, 2023

View reviewed changes

younesbelkada merged commit ca26699 into huggingface:main May 11, 2023

younesbelkada deleted the gpt2-fix-inferencnce branch May 11, 2023 07:32

sheonhan pushed a commit to sheonhan/transformers that referenced this pull request May 15, 2023

[gpt] Gpt2 fix half precision causal mask (huggingface#23256)

5f004f4

* fix gpt2 inference * fixup * no need to be in `_keys_to_ignore_on_load_missing`

ezyang mentioned this pull request May 23, 2023

separate out dynamo .requires_grad and .is_grad_enabled guards pytorch/pytorch#100570

Closed

gojiteji pushed a commit to gojiteji/transformers that referenced this pull request Jun 5, 2023

[gpt] Gpt2 fix half precision causal mask (huggingface#23256)

96f90a5

* fix gpt2 inference * fixup * no need to be in `_keys_to_ignore_on_load_missing`

younesbelkada mentioned this pull request Jun 8, 2023

[GPT2] Add correct keys on _keys_to_ignore_on_load_unexpected on all child classes of GPT2PreTrainedModel #24113

Merged

patrickvonplaten mentioned this pull request Jun 8, 2023

Fix loading if unexpected keys are present huggingface/diffusers#3720

Merged

novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023

[gpt] Gpt2 fix half precision causal mask (huggingface#23256)

36649d7

* fix gpt2 inference * fixup * no need to be in `_keys_to_ignore_on_load_missing`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`gpt`] Gpt2 fix half precision causal mask #23256

[`gpt`] Gpt2 fix half precision causal mask #23256

[gpt] Gpt2 fix half precision causal mask #23256

[gpt] Gpt2 fix half precision causal mask #23256

Conversation

What does this PR do?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[`gpt`] Gpt2 fix half precision causal mask #23256

[`gpt`] Gpt2 fix half precision causal mask #23256