[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot generate w/ do_sample=True on a LoRA model #81

Closed
minimaxir opened this issue Feb 13, 2023 · 2 comments
Closed

Cannot generate w/ do_sample=True on a LoRA model #81

minimaxir opened this issue Feb 13, 2023 · 2 comments

Comments

@minimaxir
Copy link

From the last cell of the OPT Notebook, adding do_sample=True to generate():

/opt/conda/lib/python3.7/site-packages/peft/peft_model.py:550 in generate                        │
│                                                                                                  │
│   547 │                                                                                          │
│   548def generate(self, **kwargs):                                                          │
│   549 │   │   if not isinstance(self.peft_config, PromptLearningConfig):                         │
│ ❱ 550 │   │   │   return self.base_model.generate(**kwargs)                                      │
│   551 │   │   else:                                                                              │
│   552 │   │   │   if "input_ids" not in kwargs:                                                  │
│   553 │   │   │   │   raise ValueError("input_ids must be provided for Peft model generation")   │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py:27 in decorate_context        │
│                                                                                                  │
│    24 │   │   @functools.wraps(func)                                                             │
│    25 │   │   def decorate_context(*args, **kwargs):                                             │
│    26 │   │   │   with self.clone():                                                             │
│ ❱  27 │   │   │   │   return func(*args, **kwargs)                                               │
│    28 │   │   return cast(F, decorate_context)                                                   │
│    29 │                                                                                          │
│    30def _wrap_generator(self, func):                                                       │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/transformers/generation/utils.py:1442 in generate         │
│                                                                                                  │
│   1439 │   │   │   │   output_scores=generation_config.output_scores,                            │
│   1440 │   │   │   │   return_dict_in_generate=generation_config.return_dict_in_generate,        │
│   1441 │   │   │   │   synced_gpus=synced_gpus,                                                  │
│ ❱ 1442 │   │   │   │   **model_kwargs,                                                           │
│   1443 │   │   │   )                                                                             │
│   1444 │   │                                                                                     │
│   1445 │   │   elif is_beam_gen_mode:                                                            │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/transformers/generation/utils.py:2462 in sample           │
│                                                                                                  │
│   2459 │   │   │                                                                                 │
│   2460 │   │   │   # pre-process distribution                                                    │2461 │   │   │   next_token_scores = logits_processor(input_ids, next_token_logits)            │
│ ❱ 2462 │   │   │   next_token_scores = logits_warper(input_ids, next_token_scores)               │
│   2463 │   │   │                                                                                 │
│   2464 │   │   │   # Store scores, attentions and hidden_states when required                    │2465 │   │   │   if return_dict_in_generate:                                                   │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/transformers/generation/logits_process.py:92 in __call__  │
│                                                                                                  │
│    89 │   │   │   │   │   )                                                                      │
│    90 │   │   │   │   scores = processor(input_ids, scores, **kwargs)                            │
│    91 │   │   │   else:                                                                          │
│ ❱  92 │   │   │   │   scores = processor(input_ids, scores)                                      │
│    93 │   │   return scores                                                                      │
│    94                                                                                            │
│    95                                                                                            │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/transformers/generation/logits_process.py:297 in __call__ │
│                                                                                                  │
│   294def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.   │
│   295 │   │   top_k = min(self.top_k, scores.size(-1))  # Safety check                           │296 │   │   # Remove all tokens with a probability less than the last token of the top-k       │
│ ❱ 297 │   │   indices_to_remove = scores < torch.topk(scores, top_k)[0][..., -1, None]           │
│   298 │   │   scores = scores.masked_fill(indices_to_remove, self.filter_value)                  │
│   299 │   │   return scores                                                                      │
│   300                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: "topk_cpu" not implemented for 'Half'
@younesbelkada
Copy link
Contributor

Hi @minimaxir
Thanks for the issue!
In fact you need to put your input_ids in the correct device, calling generate with input_ids.to(0) should fix your issue

@minimaxir
Copy link
Author

Yep, a batch.to('cuda') did the trick. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants