[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use peft for RLHF #71

Closed
younesbelkada opened this issue Feb 11, 2023 · 5 comments
Closed

Use peft for RLHF #71

younesbelkada opened this issue Feb 11, 2023 · 5 comments
Labels

Comments

@younesbelkada
Copy link
Contributor

Feature request

We should leverage trl: https://github.com/lvwerra/trl - the recent library from Hugging Face for RLHF, to apply PPO using peft and LoRA

I think peft should just work out of the box, the first step could be trying to adapt gpt2-sentiment.py script: https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt2-sentiment.py to use peft

@BirgerMoell
Copy link

@pacman100 @younesbelkada I could work on this if there is any interest. What would be the first steps?

@younesbelkada
Copy link
Contributor Author

Thanks for your interest @BirgerMoell !! That would be definitely helpful
We need to check first the compatbility of peft and trl (i.e. it might not directly work out of the box but we will need to double check), so we will most likely adapt the gpt2 script ourselves just to make sure we fix things that needs to be fixed, then you can help us maybe converting the t5 script? Let me know what do you think!
If you feel you want to take the challenge and start converting the gpt2 script feel free to do it and we'll be happy to review your Pull Request / code changes ! 🔥

@BirgerMoell
Copy link

I think it sounds like a good idea if you start out with the gpt2 script and then I can do the t5 script. I think that would be a nice way to do it and I can understand much more by following along with your work.
Btw I'm on the huggingface discord if you want to messag me my username is birger

@pacman100 pacman100 added the wip label Feb 24, 2023
@younesbelkada
Copy link
Contributor Author

The PR huggingface/trl#163 should add peft support on trl, we may need to convert the other scripts such as t5 summarization etc as well once this PR gets merged!

@younesbelkada
Copy link
Contributor Author

Hi everyone,
Closing this issue as we now officially support PEFT in TRL , making it possible to apply RLHF + PEFT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants