-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use peft
for RLHF
#71
Comments
@pacman100 @younesbelkada I could work on this if there is any interest. What would be the first steps? |
Thanks for your interest @BirgerMoell !! That would be definitely helpful |
I think it sounds like a good idea if you start out with the gpt2 script and then I can do the t5 script. I think that would be a nice way to do it and I can understand much more by following along with your work. |
The PR huggingface/trl#163 should add |
Hi everyone, |
Feature request
We should leverage
trl
: https://github.com/lvwerra/trl - the recent library from Hugging Face for RLHF, to apply PPO usingpeft
and LoRAI think
peft
should just work out of the box, the first step could be trying to adaptgpt2-sentiment.py
script: https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt2-sentiment.py to usepeft
The text was updated successfully, but these errors were encountered: