human-feedback

Here are 9 public repositories matching this topic...

lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

reinforcement-learning deep-learning transformers artificial-intelligence attention-mechanisms human-feedback

Updated Jan 14, 2024
Python

conceptofmind / LaMDA-rlhf-pytorch

Star

Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.

machine-learning reinforcement-learning deep-learning transformers artificial-intelligence attention-mechanism human-feedback

Updated Feb 24, 2024
Python

wxjiao / ParroT

Star

The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.

machine-translation llama lora contrastive gpt-4 chatgpt human-feedback instruction-tuning bloomz error-guided

Updated Oct 12, 2023
Python

yk7333 / d3po

Star

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"

reinforcement-learning diffusion-models human-feedback

Updated Apr 6, 2024
Python

trubrics / trubrics-sdk

Star

Product analytics for AI Assistants

machine-learning mlops streamlit ml-monitoring llm human-feedback llmops model-feedback

Updated May 13, 2024
Python

gao-g / prelude

Star

Aligning LLM Agents by Learning Latent Preference from User Edits

transformers alignment user-feedback edits interpretability preference-learning gpt4 llm llms human-feedback

Updated May 1, 2024
Python

AlaaLab / pathologist-in-the-loop

Star

[ NeurIPS 2023 ] Official Codebase for "Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback"

synthetic-data human-feedback rlhf pathology-images

Updated Oct 19, 2023
Python

victor-iyi / rlhf-trl

Star

Reinforcement Learning from Human Feedback with 🤗 TRL

reinforcment-learning human-feedback rlhf

Updated Jun 14, 2023
Python

ZiyiZhang27 / tdpo

Star

[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"

reinforcement-learning alignment text-to-image diffusion-models stable-diffusion human-feedback rlhf

Updated May 20, 2024
Python

Improve this page

Add a description, image, and links to the human-feedback topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the human-feedback topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

human-feedback

Here are 9 public repositories matching this topic...

lucidrains / PaLM-rlhf-pytorch

conceptofmind / LaMDA-rlhf-pytorch

wxjiao / ParroT

yk7333 / d3po

trubrics / trubrics-sdk

gao-g / prelude

AlaaLab / pathologist-in-the-loop

victor-iyi / rlhf-trl

ZiyiZhang27 / tdpo

Improve this page

Add this topic to your repo