Imitation Learning Baseline Implementations

This project aims to provide clean implementations of imitation and reward learning algorithms. Currently we have implementations of Behavioral Cloning, DAgger (with synthetic examples), density-based reward modeling, Maximum Causal Entropy Inverse Reinforcement Learning, Adversarial Inverse Reinforcement Learning, Generative Adversarial Imitation Learning and Deep RL from Human Preferences.

Installation:

Installing PyPI release

pip install imitation

Install latest commit

git clone http://github.com/HumanCompatibleAI/imitation
cd imitation
pip install -e .

Optional Mujoco Dependency:

Follow instructions to install mujoco_py v1.5 here.

CLI Quickstart:

We provide several CLI scripts as a front-end to the algorithms implemented in imitation. These use Sacred for configuration and replicability.

From examples/quickstart.sh:

# Train PPO agent on pendulum and collect expert demonstrations. Tensorboard logs saved in quickstart/rl/
python -m imitation.scripts.train_rl with pendulum common.fast train.fast rl.fast fast common.log_dir=quickstart/rl/

# Train GAIL from demonstrations. Tensorboard logs saved in output/ (default log directory).
python -m imitation.scripts.train_adversarial gail with pendulum common.fast demonstrations.fast train.fast rl.fast fast demonstrations.rollout_path=quickstart/rl/rollouts/final.pkl

# Train AIRL from demonstrations. Tensorboard logs saved in output/ (default log directory).
python -m imitation.scripts.train_adversarial airl with pendulum common.fast demonstrations.fast train.fast rl.fast fast demonstrations.rollout_path=quickstart/rl/rollouts/final.pkl

Tips:

Remove the "fast" options from the commands above to allow training run to completion.
python -m imitation.scripts.train_rl print_config will list Sacred script options. These configuration options are documented in each script's docstrings.

For more information on how to configure Sacred CLI options, see the Sacred docs.

Python Interface Quickstart:

See examples/quickstart.py for an example script that loads CartPole-v1 demonstrations and trains BC, GAIL, and AIRL models on that data.

Density reward baseline

We also implement a density-based reward baseline. You can find an example notebook here.

Citations (BibTeX)

@misc{wang2020imitation,
  author = {Wang, Steven and Toyer, Sam and Gleave, Adam and Emmons, Scott},
  title = {The {\tt imitation} Library for Imitation Learning and Inverse Reinforcement Learning},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/HumanCompatibleAI/imitation}},
}

Contributing

See CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 487 Commits
.circleci		.circleci
ci		ci
docs		docs
examples		examples
experiments		experiments
runners		runners
src/imitation		src/imitation
tests		tests
.codecov.yml		.codecov.yml
.codespell.skip		.codespell.skip
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
readthedocs.yml		readthedocs.yml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Imitation Learning Baseline Implementations

Installation:

Installing PyPI release

Install latest commit

Optional Mujoco Dependency:

CLI Quickstart:

Python Interface Quickstart:

Density reward baseline

Citations (BibTeX)

Contributing

About

Releases

Packages

Languages

License

ChunYe173/imitation

Folders and files

Latest commit

History

Repository files navigation

Imitation Learning Baseline Implementations

Installation:

Installing PyPI release

Install latest commit

Optional Mujoco Dependency:

CLI Quickstart:

Python Interface Quickstart:

Density reward baseline

Citations (BibTeX)

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages