Single-Life Reinforcement Learning

This code supplements the following paper:

You Only Live Once: Single-Life Reinforcement Learning

Instructions

Install MuJoCo from here.

Install the following libraries:

sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3

Install dependencies:

conda env create -f conda_env.yml
conda activate slrl

Sample commands

As a starting point, the environments are provided in envs/ with the corresponding prior data and pretrained models (including Q functions) for the Pointmass and Cheetah environments in data/.

Run Q-Weighted Adversarial Learning (QWALE) in the Pointmass environment

python train.py use_discrim=True rl_pretraining=True q_weights=True online_steps=200000 env_name=pointmass

Run SAC fine-tuning in the Pointmass environment

python train.py use_discrim=False rl_pretraining=True q_weights=False online_steps=200000 env_name=pointmass

Note that due to the randomness in the distribution shift of single-life trials, runs may have a large variance, so running many seeds is often needed to evaluate a method.

Citation

@article{chen2022you,
  title={You Only Live Once: Single-Life Reinforcement Learning},
  author={Chen, Annie S and Sharma, Archit and Levine, Sergey and Finn, Chelsea},
  journal={Neural Information Processing Systems},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend		backend
data		data
envs		envs
.gitignore		.gitignore
README.md		README.md
agents.py		agents.py
analysis.py		analysis.py
conda_env.yml		conda_env.yml
config.yaml		config.yaml
env_loader.py		env_loader.py
logger.py		logger.py
networks.py		networks.py
simple_replay_buffer.py		simple_replay_buffer.py
train.py		train.py
utils.py		utils.py
video.py		video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Single-Life Reinforcement Learning

Instructions

Sample commands

Run Q-Weighted Adversarial Learning (QWALE) in the Pointmass environment

Run SAC fine-tuning in the Pointmass environment

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Single-Life Reinforcement Learning

Instructions

Sample commands

Run Q-Weighted Adversarial Learning (QWALE) in the Pointmass environment

Run SAC fine-tuning in the Pointmass environment

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages