Abstract

Learned world models summarize an agent’s experience to facilitate learning complex behaviors. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. We efficiently learn behaviors by propagating analytic gradients of learned state values back through trajectories imagined in the compact state space of a learned world model. On 20 challenging visual control tasks, Dreamer exceeds existing approaches in data-efficiency, computation time, and final performance.

Behaviors Learned by Dreamer

Cheetah Run
Image
Hopper Hop
Image
Walker Run
Image
Quadruped Run
Image
Cup Catch
Image
Cartpole Sparse
Image
Pendulum Swingup
Image
Acrobot Swingup
Image
Reacher Easy
Image
Reacher Hard
Image
Finger Spin
Image
Finger Turn Hard
Image

Atari and DMLab with Discrete Actions

Fishing Derby
Image
Ice Hockey
Image
Kung Fu Master
Image
Assault
Image
Boxing
Image
Hero
Image
Freeway
Image
Frostbite
Image
Montezuma
Image
Pong
Image
Tennis
Image
Collect Objects
Image

Multi-Step Video Predictions

Top: Holdout Sequences Middle: Predictions Bottom: Differences
Image
Image
Image

Read the Paper for Details [PDF]

Image Image Image Image Image Image Image Image Image