Blog

Reinforcement Learning with Policy Gradients: A TensorFlow Implementation of “Pong from Pixels”

Andrej Karpathy wrote a great post last year on how to train a neural network to play the Atari game Pong by using the Policy Gradients reinforcement learning (RL) algorithm. Given the game’s state as input, the neural network outputs a probability with which we should move the Pong paddle up or down.

I converted Karpathy’s NumPy-only approach to TensorFlow inside a Jupyter notebook. I also created a class to represent the agent playing the game–I stuck all of the code to run the Pong simulation inside that class. Here’s the Github gist, which is best viewed by clicking the link below the embedding 🙂

Loading
Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.
Viewer requires iframe.

view raw Pong-Playing TensorFlow Neural Network.ipynb hosted with ❤ by GitHub

Here’s a short GIF of some gameplay. The neural-network agent is on the right, and the built-in AI is on the left.

After ~3,000 parameter updates, the Pong-playing neural network can beat the built-in AI more often than not. What’s interesting to me is that this network looks simpler than one that you’d use for MNIST, and it doesn’t require data with labels to learn!

November 6, 2017
Artificial Neural Network in Python

My research group has been discussing Artificial Neuron-Glia Networks lately. These algorithms add artificial astrocytes to the traditional Artificial Neural Network scheme, and they may also feature a Genetic Algorithm in lieu of back-propagation. See http://www.ncbi.nlm.nih.gov/pubmed/21526157 for an example.

To better understand the implementation of a neural net, I constructed one that is capable of giving an approximation to sin(x). I relied on intuition that I developed while reading a blog post that a classmate linked me to. I strongly recommend the post if you’re interested in ANNs: http://karpathy.github.io/neuralnets/.

My neural net is available for you to view and modify via GitHub: https://github.com/bbartoldson/examples/blob/master/hacker_ANN/net.py.

December 22, 2015

Blog

Reinforcement Learning with Policy Gradients: A TensorFlow Implementation of “Pong from Pixels”

Artificial Neural Network in Python