Stories by Marcus Alder on Medium

Visualizing Words

Marcus Alder — Sat, 22 Feb 2020 20:00:24 GMT

PCA and clustering in Python

In this post, I’ll show how to use a few NLP techniques to transform words into mathematical representations and plot them as points, as well as provide some examples. The graph below was created from the Star Wars wiki Wookieepedia and colored with a clustering algorithm.

Plot of characters, locations, and organizations from Star Wars

The words’ coordinates are created from word embeddings (word vectors) which are created based on the contexts each word appears in. The vectors have properties related to the words’ meanings, approximately satisfying equations like (vector for “Paris”) - (vector for “France”) + (vector for “Italy”) ≈ (vector for “Rome”)— i.e. you can take Paris, substitute France out for Italy, and you’ll get Rome. Clustering and plotting also reveals interesting patterns; if you’ve watched Star Wars you might notice the clustering algorithm has unknowingly separated people, places, and organizations.

All the code is available at github.com/LogicalShark/wordvec

Collecting Data

Find an appropriate corpus for your analysis, or for more general uses like finding analyzing countries or movies you could use a generic text corpus. Get enough data for get meaningful word embeddings, but just 50KB or less can be sufficient if it’s all relevant.

The dump will be XML, which you may want to preprocess

I analyzed proper names from franchises I personally like, and if you want to do the same I recommend checking the franchise’s Fandom wiki for a database dump at “whatever.fandom.com/wiki/Special:Statistics” (although sadly some don’t provide database dumps).

Generating Word Vectors

wvgen.py on github

To create the vectors we need the words they correspond to, which requires splitting the text into words. We can then use Word2Vec (a word vector creation model) to create the vectors. To get a list of words I used NLTK’s word-tokenizer, and for an Word2Vec implementation I used gensim. Here’s some more details on processing the text:

Memory Limitations: The input was too large to manipulate all at once, but since Word2Vec can take an iterator as input, I made a custom iterator which takes files one line at a time for tokenization.

Synonyms: There are some names written in multiple ways that we want represented in one vector, like Donkey Kong = DK or Obi-Wan Kenobi = Ben Kenobi. Instead of combining the output vectors, we can prevent the problem with string replacement on the input. For example, replacing all instances of “Donkey Kong” with “DK” means this character is represented by a single vector for the word “DK,” instead of two.

Multi-word Expressions: Sometimes we want one vector to represent multiple words (e.g. Han Solo, Peach’s Castle), but the words get split by tokenization and become separate vectors. I used NLTK’s multi-word expression tokenizer (MWEtokenizer), which lets you add these names as custom phrases to be re-concatenated after the output of the word tokenization. An alternative would string replacement again (replace “Han Solo” with “HanSolo”) but I found MWEtokenizer to be simpler.

Summary of the custom iterator

Summary: The iterator takes each line in the file, makes replacements for synonym consistency, performs word tokenization, and finally condenses multi-word expressions. Word2Vec takes the iterator as an argument in lieu of a word list and generates a model with the word vectors.

Graphing Word Vectors

wvplot.py on github

Words have greatly varied contexts, meaning the vectors have a lot of features. Instead of graphing on three features, we can use Principal Component Analysis (PCA) to calculate a linear combination of features providing the orthogonal axes with the greatest variance. To further visualize patterns, the point’s text colors are set with k-means++ clustering (using sklearn) which automatically creates k “clusters.” I used matplotlib for graphing, giving an interactive graph like this:

Plot of characters, locations, and games in the Super Mario franchise. The overlapping red points includes all the locations (e.g. Peach’s Castle, New Donk City) and some characters (e.g. Dry Bones, Hammer Bro)

The Super Mario Fandom wiki was used to generate the vectors. A k=3 clustering seems to create a “main character” cluster, a “location/secondary character” cluster, and a “game” cluster. Daisy, Waluigi and Toad are appropriately positioned between main and secondary characters. If Donkey Kong seems a bit closer to the game cluster than the other main characters, it’s because “Donkey Kong” is both a character and the original arcade game!

2D Plot example using characters from The Office (TV), k=4

Sometimes 2D plots are enough to show patterns. The x-axis positions and clusters in the above plot approximately correlate with the screen presence of each character, matching this chart:

Source and more data on this reddit post

Minor spoilers for Hollow Knight in second plot below!

All League of Legends champions with k=5. Clusters are determined not only by features visible in the plot but also many unseen features, which is why the red cluster is not a clear, separate group

More Plots and Word Vector Arithmetic

wvarith.py on github

Locations and characters from the indie game Hollow Knight, k=3

For arithmetic, the function most_similar_cosmul will find approximate solutions to vector addition of each word in positive and subtraction of those in negative. I found it useful to put one more positive word than negative, for a net of one word vector. This includes a single word in positive and empty negative.

wvlinear.py uses this function to search for equations. However, unrelated words often combine by pure coincidence, so I recommend looking for relationships yourself with wvarith.py.

Thanks for reading! All the code and some sample models are available at github.com/LogicalShark/wordvec. Let me know if you have any questions!

The Gopher with Artificial Intelligence

Marcus Alder — Thu, 22 Aug 2019 17:55:56 GMT

Using Supervised ML for Games

Machine learning is used everywhere these days. For the Go language conference GopherCon 2019, we put Go’s Gopher mascot into an endless running game. By collecting data from human players and training a model on GCP, we were able to create an AI player that scored better than many of the humans. The backend code is available in this repo.

The Game

Gopher Run was playable on Chromebooks at Google Cloud’s GopherCon booth. The main controls are simple: jump with up arrow and roll with down arrow. The levels are procedurally generated, so they’re infinite and change every time (but contain similar patterns). You take damage if you run into spikes or bugs, and after losing your three lives, the number of coins you collected is your score. The player speeds up over time at predetermined intervals, making the game harder as you progress. The game has a few other mechanics, but they’re not relevant to this post.

Gameplay example

It speeds up as you progress, requiring faster reaction times

The AI

The AI created for this game uses supervised learning, meaning it trains on the top players’ behaviors, unlike reinforcement learning which trains by playing and improving itself. While reinforcement learning has constant improvement and doesn’t require storing player data, it also takes much longer to train (the supervised algorithm only needs to run once) and it’s harder to set up and implement.

The training data consists of snapshots of the game recorded every half-second and whenever the player performs an action. This way the algorithm sees how the player reacts to different situations (by jumping, rolling, or doing nothing). After training and deploying the model, actions of the AI are determined by giving the current game state to the deployed model for a prediction of what a player is most likely to do given the situation.

I used the Unity engine for the game and the Google Cloud AI platform for ML, but this works with any engine that can send HTTP requests and ML service (or custom ML code).

Creating the Input

If you’re making this kind of AI, one of the more challenging tasks is determining what to put in the input. To keep everything fast and memory efficient it’s important to condense a game state into only the most helpful information. To record the player’s situation, I stored their current y position and vertical velocity. Since I wanted the AI to avoid obstacles, I looked at the nearest three bugs and one spike ahead of the player, and stored their y positions and x distances from the player (normalized by player speed, see note below). A data point was collected every time the player performed an action as well as every half-second to capture times when the player was doing nothing. The data was stored in a structure in Unity until score submission. Then it was sent to a csv file in Cloud Storage, and preparation for training collects the top 10 players’ files.

Diagram of input variables

The diagram shows the variables collected from the nearest spike and nearest three bugs. The information given to the algorithm includes the y variables, each dx variable divided by the player’s speed, as well as the player’s vertical velocity.

Note on normalization: Horizontal spacing of objects scales with speed — compare the two gameplay gifs in the first section. Consider that a jump at high speed covers more horizontal distance despite having the same peak height and airtime. Dividing by speed means the input variable is the time it will take to reach the hazard (e.g. a spike 6 units ahead ÷ a speed of 12 units/second = 0.5 seconds to reach the spike), and this is what’s actually important for timing inputs like jumps.

Training

I used GCloud’s built-in XGBOOST framework with a classification objective (multi:softmax), which took around 7 minutes to train each time. More complicated behavior might be better suited to neural networks or regression objectives. Find something appropriate for your input which gives relevant information for the AI’s behavior. For example, I could have used a regression objective with -1 = roll, 0 = do nothing, 1 = jump. I elected not to since the player’s input was not analog (the buttons were either held or not, it wasn’t a 360° control stick), and it wouldn’t handle jumping while rolling at the same time.

Most games will probably only require a one-time training job. However, for the purposes of the demo, we wanted to see the AI improve as players improved and contributed better data, so we re-trained at regular intervals. You can automate this with a shell script (here’s the one I made). It uses the gcloud setup commands from the AI Platform quickstart, then loops the training and version creation commands (also in the quickstart). This creates a trained model in a Cloud Storage directory, which overrides the model from the previous training job, and creates a new version of the deployed model (whose job-dir parameter points to the training output directory). You do have to deploy the model manually to create the first version (easy with the cloud console), but afterwards the new versions use the latest training output and set themselves as the default version, so prediction requests will always use the latest model.

Using ML Predictions

The AI player is identical to normal player but keyboard input is disabled, and instead it repeatedly makes requests to the ML model. Requests to a deployed model in the GCloud AI platform are formatted as the input minus the target column (i.e. the action, which is what’s being predicted). It then returns one of the possible target classes, indicating that a player in this game state would be likely to roll, jump, or do nothing (which calls the roll or jump method).

Note on GCloud outputs: Even with though multi-class classification algorithm uses strings for the class column, a quirk of Google Cloud AI is that the prediction returns a float instead of a string, either 0.0 (first class seen in the input), 1.0 (second class seen in the input), 2.0, etc. Since the correspondence between the floats and the classes changes depending on the order the classes are seen in the input, prefix the csv training file with one dummy data line for each class, so the class represented by 0.0 is always jump, 1.0 is always roll, etc.

Running the AI

The earliest versions didn’t get very far. One early version with not enough data jumped every time it reached the ground. The AI quickly improved as it got more data and human players did better, but somewhat plateaued at a high score of 268 on the first day, which was enough to put it in the top 10 leaderboard until it got pushed out around the end of the day. On the second day it kept improving and at the end it reached 370. In comparison, human players typically scored 10–30 on their first run, and the leaderboard scores were around the 200–800 range on day 1 and 600–2000 on day 2 (much higher due to the returning players).

Early AI struggled with roll timing

Later AI was much better

Takeaways

Optimize performance and memory usage.

There was a bug in which I accidentally collected data every frame, which destroyed the framerate and crashed the WebGL build by using up all the memory (even with the relatively simple data points I was collecting). It’s always good to follow standard best practices, like keeping Unity’s Update function as small as possible.

Time to query the model is non-negligible in a fast-paced game.

The AI’s gameplay had short but noticeable pauses while waiting for responses from the model. Under other circumstances it might be possible to send multiple data points at once, since the API takes in a list. I requested one data point at a time because I didn’t know where the AI player was going to be in the future or the future vertical velocity. I also set up the HTTP requests to run asynchronously, but the game moved too fast and the responses wouldn’t come after the AI had already colliding with upcoming hazards.

Because of the aforementioned constraint, I only queried the AI every half-second (more often at higher speeds). This meant that the AI was limited in how often it could perform a new action. I had “stop rolling” counted as its own distinct input, so a pattern which required rolling followed by a pattern which required jumping would sometimes see the AI successfully roll under the first part, then stop rolling, then before another half-second had passed it would run into the second pattern because it hadn’t yet made a new request (which would have told it to jump).

This kind of ML works well and can be used for more complex tasks.

After getting enough data, the AI was consistently good (outside of the issue in the above paragraph, which was a limitation of the implementation rather than the model). I used the ML output in a very direct way by having the AI copy the action predicted for a player, but there are more creative possibilities when you know what the player is likely to do in any situation. If the player always jumps over certain bug patterns instead of rolling, the level generation algorithm could adapt and increase the frequency of a similar pattern with added bugs above, forcing the player to roll. In a game like PAC-MAN or Bomberman, the AI could cut the player off by predicting where they will go next. There are endless applications in every genre, so I encourage you to use ML for something creative in your own projects.

Have you implemented any games? Do you think AI could be trained to play your game? Let me know in the comments!

Special thanks to Tyler Bui-Palsulich and Franzi Hinkelmann as well as Dane Liergaard, Jon Foust, and everyone on the Go and Cloud DPE teams in Google NYC.

The Go/Bash code is available here, and the Unity project is currently not available but will be linked here if it is in the future. More information on GCP AI and pricing model can be found here.

The Gopher with Artificial Intelligence was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.