Agent Robot Game

Planning I did for the Project
How the project looks

Inspiration

I was inspired by the idea of attempting to implement AI agents that would interact together in a simulated environment, being able to choose from a wide variety of actions in the environment and which would eventually choose the best actions that maximized their reward in the environment. If you've watched the youtube channel "Primer", I was interested in simulating something such the simulations that are done in their videos.

Additionally, I wanted to integrated hardware as an aspect as well. The idea I had that I thought would be really interesting would be to have the user have a chip they can hold that, depending on how they turned or moved it in the real world, would actually move it on the game environment which it was connected to, and move it accurately to its real world movements. Additionally, the hardware could have a microphone that they could use as input into the environment as language as well.

What it does

Right now it doesn't really do too much. It tries to simulate different entities that are "villains" and then "citizens", each having their own actions they can take and different reward/punishments that encourage them to act differently in the environment. In practice this doesn't seem to work as all the robots try to hug just the corners of the board. The robot aspect hadn't been implemented either.

How we built it

I had decided to use Unity since it could be used for simulations and I was familiar with how it worked, and also knew it supported RL algorithms such as PPO. I used the unity rl-agents package to help with sending the state and receiving actions + reward/punishment. We had a separate python sevrver that the Unity game would connect through via a socket, where this python server would actually run the PPO algorithm and optimize it over time. If I had time to do hardware as well, I likely would've used a battery, ESP32-S3, accelerometer, and gyroscope + magnetometer.

Challenges we ran into

I had 2 midterms I had to study for as well as taking care of various other group project tasks. By the time I had time to start on the project, it was 11 PM. At first it had went well - I had an idea that I thought was interesting, and I had gotten the base of the project set up. However, then I noticed how time was running out, and I gave into the devil's temptations - vibe coding. It caused the project to take a few big leaps forward in progress only to repeatedly do backflips for the rest of the time I was working on it- with me running into several issues that were very hard to debug since I didn't even know how the code worked. I probably would've gotten the same amount of work done had I not used vibe coding, but I at least would have enjoyed my time a lot more.

Accomplishments that we're proud of

Nothing really at least in regards to the actual product itself, as it is a project that is severely lacking. However, one thing I am proud of the idea of it - while it doesn't have a high amount of real world applications, I feel like implementing it and being able to simulate an entire agentic environment is a very interesting idea along with putting actual hardware into that simulated environment. I'd like to one day (hopefully over spring break) go back and legitimately implement this project. Additionally, I am proud of my initial planning for the project, which included documenting the possible states, actions, and rewards/punishments for each type of entity.

What we learned

I have learned that I need to better my skills at actually vibe coding if I intend to do it, or at least that vibe coding does not nearly as well when it comes to game development tasks. On a positive note I did learn that I have the potential to make fun ideas if I give it enough thought.

What's next for Agent Robot Game

Hopefully over spring break, at the very least by summer break, I plan to do a legitimate implementation of this project, since I believe it'd be a very fun project to work on.

Built With

Updates

Mahd Malik started this project — Mar 07, 2026 11:38 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.