-
-
Players are listening to the game master narrate while playing the game, the ai game master has given them a scenario that requires action.
-
Character selection and what class they want to be, they can also select what the them is and how many players are going to compete.
-
More game play given another situation.
AI-Powered Dungeon Master
The Idea
We set out to build an AI-powered Dungeon Master that makes getting into Dungeons & Dragons as simple and immersive as possible — no rulebooks, no prep time, and no prior experience required.
The goal was to create a generative storytelling assistant that:
- Reacts to real dice rolls on a physical table
- Understands voice commands and video input
- Narrates a dynamic D&D story in real-time
- Helps beginners start playing without needing to buy a book or learn complex rules
D20mind gives players the ability to dive right into a tabletop adventure with almost no setup.
Tech Stack
- Frontend: Node.js with custom React components
- State Management: Local Storage (for players, HP, actions)
- AI Integration: Gemini API for story generation and context processing
- Vision: Analyzes camera snapshots of the board and dice
- Speech-to-Text: Converts player voice input into actions
- Text-to-Speech: Narrates the game using natural voice synthesis
Lessons Learned
We discovered that sometimes simpler solutions are better. While the concept was ambitious and had many complex moving parts, we learned the importance of:
- Prioritizing core features over perfect polish
- Using fallback systems when AI or APIs failed
- Keeping the UX clean despite technical complexity
Key Takeaways:
- Integrating Gemini's multimodal capabilities in real-time was tricky — especially formatting and handling image input.
- Handling audio and video streams, speech recognition, and voice synthesis together was more complex than expected.
- Having everything work in sync required constant debugging and testing.
Hackathon Challenges
This project was built during a 24-hour hackathon, and time pressure was very real.
- Gemini API integration took significant trial and error.
- Speech and vision features required fine-tuned formatting and logic.
- Building something of this scale in one day pushed us technically and creatively.
Despite the challenges, we’re proud of how far we got and what we created.
What’s Next?
There’s a lot of room for future development: Improve the ai elements so that it is quicker and also the accuracy of the board scanning feature.
Final Thoughts
D20mind is a glimpse at the future of tabletop RPGs — immersive, accessible, and AI-driven. By combining real-time voice, video, and AI storytelling, we’ve created a tool that makes adventures more magical and approachable for new and experienced players alike.



Log in or sign up for Devpost to join the conversation.