Inspiration

We wanted to move beyond the typical "Input to Output" AI generator. Inspired by the collaborative nature of crewed missions (from Star Trek to pirate ships), we envisioned a Voyage Command Centre where users aren't just prompting an AI, but actively role-playing alongside AI crew mates. We aimed to create a truly immersive, multiplayer experience where AI agents have distinct personalities and voices, making the planning phase of a journey feel like a story unfolding in real-time.

What it does

Voyager AI is a collaborative mission planning platform.

  • Select Your World: Users choose a voyage archetype (Space, Pirate, Cyberpunk, etc.).
  • Generate Inventory: Using Gemini AI + DALL-E 2, the system generates a custom survival kit with unique items and visuals tailored to the mission description.
  • Command Centre (The Core): Users enter a real-time chat room where they can take on roles like "Captain" or "Mechanic." They are joined by two AI agents:
    • The Navigator: A logic-driven planner who suggests routes.
    • The Watchman: A paranoid safety officer who critiques plans. These agents speak (via Text-to-Speech) and react dynamically to the user's role and the specific inventory items available.

How we built it

We architected a decoupled, full-stack application optimised for Arm64 architecture on AWS EC2.

  • Backend (FastAPI & WebSockets): We built a custom WebSocket manager in Python to handle real-time communication. Key features include isolated chat rooms (keeping chat rooms separate per voyage type) and role locking (ensuring only one user can be Captain per room).
  • AI Agents (Google Gemini): We utilised the Google Gen AI SDK to inject dynamic context into our agents. The backend constructs prompts on the fly, feeding the agents the current Voyage Type, User Role, and Shop Inventory so their responses are highly specific and immersive.
  • Voice (ElevenLabs): To bring the characters to life, we integrated the ElevenLabs API. We built a custom synchronous message queue in the React frontend to buffer AI messages, ensuring audio clips play sequentially like a movie dialogue rather than overlapping chaotically.
  • Infrastructure (Docker & AWS): We containerised the entire stack using Docker, optimising our builds for the Arm64 architecture. We deployed to an AWS EC2 t4g.small arm64 instance, to allow seamless multi-device connectivity.

Challenges we ran into

  • Audio Synchronisation: Initially, our AI agents would speak over each other, creating a chaotic audio experience. We had to architect a custom message queue system in the frontend that intercepts incoming AI messages, buffers them, and only plays the next audio clip once the previous one has finished.
  • Docker on Arm64: Ensuring all our dependencies (especially Python ML libraries and Node.js binaries) were compatible with the AWS EC2 (Arm64) architecture required careful selection of Docker base images and multi-stage builds.
  • WebSockets & Next.js Compatibility: Our initial attempts to proxy WebSocket connections through Next.js failed repeatedly due to protocol mismatch issues. We solved this by bypassing the proxy and having the client connect directly to the Python backend, but this introduced CORS and network discovery challenges when deploying to AWS.

Accomplishments that we're proud of

  • Seamless Multi-Device Experience: We are incredibly proud that a user on a phone and a user on a laptop can join the same "room" instantly, see each other's messages, and hear the AI respond to both of them in real-time.
  • Efficient Deployment: We successfully containerized a complex full-stack app (Next.js + FastAPI + Websockets) and deployed it to a cost-effective t4g.small instance, proving that high-quality AI apps don't need massive compute resources if architected correctly.

What we learned

  • Prompt Engineering is Logic: Writing good prompts isn't just about English; it's about logic injection. Passing structured data (inventory JSON) into the prompt context dramatically improved the AI's ability to be useful rather than just conversational.
  • Arm64 is Ready: Building for Arm was seamless once we configured our Docker containers correctly.

What's next for Voyager AI

  • Persistent Campaigns: Currently, sessions are short-lived. We plan to implement persistent storage so crews can save their progress, inventory, and chat history, turning single sessions into long-running campaigns.
  • Procedural Event Engine: Instead of just planning, we want users to play the mission. We will build a procedural event engine where the Navigator and Watchman generate random encounters (e.g., "Asteroid Field Detected" or "Kraken Attack"), requiring the crew to spend their inventory items to survive.
  • More Archetypes: Expanding the roster with more specialized roles (e.g., Medic, Diplomat, hacker) and voyage types (e.g., Fantasy Dungeon, Noir Detective Agency) to cater to diverse storytelling genres.

Built With

Share this project:

Updates