Inspiration
Our project was inspired by our own experiences preparing for job interviews. One team member practiced behavioral interviews using ChatGPT’s voice feature, while another used a platform called InStage to rehearse interview responses. While these tools were helpful, they lacked the feeling of interacting with a real person in a real environment. We wanted to take the idea of practicing conversations with AI one step further by bringing it into augmented reality, allowing users to practice social interactions in their actual surroundings.
What it does
parsAR is an augmented reality app that helps users practice real‑world conversations with an AI-powered virtual person. Users speak naturally, and the virtual character listens and responds in real time, simulating an authentic face‑to‑face interaction. Each session is recorded, and after the conversation ends, the app provides personalized feedback to help users improve their communication and social confidence.
How we built it
parsAR was built using Unity for the augmented reality experience. We used ElevenLabs for both speech‑to‑text and text‑to‑speech, enabling natural voice interactions. User input is processed by Google’s Gemini API, which generates realistic conversational responses and coaching insights. Session recordings are analyzed using TwelveLabs to extract key moments and support post‑conversation feedback.
Challenges we ran into
During the first couple of hours we initally thought we were set on an idea: it would be a browser extension tailored specifically towards identifying fake products & websites on Shopify. However, the idea -- albeit useful -- felt more fit for a personal project and just wasn't innovative enough. We learned that it was okay to scrap an idea because -- after all -- the brainstorming process is one of the hardest parts. During our time with Unity, we also had a hard time getting the character AI to respond back; it would either just repeat what we said, or straight up had no response. However, we as a group decided to stick through and work with Elevenlabs and Twelvelabs because they were perfect for our idea.
Accomplishments that we're proud of
We're extremely proud of our frontend and UI that makes parsAI feel futuristic and our backlogic which integrates Elevenlabs, Twelvelabs, Unity, Javascript, etc. Many of the technologies were new and we had to experiment with them in a relatively short time-frame. Ultimately, the hackathon wasn't just about winning, but making an innovative product and networking. We all met so many cool new people during the events such as CS Jeopardy and the Spicy Indomie eating challenge.
What we learned
We learned how to code with Unity (C# development) and expanded our knowledge of RESTful APIs. We became more fimilar with TTS and STT features with Elevenlabs' API. This was our first time incorporating a VR headset as well to allow for a more engaging end-user experience.
What's next for parsAR
We would love for the AI agents' speech to feel more humane and incorporate body gestures just as a person would in real life. Also, we want to develop more agents to cover wider personality types: an introvert who won't initiate first; an extrovert who maybe will keep you on your feet with relentless questions; the philosopher who asks questions that really makes you pause and think.
Built With
- android
- c-sharp
- eleven-labs
- gemini-api
- javascript
- node.js
- unity
Log in or sign up for Devpost to join the conversation.