Inspiration

This idea came from my own travel struggles. When I was exploring cities alone, I often found myself staring at amazing buildings or monuments but not knowing their stories. Language barriers made it harder, and I would end up spending more time scrolling through Google or social media just to try to find information than actually enjoying the moment. It felt like I was researching instead of experiencing. That’s when I thought what if there was a way to just look at something and instantly know its story, fun facts or Information? No endless scrolling, no language issues, just pure discovery in the moment of curiosity. That’s how the idea was born: a travel platform that makes exploring easy, fun, and meaningful. That’s how the idea was born: a travel platform that makes exploring easy, fun, and meaningful.

What it does

XploreBirdie is like having a local friend by your side while you travel. Just look around, and it instantly recognizes the landmarks and places around you. It has a deep knowledge about the city, and EchoXR shares quick stories, local tips, fun facts, or even a little history about what is in front of you. No more pulling out your phone and scrolling through endless pages to figure out what you’re seeing. Instead, you get instant answers that spark curiosity and help you decide if you want to explore more or simply move on and keep enjoying your journey.

The product is prototyped with meta quest 3 but with a goal to be used for meta mr glasses expected to be launched in the coming future.

How we built it

We built XploreBirdie bringing the different technological block together like Unity, the Meta SDK, the Roboflow and Botpress. Unity is our main development Environment where we designed XR experience and interaction. The Meta SDK made it possible to run everything smoothly on Meta Quest and connect the headset features with our application. For landmark recognition , we trained and deployed the model using Roboflow. It allowed us to scan the user’s surrounding and detect point of interest real time. once the landmark is identified we called AI-API to generate quick, engaging Audio responses that tuned raw data into human like strorytelling. We used botpress autonomous agents with prompt engineering coupled with deep knowledge base about travel locations and powered with orchestration of multiple models such as chatgtp 5 and chatGTP 5 mini as LLM and Deepgram for STT, Hume AI for TTS. We have connected image processing and detection with XR compatibility and AI- generated narration into one seamless flow.

Challenges we ran into

We faced initial challenges , at first even finding the right team and developer, aligning on the vision and deciding the which features to keep in the idea considering the length time crunch at the event. We had many ideas but had to scale it down to the essentials on scanning the surrounding , recognizing the Landmark and telling the stories and relevant information. The tech part was a bit challenging keeping harmonization of Technologies have retrain and optimize it until it becomes the reliable to identify landmarks on the go. Getting text-to-speech and Speech -to -text to run smoothly and handling the turns between listening/speaking between user and the assistant using 4 APIs (roboflow, botpress, deepgram, hume ai) without delays was really challenging.

Technical challenges also included:
Integrating multiple API’S , syncing voice input , and balancing the efficiency and UI to design user can understand. Mixing the Knowledge base with context engineering to give personality and the necessary deep local informational layer to the ai agent.

Accomplishments that we're proud of

One of our biggest wins was finding the right team. We started as strangers, but quickly discovered common interests and focused on the features that really mattered — using the passthrough camera and AI to bring landmarks to life.

We’re proud of how we divided the idea into clear technical components and then stitched them back together into one working flow. From linking Meta Quest hardware, to building the UI, training each team member to use the device, adding image recognition, layering in AI narration, and finally enriching it with local knowledge for smarter stories every step felt like progress and learning curves.

On the technical side, we achieved syncroniyatzion between different APIs, connected Unity with the Meta SDK. The proof of concept works: you can scan, recognize a landmark, and instantly hear a narrated story that feels alive and interact with the agent asking questions and getting real local answers.

Most of all, we’re proud that we turned a idea into a functioning prototype by combining our diverse skills and creativity in just a short time.

What we learned

We learned about the coordinating, communicating and brainstorming our ideas into features and part down the technical components.

Some of the technical concepts we learned are: Unity integration , botpress bot crreation, TTS/SST integration, autonomous agents, context engineering, knowledge base creation, training an image recognition model through roboflow.

What's next for XploreBirdie (Echo_XR)

  • Expanding the knowledge base with more cities, landmarks, and local stories.
  • Optimizing for real-time performance with faster recognition and smoother narration.
  • Adding personalization features like letting users pick if they prefer history, culture, food, or architecture insights.
  • Enabling multi-language support to break language barriers for global travelers.
  • Preparing for Meta MR Glasses integration to make the experience more immersive and natural.
  • Adding community contributions, where locals can share their own stories, fun facts, and hidden gems.

Built With

Share this project:

Updates