Note

We used only the given 3 days (realistically 2 days) to come up with the idea and implement it. We haven't made anything similar to this before. So, we only used the given time and specifically made this for CalCodeFest. Hope you have fun watching our demo!

Inspiration

Many people spend a significant amount of time commuting or travelling, and they often wish to utilize this time productively. Traditional learning methods, such as watching YouTube videos, require visual attention and aren't safe for driving or hands-free activities. On the other side, regular apps might not have a podcast or an audio lesson on the topic you want to explore. Maybe you're into the niche stuff. This presents challenges for those who want to learn new concepts or explore interesting topics during their travel time without compromising safety.

What it does

RainPod generates interactive podcasts using AI agents:

  • Engaging Host AI: Mimics charismatic podcast hosts to guide discussions (Think Joe Rogan)
  • Domain Expert AI: Provides in-depth explanations on selected topics in a digestible manner, responding to listeners' questions in real-time.
  • Interactive Agentic Framework: We let the user interrupt the podcast in real-time, ask doubts, alter the direction of the podcast slightly or even just have a nice conversation about coffee. Yet, The podcast flows naturally and almost instantly as if you're the guest #3!

How we built it

  • Designed and implemented Interactive Agentic Framework (IAP) using CrewAI, Gemini and WebSockets.
  • Integrated AI agents with hosting and expert roles, ensuring seamless audio interactions.
  • Text-to-speech technology was optimized for clarity and engagement, while UI improvements streamlined user interaction.
  • Wrote custom-made search tools for the AI agents to keep the discussions grounded on factual data and to try to avoid hallucinations.
  • Coded up WebSockets to deal with continuous information streaming from the backend server, and also synchronized be and fe conversation history.

Tools used:

Gemini API, Crew AI, pyTTS, FastAPI, Nextjs, google speech recognition, WebSockets

Challenges we ran into

  1. Integrating audio seamlessly across frontend and backend (we figured out HTTP connection isn't the perfect networking protocol, then we took time to thoroughly alter the architecture with WebSockets.)
  2. Adapting to AI agent complexities and ensuring responsive interactions.
  3. Overcoming remote collaboration hurdles and managing time constraints effectively.
  4. Coded up WebSockets to deal with continuous information streaming from the backend server, and also synchronized be and fe conversation history to avoid the backend being too ahead. This initially led the listener's question to be outdated.

Accomplishments that we're proud of

Successfully deploying functional AI agents for podcast hosting and expertise roles, enhancing educational accessibility during commutes.

What we learned

  1. Practical applications of AI in educational content creation.
  2. Effective integration of audio technologies in real-time environments.
  3. Teamwork and project management skills in a hackathon setting.

What's next for RainPod

Future developments include:

  • Enhancing emotional nuances in text-to-speech for richer listener experiences.
  • Improving UI for intuitive navigation and user feedback.
  • Scaling AI capabilities to cover diverse topics and languages, expanding global accessibility.

Built With

Share this project:

Updates