Inspiration
We noticed how often people struggle with clarity in conversations - using filler words, speaking too fast, or interrupting without realizing it. These habits don’t just affect confidence in classrooms or job interviews - they can also increase stress and worsen anxiety, making everyday communication harder for people with social or mental health challenges. In healthcare, communication breakdowns are responsible for over 80% of medical errors (Joint Commission), while social anxiety affects 1 in 4 people worldwide. Most tools for communication coaching are either too clinical or too distracting. We wanted to design something lightweight, real-time, and wearable that supports both daily users and healthcare contexts by offering subtle feedback in the moment and reflective insights afterward.

What it does
ConvoSenses is a proactive conversation intelligence tool that works in two layers:
- Real-time AR feedback: Subtle cues on Spectacles HUD (e.g., “Take 3s pause” or “Let them finish”), with 5 visual indicators (Fluency, Prosody, Pragmatics, Consideration, Time Balance) that shift from green to red based on performance. This helps users manage speaking anxiety and improve conversational balance without distraction.
- Evaluation + Feedback from the LLM-as-a-Judge The agent holistically evaluates the performance of the speaker based on given quantifiable metrics and provides feedback on areas that was performed well, areas that could use improvement, and how to improve. - Filler & Fluency → Counts the number of filler words, words per minute. - Prosody → Pace of the speaker, frequency of pauses, volume variance. - Pragmatics → Was the question answered? Was there rambling? - Empathy and Politeness → Prescence of unconfidence/uncertainity, acknowledgement, and interruptions. - Turn-Taking → Interruption ratio (with respect to other speakers) and sharing speaking time. - Web Dashboard: After each conversation, results are saved to a secure website where every category is scored from 1–5. The dashboard visualizes:
  • Communication health metrics for fluency, empathy, clarity, and pacing.
  • Session summaries with “What Went Right,” “Areas for Attention,” and actionable recommendations for stress reduction and improved dialogue.
  • Trend graphs and history that allow individuals, students, or even therapists to track long-term progress over time.

How we built it
- Frontend: Base-44 + TailwindCSS for the web dashboard, Lens Studio + Spectacles for AR HUD.
- Backend: Python + TypeScript orchestrating LangChain pipelines.
- AI Agents: Gemini AI (as LLM as judges) with LangChain for a multi-agent evaluation framework + Base-44 to decide which feedback to show in real time and how to rate sessions 1–5.
- Speech & Analysis: Google Cloud for speech-to-text/diarization, Gemini API for semantic insights and empathetic tone detection.
- Storage: MySQL for session history, scores, and progress trends, with privacy controls for healthcare compliance.

Challenges we ran into
- Designing a UI minimal enough for AR glasses while still giving actionable cues.
- Balancing real-time agentic feedback with latency and user privacy, critical in healthcare.
- Creating meaningful scores from messy, natural conversations.
- Integrating Fetch.ai, Base-44, LangChain, and Gemini API into a seamless, low-latency pipeline.

Accomplishments that we're proud of
- Built a working HUD that provides only micro-feedback, reducing anxiety instead of adding distraction.
- Designed a visual dashboard that feels intuitive, with trend graphs and breakdowns useful for both personal growth and therapeutic settings.
- Got the multi-agent flow working: ASR → analysis → agent reasoning → web visualization.
- Created a modular scoring framework (1–5 categories) that can be reused in healthcare, education, and enterprise training.

What we learned
- AR UI design forces you to be concise- sometimes two words are better than two sentences.
- Communication health requires combining raw metrics (WPM, interruptions) with higher-level semantic signals (empathy, acknowledgment).
- Multi-agent AI needs guardrails to avoid overwhelming or stressing the user.
- Building for Spectacles + Lens Studio gave us new insight into designing for everyday wellness and healthcare wearables.

What's next for ConvoSenses
- Add gamification to encourage continued practice, reducing long-term anxiety.
- Expand multilingual support with Gemini API for global access.
- Pilot in therapy and wellness programs to support patients with social anxiety or speech challenges.
- Scale backend with Google Cloud + MySQL for larger deployments in classrooms, healthcare, and enterprise.

Built With

Share this project:

Updates