Inspiration

Every year, millions of people apply for US visas, and the interview is the most nerve-wracking part of the process. A single poorly worded answer can lead to denial, affecting dreams of education, work, or family reunification. I noticed that while there are plenty of written guides, there's no realistic way to practice the actual voice interview with immediate feedback.

I wanted to build something that could help applicants walk into their visa interview feeling prepared and confident. The idea was simple: create an AI consular officer that conducts realistic voice interviews and provides actionable feedback to improve your chances of approval.

What it does

Prepared is an AI-powered voice interview practice platform that simulates real US visa interviews. Here's how it works:

  1. Personalized Setup: Users complete onboarding with their visa type (F-1, H-1B, B-1/B-2, etc.), country, field of study/work, and other details
  2. Voice-First Interview: An AI consular officer conducts a realistic interview through natural voice conversation - no typing, just like the real thing
  3. Analysis: After the interview, users receive detailed feedback including:
    • Approval likelihood percentage
    • Performance breakdown (clarity, confidence, specificity, return intent)
    • Red flags identified
    • Strengths recognized
    • Actionable recommendations

How I built it

Tech Stack:

  • Frontend: React + Vite, TailwindCSS, ElevenLabs React SDK
  • Backend: Node.js/Express, Google Cloud Run
  • AI: ElevenLabs Conversational AI + Google Gemini 1.5 Flash
  • Database: Google Cloud Firestore
  • Auth: Firebase Authentication
  • Hosting: Firebase Hosting (frontend), Cloud Run (backend)

The Technical Challenge - Dual AI Integration:

The most interesting part was getting ElevenLabs and Gemini to work together seamlessly:

  1. Custom LLM Endpoint: I built an OpenAI-compatible /chat/completions endpoint that acts as a bridge between ElevenLabs and Gemini
  2. Server-Sent Events (SSE): Implemented streaming responses so ElevenLabs could synthesize speech in real-time as Gemini generates text
  3. Conversation History Management: Each interview maintains context by building proper conversation history in Gemini's chat format
  4. Dynamic System Prompts: The AI interviewer's behavior is customized based on visa type, country-specific concerns, and user profile

Real-time Audio Visualization: I implemented custom Web Audio API processing to detect when the user or AI is speaking, showing animated blue waves for AI and green waves for the user. This creates an immersive, natural conversation experience.

Smart Decision-Making: The AI evaluates after each response:

  • Question count (minimum 3, target 5, maximum 8)
  • Information gathered (study/work plans, funding, return intent)
  • Red flags detected (immigrant intent, weak ties, vague answers)
  • When to end (clear red flag → 1 more question → decision)

Challenges I ran into

Majorly this - The ElevenLabs-Gemini Integration Getting two AI systems to communicate wasn't straightforward. ElevenLabs expects OpenAI format, Gemini uses its own format. I had to:

  • Convert between message formats
  • Handle streaming properly with SSE
  • Manage conversation context across both systems
  • Debug CORS issues between frontend/backend/ElevenLabs

Getting the AI to behave realistically required extensive prompt engineering with:

  • Precise question style guidelines (5-10 words, complete sentences)
  • Decision-making logic after each response
  • Question count enforcement
  • Mandatory ending statements (approval/denial)
  • Country and visa-type specific context

Accomplishments that I'm proud of

  1. Seamless Voice Experience: The interview feels natural - no typing, no buttons, just conversation. The audio visualization adds to the immersion.

  2. Technical Sophistication: Successfully orchestrated two complex AI systems (ElevenLabs + Gemini) to work as one unified interviewer with custom LLM integration and SSE streaming..

  3. Attention to Detail: The AI interviewer says exactly what a real consular officer would say, from "Good morning, please give your passport" to the final approval/denial statement.

  4. Impact Potential: This addresses a real need. 10M+ visa interviews happen annually, and this tool could genuinely help applicants prepare better.

What's next for Prepared

My long-term plan is:

  1. Re-enable Practice Mode with real-time coaching where the AI pauses mid-interview to explain red flags and suggest better ways to answer.
  2. Build mobile apps for iOS and Android so people can practice anywhere, anytime.
  3. Add custom scenario practice for challenging situations like previous visa denials, funding gaps, or gap years in education.
  4. Partner with universities and immigration consultants to make this an official preparation tool for their students and clients.

My goal with prepared is simple - make visa interviews less intimidating and help qualified applicants present their cases clearly and get approved.

Built With

Share this project:

Updates