Inspiration
MoodShaker started with a simple (and very common) feeling: we want a great cocktail at home—but not the bar tab, not the “what do I even order?” panic, and definitely not ten browser tabs of recipes that read like homework.
At the same time, cocktails have become more than “a drink.” They’re a moment—a little ritual that’s aesthetic, shareable, and personal. We found that many people want to make cocktails at home, and younger drinkers especially want recommendations that match their mood and taste, not just generic “top drinks.” But the current DIY experience is messy: jargon, confusing ratios, too many tools, and too much searching.
So we asked: what if learning cocktails felt like talking to a friend behind the bar—someone who listens, surprises you with something that fits, and teaches you step-by-step?
What it does
MoodShaker is a real-time voice bartender cast designed for non-professionals.
- Pick a bartender persona (beginner → curious → advanced), based on tools/effort level.
- Talk naturally (2–3 prompts) to capture vibe + constraints (mood, taste, safety/allergies).
- Get three mood-driven recommendations as clear, scannable cards.
- Choose one and follow a guided making animation you can pause anytime for exact ratios + tips.
- Missing ingredients? MoodShaker suggests substitutions so you can still make your version.
- Rate/save/share, and MoodShaker learns so next time it asks fewer questions and recommends better.
How we built it
System design
We built MoodShaker end-to-end on Gemini, combining voice, tool calling, and structured outputs.
Flow: Conversation → Signals → Tool Call → Cards → Selection → Steps → Rating → Memory
- Gemini Live runs a bidirectional real-time voice session (mic audio in, bartender voice out).
- Each bartender persona has a clear personality + tier, and its own distinct voice—so the cast genuinely feels different, not just “reskinned.”
- During natural conversation, Gemini extracts lightweight but high-signal context: (\text{mood/state}), (\text{occasion + setting}) (e.g., celebrating vs. winding down, hosting vs. solo, at home vibe/weather), (\text{taste preferences}), and (\text{allergies/dietary constraints}). These signals—together with the selected persona’s personality/tier—drive both the recommendations and how we teach the drink.
- When context is sufficient, Gemini triggers tool calls:
get_recommendations→ returns 3 recommendation cardsselect_cocktail→ generates step-by-step mixing instructions mapped to animation/video beats
Structured output + tiers
To keep recommendations consistent per persona, we use structured output with Pydantic schemas + enum constraints. This ensures each bartender can only recommend cocktails within its tier (tools/complexity), preventing “advanced-only” recipes from leaking into the beginner experience.
Conceptually, we treat recommendations as a constrained scoring problem:
$$ \text{score}(c)=w_m\,\text{match}{mood}(c)+w_s\,\text{match}{setting}(c)+w_t\,\text{match}{taste}(c)+w_i\,\text{match}{inventory}(c)-w_a\,\text{penalty}_{allergy}(c) $$
Backend sketch (tool-calling loop)
# Pseudocode illustrating the tool-calling pipeline
def on_live_session(user_audio_stream):
# 1) Gemini Live handles real-time conversation
signals = extract_signals_from_conversation() # mood/state, occasion+setting, taste, allergies
if gemini_requests("get_recommendations"):
cards = gemini_flash_structured(
schema=RecommendationSchema, # Pydantic + enums per persona tier
context=signals
)
return cards
if gemini_requests("select_cocktail"):
steps = gemini_flash_structured(
schema=StepByStepSchema,
context=selected_card + signals
)
return steps
def on_rating(rating, session_context):
update_user_profile(session_context, rating) # fewer questions next time, better picks
UX + visuals
We designed the product to feel low-friction and playful:
- Minimal prompts, friend-like tone (while matching each persona’s personality)
- Cards that summarize the drink at a glance
- Guided animation with pause-to-see-details
- Figma motion planning + screen recording for the demo
Challenges we ran into
- Making voice feel human (not like a quiz): We learned that fewer prompts and better pacing beat “smarter questions.” Keeping it to 2–3 prompts made the experience feel friendly.
- Balancing creativity with control: Cocktails need personality, but also correctness (ratios/tools/safety). Structured output + tier constraints kept the UX coherent.
- Explaining the system quickly: Hackathon time is brutal, so we focused on a clean narrative: talk → cards → guided making → share/save, then a minimal “under the hood” icon diagram.
Accomplishments that we're proud of
- Real-time, bidirectional voice bartender: built a truly voice-first experience on Gemini Live — mic audio in, bartender voice out — so it feels like a conversation, not a form.
- Low-friction UX with high-quality results: simplified the flow into a short chat (2–3 prompts) that reliably produces three clear recommendations.
- Full demo under tight time: delivered a polished end-to-end prototype (UI + screen recording + motion planning + voiceover) that clearly communicates both impact and technical depth.
What we learned
- Gemini Live (real-time voice conversation)
- Gemini Flash (structured recommendations + step generation)
- Server-side tool calling (
get_recommendations,select_cocktail) - Python + Pydantic (schema + enum constraints)
- Figma (UI + motion planning)
What's next for Moodshaker
- Voice UX, fully conversational: push “UX inside the dialogue” — fewer UI steps, smarter prompts, and a bartender that guides naturally through voice.
- Better mixing animations: keep polishing the modular animation system so steps feel smoother, more dynamic, and less repetitive across drinks.
- Mobile app version for sharing: build a mobile experience optimized for capturing the moment — take a photo after making, then generate a branded, same-style share card that matches MoodShaker’s look and feel.
Built With
- fastapi
- gemini-2.5-flash-native-audio-preview
- gemini-3-flash-preview
- geminiflash
- geminiliveapi
- googlecloudbuild
- googlecloudrun
- python
- react18
- serversidefunctioncalling
- tailwindcss
- typescript
- vite
- webaudioapi
Log in or sign up for Devpost to join the conversation.