Hey builders! π Stop typing, and start interacting! We are moving beyond the text box. The future isn't about just chatting with AIβit's about immersive, real-time experiences. To celebrate the power of multimodal AI, weβre challenging you to build the next generation of agents that can help you see π, hear π, speak π, and create in the Gemini Live Agent Challenge.
Requirements
What to Build
Entrants must develop a NEW next-generation AI Agent that utilizes multimodal inputs and outputs and moves beyond simple text-in/text-out interactions. Projects should leverage Googleβs Live API with the creative power of video/image generation to solve complex problems or create entirely new user experiences within one of these three categories:
-
Live Agents π£οΈ
-
Focus: Real-time Interaction (Audio/Vision).
-
Build an agent that users can talk to naturally can be interrupted. This could be a real-time translator, a vision-enabled customized tutor that "sees" your homework, or a customer support voice agent that handles interruptions gracefully.
-
Mandatory Tech: Must use Gemini Live API or the use of ADK. The agents are hosted on Google Cloud.
-
Creative Storyteller βοΈ
-
Focus: Multimodal Storytelling with Interleaved Output
-
Build an agent that thinks and creates like a creative director, seamlessly weaving together text, images, audio, and video in a single, fluid output stream. Leverage Gemini's native interleaved output to generate rich, mixed-media responses that combine narration with visuals, explanations with generated imagery, or storyboards with voiceover, all in one cohesive flow. Examples include Interactive storybooks (text + generated illustrations inline), marketing asset generator (copy + visuals + video in one go), educational explainers (narration woven with diagrams), and social content creator (caption + image + hashtags together).
-
Mandatory Tech: Must use Gemini's interleaved/mixed output capabilities. The agents are hosted on Google Cloud.
-
UI Navigator βΈοΈ
-
Focus: Visual UI Understanding & Interaction
-
Build an agent that becomes the user's hands on screen. The agent observes the browser or device display, interprets visual elements with or without relying on APIs or DOM access, and performs actions based on user intent. Examples include a universal web navigator, a cross-application workflow automator, or a visual QA testing agent.
-
Mandatory Tech: Must use Gemini multimodal to interpret screenshots/screen recordings and output executable actions. The agents are hosted on Google Cloud.
All projects MUST:
-
Leverage a Gemini model
-
Agents must be built using either Google GenAI SDK OR ADK (Agent Development Kit)
-
Use at least one Google Cloud service
What to Submit
- π Text Description: Summary of the Projectβs features and functionality, technologies used, information about any other data sources used, and your findings and learnings as you worked through the project.
- π¨βπ» URL to your Public Code Repository: Let us see how you built it!
- Include spin-up instructions in your README for the judges to see your project is reproducible
- π₯οΈ Proof of Google Cloud Deployment: You must demonstrate that the backend is running on Google Cloud with a short recording (separate from your demo) proving your Projectβs backend is running on Google Cloud. Proof would either be (1) a quick screen recording that shows the behind-the-scenes of their app running on GCP (e.g. console logs or console view of a deployment) or (2) a link to a code file in their code repo that demonstrates use of Google Cloud services and APIs (e.g. API calls to Vertex AI endpoints)
- ποΈ Architecture Diagram: A clear visual representation of your system (e.g., how Gemini connects to your backend, database, and frontend)
- Pro tip: Add this to th file upload or image carousel so it's easy for judges to find!
- πΉ Demonstration Video:
- <4-minute video
-
Demos your multimodal/agentic features working in real-time (no mockups)
-
Pitches your project: what problem did you solve and what value does your solution bring?
-
- <4-minute video
For Bonus Points, optionally you can do one or all of the following:
-
Publish a piece of content (blog, podcast, video) covering how the project was built with Google AI models and Google Cloud. You must include language that says you created the piece of content for the purposes of entering this hackathon. When sharing on social media, use the hashtag #GeminiLiveAgentChallenge.
- Prove you automated your Cloud Deployment using scripts or infrastructure-as-code tools. This code must be included in your public repository.
- Sign up for a Google Developer Group and provide a link to your public GDG profile
Prizes
Grand Prize
β’ $25,000 in USD
β’ $3,000 in Google Cloud Credits for use with a Cloud Billing Account
β’ Virtual Coffee with a Google Team Member
β’ Social Promo
β’ Maximum of two (2) Google Cloud Next 2026 conference tickets for two (2) teammates (April 22-24, 2026) (Value: $2,299 each)
β’ Maximum of two (2) travel stipends for airfare and hotel to Google Cloud Next 2026 in Las Vegas, NV for two (2) teammates (maximum of $3,000 USD each)
β’ Opportunity to demo your Project in a Google Cloud Next 2026 presentation (additional requirements in Official Rules)
Best of Live Agents
β’ $10,000 in USD
β’ $1,000 in Google Cloud Credits for use with a Cloud Billing Account
Virtual Coffee with a Google Team Member
β’ Social Promo
β’ Maximum of two (2) Google Cloud Next 2026 conference tickets for two (2) teammates (April 22-24, 2026) (Value: $2,299 each)
Best of Creative Storytellers
β’ $10,000 in USD
β’ $1,000 in Google Cloud Credits for use with a Cloud Billing Account
Virtual Coffee with a Google Team Member
β’ Social Promo
β’ Maximum of two (2) Google Cloud Next 2026 conference tickets for two (2) teammates (April 22-24, 2026) (Value: $2,299 each)
Best of UI Navigators
β’ $10,000 in USD
β’ $1,000 in Google Cloud Credits for use with a Cloud Billing Account
Virtual Coffee with a Google Team Member
β’ Social Promo
β’ Maximum of two (2) Google Cloud Next 2026 conference tickets for two (2) teammates (April 22-24, 2026) (Value: $2,299 each)
Best Multimodal Integration & User Experience
β’ $5,000 in USD
β’ $500 in Google Cloud Credits for use with a Cloud Billing Account
Best Technical Execution & Agent Architecture
β’ $5,000 in USD
β’ $500 in Google Cloud Credits for use with a Cloud Billing Account
Best Innovation & Thought Leadership
β’ $5,000 in USD
β’ $500 in Google Cloud Credits for use with a Cloud Billing Account
Honorable Mentions
β’ $2,000 in USD
β’ $500 in Google Cloud Credits for use with a Cloud Billing AccountCloud Billing Account
Devpost Achievements
Submitting to this hackathon could earn you:
Judges
Abhijeet Rajwade
Senior Customer Engineer, AI Infra
Abhishek Dharmaratnakar
Staff Software Engineer, YouTube Premium & AI Labs
Alex Moore
Head of AI Infrastructure Customer Engineering EMEA
Alexandre Debargis
Customer Engineer, Data Analytics
Amar Muni
Principal Architect
Ankit Virmani
Customer Engineer - AI Infra
Annie Wang
Software Engineer
Anuj Shah
Senior Software Engineer
Anushree Sinha
Software Engineer, GenAI Search Notifications
Ayo Adedeji
Developer Relations Engineer
Chloe Gaudreau
Customer Engineer, AI ML
Christina Lin
Developer Relations Engineering Manager
Debanshu Das
Senior Software Engineer
Kaz Sato
Staff Developer Advocate
Kent Hua
Customer Engineer Specialist, Apps
Kevin Lamenzo
Technical Writer
Khushan Adatiya
Senior Software Engineer
Manas Srivastava
Customer Engineer
Miguel Leon
Outcome Customer Engineer
Nicolas Fadli
Customer Engineer, Platform
Nitin Soni
Consulting account lead
Nivedita Kumari
Data Analytics Customer Engineer
Olivier Bourgeois
Developer Relations Engineer
Pritam Pal
Customer Engineer
Priya Pandey
Developer Relations Engineering Manager
Sagar Malla
AI/ML Outcome Customer Engineer
Sathya AG
Principal Architect, Retail
Shankar Athinarayanan
Customer Engineer - Platform
Shobhit Gupta
Solutions Architect
Shruti Dhumak
Head of Customer Engineering
Shub Shrivastava
Customer Engineer AI Infra/AppMod
Smitha Kolan
Developer Relations Engineer
Srivaths Ranganathan
Staff Software Engineer, YouTube
Subu Ramkumar
Customer Engineer
Sukesh Kumar
AI/ML Customer Engineer
Tom McGrath
Customer Engineering Manager
Uday Korat
Software Engineer
Urvish Pandya
Technical Program Manager
Yashesh Shroff
AI Infra Customer Engineer
Zaid Elkhateeb
Customer Engineer
Judging Criteria
-
Innovation & Multimodal User Experience (40%)
Does the project break the "text box" paradigm? Does the agent help "See, Hear, and Speak" in a way that feels seamless? Does it have a distinct persona/voice? Is the experience "Live" and context-aware, or does it feel disjointed and turn-based? -
Technical Implementation & Agent Architecture (30%)
Does the code effectively utilize the Google GenAI SDK or ADK? Is the backend robustly hosted on Google Cloud? Is the agent logic sound? Does it handle errors gracefully? Does the agent avoid hallucinations? Is there evidence of grounding? -
Demo & Presentation (30%)
Does the video define the problem and solution? Is the architecture diagram clear? Is there visual proof of Cloud deployment? Does the video show the actual software working? View Full Rules for Details
Questions? Email the hackathon manager
Tell your friends
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
