Blink AI is our hackathon submission: an autonomous home assistant that integrates voice, vision, and device control through MCP servers and A2A protocols, all containerized on GCR. At its core lies a custom Llama 3.1 8B model fine-tuned via Weave for sub-100 ms inference on home-automation dialogues.

Inspiration We set out to build a home assistant that truly “just works,” freeing users from juggling multiple apps and manual routines. Drawing on real-world frustrations—delayed voice commands, fragmented device ecosystems, and brittle automations—we imagined a system that “listens” and “acts” as seamlessly as turning a light switch, but from anywhere.

What it does Blink AI captures natural-language voice commands (“Hey Blink, goodnight”) and:

Parses intent and parameters (devices, actions, schedules) via our fine-tuned Llama 3.1 8B model.

Routes commands over a low-latency MCP message bus to IoT devices.

Manages secure credentials and third-party services via Agent-to-Agent (A2A) flows.

Provides end-to-end containerized delivery on Google Container Registry, enabling instant deployment and autoscaling.

How we built it Model fine-tuning: Leveraged Weave’s distributed framework to perform parameter-efficient tuning of Llama 3.1 8B on a curated dataset of home-automation dialogues.

Messaging stack: Developed an MCP (Message Control Protocol) layer for sub-50 ms pub/sub communication; integrated A2A for OAuth-style credential handoffs.

Containerization & deployment: Packaged inference, gateway, and front-end services into Docker images and pushed to GCR; orchestrated on Kubernetes.

Challenges we ran into Fine-tuning instability: Initial hyperparameters caused mode collapse; debugging required multiple training runs and gradient‐norm monitoring.

Data curation: Generating realistic command–response pairs needed Claude to sanitize and expand traces—this iterative process consumed significant time.

Protocol integration: Harmonizing MCP’s lightweight binary framing with A2A’s JSON-based exchanges introduced edge-case parsing bugs. Devpost - The home for hackathons

Accomplishments that we're proud of Achieved sub-100 ms end-to-end inference latency on real hardware.

Demonstrated reliable multi-device orchestration: lights, thermostat, locks, and calendar invites—all from a single utterance.

Fully automated CI/CD pipeline on GCR, enabling one-click redeploys. Devpost - The home for hackathons

What we learned Best practices for trace curation and dataset hygiene using Claude.

Designing agent-based reasoning loops for voice assistants.

Building resilient MCP messaging layers and secure A2A authentication flows. Devpost - The home for hackathons

What’s next for Blink AI Vision integration: Add object-detection capabilities for camera-triggered automations.

Adaptive learning: Enable on-device personalization of the fine-tuned model via continual learning.

Open API: Publish a developer SDK so others can build Blink AI “skills.”

Built With

Share this project:

Updates