Inspiration

We wanted to move beyond rigid, pre-programmed buttons and complex dashboards. Inspired by the seamless interaction of Jarvis from Iron Man, we set out to create a world where humans can command complex hardware as easily as speaking to a friend, using Higgs to bridge the gap between intent and action.

What it does

EchoControl is a voice-first orchestration layer that allows users to operate robots using natural language. By combining high-accuracy Speech-to-Text (STT) with Higgs-driven tool calling, the system parses spoken commands, identifies the necessary robotic functions, and executes them in real-time—turning "pick up that tool" into a precise physical maneuver.

How we built it

Intelligence: We utilized the Higgs framework to handle intent classification and tool selection logic. The Ear: A low-latency Speech-to-Text (STT) model captures and cleans audio input. The Bridge: We developed a custom middleware that maps Higgs’ JSON tool outputs to robotic API endpoints. Cloud: Call on Claude API to process the tool calling and the orchestration layer The Hardware: Integrated via a standardized control protocol to ensure the robot reacts to tool calls instantly.

Challenges we ran into

The biggest hurdle was latency. Transitioning from audio to text, then through a logic model, and finally to a mechanical move can feel sluggish. We had to optimize our Higgs prompt structures and stream the STT data to ensure the robot felt responsive rather than delayed. Eventually, we want to run this on the edge as our robot will be deployed in public and can have spontaneous data.

Accomplishments that we're proud of

Zero-friction tool calling, where the system can handle ambiguous speech to prevent users to pull out their phones to have the robot do specific actions.

What we learned

We discovered that the "human" element of robotics is all about context. Building robust tool descriptions within Higgs is just as important as the code itself; the better the model understands the capability of the tool, the more reliable the robot becomes.

What's next for EchoControl: Real-time Robotic Orchestration via Higgs

We plan to implement visual feedback loops, allowing the robot to "see" the tool it is calling. We are also looking into multi-turn voice conversations, where EchoControl can ask clarifying questions like, "where is the house I need to deliver this package to" before taking action.

Built With

Share this project:

Updates