Inspiration Drone operators in contested environments shouldn't need to be experts to be effective. Current C2 systems demand precise inputs — coordinates, menus, keypads — at exactly the moment operators need to be making decisions, not navigating interfaces. We wanted to ask: what if you could just talk to the drone?
What It Does Cicada turns natural speech into controlled, safe drone action. An operator says "Go to barracks, then fly to the command center, then land" — and the system handles everything else. It transcribes locally, parses intent, queues commands, plans obstacle-free routes around no-fly zones, and executes on a simulated ArduPilot drone in real time. Unsafe commands are rejected with an explanation. The operator always has override. Inexperienced operators can be mission-effective from their first command. If you can talk, you can fly.
How We Built It We built a full pipeline in a single day across five layers:
Whisper (faster-whisper) — local speech-to-text with silence detection, zero internet dependency Groq (llama-3.3-70b-versatile) — natural language to structured JSON intent, with a regex fallback when the LLM is unavailable ARA* Pathfinder — anytime repairing A* automatically reroutes around no-fly zones when a direct path is unsafe pymavlink — MAVLink protocol for drone control and telemetry to ArduPilot SITL Flask + Vanilla JS — REST backend with Server-Sent Events pushing live telemetry to a tactical Canvas map in the browser
The safety validator sits between the parser and the pathfinder — every command is checked before anything moves.
Challenges We Ran Into
Whisper quirks — NATO callsigns like "Tango-1" transcribe as "tango one." We built a normalizer to handle number words, homophones, and fuzzy matching against the IFF table asyncio + Flask conflict — running async MAVSDK inside a synchronous Flask app required a background thread and command queue to avoid deadlocks Coordinate system translation — mapping compound layout coordinates to MAVLink's local NED frame required careful ENU ↔ NED conversion Stall detection — drones occasionally got stuck against walls mid-route. We built a stall detector that aborts early rather than hanging the queue indefinitely Venue WiFi pressure — we designed Whisper to run fully offline from the start, which paid off under heavy network load
Accomplishments That We're Proud Of
A full voice-to-drone pipeline that actually works end-to-end under live demo conditions Sequential command queuing — "go to A, then B, then land" parsed and executed in order with just-in-time validation ARA* pathfinding that silently reroutes around obstacles the operator didn't know were there A tactical map that renders the compound, drone position, flight trails, and threat zones in real time Full voice loop — Whisper in, browser TTS out — the drone talks back
What We Learned
Keep safety deterministic. The LLM handles language, it never touches the block/confirm/allow decision. That boundary made the system auditable and predictable under pressure Reliability beats features. A system that does five things perfectly is more impressive in a live demo than one that does ten things flakily Design for the operator, not the architecture. The confirmation prompt, the command log, the voice feedback — these aren't polish, they're the product
What's Next for Cicada
Real hardware integration — the pymavlink layer is hardware-agnostic. Cicada can command a real drone with minimal changes Expanded IFF and rules of engagement — richer contact classification and dynamic no-fly zone updates from live sensor feeds Multi-operator support — multiple voice inputs commanding Alpha and Bravo independently from different positions Edge deployment — package Whisper + Groq fallback into a fully offline system for denied-communications environments The architecture is modular by design. Swap the drone API, swap the voice engine — the safety layer stays intact
Team
David Cadena Spencer Muller Steven Brar Kyle Chau James Kyle
Built during the Voice-Driven UxS Hackathon.
Log in or sign up for Devpost to join the conversation.