๐๏ธ HandFlow
โจ Inspiration
Using a computer should feel fast and effortless, but it doesnโt.
Weโre still clicking through menus, switching tabs, and repeating the same small actions over and over.
There are tools to help, like macropads, but theyโre expensive and limited.
And while smart glasses promise the future, they focus on flashy AI features instead of solving this everyday frustration.
We wanted to fix something simpler, but more meaningful:
make interacting with your computer faster, more productive, and more accessible.
With just an extremely affordable $5 camera, HandFlow turns the glasses you already wear into smart glasses, unlocking a new way to control your computer.
One camera. Unlimited control.
๐ What it does
HandFlow transforms a glasses-mounted camera into a versatile gesture-controlled input system.
- ๐ฅ๏ธ Screen Macro Pad: Raise your hand and customizable macro pad appeared on-screen, activate them with a simple touch gesture.
- ๐ Paper Macro Pad: Fold a single sheet of paper using our origami-inspired design to create 24 programmable buttons (3 faces ร 8 buttons). A full macro pad that fits in your pockeet.
- โ Knuckle Buttons: Finger knuckles on your hands becomes a programmable hotkey โ 7 buttons you always have with you.
- โ Free-Space Gestures: Perform 7 distinct hand gestures in mid-air, each mapped to any computer action you choose.
- ๐ฒ๏ธ Virtual Touchscreen: Turn any non-touch display into a touchscreen using camera tracking and ArUco marker calibration.
- ๐ค Live Capture: Pinch to select any region in view โ automatically save to clipboard save/paste instantly into your note or Gemini will automatically analyzes the content and suggests context-aware actions.
- โ๏ธ Customizable Gesture: Every gesture, button, and action is fully customizable through a simple, intuitive GUI.
๐ ๏ธ How we built it
- Hand Tracking: MediaPipe (21 landmarks - 20 FPS)
- Gesture Recognition: Engineered 21 landmarks from Mediapipe into 96 features, feedidng into TCN (16 classes, ~5ms inference)
- Detection: OpenCV + ArUco markers for macropad detection
- Knuckle Input: Palm orientation + rotated hit-testing
- AI Integration: Gemini 2.5 Flash for real-time visual understanding
- Training Pipeline: Docker + Akash GPU cloud
- Interface: NiceGUI + CustomTkinter overlay
โก Challenges we ran into
- Balancing gesture sensitivity vs false positives
- Handling unstable head-mounted camera perspectives
- Fixing coordinate mismatches from MediaPipe transformations
- Syncing real-time detection with UI processes
- Training with time constrained and lack of data (have to do the whole dataset our own)
๐ Accomplishments we're proud of
- Real-time gesture system with sub-5ms model inference
- Help improve our workflow and productivity significantly
- Turning a single camera into a full input system
- Seamless fusion of gestures, physical inputs, and AI
- Fully working end-to-end prototype
๐ What we learned
- Small models can be fast and powerful
- Physical + digital hybrid interfaces feel intuitive and natural
- Coordinate systems are the hardest bugs to debug
๐ฎ Whatโs next for HandFlow
- Improve robustness across lighting and motion conditions
- Develop a wireless, ultra-compact wearable system, for example, a ring or smartwatch that can activate the camera on demand, detecting your hand only when needed for maximum convenience and saving batter
- Expand gesture set and personalization
- Enable deeper OS-level integrations and workflows
Log in or sign up for Devpost to join the conversation.