🖐️ HandFlow

✨ Inspiration

Using a computer should feel fast and effortless, but it doesn’t.
We’re still clicking through menus, switching tabs, and repeating the same small actions over and over.

There are tools to help, like macropads, but they’re expensive and limited.
And while smart glasses promise the future, they focus on flashy AI features instead of solving this everyday frustration.

We wanted to fix something simpler, but more meaningful:
make interacting with your computer faster, more productive, and more accessible.

With just an extremely affordable $5 camera, HandFlow turns the glasses you already wear into smart glasses, unlocking a new way to control your computer.

One camera. Unlimited control.

🚀 What it does

HandFlow transforms a glasses-mounted camera into a versatile gesture-controlled input system.

🖥️ Screen Macro Pad: Raise your hand and customizable macro pad appeared on-screen, activate them with a simple touch gesture.
📄 Paper Macro Pad: Fold a single sheet of paper using our origami-inspired design to create 24 programmable buttons (3 faces × 8 buttons). A full macro pad that fits in your pockeet.
✋ Knuckle Buttons: Finger knuckles on your hands becomes a programmable hotkey — 7 buttons you always have with you.
✋ Free-Space Gestures: Perform 7 distinct hand gestures in mid-air, each mapped to any computer action you choose.
🖲️ Virtual Touchscreen: Turn any non-touch display into a touchscreen using camera tracking and ArUco marker calibration.
🤖 Live Capture: Pinch to select any region in view → automatically save to clipboard save/paste instantly into your note or Gemini will automatically analyzes the content and suggests context-aware actions.
⚙️ Customizable Gesture: Every gesture, button, and action is fully customizable through a simple, intuitive GUI.

🛠️ How we built it

Hand Tracking: MediaPipe (21 landmarks - 20 FPS)
Gesture Recognition: Engineered 21 landmarks from Mediapipe into 96 features, feedidng into TCN (16 classes, ~5ms inference)
Detection: OpenCV + ArUco markers for macropad detection
Knuckle Input: Palm orientation + rotated hit-testing
AI Integration: Gemini 2.5 Flash for real-time visual understanding
Training Pipeline: Docker + Akash GPU cloud
Interface: NiceGUI + CustomTkinter overlay

⚡ Challenges we ran into

Balancing gesture sensitivity vs false positives
Handling unstable head-mounted camera perspectives
Fixing coordinate mismatches from MediaPipe transformations
Syncing real-time detection with UI processes
Training with time constrained and lack of data (have to do the whole dataset our own)

🏆 Accomplishments we're proud of

Real-time gesture system with sub-5ms model inference
Help improve our workflow and productivity significantly
Turning a single camera into a full input system
Seamless fusion of gestures, physical inputs, and AI
Fully working end-to-end prototype

📚 What we learned

Small models can be fast and powerful
Physical + digital hybrid interfaces feel intuitive and natural
Coordinate systems are the hardest bugs to debug

🔮 What’s next for HandFlow

Improve robustness across lighting and motion conditions
Develop a wireless, ultra-compact wearable system, for example, a ring or smartwatch that can activate the camera on demand, detecting your hand only when needed for maximum convenience and saving batter
Expand gesture set and personalization
Enable deeper OS-level integrations and workflows