CutOS | Devpost

GIF
Landing Page
GIF
Video RAG Feature directly on CutOS
GIF
Chroma Key Feature directly on CutOS
GIF
Morph Feature directly on CutOS

Inspiration

Video editing is powerful but intimidating. Professional tools like Premiere Pro have steep learning curves that lock out creators with great ideas but limited time. We asked: what if editing was as simple as describing what you want? What if AI could handle the technical complexity while you focus on creativity?

What it does

CutOS is an AI-first video editor that understands natural language. Type "split this clip at 10 seconds and add a vintage effect" and it happens instantly. Upload a video, describe your vision, and watch our Agent split clips, apply effects, remove green screens, dub into 29 languages, isolate voices, and create AI-powered morph transitions. It combines professional multi-track editing with conversational AI assistance, making video editing accessible to everyone.

How we built it

Frontend: Next.js 16 with React 19, Tailwind CSS, and WebGL for GPU-accelerated effects
AI Stack: OpenAI GPT-4o for the editing agent, TwelveLabs Marengo 3.0 for semantic video search, ElevenLabs for AI dubbing/voice isolation, and Kling API for morph transitions
Backend: Supabase for auth, storage, and real-time sync
Architecture: Built a tool-based agent system where GPT-4o analyzes timeline state and executes editing operations through structured tools, with streaming responses for real-time feedback

Challenges we ran into

Performance: Real-time video preview with effects was initially laggy. Solved with WebGL shaders and RAF-based rendering optimization, separating visual feedback from state updates
AI Context Management: The agent needed to understand complex timeline states. Built a dynamic system prompt that includes all clips, positions, and effects in every request
Async AI Operations: Dubbing and voice isolation take 30-60 seconds. Implemented proper async handling with upload status tracking and automatic timeline replacement
Tool Chaining: Teaching the AI to execute multiple operations in sequence ("split at 5s, apply noir, move to track 2") required careful prompt engineering and action parsing

Accomplishments that we're proud of

Zero Learning Curve: Complete beginners can edit videos by just describing what they want
True Multi-Modal AI: Combines vision (video search), audio (dubbing/isolation), and generation (morph transitions) in one seamless workflow
Professional Features: Despite being AI-powered, it's a real NLE with multi-track timeline, effects, trim handles, and magnetic snapping
Instant Gratification: From drag-and-drop to AI commands, every interaction feels immediate and responsive
29-Language Dubbing: Preserving emotion and timing while translating, democratizing global content creation

What we learned

AI UX Design: The best AI tools don't feel like chatbots, they feel like magic. Streaming responses, optimistic updates, and automatic replacements make AI feel instantaneous
Prompt Engineering at Scale: System prompts need to be dynamic and context-aware. We learned to balance completeness with token efficiency
WebGL Optimization: GPU-accelerated effects are essential for real-time video editing in the browser
User Intent Parsing: Natural language is ambiguous, designing tools that handle edge cases ("split this" when there are 5 clips) required thoughtful defaults and clarification flows

What's next for CutOS

AI Video Generation: Generate B-roll, transitions, and effects from text descriptions
Collaborative Editing: Real-time multi-user editing powered by Supabase
Auto-Enhance: "Make this look cinematic" should intelligently analyze footage and apply color grading, cuts, and pacing
Export to Social: One-click optimization for TikTok, Instagram, YouTube with platform-specific formatting
Voice Cloning: Dub videos while preserving the original speaker's voice characteristics
Mobile App: Bring AI-powered editing to iOS and Android with the same conversational interface

Built With

canvasapi
claude
elevenlabs
ffmpeg
framer
gemini
kling
mediarecorderapi
nextjs
openai
radix
shadcn
supabase
tailwind
twelvelabs
v0
vercel
vercelsdk
webgl

Submitted to

HackHive 2026
- Winner [MLH] Best Use of ElevenLabs

Created by

i ran up my claude bill $500

Shams Haroon
SWE @ Wealthsimple, Shopify, TypeOS (YC X25)
i ate food at vince's house

Jonathan David McKesey
pythonista
i didnt eat food at vince's house, i left for 2 hours

Julian Cruzet
i let jon eat food at my house

Vincent Wong