StoreUFO: Project Story
Inspiration
I was frustrated paying $109/month across three tools: Google Drive ($10), Restream ($99), and relying on Google Photos for AI search. When I discovered Cloudflare R2 costs just $0.015/GB, I wondered: why can't I add intelligence to raw storage myself?
The challenge: R2 is just object storage—no search, no streaming, no AI. I decided to build a platform that lets users bring their own R2/S3 credentials (BYOC model) while I provide the intelligence layer on top. This keeps their data sovereign while giving them enterprise features at storage-only costs.
What it does
StoreUFO transforms S3-compatible storage into an intelligent media platform:
1. AI Image Search ("Project Memories")
- Upload images to your own R2/S3 bucket
- Background workers analyze images using Gemini Vision API
- Search by natural language: "sunset photos", "photos with cars", "festival pictures"
- Works in any language—the AI translates queries to English keywords before searching
- Bulk analysis trigger: "Analyze All" button for batch processing
2. 24/7 Live Streaming
- Create playlists from videos in your storage
- Stream to YouTube, Twitch, Facebook simultaneously
- Loops continuously for 24/7 broadcasting
- Direct-from-storage: videos never touch our servers
- Custom encoding settings: resolution, bitrate, FPS configuration
- Stream scheduling with timezone support
- Real-time monitoring: FFmpeg logs, FPS, bitrate metrics
3. Multi-Cloud Management
- Single dashboard for R2, S3, MinIO buckets
- Organization → Project → Storage Account hierarchy
- Credential verification before adding storage accounts
- Automatic storage allocation based on available space
- Public/private storage separation
4. File Management
- Direct presigned URL uploads
- Folder upload support ( No Limits)
- File expiry settings (days, milliseconds, or never)
- Generate public or presigned URLs for sharing
- Automatic file type detection (image/video/document)
- File and folder deletion with storage cleanup
5. API-Based File Access
- RESTful APIs for programmatic file management
- Integrate storage into your apps and websites
- Generate shareable URLs (public or time-limited presigned)
- Full CRUD operations: upload, list, retrieve, delete
- JWT authentication for secure API access
How we built it
Technology Stack:
- Frontend: Next.js 16 with React 19, TanStack Query for server state, shadcn/ui components, Tailwind CSS
- Backend: Node.js with Express, RESTful API architecture
- AI Engine: Google Gemini 2.5 Flash for vision and natural language processing
- Streaming: FFmpeg for video transcoding and RTMP broadcasting
- Storage Integration: AWS SDK v3 (compatible with R2, S3, MinIO)
- Database: MongoDB for metadata, Redis for job queues and caching
- Background Processing: Bull queues for async tasks (AI analysis, stream management)
Key Implementation Highlights:
BYOC Architecture
- Users provide their own storage credentials (R2/S3/MinIO)
- Platform never stores user files—only manages metadata
- Direct client-to-storage uploads via presigned URLs
AI Analysis Pipeline
- Background workers trigger Gemini Vision API on image upload
- Extracts objects, scenes, and contextual descriptions
- Results indexed in MongoDB for fast natural language search
MCP Tools Integration
- Custom function declarations bridge Gemini AI with MongoDB
- Three tools:
search_images,get_memories_by_date,get_project_stats - AI translates natural language queries into precise database operations
FFmpeg Streaming Engine
- Custom process manager spawns FFmpeg with playlist files
- Uses
concatdemuxer for seamless video transitions teemuxer enables simultaneous multi-platform broadcasting- Redis jobs handle presigned URL refresh every 10 hours
Multi-Cloud Abstraction
- Single service layer abstracts provider differences (R2 vs S3 endpoints)
- Runtime provider detection based on credential format
- Automatic storage allocation based on available space
Challenges we ran into
1. Making Static Storage "Stream-able" FFmpeg expects file paths, but we're working with authenticated cloud storage. We had to generate time-limited presigned URLs and figure out how to refresh them mid-stream without dropping the broadcast. The solution involved background Redis jobs and dynamic playlist updates—tricky to get right without interrupting the live feed.
2. AI Hallucinations During development, the AI would sometimes "invent" files that didn't exist when users searched. We solved this by implementing MCP tools that strictly validate results—the AI can only return what MongoDB actually finds. This was a critical lesson in grounding LLMs with real data.
3. Scope Limitations We wanted AI search for videos too, but video analysis is computationally expensive. We had to make the hard decision to limit the Memories feature to images only for this version. Managing user expectations around this limitation was important.
4. Multi-Platform Streaming Complexity Each streaming platform (YouTube, Twitch, Facebook) has different RTMP requirements. YouTube wants 2-second keyframe intervals, Twitch prefers 4-second. We built platform-specific validation to ensure streams work correctly before users go live.
Accomplishments that we're proud of
Bringing enterprise features to consumer costs: The BYOC model means users pay only for storage (~$1-5/month with R2), yet get features that would cost $100+/month with traditional SaaS tools.
The "Memories" feature genuinely feels magical: Being able to ask "show me sunset photos" and have it actually work across thousands of images—without any manual tagging—is incredibly satisfying. Multi-language support makes it accessible globally.
Production-ready streaming: We tested 6-hour continuous streams and they held up. The URL refresh mechanism works seamlessly, and multi-platform broadcasting is stable. This isn't just a hackathon demo—it's actually usable.
Clean architecture: Despite the complexity (AI + streaming + multi-cloud), the codebase is organized and maintainable. The separation between storage providers, AI processing, and streaming makes it easy to extend.
What we learned
FFmpeg is incredibly powerful but temperamental: We learned to parse stderr line-by-line for metrics and errors. Understanding the concat demuxer and tee muxer capabilities opened up possibilities we didn't initially know existed.
MCP is the future of AI tool integration: Writing custom function declarations to let Gemini query our database was a game-changer. It's more reliable than prompt engineering alone and makes the AI's capabilities explicit and controllable.
Database indexing matters at scale: Our first implementation took 2 seconds to search 1000 images. After adding compound indexes on (projectId, aiProcessingStatus, type), it dropped to 200ms. Performance optimization is critical for good UX.
BYOC is a compelling model: Users love keeping control of their data. Not having to worry about storing user files simplified our architecture and reduced liability. It's win-win.
What's next for StoreUFO
- Video moment search: Extend AI analysis to video content—sample frames at intervals and enable queries like "find when the speaker mentions cloud computing"
- Collaborative features: Allow teams to share projects and collaborate on playlists
- Advanced analytics: Track stream viewership, popular videos, and engagement metrics
- Webhook integrations: Let external services trigger streams or receive notifications on events
Built With
- ai
- amazon-web-services
- cloudflare
- css
- express.js
- ffmpeg
- gemini
- mongodb
- next.js
- node.js
- r2
- react
- redis
- shadcn/ui
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.