StoreUFO: Project Story

Inspiration

I was frustrated paying $109/month across three tools: Google Drive ($10), Restream ($99), and relying on Google Photos for AI search. When I discovered Cloudflare R2 costs just $0.015/GB, I wondered: why can't I add intelligence to raw storage myself?

The challenge: R2 is just object storage—no search, no streaming, no AI. I decided to build a platform that lets users bring their own R2/S3 credentials (BYOC model) while I provide the intelligence layer on top. This keeps their data sovereign while giving them enterprise features at storage-only costs.

What it does

StoreUFO transforms S3-compatible storage into an intelligent media platform:

1. AI Image Search ("Project Memories")

Upload images to your own R2/S3 bucket
Background workers analyze images using Gemini Vision API
Search by natural language: "sunset photos", "photos with cars", "festival pictures"
Works in any language—the AI translates queries to English keywords before searching
Bulk analysis trigger: "Analyze All" button for batch processing

2. 24/7 Live Streaming

Create playlists from videos in your storage
Stream to YouTube, Twitch, Facebook simultaneously
Loops continuously for 24/7 broadcasting
Direct-from-storage: videos never touch our servers
Custom encoding settings: resolution, bitrate, FPS configuration
Stream scheduling with timezone support
Real-time monitoring: FFmpeg logs, FPS, bitrate metrics

3. Multi-Cloud Management

Single dashboard for R2, S3, MinIO buckets
Organization → Project → Storage Account hierarchy
Credential verification before adding storage accounts
Automatic storage allocation based on available space
Public/private storage separation

4. File Management

Direct presigned URL uploads
Folder upload support ( No Limits)
File expiry settings (days, milliseconds, or never)
Generate public or presigned URLs for sharing
Automatic file type detection (image/video/document)
File and folder deletion with storage cleanup

5. API-Based File Access

RESTful APIs for programmatic file management
Integrate storage into your apps and websites
Generate shareable URLs (public or time-limited presigned)
Full CRUD operations: upload, list, retrieve, delete
JWT authentication for secure API access

How we built it

Technology Stack:

Frontend: Next.js 16 with React 19, TanStack Query for server state, shadcn/ui components, Tailwind CSS
Backend: Node.js with Express, RESTful API architecture
AI Engine: Google Gemini 2.5 Flash for vision and natural language processing
Streaming: FFmpeg for video transcoding and RTMP broadcasting
Storage Integration: AWS SDK v3 (compatible with R2, S3, MinIO)
Database: MongoDB for metadata, Redis for job queues and caching
Background Processing: Bull queues for async tasks (AI analysis, stream management)

Key Implementation Highlights:

BYOC Architecture
- Users provide their own storage credentials (R2/S3/MinIO)
- Platform never stores user files—only manages metadata
- Direct client-to-storage uploads via presigned URLs
AI Analysis Pipeline
- Background workers trigger Gemini Vision API on image upload
- Extracts objects, scenes, and contextual descriptions
- Results indexed in MongoDB for fast natural language search
MCP Tools Integration
- Custom function declarations bridge Gemini AI with MongoDB
- Three tools: search_images, get_memories_by_date, get_project_stats
- AI translates natural language queries into precise database operations
FFmpeg Streaming Engine
- Custom process manager spawns FFmpeg with playlist files
- Uses concat demuxer for seamless video transitions
- tee muxer enables simultaneous multi-platform broadcasting
- Redis jobs handle presigned URL refresh every 10 hours
Multi-Cloud Abstraction
- Single service layer abstracts provider differences (R2 vs S3 endpoints)
- Runtime provider detection based on credential format
- Automatic storage allocation based on available space

Challenges we ran into

1. Making Static Storage "Stream-able" FFmpeg expects file paths, but we're working with authenticated cloud storage. We had to generate time-limited presigned URLs and figure out how to refresh them mid-stream without dropping the broadcast. The solution involved background Redis jobs and dynamic playlist updates—tricky to get right without interrupting the live feed.

2. AI Hallucinations During development, the AI would sometimes "invent" files that didn't exist when users searched. We solved this by implementing MCP tools that strictly validate results—the AI can only return what MongoDB actually finds. This was a critical lesson in grounding LLMs with real data.

3. Scope Limitations We wanted AI search for videos too, but video analysis is computationally expensive. We had to make the hard decision to limit the Memories feature to images only for this version. Managing user expectations around this limitation was important.

4. Multi-Platform Streaming Complexity Each streaming platform (YouTube, Twitch, Facebook) has different RTMP requirements. YouTube wants 2-second keyframe intervals, Twitch prefers 4-second. We built platform-specific validation to ensure streams work correctly before users go live.

Accomplishments that we're proud of

Bringing enterprise features to consumer costs: The BYOC model means users pay only for storage (~$1-5/month with R2), yet get features that would cost $100+/month with traditional SaaS tools.

The "Memories" feature genuinely feels magical: Being able to ask "show me sunset photos" and have it actually work across thousands of images—without any manual tagging—is incredibly satisfying. Multi-language support makes it accessible globally.

Production-ready streaming: We tested 6-hour continuous streams and they held up. The URL refresh mechanism works seamlessly, and multi-platform broadcasting is stable. This isn't just a hackathon demo—it's actually usable.

Clean architecture: Despite the complexity (AI + streaming + multi-cloud), the codebase is organized and maintainable. The separation between storage providers, AI processing, and streaming makes it easy to extend.

What we learned

FFmpeg is incredibly powerful but temperamental: We learned to parse stderr line-by-line for metrics and errors. Understanding the concat demuxer and tee muxer capabilities opened up possibilities we didn't initially know existed.

MCP is the future of AI tool integration: Writing custom function declarations to let Gemini query our database was a game-changer. It's more reliable than prompt engineering alone and makes the AI's capabilities explicit and controllable.

Database indexing matters at scale: Our first implementation took 2 seconds to search 1000 images. After adding compound indexes on (projectId, aiProcessingStatus, type), it dropped to 200ms. Performance optimization is critical for good UX.

BYOC is a compelling model: Users love keeping control of their data. Not having to worry about storing user files simplified our architecture and reduced liability. It's win-win.

What's next for StoreUFO

Video moment search: Extend AI analysis to video content—sample frames at intervals and enable queries like "find when the speaker mentions cloud computing"
Collaborative features: Allow teams to share projects and collaborate on playlists
Advanced analytics: Track stream viewership, popular videos, and engagement metrics
Webhook integrations: Let external services trigger streams or receive notifications on events

Built With

Updates

Bramhanand Tiwari started this project — Dec 01, 2025 02:17 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.