π StyleAssist: AI Closet Organizer & Virtual Stylist Hackathon Submission (MVP)
Inspiration
The primary inspiration for StyleAssist came from addressing the widespread problem of Decision Fatigueβthe paradox of a closet full of clothes and nothing to wear. We also sought to solve the fundamental user problem that existing AI stylists are widely distrusted and perceived as "horrible" or "dumb as hell". Our core goal is to deliver the confidence to wear the perfect outfit, validated by AI you can see and trust.
What it does
StyleAssist is an AI-native wardrobe management application that uses an advanced technology stack to streamline organization and style validation:
- 10x Cataloging: We use a multimodal LLM (Claude API) to instantly generate all metadata (Category, Color, Style, Fabric, Occasion) from a photo, which eliminates the manual, time-consuming effort that causes user churn in competitor apps.
- Contextual Styling: An AI stylist suggests complete outfits based on user context (e.g., occasion) by leveraging the rich, nuanced LLM metadata.
- Virtual Try-On (VTO) Validation: The app generates a photorealistic image of the suggested outfit on the user's own body using generative AI (Nano Banana/Gemini API). This VTO acts as the crucial "trust layer" that validates the AI's suggestions.
How we built it
StyleAssist was built using a robust, cloud-native architecture hosted on Daytona, aggregating several advanced services into a seamless experience:
- Frontend: The client application is an iOS App (Swift/SwiftUI) that handles camera capture, local data storage etc
- Core Backend: A Python/FastAPI application running on Daytona handles all heavy AI processing and API aggregation.
- AI Services:
- Claude API for complex clothing analysis and metadata generation.
- Nano Banana (Gemini API) for the photorealistic Virtual Try-On inference.
- ElevenLabs for converting outfit suggestions into a natural voice output.
- Infrastructure & Observability:
- LLM performance is traced and monitored in real-time by Galileo.
- The current database used is Supabase.
- Tigris for storing images
Challenges we ran into
The development required overcoming several integration and processing hurdles:
- Different Integrations: Orchestrating multiple disparate, cutting-edge AI services (LLMs, vision models, TTS) within the FastAPI backend was technically complex.
- Processing Images: We faced challenges in correctly handling the inputs and outputs for the image-based models, including encoding/decoding and ensuring the VTO model produced realistic results.
- Bulk Image Parallelism: Processing user-uploaded bulk images, especially the high-compute VTO feature, requires further infrastructure changes to achieve true parallel processing without bottlenecks.
Accomplishments that we're proud of
We are most proud of:
- Working Prototype: Being able to achieve a fully functional, end-to-end working prototype that integrates multiple complex AI services in such a short span of time.
- Frictionless Cataloging: Implementing the 10x LLM-tagging advantage, which solves the market's primary onboarding barrier by eliminating manual data entry.
- Core Feature Validation: Successfully integrating the photorealistic VTO as the "trust layer" to validate the AI's creative suggestions, which is the app's key differentiator.
What we learned
The project provided several key insights:
- User Demand & Problem: There is a clear user demand for a product that finally solves the decision fatigue problem, making this an interesting and high-impact area to solve.
- Cost Optimization: The importance of cost optimization for the expensive generative VTO feature is extremely important for the long-term viability of the business.
- Data as the Moat: The sustainable competitive advantage relies on building a proprietary "Style-LLM" by capturing user feedback on taste and personality.
- Database Shift: We established the current database is Supabase.
What's next for StyleAssist
The immediate next steps will focus on enhancing user interaction and core functionality:
- Voice Agent Development: Implement a conversational voice agent that uses the ElevenLabs model to talk to the user and respond to natural language queries like, "What should I wear today?" The agent will use contextual data (weather, occasion) and outfit data to provide suggestions and a Virtual Try-On image. The agent will also handle follow-up questions.
- Enhanced Search Feature: Implement a robust search capability to easily allow the user to search for and find specific outfits or clothing items within their closet.
Log in or sign up for Devpost to join the conversation.