Inspiration

We've all been there, scrolling Instagram, seeing an amazing dish, and thinking "I want that, but what even is it?" Or walking down the street, hands full, wishing we could just ask for restaurant recommendations instead of typing.

Existing food apps make you type, scroll, and filter. Worse, they forget your allergies and dietary needs every single time. We wanted something more natural., like asking a friend who knows every restaurant in town and remembers you can't eat gluten.

What it does

Yelp Scout is an AI food assistant with three input modes:

  • 📷 Photo Search — Snap any dish, AI identifies it, finds restaurants serving it nearby
  • 🎤 Voice Chat — Speak naturally, get spoken recommendations back
  • ⌨️ Text — Traditional search with smart time-based suggestions

The app auto-learns and remembers:

  • 🍕 Favorite cuisines ("I love sushi" → saved)
  • 🥗 Dietary restrictions ("I'm vegan" → saved)
  • ⚠️ Allergies ("I have a nut allergy" → saved and respected in future recommendations)

No settings page needed. Just chat naturally — we pick up on it.

How I built it

Layer Tech
Frontend React 19, TypeScript, Tailwind CSS, Leaflet
Vision AI OpenAI GPT-4o via InsForge AI SDK
Food Data Yelp Chat API v2
Voice Web Speech API + OpenAI TTS
Backend InsForge (Postgres, Auth, Edge Functions)

The architecture flows like this: User Input (📷/🎤/⌨️) → GPT-4o Vision → Yelp API → Results + Map

Preference detection happens server-side our edge function scans each message for cuisine keywords and dietary patterns, then auto-updates the user's profile in Postgres.

Challenges I faced

  1. Speech re-triggering — Auto-speak kept replaying when switching tabs. Fixed by tracking all spoken message IDs in a Set instead of just the last one.

2.. Photo → Text pipeline — Getting GPT-4o to return structured JSON consistently required careful prompt engineering.

What I learned

  • Building voice-first UX is harder than it looks, timing, interruptions, and feedback matter
  • InsForge made backend setup incredibly fast (auth, DB, edge functions in minutes)
  • Multi-modal AI (vision + text + voice) creates a much more natural user experience
  • Auto-detecting preferences from natural language is powerful but requires robust keyword matching

What's next

  • Group dining mode (resolve conflicts: "Find food for a vegan + someone gluten-free")
  • Swipe-to-decide for indecisive moments
  • Reservation booking integration

Built With

  • insforge
  • openai
  • react
  • yelpapi
Share this project:

Updates