T-Time | Devpost

Inspiration

Having come into this hackathon wanting to build a solid foundation in RAG and agentic workflows, we firmly believed that competing in T-Mobile's challenge would more than fulfill our desire to learn. We saw an opportunity to tackle a real problem: customer sentiment is scattered across the internet, but there's no single place to understand what customers actually feel. from Reddit threads, to review sites they all tell part of the story, but piecing them together manually is impossible at scale. We wanted to build something that could listen to thousands of voices and turn them into actionable insights, using cutting-edge AI to bridge the gap between raw data and real understanding.

What it does

T-Time is an AI-powered command center for customer experience teams. We continuously scan the web for T-Mobile related conversations over 40,000 vectors (and growing) to turn raw sentiment into live, actionable insight. With simple natural language, teams can instantly surface metrics like customer satisfaction, response times, churn risk, and regional sentiment no SQL, no dashboards, just questions asked in plain English and answered with real customer data.

Core Features:

💬 RAG-Powered Chat: Ask questions like "What are customers saying about 5G coverage in California?" and get answers grounded in actual customer feedback, not hallucinated responses
📊 AI-Generated Metrics Dashboard: Real-time happiness index, sentiment trends, and period-over-period comparisons all generated by AI analyzing thousands of data points
🗺️ Interactive Global Sentiment Map: Click anywhere in the world to see location-specific sentiment, with AI extracting geographic insights from unstructured text
🔄 Multi-Source Data Unification: Three platforms, one unified view

How we built it

The Stack:

Frontend: Next.js + TypeScript + TailwindCSS + tRPC for type-safe API calls
Backend API: Node.js with tRPC routing, orchestrating embedding generation, semantic search, and LLM inference
Embedding Service: FastAPI + Sentence-Transformers (e5-base-v2, 768-dim) on a VPS running cron jobs on our scraper scripts
LLM Service: FastAPI wrapper around Ollama with Nemotron-mini running on a NVIDIA A100 (80GiB) GPU with 16 CPUs and 160GiB RAM
Vector Database: Pinecone storing 40,000+ vectors for semantic search
Data Collection: Python scrapers with PRAW (Reddit API), Consumer Affairs JSON processing, and BeuatifulSoup4 for web scraping
Authentication: Better Auth with GitHub OAuth + email/password
Database: PostgreSQL + Drizzle ORM

The Architecture:

Data Collection Layer: Python scrapers continuously pull data from Reddit (r/tmobile) and review sites. Each piece of feedback gets sentiment-analyzed using CardiffNLP Twitter-RoBERTa (3-class sentiment), embedded with e5-base-v2 (768 dimensions), and deduplicated via MD5 hashing before being upserted to Pinecone.
RAG Pipeline: When a user asks a question, we:
- Generate an embedding for their query using our FastAPI embedding service
- Perform semantic search in Pinecone (top-20 most relevant customer feedback)
- Inject this context + conversation history (last 10 messages) into the LLM
- Get a response grounded in actual customer data, not training data
Dual LLM Strategy: We run Nemotron-mini on a remote A100 GPU for 3-5x faster metric inference compared to cloud APIs. If the GPU fails, we automatically fall back to Gemini API with exponential backoff retry logic ensuring maximum uptime. Gemini is great for conversation, but we can scale it back a little when it comes to crunching the numbers
Location Intelligence: Our metrics dashboard uses Gemini AI to extract location names from unstructured customer feedback, geocodes them, and displays sentiment on an interactive world map. Click New York, see what New Yorkers are saying. Click Texas, see Texas sentiment. All in real-time.
Email System: We discovered an email notification system integrated into the codebase that alerts teams when sentiment shifts significantly, helping teams stay proactive rather than reactive.

Challenges we ran into

1. Keeping hallucinations and variations to a minimum LLMs love to make things up. Without RAG, asking "What are customers saying about 5G?" would result in generic training data regurgitation. We solved this by:

Implementing semantic search to retrieve real customer feedback
Injecting top-20 relevant vectors into every LLM prompt
Building conversation memory (last 10 messages) to maintain context without token bloat
Adding fallback mechanisms: if embeddings fail, we use keyword-based sentiment matching

2. Having the app work at a manageable speed With 40K+ vectors, multi-service orchestration (frontend → tRPC → embedding service → Pinecone → LLM), and GPU inference, latency was a beast. We tackled it with:

Cloud GPU inference (Ollama Nemotron-mini) for 3-5x speedup vs larger models
Singleton pattern for ML models (load once, reuse forever)
Multi-stage Docker builds to pre-download embedding models, reducing cold start from 120s → 30s
Exponential backoff retry logic with 3 attempts per request
Batch processing for Pinecone upserts (100 vectors at a time)
Health checks every 30s on the embedding service to catch failures early

3. Multi-source data chaos Reddit returns nested JSON with t3_ prefixes. Threads uses Meta Graph API with different field names. Consumer Affairs has author objects inside objects. We:

Built a unified metadata schema with 15-20 standardized fields
Created platform-specific extensions (Consumer Affairs adds rating, location, verified_purchase)
Implemented MD5-based deduplication to prevent duplicate vectors
Wrote robust error handling to skip malformed data without halting execution

4. Sentiment model bias Our initial sentiment model (distilbert-base-uncased-finetuned-sst-2-english) was trained on movie reviews, leading to heavy negative bias in customer service language. We upgraded to CardiffNLP Twitter-RoBERTa, which supports 3-class sentiment (positive/neutral/negative) and was trained on social media data, much better for our use case.

5. Location extraction from text Customers don't say "I'm at coordinates [-74.006, 40.7128]", they say "I'm in New York" or "Coverage sucks in Austin, TX". We solved this with:

Gemini AI parsing of Pinecone results to extract location names
Location validation against Consumer Affairs metadata (city, state, country)
Geocoding pipeline to convert text → [longitude, latitude] arrays
Interactive map visualization with click-to-explore sentiment details

Accomplishments that we're proud of

🚀 40,000+ customer vectors indexed and searchable - a production-scale dataset
⚡ 2-5 second end-to-end query latency - despite multi-service orchestration
🎯 99.9% uptime with dual LLM fallback - GPU-first, cloud-backed reliability
🌍 Real-time location-based sentiment extraction - AI turning text into coordinates
🧠 RAG pipeline with conversation memory - grounded responses, not hallucinations
🔄 Multi-source data unification - three platforms, one schema, zero duplicates
🛠️ Production-grade architecture - Docker, health checks, retries, logging, auth
💡 Multi-Agent integration - 5 tools for Google ADK agents to autonomously analyze sentiment to ship to users

We didn't just build a hackathon demo, we built a data analysis machine that you can play with right now!

What we learned

RAG is non-negotiable for real-world AI applications. Without semantic search grounding LLM responses in actual data, we'd just be building a fancy chatbot that makes stuff up.

Cloud GPU inference is a game-changer. Running Nemotron-mini on an A100 gave us 3-5x faster inference compared to larger models, vast customization, and full data privacy.

Unified data schemas are hard but worth it. Normalizing Reddit, Threads, and Consumer Affairs data into a single metadata format took time, but it unlocked cross-platform queries and made our RAG pipeline possible.

Multi-stage Docker builds save cold start time. Pre-downloading our 600MB embedding model during the build reduced startup from 120s → 30s.

Type safety prevents bugs at scale. TypeScript + Zod validation + tRPC caught errors before they hit production.

Conversation memory management is an art. Keeping the last 10 messages (sliding window) preserved context while preventing token bloat.

AI can extract structured data from unstructured text. Using Gemini to parse location names from customer feedback and geocode them unlocked our global sentiment map feature.

What's next for T-Time

🔮 Real-time streaming updates: WebSocket integration for live sentiment changes as new data arrives 📈 Historical trend analysis: Track sentiment shifts over weeks/months to identify patterns 🤖 Autonomous agent workflows: Expand MCP integration to let AI agents autonomously investigate sentiment anomalies 🌐 Multi-language support: Expand beyond English to analyze global T-Mobile sentiment 🎯 Advanced NLP topic extraction: Replace basic keyword frequency with transformer-based topic modeling ⚖️ A/B testing framework: Compare different prompt strategies and model configurations 🔐 Enterprise security: Role-based access control, audit logs, SOC 2 compliance 📊 Custom metric creation: Let teams define their own happiness metrics using natural language

T-Time started as a hackathon project, but it's architected to scale. With 40,000+ vectors today, we're ready for millions tomorrow.

Built with ❤️ and too much tiramisu at HackUTD