The inspiration for TalkSense-AI came from witnessing the real-world challenges faced by call center agents who often struggle to deliver quick, accurate responses during live customer interactions. Traditional systems lack real-time AI-driven support, resulting in slower resolutions and inconsistent customer experiences.
We envisioned a solution that could listen, analyze, and assist in real-time — transforming every conversation into an opportunity for better service. With advancements in speech recognition and generative AI, we realized it was finally possible to create an intelligent co-pilot for live calls.
What It Does
TalkSense-AI is a comprehensive Live Call Analytics & AI Assistance Platform that empowers agents and supervisors with:
- Real-time Speech-to-Text Transcription using browser microphone input
- AI-Powered Call Assistance providing contextual, in-call suggestions
- Sentiment Analysis to track customer emotions throughout the conversation
- AI Chatbot powered by Bedrock for instant Q&A from your knowledge base
- Call Simulator for realistic training and mock call scenarios
- Data Management System to upload, parse, and analyze call recordings, PDFs, and documents
- Post-Call Analytics with detailed insights like talk time, interruptions, resolution patterns
How We Built It
We designed a full-stack system using a React.js + Vite frontend, a FastAPI backend, and a powerful set of AWS cloud services to enable scalability and intelligent automation.
Architecture Overview
Frontend (React + Vite)
- Real-time audio capture via MediaRecorder or Web Audio API
- WebSocket-based transcript updates and agent suggestions
- Dynamic UI updates using React state, Radix UI, and Tailwind CSS
// Establish WebSocket for live call analytics
const wsRef = useRef(null);
wsRef.current = new WebSocket('ws://localhost:8000/ws/live-call-analytics');
Backend (FastAPI + AWS)
- Real-time audio ingestion over WebSocket
- AWS Transcribe for live speech-to-text with Call Analytics metadata
- Bedrock inference to generate contextual suggestions from the knowledge base
- Sentiment analysis using AWS Comprehend
# Real-time WebSocket endpoint for streaming transcription
@app.websocket("/ws/live-call-analytics")
async def live_call_analytics(websocket: WebSocket):
client = TranscribeStreamingClient(region=AWS_REGION)
stream = await client.start_call_analytics_stream_transcription(
language_code="en-US",
media_sample_rate_hz=16000,
media_encoding="pcm"
)
Key Implementation Features
- WebSocket Integration for low-latency audio and data transfer
- Amazon Transcribe Call Analytics with speaker separation
- Amazon Bedrock (Titan LLM) for real-time generative AI responses
- S3-based Knowledge Base for document storage and retrieval
- Sentiment Analysis using Amazon Comprehend
- Post-call Summarization using Bedrock with structured prompts
- Secure Upload System for audio files, PDFs, and more
Challenges We Faced
- Real-time Audio Processing: Achieving low-latency browser-to-AWS streaming with synchronized audio
- WebSocket Stability: Managing concurrent WebSocket sessions for multiple users
- Multi-Service Integration: Coordinating Bedrock, Transcribe, Comprehend, and S3 within the same workflow
- Context Management: Maintaining continuous conversation context for live AI prompts
- CORS Configuration: Ensuring secure cross-origin access for APIs and sockets
What We Learned
- Implementing real-time GenAI workflows in production-grade stacks
- Deep-dive into Amazon Transcribe Call Analytics and speaker separation
- Designing efficient WebSocket-based pipelines
- Crafting effective AI prompts for contextual assistance
- Leveraging cloud-native AWS architectures for scalable AI products
What's Next for TalkSense-AI
- Multi-Language Support for global customer interaction
- Advanced Analytics Dashboard with real-time KPIs and escalation tracking
- CRM Integrations (Salesforce, HubSpot) for data syncing
- Mobile App for agents and managers on the go
- Custom ML Models for industry-specific classification (e.g., healthcare, legal, banking)
Built With
Languages & Frameworks
- Python (FastAPI, Uvicorn)
- JavaScript (React, Vite)
- HTML5/CSS3 (Tailwind CSS)
Cloud Services
- Amazon Web Services (AWS)
- Amazon Transcribe Call Analytics
- Amazon Bedrock (Titan Text)
- Amazon S3 (Storage)
- Amazon Comprehend (Sentiment Analysis)
- Amazon DynamoDB (Conversation Storage)
APIs & Libraries
- WebSocket API
- Amazon Transcribe Streaming SDK
- Boto3 (AWS SDK for Python)
- PyPDF2 (PDF Document Parsing)
- Radix UI (Component Library)
Development Tools
- React Router
- Python-dotenv
- Pydantic
- Aiohttp
Real-time Technologies
- WebSocket Protocol
- Audio Streaming APIs
- Real-time Transcription
- Live Sentiment Analysis
Built With
- aiohttp
- amazon-bedrock
- amazon-comprehend
- amazon-dynamodb
- amazon-transcribe-call-analytics
- amazon-transcribe-streaming-sdk
- amazon-web-services
- audio
- boto3
- css3
- fastapi
- html5
- javascript
- pydantic
- pypdf2
- python
- python-dotenv
- radix-ui
- react
- react-router
- tailwind-css
- uvicorn
- vite
- web
- websocket-api


Log in or sign up for Devpost to join the conversation.