TalkSense AI | Devpost

Logo
Train The Knowlefhebase for realtime assistance
Contextual Aware chatbot trained By the Knowledgebase
Real Time Call Assistance Panel

The inspiration for TalkSense-AI came from witnessing the real-world challenges faced by call center agents who often struggle to deliver quick, accurate responses during live customer interactions. Traditional systems lack real-time AI-driven support, resulting in slower resolutions and inconsistent customer experiences.

We envisioned a solution that could listen, analyze, and assist in real-time — transforming every conversation into an opportunity for better service. With advancements in speech recognition and generative AI, we realized it was finally possible to create an intelligent co-pilot for live calls.

What It Does

TalkSense-AI is a comprehensive Live Call Analytics & AI Assistance Platform that empowers agents and supervisors with:

Real-time Speech-to-Text Transcription using browser microphone input
AI-Powered Call Assistance providing contextual, in-call suggestions
Sentiment Analysis to track customer emotions throughout the conversation
AI Chatbot powered by Bedrock for instant Q&A from your knowledge base
Call Simulator for realistic training and mock call scenarios
Data Management System to upload, parse, and analyze call recordings, PDFs, and documents
Post-Call Analytics with detailed insights like talk time, interruptions, resolution patterns

How We Built It

We designed a full-stack system using a React.js + Vite frontend, a FastAPI backend, and a powerful set of AWS cloud services to enable scalability and intelligent automation.

Architecture Overview

Frontend (React + Vite)

Real-time audio capture via MediaRecorder or Web Audio API
WebSocket-based transcript updates and agent suggestions
Dynamic UI updates using React state, Radix UI, and Tailwind CSS

// Establish WebSocket for live call analytics
const wsRef = useRef(null);
wsRef.current = new WebSocket('ws://localhost:8000/ws/live-call-analytics');

Backend (FastAPI + AWS)

Real-time audio ingestion over WebSocket
AWS Transcribe for live speech-to-text with Call Analytics metadata
Bedrock inference to generate contextual suggestions from the knowledge base
Sentiment analysis using AWS Comprehend

# Real-time WebSocket endpoint for streaming transcription
@app.websocket("/ws/live-call-analytics")
async def live_call_analytics(websocket: WebSocket):
    client = TranscribeStreamingClient(region=AWS_REGION)
    stream = await client.start_call_analytics_stream_transcription(
        language_code="en-US",
        media_sample_rate_hz=16000,
        media_encoding="pcm"
    )

Key Implementation Features

WebSocket Integration for low-latency audio and data transfer
Amazon Transcribe Call Analytics with speaker separation
Amazon Bedrock (Titan LLM) for real-time generative AI responses
S3-based Knowledge Base for document storage and retrieval
Sentiment Analysis using Amazon Comprehend
Post-call Summarization using Bedrock with structured prompts
Secure Upload System for audio files, PDFs, and more

Challenges We Faced

Real-time Audio Processing: Achieving low-latency browser-to-AWS streaming with synchronized audio
WebSocket Stability: Managing concurrent WebSocket sessions for multiple users
Multi-Service Integration: Coordinating Bedrock, Transcribe, Comprehend, and S3 within the same workflow
Context Management: Maintaining continuous conversation context for live AI prompts
CORS Configuration: Ensuring secure cross-origin access for APIs and sockets

What We Learned

Implementing real-time GenAI workflows in production-grade stacks
Deep-dive into Amazon Transcribe Call Analytics and speaker separation
Designing efficient WebSocket-based pipelines
Crafting effective AI prompts for contextual assistance
Leveraging cloud-native AWS architectures for scalable AI products

What's Next for TalkSense-AI

Multi-Language Support for global customer interaction
Advanced Analytics Dashboard with real-time KPIs and escalation tracking
CRM Integrations (Salesforce, HubSpot) for data syncing
Mobile App for agents and managers on the go
Custom ML Models for industry-specific classification (e.g., healthcare, legal, banking)

Built With

Languages & Frameworks

Python (FastAPI, Uvicorn)
JavaScript (React, Vite)
HTML5/CSS3 (Tailwind CSS)

Cloud Services

Amazon Web Services (AWS)
Amazon Transcribe Call Analytics
Amazon Bedrock (Titan Text)
Amazon S3 (Storage)
Amazon Comprehend (Sentiment Analysis)
Amazon DynamoDB (Conversation Storage)

APIs & Libraries

WebSocket API
Amazon Transcribe Streaming SDK
Boto3 (AWS SDK for Python)
PyPDF2 (PDF Document Parsing)
Radix UI (Component Library)

Development Tools

React Router
Python-dotenv
Pydantic
Aiohttp

Real-time Technologies

WebSocket Protocol
Audio Streaming APIs
Real-time Transcription
Live Sentiment Analysis

Built With

aiohttp
amazon-bedrock
amazon-comprehend
amazon-dynamodb
amazon-transcribe-call-analytics
amazon-transcribe-streaming-sdk
amazon-web-services
audio
boto3
css3
fastapi
html5
javascript
pydantic
pypdf2
python
python-dotenv
radix-ui
react
react-router
tailwind-css
uvicorn
vite
web
websocket-api

Updates

Gyana Ranjan started this project — Jul 10, 2025 11:39 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.