Inspiration

Demand for fact checking at scale by remote users with variety of devices. Advanced OCR requirements for certain documents in current workflow.

What it does

Project Concept: DocuChat AI An intelligent document assistant that combines conversational AI with multi-modal document understanding. Users can chat naturally, upload PDFs, images of documents, and handwritten forms to get insights, summaries, and answers - all secured behind OAuth2 authentication with a comfortable, responsive pastel UI.

How we built it

Key Technical Stack:

LLM: NVIDIA Llama-3.1-Nemotron-Nano-8B-v1 (reasoning model) Embeddings: NVIDIA NemoRetriever-300M-Embed-v1 (for RAG) Deployment: Amazon EKS or SageMaker Auth: OAuth2 (Google/GitHub) Frontend: React with responsive design, pastel color scheme Backend: FastAPI/Node.js with vector database for retrieval

Core Features:

🔐 OAuth2 authentication (Google/GitHub) 💬 Natural conversational AI using Llama-3.1-Nemotron-Nano-8B-v1 📄 PDF document upload and Q&A 🖼️ Image document processing with OCR ✍️ Handwritten form recognition 🎨 Responsive pastel UI optimized for extended use 🔍 RAG pipeline with NemoRetriever embeddings ☁️ Deployed on AWS EKS/SageMaker as NIM microservices

Challenges we ran into

Accomplishments that we're proud of

What we learned

What's next for DocuChat AI

Share this project:

Updates