Inspiration
In today’s information-heavy world, valuable knowledge is often trapped inside lengthy and static PDF documents — research papers, reports, manuals, policies, and more. Searching through these manually is slow and inefficient. We wanted to build something that doesn’t just read documents but actually understands them — an intelligent assistant that turns passive information into an active conversation. That’s how CogniDoc was born — to make interacting with documents as easy as talking to a knowledgeable expert.
What it does
CogniDoc is an AI-powered web application that transforms any static PDF into an interactive conversational experience. Users can upload a PDF, ask questions in natural language, and instantly receive intelligent, context-aware answers grounded in the document’s content. Whether it’s a 300-page research report or a short user manual, CogniDoc allows users to explore information conversationally — no scrolling, no keyword-hunting, just clear, focused answers. Key features: 📄 Dynamic PDF Upload: Instantly process and analyze any PDF document. 🤖 AI-Powered Q&A: Answers are generated using a Retrieval-Augmented Generation (RAG) pipeline for accuracy and relevance. 🧠 Semantic Understanding: Understands the meaning of questions, not just keywords. ⚙️ Scalable Full-Stack Design: React frontend, FastAPI backend — decoupled for performance and flexibility. 💾 Persistent Vector Storage: Astra DB powers lightning-fast, low-latency semantic search.
How we built it
We built CogniDoc as a fully decoupled full-stack application with a focus on modularity, scalability, and modern AI integration. Frontend (React.js): Users can upload PDFs and interact with the AI through a clean, intuitive interface. Real-time chat-style Q&A experience. Backend (FastAPI + Python): Handles file uploads, text extraction, and communication between the frontend and AI pipeline. AI Pipeline (RAG Architecture): Document Ingestion: The uploaded PDF is parsed, and text is extracted. Text Chunking & Embedding: LangChain splits text into chunks, which are embedded using Hugging Face sentence transformers. Storage: Embeddings are stored in Astra DB’s vector database for efficient retrieval. Query Processing: When a user asks a question, the backend performs semantic search in Astra DB to find the most relevant chunks. Answer Generation: The retrieved context is passed to the Google Gemini API, which generates a coherent, grounded answer. Deployment: Frontend deployed on Netlify, backend on Render, with seamless CI/CD integration via Git.
Challenges we ran into
PDF Complexity: Extracting clean text from PDFs with inconsistent layouts and formatting was surprisingly challenging. Embedding Optimization: Balancing chunk size, semantic accuracy, and query latency required careful experimentation. Context Management: Ensuring the Gemini model remained grounded in the document (and didn’t hallucinate) took multiple iterations of prompt tuning. Integration Friction: Managing data flow between React, FastAPI, Astra DB, and Gemini while maintaining low latency was a key engineering hurdle.
Accomplishments that we're proud of
Successfully implemented a complete RAG pipeline from scratch — from data ingestion to intelligent response generation. Built a scalable, production-ready architecture with clear separation between frontend, backend, and AI components. Achieved real-time semantic search with persistent vector storage using Astra DB. Created a smooth, chat-like user experience that feels intuitive and responsive. Most importantly, we built something that genuinely changes how people interact with written information.
What we learned
How to design and deploy a modular AI architecture that integrates multiple complex systems. The importance of data preprocessing in achieving high-quality embeddings and accurate retrievals. That prompt engineering and context optimization are critical for reliable AI performance. The nuances of frontend-backend coordination in real-time AI applications.
What's next for CogniDoc
We see CogniDoc as more than just a hackathon project — it’s a foundation for the next generation of document intelligence tools. Our roadmap includes: 🔍 Multi-document querying: Ask questions across multiple PDFs simultaneously. 🗣️ Voice interaction: Talk to your documents using speech recognition and voice output. 🧾 Citation support: Display document references or page numbers for each answer. ☁️ User accounts & document history: Save and revisit uploaded documents and past conversations. 🚀 Enterprise integration: Connect CogniDoc with internal knowledge bases for business use cases.
Log in or sign up for Devpost to join the conversation.