Skip to content

Caerii/pdf-parser-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF Parser AI (Client-Side)

Overview

PDF Parser AI is a client-side React application that allows users to upload PDF documents and extract structured process information from them using advanced AI services. The app leverages the Marker API for PDF parsing and the Google Gemini API for AI-powered content analysis, providing users with a summarized, step-by-step process extracted from the uploaded PDF.

Demo Video

A short preview of the PDF Parser AI in action (using Marker API and Gemini 2.5 Flash):

▶️ Download the demo (MP4, 9MB)


What It Does

  • Upload a PDF: Users can upload any PDF document (e.g., research papers, reports, tutorials).
  • Parse & Extract: The app sends the PDF to the Marker API, which converts it to markdown and extracts images and metadata.
  • AI Analysis: The extracted content is analyzed by the Gemini API, which summarizes the document, identifies key steps, findings, or processes, and returns a structured JSON response.
  • Display Results: The UI presents the AI-generated summary, process steps, key findings, and any extracted images in a clear, user-friendly format.

Key Features

  • Client-side only: All logic runs in the browser; no backend required.
  • Modular architecture: Components, hooks, and services are separated for maintainability.
  • Environment-based config: All API keys and endpoints are loaded from .env for security.
  • Live processing status: Users see real-time progress, timing, and token usage.
  • Error handling: Friendly error messages and robust handling of API failures.

High-Level Architecture

  • UI Components: File upload, processing status, and results display are modular React components.
  • Custom Hooks: State and workflow logic (e.g., PDF processing, timers) are encapsulated in hooks.
  • API Services: Marker and Gemini API interactions are abstracted into service classes.
  • Config & Types: Centralized configuration and TypeScript types ensure maintainability and type safety.

Getting Started

cd frontend
npm install
npm run dev

Environment Variables (.env)

Create a .env file in the frontend directory with the following variables:

Variable Name Description Example Value
VITE_MARKER_API_KEY Marker API key grab key from www.datalab.to
VITE_GEMINI_API_KEY Google Gemini API key grab key from ai.google.dev
VITE_MARKER_API_BASE_URL Marker API base URL (proxy) /api/marker
VITE_GEMINI_MODEL Gemini model name gemini-1.5-flash

Example .env file:

# API Keys
VITE_MARKER_API_KEY=your_marker_api_key_here
VITE_GEMINI_API_KEY=your_gemini_api_key_here

# API Configuration
VITE_MARKER_API_BASE_URL=/api/marker
VITE_GEMINI_MODEL=gemini-1.5-flash

About

test module to parse PDFs intelligently

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors