PDF Parser AI is a client-side React application that allows users to upload PDF documents and extract structured process information from them using advanced AI services. The app leverages the Marker API for PDF parsing and the Google Gemini API for AI-powered content analysis, providing users with a summarized, step-by-step process extracted from the uploaded PDF.
A short preview of the PDF Parser AI in action (using Marker API and Gemini 2.5 Flash):
- Upload a PDF: Users can upload any PDF document (e.g., research papers, reports, tutorials).
- Parse & Extract: The app sends the PDF to the Marker API, which converts it to markdown and extracts images and metadata.
- AI Analysis: The extracted content is analyzed by the Gemini API, which summarizes the document, identifies key steps, findings, or processes, and returns a structured JSON response.
- Display Results: The UI presents the AI-generated summary, process steps, key findings, and any extracted images in a clear, user-friendly format.
- Client-side only: All logic runs in the browser; no backend required.
- Modular architecture: Components, hooks, and services are separated for maintainability.
- Environment-based config: All API keys and endpoints are loaded from
.envfor security. - Live processing status: Users see real-time progress, timing, and token usage.
- Error handling: Friendly error messages and robust handling of API failures.
- UI Components: File upload, processing status, and results display are modular React components.
- Custom Hooks: State and workflow logic (e.g., PDF processing, timers) are encapsulated in hooks.
- API Services: Marker and Gemini API interactions are abstracted into service classes.
- Config & Types: Centralized configuration and TypeScript types ensure maintainability and type safety.
cd frontend
npm install
npm run devCreate a .env file in the frontend directory with the following variables:
| Variable Name | Description | Example Value |
|---|---|---|
| VITE_MARKER_API_KEY | Marker API key | grab key from www.datalab.to |
| VITE_GEMINI_API_KEY | Google Gemini API key | grab key from ai.google.dev |
| VITE_MARKER_API_BASE_URL | Marker API base URL (proxy) | /api/marker |
| VITE_GEMINI_MODEL | Gemini model name | gemini-1.5-flash |
Example .env file:
# API Keys
VITE_MARKER_API_KEY=your_marker_api_key_here
VITE_GEMINI_API_KEY=your_gemini_api_key_here
# API Configuration
VITE_MARKER_API_BASE_URL=/api/marker
VITE_GEMINI_MODEL=gemini-1.5-flash