PDF manipulation, processing, and management toolkit for Pi coding agent.
- Extract text from PDFs with layout preservation
- Extract tables as structured data
- Merge multiple PDFs into one
- Split PDFs into individual pages or ranges
- Rotate pages by 90°, 180°, or 270°
- Add watermarks (text overlay on all pages)
- Encrypt PDFs with password protection
- Decrypt password-protected PDFs
- Fill PDF forms (fillable fields and annotation-based)
- Convert PDFs to images for visual analysis
- OCR scanned PDFs (with pytesseract)
- Get PDF metadata (title, author, page count, encryption status)
pi install npm:@joemccann/pi-pdfOr for project-local installation:
pi install -l npm:@joemccann/pi-pdfRequires Python 3.9+ with these packages:
pip install pypdf pdfplumber reportlabOptional (for OCR and image conversion):
pip install pdf2image pytesseract pypdfium2 PillowOptional CLI tools:
# macOS
brew install poppler qpdf
# Ubuntu/Debian
apt install poppler-utils qpdfCheck all dependencies:
node scripts/check-deps.jsThe LLM can call these tools directly:
| Tool | Description |
|---|---|
pdf_info |
Get metadata, page count, encryption & form-field status |
pdf_extract_text |
Extract text from pages with layout preservation |
pdf_extract_tables |
Extract structured table data |
pdf_merge |
Merge multiple PDFs into one |
pdf_split |
Split a PDF into pages or ranges |
pdf_rotate |
Rotate pages by 90/180/270 degrees |
pdf_encrypt |
Password-protect a PDF |
pdf_decrypt |
Remove password protection |
pdf_watermark |
Add text watermark overlay |
pdf_to_images |
Convert pages to PNG images |
pdf_form_fields |
Extract fillable form field info |
pdf_fill_form |
Fill in PDF form fields |
The pdf skill provides comprehensive PDF processing guidance that the agent loads on-demand when working with PDF files. It covers:
- All Python libraries (pypdf, pdfplumber, reportlab, pypdfium2)
- JavaScript libraries (pdf-lib, pdfjs-dist)
- CLI tools (qpdf, pdftotext, pdftk)
- Form filling workflows (fillable and non-fillable PDFs)
- Advanced reference documentation
Just ask Pi to work with PDFs naturally:
"Extract the tables from invoice.pdf"
"Merge report1.pdf, report2.pdf and report3.pdf into combined.pdf"
"Split this 50-page PDF into chunks of 10 pages"
"Add a CONFIDENTIAL watermark to the document"
"Password-protect the contract PDF"
"Fill out this tax form with my information"
"How many pages are in this PDF?"
# Run all tests
npm test
# Run unit tests only
npm run test:unit
# Run integration tests only
npm run test:integrationpi-pdf/
├── package.json # Pi package manifest
├── README.md
├── LICENSE
├── extensions/
│ └── index.ts # Extension tools (pdf_info, pdf_merge, etc.)
├── skills/
│ └── pdf/
│ ├── SKILL.md # Main skill file with usage guide
│ ├── REFERENCE.md # Advanced reference documentation
│ ├── FORMS.md # Form filling workflow guide
│ └── scripts/ # Python helper scripts
│ ├── check_bounding_boxes.py
│ ├── check_fillable_fields.py
│ ├── convert_pdf_to_images.py
│ ├── create_validation_image.py
│ ├── extract_form_field_info.py
│ ├── extract_form_structure.py
│ ├── fill_fillable_fields.py
│ └── fill_pdf_form_with_annotations.py
├── scripts/
│ └── check-deps.js # Dependency checker
└── tests/
├── fixtures/ # Test PDF files
├── unit/ # Unit tests
└── integration/ # Integration tests
MIT