51,056 questions
Best practices
1
vote
10
replies
147
views
What is the most secure way to embed a PDF (preview) into a website?
Aim
Just embed a PDF (preview) into a website. This is very likely a well-known use case.
Given I have this use case for showing a PDF preview, I am searching for a solution. And yes I know e.g. MDN ...
3
votes
0
answers
111
views
Calling remove_from_tree on outline item causes "Removed child does not appear to be a tree item" error
I want to change bookmark (aka outline item) colors in a PDF document from black to blue.
It's my understanding that pypdf can only add outline items; you can't change the properties of those already ...
2
votes
2
answers
70
views
Pandoc not displaying front matter (YAML) date when compiled
I'm using Pandoc to convert my Markdown (4,000 word count) file to PDF. My problem is the date won't appear on the PDF. Here's my system info:
pandoc 3.7.0.2
Features: -server +lua
Scripting engine: ...
-7
votes
0
answers
149
views
Recognize Boxes on Form W-2C PDF in Python [closed]
I have 100+ Forms W-2C in a single PDF, and I need to extract the populated fields into an excel file where each Form W-2C field is a header and there is one line for each Form W-2C.
I am trying to ...
1
vote
0
answers
66
views
How to programmatically build /StructTreeRoot tags using pypdf/pypdf2 without breaking existing PDF visual content?
I am developing a Python FastAPI backend to inject PDF accessibility compliance tags (WCAG 2.1) into existing documents generated from Adobe InDesign. The layout and visual binary bytes must remain ...
Advice
0
votes
1
replies
110
views
How can I make Firefox's PDF viewer automatically re-load the PDF if it changes?
I use Firefox as my PDF viewer while I write them in another program. How can I make the PDF automatically refresh in Firefox when the file changes?
I am open to workarounds like refreshing every 5 ...
Tooling
0
votes
3
replies
96
views
How to access PDF files which have an empty password in PHP
I need to perform some editing actions on many PDFs across a wordpress site programmatically. Using MPDF I can access and edit many of them just fine. The issue is a lot of these files are also ...
Advice
0
votes
9
replies
248
views
Looking for a completely free C# PDF library with fluent API and low memory usage
I’m building a .NET application that needs to generate PDF documents dynamically, and I’m trying to choose an appropriate library.
I’m specifically looking for a library that meets the following ...
Advice
1
vote
6
replies
245
views
How can I export a PDF file from Laravel application data?
I’m working on a Laravel application where I need to generate and export a PDF file based on database records (such as appointments, payments, or invoices).
I want to be able to:
Generate a PDF from ...
Tooling
1
vote
5
replies
484
views
Best Open Source Models or Libraries for Accurate PDF Data Extraction?
I am looking for the best open-source models, frameworks, or libraries for extracting text and structured data from PDFs with high accuracy.
My use case includes:
Extracting text from scanned and ...
Tooling
0
votes
4
replies
111
views
Turn scanned form into json
I need to turn a list of scanned PDF, containing a form filled by people (some of the answers are handwritten, and the form contains a mix of text and checkboxes) into a JSON than I can then use to ...
3
votes
3
answers
98
views
iText: offline PAdES validation fails even though DSS is timestamped
The following piece of code tries to validate a pdf using iText signature validator in offline mode.
The repro archive contains:
timestamped.pdf
SwissSignRootCA.cer
SwissSignTARootCA.cer
Repro files:...
0
votes
1
answer
95
views
Should XRef stream offset be always stored, in that very stream or any other?
Consider:
C:\>cpdf -create-pdf -create-objstm -stdout
%PDF-2.0
%
% skipped
%
5 0 obj
<</Type/XRef/W[1 2 1]/Root 3 0 R/Size 5/Length 20
% skipped
Note that "Size" is not one more (...
0
votes
2
answers
73
views
Laravel Storage::response() returns download instead of displaying PDF inline
I'm trying to display a PDF file inline in the browser using Laravel, but instead of showing the file, the browser downloads it.
Here is my code:
return Storage::disk('public')->response(
$path,...
Advice
0
votes
0
replies
60
views
What kind of tech stack is best appropriate for reducing manual redundant tasks to a minimum on Windows desktops
If I wanted to go about creating an app to be deployed on laptops/desktops (Windows mostly) - either as localhost:800X in the browser or maybe even as a desktop app (using Electron?) to accomplish ...