Clinton automatically renames PDF invoices and bills according to their contents, such as dates, sums, and other relevant information.
I got frustrated with certain SaaS vendors naming their invoices with e.g. just an opaque number, making it impossible to easily identify them for correlating with expense management systems.
This is obviously rough around the edges.
- Extract dates, sums, and other key information from PDF files using
pdfminerand custom rules. - Copy-rename files based on extracted information
Bill. Bill Clinton.
To start using Clinton:
- Clone this repository.
- Install the required dependencies;
pip install -e .should do. (Remember to use a virtualenv!) - Unless you happen to use the exact same cloud services I do, you may need to augment the files in
clinton/maps. Contributions are welcome! - Run
python -m clinton *.pdfto see what it'd extract from the files; run with the-d DIRparameter to have it copy-rename them toDIR.
Clinton is licensed under the MIT License. Feel free to use, modify, and distribute the code in accordance with its terms.