So over the weekend I vibe coded a python script to validate data in my ledger journals, and I thought it would be worth sharing. I've skimmed the code and done some basic checking that it works. It's also heavily tied to how I capture information in hledger.
I have my journals split by financial year, which is 1 July - 30 June in my part of the world. I reference receipts using a comment and a relative file path. All receipt filenames start with YYYYMMDD -
Currently receipts are stored in ./receipts/ so the comment looks like
; Receipt ./receipts/YYYYMMDD - Invoice.pdf
I keep all other things like payslips, in an information folder and they follow the same format, so that are in comments like
; ./information/YYYMDD - Payslip.pdf
This script is supposed to check 3 things:
- all files referenced in a journal exist
- all files in folders in the same location as the journal are referenced in a journal, excluding things like .git
- filenames match the YYYYMMDD format at the start and are in the right financial year (with a whitelist for things that may violate this check)
The script can also run as a git hook. It can be found here GitHub - KhaineBOT/hledger-file-reference-checker: Script to check files referenced in hledger comments