Python script to validate files referenced in hledger comments

So over the weekend I vibe coded a python script to validate data in my ledger journals, and I thought it would be worth sharing. I've skimmed the code and done some basic checking that it works. It's also heavily tied to how I capture information in hledger.

I have my journals split by financial year, which is 1 July - 30 June in my part of the world. I reference receipts using a comment and a relative file path. All receipt filenames start with YYYYMMDD -

Currently receipts are stored in ./receipts/ so the comment looks like

; Receipt ./receipts/YYYYMMDD - Invoice.pdf

I keep all other things like payslips, in an information folder and they follow the same format, so that are in comments like

; ./information/YYYMDD - Payslip.pdf

This script is supposed to check 3 things:

  1. all files referenced in a journal exist
  2. all files in folders in the same location as the journal are referenced in a journal, excluding things like .git
  3. filenames match the YYYYMMDD format at the start and are in the right financial year (with a whitelist for things that may violate this check)

The script can also run as a git hook. It can be found here GitHub - KhaineBOT/hledger-file-reference-checker: Script to check files referenced in hledger comments

2 Likes