Using (local) LLMs with hledger records and reports

I’ve spent some time recently playing with LLMs and my financial records, and I want to share my findings. Most of them are trivial, but I hope that they could nevertheless be useful or provide food for thought.

I’m quite paranoid re sending my personal info to third parties, so I was not comfortable with giving commercial providers like ChatGPT or Anthropic access to my records as-is – even if they promise some degree of anonymity, like Deepinfra or Venice.ai. I’ve decided that there are two ways to go around this limitation: anonymising my data and using local inference.

On anonymising data

I found that you can go surprisingly far if you convert your quantitative facts to qualitative or comparative descriptions. Instead of saying “I earn $X, my expenses are $E, my house costs $Y, my mortgage is $Z, and my total estate is $T”, you can say, for example, something like “I am interested in estate planning. My estate is more than .” or “My earnings put me in and my mandatory expenses are X% of that, and further voluntary expenses are Y% of that”, and so on. Some of the answers or facts could be pure “yes/no” binary answers: yes, I have kids. No, I do not have Roth IRA. Etc

I found that you can almost always find a way to give a qualitative answer in a way that would move the discussion forward in a meaningful way. Your initial prompt could say, “I am not comfortable disclosing exact amounts, but I am ok giving qualitative answers. If you identified meaningful boundaries or limits, you can ask me how my figures compare to that”, to ensure that the model formulates the questions in a way that works with qualitative answers.

Depending on the questions and answers, this could still be a significant disclosure of personal circumstances, of course :slight_smile: , which brings me to the next point.

Local inference

When running local models, I found that I am getting the most value from opencode (which is a CLI tool similar to Claude Code) or open-notebooklm, rather than a simple chat-like interface.

Opencode could grep through your journals and reports, could be instructed to run "hledger” directly to extract the numbers it needs, and in general, works best when you are exploring things.

Open-notebooklm works best when you want to combine your records with other documents - for example, you can create a notebook, upload your tax code, your tax filing and your end-of-year reports from hledger, and ask the model to verify that the saved tax form lines up with your financial record (double-checking your work).

I know that hledger MCP exists, but I havent tried it yet.

Dear reader, share your stories from the trenches :slight_smile: Are you using LLMs, what for, and how?

5 Likes

I am way, way, way deep down this rabbit hole and for me, there is no going back!

I just prepared my entire 2025 taxes for my two small business using codex and a single “skill” I made called “prepare-tax-packet”

The results were absolutely perfect. I iterated with codex (like open code, Claude code, etc) to built the first version of the skill. I then simply ran it on the second business and it was complete in about 8 minutes. All of this enabled by the incredible access and tools that plain text account (hledger, in my case) gives us. Go PTA!!

I created the HLedger MCP so I will link it here: GitHub - iiAtlas/hledger-mcp: A local MCP server for interacting with the HLedger cli

It is helpful if you want to use your data in Claude or ChatGPT desktop apps. I often “chat with my finances” here - but for taking actions, I am just relying on the HLedger CLI (no MCP) and either Claude Code or Codex harnesses. Both more than capable.

I have skills for importing receipts, importing statements, even reconciling and doing monthly rollups. I have Resend Inbox setup so that I can email myself a receipt, it runs through my import codex skill, and adds itself to the journal. Absolutely incredible!!! I’ve never had so much fun with my accounting :slight_smile:

1 Like

So you are are basically just accepting the risk of data leaks/inclusion in the training sets and use commercial LLM providers, right?

1 Like

Yes. I can only concern myself with someone things in the world. If they want to know that I paid $4.00 for my GitHub subscription, I accept. I recognize this may not be the case for everyone!

If so, ollama is a great replacement allowing you to run any number of local models.

What local models did you use with opencode?

(Just a warning to people that don’t know, while opencode itself is open source, it defaults to using a proprietary remote LLM provider, so make sure you configure it for local use first!).

2 Likes

So far I’ve used qwen3-coder-30b for coding the various supplementary scripts for hledger and gpt-oss (20b and 120b) for working with actual financial reports. Qwen3 is non-thinking model and I found it to be faster and easier to work with for coding tasks, and it has a sizable context window. Gpt-oss is palpably better when “reasoning” about finance though and processing large volumes of text (such as tax codes). Just my $0.02, i do not claim to be any sort of authority on this topic, though.

3 Likes