Is it possible to combine 2 csv transactions into 1 using hledger?

I have a csv file from a crypto exchange which looks something like this:

date,transaction id,currency,amount
2025-09-27,abc123,USD,-20
2025-09-27,abc123,BTC,1.2
2025-09-10,xyz456,USD,-55
2025-09-10,xyz456,BTC,3.1

Basically each purchase is split it into 2 records, the fiat currency deducted and the cryptocurrency purchased and they share the same transaction id.

Is it possible for hledger to automatically detect the 2 csv rows with identical IDs and create a single transaction from it?

PS: less important, but I have a similar situation with currency conversion transactions on credit-card statement. When I purchase something with a different currency, a 2nd transaction shows up with the same description as the main one plus a special string, something like ` CCY CONVERSION`.

Not directly; you’d have to configure a data cleaning command to preprocess the csv.

Paypal csv is like this with currency conversions. I handle that by generating two transactions, each one balanced by equity:conversion.

Thanks simonmic! I was hoping to hear from you :slight_smile:

As a matter of fact I came across the data cleaning command yesterday and that’s what led me to ask this question. For the currency conversion fee I have been dealing with it with a custom python script that scans all transactions and finds the conversion transactions and adds identifying info to it so that the csv rules file can auto-categorize it. In short, a lot of python magic which I would like to get rid of or at least simplify.

Even with data cleaning, there is no way to do this out-of-the-box right? I’d have to write a script that gets called by the data-cleaning command, if I understand it correctly.

Still, I appreciate the data-cleaning step now baked into the CSV rules. It eliminates a big preprocessing step for me.

I’ll try your method of using a temporary account to hold the transfer/conversion.

1 Like

I just realized that I won’t be able to use the @@ format for these purchases if I split them into separate transactions. So I won’t be able to use P directives to track the performance or PNL.

I think writing a script which can combine 2 transactions into 1 based on the value of a csv column might be a better solution for me.

No way with hledger to build a single entry based on two CSV records, no; you’d have to preprocess and merge all the required info into one record first.

Here’s the rules I came up with. (The if block is just to order the postings in a sequence that I like.)

skip 1
fields date,id,_currency,_amount
intra-day-reversed
description %id

account1 equity:conversion
account2 assets:exchange:%_currency
amount -%_amount %_currency

if %_amount ^-
 account1 assets:exchange:%_currency
 account2 equity:conversion
 amount %_amount %_currency

And here's the conversion:

$ hledger -f a.csv print
2025-09-10 xyz456
    assets:exchange:USD         -55 USD
    equity:conversion            55 USD

2025-09-10 xyz456
    equity:conversion          -3.1 BTC
    assets:exchange:BTC         3.1 BTC

2025-09-27 abc123
    assets:exchange:USD         -20 USD
    equity:conversion            20 USD

2025-09-27 abc123
    equity:conversion          -1.2 BTC
    assets:exchange:BTC         1.2 BTC

Yes I think you're right that with separate entries like these, you can't record the costs, you can't --infer-costs, you can't --infer-market-prices, you can't show a bal --gain report.

Your rules are way simpler than the hodge-podge I came up with, haha. Thanks for that!

And yeah, I did a few experiments. Can't get the cost with this. Wrote a short python script which combines the transactions.

Thanks for your help!

1 Like

Trying not to be too pedantic, but it might be better to say, “Is it possible to combine 2 csv postings into one transaction using hledger?

I believe this would be a nice feature for hledger (and all PDA programs). Also, rather than combining two lines, it would be nice to combine n lines with a unique transaction_id field.

Transactions is a generic term, and usually what the exporting app calls them. I usually say "csv records".

There's a question of how much complexity is too much for csv rules, which are supposed to be simple and approachable for non-programmers. They could easily grow into yet another never finished programming language.

Ie at some point a person with advanced custom needs should just write a conversion script with a real programming language.

But hledger csv rules also have some smarts that make them better at doing this robustly. I feel the sweet spot is to use them as much as possible; and when necessary, assist them by preprocessing (or generating) the csv with an external script.