CHIP-2025-03: Merkle Header Commitment for Enhanced SPV Scalability

CHIP-2025-03: Merkle Header Commitment for Enhanced SPV Scalability

This CHIP proposes adding a Merkle tree commitment of all block headers to Bitcoin Cash, with the Merkle root embedded in each block’s coinbase transaction.
This enables simplified payment verification (SPV) clients to verify past block headers and transactions using a single root and compact proofs, reducing sync data from ~72 MB (full header chain as of March 2025) to under 15 kB for a day of headers (144 blocks, assuming 10-minute intervals) plus a single root and proof.
The scheme improves SPV scalability while maintaining security and compatibility with Bitcoin Cash’s design.

Previously mentioned here:

and credits to @tom for floating the idea here: https://github.com/cculianu/Fulcrum/issues/187

and @cculianu for already implementing it in Fulcrum / Electron Cash.

The CHIP would basically add this commitment to coinbase transactions, and make compact SPV verification fully trustless.

6 Likes

as the maintainer of an SPV-over-merkle-blocks wallet, I don’t see any need for any protocol changes.

This proposal will actually have a counter-productive effect on scalability.

2 Likes

Ok, but would you care to use Fulcrum / Electron Cash’s super-Merkle proofs (as you indicated in that Github issue)? I guess developer checkpoints would work for you, but then you’re trusting the developer.

Changing the checkpoint would let the developer fool SPV clients into accepting a proof of non-existant historical TX. Although, being open-source, anyone could independently verify the developer checkpoint against known headers and catch such an attempt (assuming anyone would actually be checking).

With commitments in consensus, entire network would be verifying them, automatically.

How so? It would only take about 20 hashes to update the root on each new block, that’s less hashing than what’s needed for just one or two normal transactions.

3 Likes

There’s no need for trust, actually. We can implement this without consensus change, but clients would still have to download the whole header chain once - and they’d then compute the root by themselves and throw away the header chain. They’d do this only on initial sync. Later they could just get the header segment from tip back to the root & tip segment they had kept from before, and prune again to new root and tip segment.

SPV servers could be upgraded to offer header Merkle proofs for clients that do this. So, even right now we could save clients the burden of storing the whole header chain!

So what is the benefit of adding roots to consensus? It would offer clients a flexible scale for initial trustless sync: they could skip downloading the whole header chain and download just the last 100k blocks (fixed 8MB vs ever-growing chain, now at 72 MB), verify the 100k-deep root, compute a 1k or 100-deep root and throw away the 8MB.

3 Likes

Sounds absolutely awesome! :+1::+1:

Seems like something this obviously great would have to be implemented long ago. Why wasn’t it?

Are there any potential downsides we should be aware of?

1 Like

When you make such claims, you should always substantiate it more. I mean, the proposal sounds like something we should have done 8 years ago.

Can you give an example why and how would it affect scalability?

5 Likes

This won’t work, sadly. It’s a bit of a chicken-and-egg problem and headers are unique in that they cannot be “commited-to” on the blockchain itself. What do I mean by that?

Think about what a commitment on the blockchain means:

It means that some piece of data X, can be trusted, because there is PoW backing it. Therefore that piece of data, X, is difficult to forge. To know X is real, you can lookup a block, and if you see X on the block (say X is a tx, or a uxo, or a cashtoken, or something like some future UTXO commitment)… you know that X is real because you do the following:

  1. You validate all headers leading up to the block containing X by downloading the headers and validating PoW.
  2. You either obtain the block containing X itself, or obtain a merkle proof for the txn where X appears in that block (to save b/w).

All of this necessarily requires that you see all the headers to validate PoW in the first place.

Putting a header commitment H in a block is not going to help you at all.

To know that header commitment H is valid, you need to validate all PoW leading up to the block containing H.

To do that… you need to download all the headers anyway and verify they link up and PoW is valid. Which is the very thing you were trying to not have to do in the first place with this proposal!

So it’s a chicken-and-egg problem.


In summary: This information doesn’t belong on the blockchain, because it cannot be verified without downloading all headers in the first place – the very thing this CHIP tries to optimize away!

This information might be useful to some SPV wallet but it can be obtained from an oracle anyway or from a hard-coded checkpoint in the wallet. It would have exactly the same security properties if done that way as it would have if put it on the blockchain.

It’s a unique chicken-and-egg problem to headers specifically – since they are the backbone of bitcoin and are the 1 thing that must be validated to know anything else is real. Therefore any header commitment in Bitcoin cannot work… sadly.

4 Likes

I am not a developer but I have found grok with the think function has really helped me understand these more technical debates. Someone might find this conversation I had with it about this chip useful https://x.com/i/grok/share/cMmWQmYaRK9Gwe2hG8JK85tW9
You’ll notice near the bottom is where I feed it the actual text of mr autists proposal where it then understands the technicals at a better level and gives an updated answer to the chicken and egg question.

1 Like

The grok analysis is indeed very good. I am amazed at how good AIs are getting these days. I don’t agree with its conclusion necessarily but it’s compelling to read the way it reasons.

Let me phrase why this won’t work another way – think about how to attack a chain that is “secured” by header merkle root commitments (on-chain) vs one that just uses an oracle as an off-chain speed-hack for SPV clients.


The way to scam with the oracle approach (no on-chain commitments) is:

  • You make up a fake merkle root that covers block headers that do not exist: Fake blocks 0 → N-D-1
  • And then to convince the SPV client this proof is “real” you forge some PoW-correct and difficult blocks (or headers) from N-D to N. The SPV client checks that block header N-D-1 is covered by the merkle root (via a merkle proof on inclusion), then it sees block N-D links to block N-D-1 via prevBlockhash… and it validates PoW for blocks N-D → N (the present).

Total cost of forgery: D+1 blocks’ worth of PoW.


The way to scam with the on-chain header commitments approach is:

  • You make up a fake block N-D that isn’t real, containing a fake merkle root (this costs 1 block’s worth of PoW)
  • The merkle proof covers block N-D-1, so you can use that “proof” to “prove” N-D-1 is included in the set, and by extention, fake-block N-D is “real”.
  • Then you also need to forge headers N-D → N, again respecting PoW.

Total cost of forgery: D+1 blocks’ worth of PoW, identical to the oracle case.

To me these seem like identical costs of forgery, more or less.


Edit: @lucasmcducas indicated to me that he asked grok and it agrees: https://x.com/i/grok/share/lq55v4dbFbq5RNhiuMM0542yr

You do need to scroll down to the end.

Side-Note: AI agreeing with an argument is not proof the argument is sound. It’s just sometimes useful to ask AI since it can write back to you in very good language so it helps to see some nicely written text as to how it reasons… sometimes helping elevate the discussion (or sometimes not).

3 Likes

Exactly. Readers can use it to try to better understand what each of us is saying, but please don’t post walls of AI-generated texts here and clutter the discussion. These Grok links are a convenient way to introduce AI consultant to the workflow, here’s an update: https://x.com/i/grok/share/ZMxi39n7GwUQBm7Z1Aga7Jm4j

I don’t need to forge PoW you see, because when generating merkle root I can just use the legitimate N-D-1 header, which will correctly link with legitimate N-D to N segment. Assuming client checks only the N-D-1 to N-D link, I have full rewrite freedom over all blocks 0 to N-D-2 and it will cost me 0 PoW.

That’s not the case if merkle root is checked by consensus and committed in coinbase.

The root’s inclusion in the coinbase means every full node and miner has checked its accuracy against the chain they’ve agreed upon. If we extracted root from N-D’s coinbase then altering N-D-2 requires rewriting block N-D (to change the real commitment verified by the network from real to fake) and all subsequent blocks, which is infeasible due to the cumulative PoW.

The bit that’s key to understanding is this, given a header chain 0 to N, and some intermediate block M, the segment 0 to M only proves ancestry of the block M, but contents of block M have only 1 PoW worth backing it.
It is the segment M to N that increases the PoW backing contents of block M.

So, these coinbase commitments would be able to prove PoW up to value determined by user (e.g. 10k blocks), but they would not be able to prove ancestry.
Suppose BCH introduced the commitments, and then the network forked again to BCH1 and BCH2 and both kept the commitment scheme.
Both chains add 10k blocks more.
Now, a client asks for root + header segment 1k long: how does it know to which network it belongs to? He doesn’t.

But how is that different from current situation: client is given both BTC and BCH full header chains, how does he know which network is which? He needs to know the hash after the common block.

The same method works to determine BCH1 vs BCH2 - client would ask Merkle proof for the block 1 after the common block and match it with a known fork.

2 Likes

Hmm yes. You are right. The way merkle trees work, yeah.

Ok, so then they are not equivalent in forging cost. That was the key to our disagreement. I apologize for getting this wrong initially.

I stand corrected.


That being said I still don’t want to mess with consensus for this. It’s not incredibly useful. In practice just hard-coding header merkle roots in clients and then having clients that use those merkle roots… seems to work fine. Headers are only 80 bytes so even a client that is older and is 100k headers behind, downloading 800k… these days is like the size of 1 image.

I think adding this to consensus won’t have a huge positive impact, in other words, beyond what clients already do.

2 Likes

Yeah that’s OK, I still want to polish the CHIP to lay out the case well, even though I don’t feel like we really need this any time soon. It would be a nice to have, so users could pick something between full headers and trusting a checkpoint.

1 Like

Yes it’s good to have this out there. If we ever go to like 2 minute or shorter block times – then this becomes increasingly relevant/necessary/helpful.

2 Likes