CHIP 2021-07 UTXO Fastsync

tom · November 5, 2023, 9:27am

This still seems unclear to people.

UTXOs, as restored by the CHIP, contain enough to allow a full node to sync. Allows a full node to build on top of the restored UTXO.
Wallets don’t use UTXOs, walles use transactions.
A UTXO set is not useful to them. No wallet exists that will take a UTXO as a historical ‘state’.
More to the point, there is no way to prove to that thin wallet that the historical transaction is actually valid.

So for wallets such a full node, restored from fastsync, is only useful for full transactions received after the sync point. It can not have any, really any history on its owned addresses before the sync point.

This is the current state of the tech, as is the case in software, this can be changed because software can be changed.

But before that is the case, I will object to any consensus changes because I consider this feature to not be finished yet.

Jonas · November 5, 2023, 12:40pm

I do agree with what you are saying above. UTXO fast-sync (with or without UTXO commitments) are indeed only useful for full nodes to be able to fully validate the blockchain from the point of synchronization. Any wallet can thus only fetch history from that point and forward.

I think it is important to talk about what the benefits of such functionality might be.
One use-case from the top of my head:
Let’s say that there is a CashToken launched called DummyCoin with genesis at block X. DummyCoin gains a lot of popularity with an entire eco-system around it with independently developed DummyCoin mobile wallets, DummyCoin merchant checkout services and so on. Each user/service/organization that wishes to launch a fully validating node exclusively for DummyCoin would be uninterested in whatever happened before block X and wouldn’t need to spend the time and storage of processing decades of redundant data.
For those the ability to quickly launch a node from, or slightly before, block X gains real value since there will be more fully validating nodes that can serve all DummyCoin SPV wallets with history.

This is not mutually exclusive for nodes with history going back from genesis.

Jonas · November 5, 2023, 12:59pm

I know this has been a somewhat controversial subject in the past and many people don’t agree with me, but…
I think synchronizing wallet history from genesis will eventually be a premium service. Without UTXO commitments I’m afraid that getting recent history will also come at a (monetary) cost.
There simply isn’t incentives for keeping public nodes open for everyone when the cost rises.

bitcoincashautist · November 5, 2023, 1:43pm

I described one of the benefits during a brainstorm session on Telegram: trustlessly bootstarting historyless SPV-serving nodes:

download block headers, verify them
get SPV proof for latest coinbase TX that has utxo commitment in it, verify that
get UTXO snapshot, verify that against the commitment
ask SPV serving nodes for proof for each UTXO in the set
sync up normally from there

With this you not only get a 100% verified UTXO snapshot, you also gather SPV proofs for each of them. With this you can be sure you have all the UTXOs at height N and all the proofs for them.
This pruned node can serve UTXO snapshot to other nodes, too.

Imagine some hypothetical future where blocks are 32 MB, history is 10 TB, and UTXO state is 30 GB. At 10k blocks interval, fast-sync would require a download of 30 GB + 32 MB * 10000 == 350 GB. Not so much compared to a full download of 10 TB. If you had 4MB/s fast-sync would take roughly a day, and full IBD would take roughly a month.

Then, if you really need it, you can fill in the spent history at your convenience.

We should separate the fast-sync and commitment CHIPs, or should I say - we should make a standalone commitments CHIP to hash things out since only the fast-sync CHIP now exists.
The commitments are the minimum consensus change needed to enable fast-sync to be trustless (they’re possible now but you need a trusted commitment, which Verde already implemented, and there’s been some work on BCHN too).
Fast-sync is the tech stack to make use of the commitments and capture the benefits of having consensus-validated commitments: trustless bootstarting from a snapshot.

@joshmg put it well in recent hangouts on X: “So that’s the difference between fast-syncing and UTXO commitments, UTXO commitments are actually making it a part of the rules that make a block valid, and fast-syncing is the benefit you get from having a UTXO commitment.”

tom · November 5, 2023, 8:35pm

This ignores the point I made.

The fastsync is about UTXOs, wallets (and SPV) is about transactions.

You can’t just replace one with the other and pretend you can get SPV proofs that are meant for transactions, for utxos.

If you think you can, please build it. Prove me wrong.

Until then, I don’t think fastsync in consensus makes much sense. Too high risk of changes needed (by hardfork) to make the normal usecase work.

bitcoincashautist · November 6, 2023, 8:02am

Yes, I was inaccurate, sorry, let’s try again:

Bootstarting historyless SPV-serving nodes:

Download block headers, verify them.
Get SPV proof for latest coinbase TX that has the UTXO commitment of interest in it, verify that against the headers from (1.).
Get UTXO snapshot, verify that against the commitment in (2.).
From the UTXO snapshot, generate a non-duplicate list of TXIDs that have at least one UTXO in them. Ask SPV serving nodes for each transaction and SPV proof for each.
Sync up normally from there.
The node can now serve historyless SPV wallets and also help bootstart other such nodes by providing them with UTXO snapshot & SPV proofs for TXs still having UTXOs in them.

ShadowOfHarbringer · November 6, 2023, 9:11am

I agree.

This is in line with Satoshi’s “Nodes will be server farms”.

Of course, the corpos and operators running these farms will not be willing to pay the costs of maintaining all the tree back to genesis for no compensation.

But I also think that it is important that some independent/hobbyist/academic/subsidied by government nodes remain, so it can be verified at any time that somebody is not cheating and supplying clients with fake history for some nefarious/for-profit reason that we cannot imagine today.

bitcoincashautist · November 6, 2023, 9:16am

Here’s some empirical evidence of there always being some copies of history: people seed Torrents, r/DataHoarder exists.

If Bitcoin Cash grows, having an archival node will come with big bragging rights.

ShadowOfHarbringer · November 6, 2023, 9:20am

Yes, also this is not going to happen in the next 20-30 years, because technology is still following Moore’s Law.

Hard drive space and Internet connection speed is rising geometrically. Much faster than adoption of BCH can probably proceed (that is, assuming no Black Swan-type scenario happens that causes billions of people to suddenly want to use Crypto).

Jonas · November 6, 2023, 9:31am

I think the crux of the problem is the details of this step

A client that just synced the UTXO set could in theory connect to every known Electrum server and issue blockchain.transaction.get and blockchain.transaction.get_merkle for each UTXO to achieve it. (No I don’t think this is a good solution)

Or, since the block height of each UTXO is known, set the appropriate bloom filter and request merkelblock for specific block from nodes with full history. Rinse and repeat for each UTXO.

Or the node gets the data from a side-channel not specified in this spec.

One key point is that those transactions and proofs are not contained within the snapshot and we are still dependent on someone being around and serving it in some way.

To summarize:
For a node to be able to fully validate everything (i.e. being a “full node”) from the point of the UTXO snapshot nothing more is needed. This is the problem solved by this CHIP and a miner validated commitment in a future CHIP could make this fully trust-less.

For a node to able to serve history (transactions and proofs) from the point of the UTXO snapshot additional data needs to fetched and validated. To me this seems like a separate CHIP if we would like a standardized way of doing it.

tom · November 6, 2023, 10:14am

So,
you just turned the small utxo set download into “download all full transactions that still have at least 1 unspent output in them”.

Maybe good to re-do the math (how big the download is) and the fastsync chip based on this new approach.

I suspect it changes a LOT.

When someone implements that, so we can be certain it actually works (not just theoretically), I’d love to see some discussion on how useful or viable it is for fastsync.

bitcoincashautist · November 6, 2023, 12:10pm

That’s only if you want to run your node as a historyless SPV server.

If you just need a node wallet, you don’t need the SPV stuff since you’ll have obtained the UTXOs from the snapshot and you’ll continue to maintain it as part of normal node activities.

tom · November 6, 2023, 12:26pm

You are not wrong, but for that usecase I suggest you just run an SPV wallet instead. Its massively less download for the same gain.
Also permissionless, trustless and actually works on-chain today.

bitcoincashautist · November 6, 2023, 1:26pm

and fully dependent on altruistic nodes that need to sync all history from genesis, and be ready to support such wallets queries.

tom · November 6, 2023, 8:21pm

yes,

so lets aim to make the fastsync actually capable of doing so. Because if we don’t then the vast majority of people that are indeed going to use SPV wallets can no longer use BitcoinCash as more and more fastsync nodes start popping up.

As I pointed out for some time, there is a logical error in your steps to get from the basic sync as the CHIP now states to actually being able to serve wallets. If you don’t believe me, then prove me wrong by building your suggestion in some proof of concept.

Arguing here makes no sense. This forum software is telling me I’ve commented enough on this thread. And, you know, its right. I’m not going to continue nicely trying to save you from the mistakes I see (for 18 months now) and you are not seeing.

Build it, saves us a lot of talk.

sandakersmann · June 10, 2024, 7:40am

BLISS Presentation: Committing to UTXOs with Calin Culianu

Jonas · October 29, 2024, 6:58pm

It recently occurred to me that a SPV wallet doesn’t need the transactions and proofs from the timepoint before the UTXO snapshot. The node can just give the wallet the relevant UTXOS and the wallet is able to spend it by signing with SIGHASH_UTXOS.

This will simplify everything significantly since a node serving thin wallets don’t need the additional historical data.

bitcoincashautist · October 29, 2024, 7:06pm

Yeah! In that case, no need for this:

You can safely assume that the UTXOs you were given are legitimate and try spend them with SIGHASH_UTXOS. If the TX fails, it can only be one of the 2 reasons:

The UTXO doesn’t exist (TXID:n doesn’t match any UTXO)
The UTXO exists but data is corrupt (TXID:n matches some UTXO, but you got the wrong contents for it)

bitcoincashautist · November 2, 2024, 6:53pm

It has come to my attention that Kaspa has already implemented UTXO commitments in block header and new nodes sync off headers and UTXO snapshots. I dug out a blog post by Yonatan Sompolinsky:

Accordingly, Kaspa nodes prune block data by default, and new nodes by default do not request historical data, rather, they sync in SPV mode, i.e., by downloading and verifying only block headers. I reiterate that this is not a stronger trust assumption than a history-verifying node, rather a different requirement. The node then requests the UTXO set from untrusted peers in the network, and verifies it against the UTXO commitment embedded inside the latest received header (technically, this is done against the latest pruning point). If those do not match, the node bans the sending peers, requests the UTXO set from new untrusted peers, and repeats the process. If those match, the node verifies that no unexpected inflation occurred by comparing the sum of UTXOs to the specified minting schedule, a comparison for which block headers suffice.

Not only that, but they’re not using EC multiset, but something else. I asked them about it and the said Muhash had better performance, and they were in fact using EC multiset and then replaced it by Muhash, as evident in this PR: Replace ECMH with Muhash by elichai · Pull Request #1624 · kaspanet/kaspad · GitHub

cculianu · March 30, 2025, 1:09pm

Note I am aware of MuHash and we will evaluate it too. We already added it as a potential hasher to BCHN: src/crypto/muhash.h · master · Bitcoin Cash Node / Bitcoin Cash Node · GitLab