Do you think it would make sense to split the UTXO buckets using sorting based on locking script instead?
I’m not very sure of the possible implementation, but something where the UTXO set would be sorted by their locking script and then split such that all 128 buckets are of uniform size or close to it.
Here, the snapshot SubBucket Metadata may then look like:
Field Name | Byte Count | Format | Description |
---|---|---|---|
public key | 33 | public key, big endian | The secp256k1 point of the EC multiset for this UTXO snapshot sub-bucket. |
start lockingScriptByteCount | 1-4 | compact variable length integer | The number of bytes in the starting** locking script of the SubBucket (as a compact variable length integer). |
start lockingScript | ? | big endian | The starting** locking script of the SubBucket (“scriptPubKey”, “pk_script”). |
byte count | 8 | integer, little endian | The number of bytes within this UTXO snapshot sub-bucket. The sum of these must equal the byte count of the snapshot bucket. |
** The ending lockingScript of a SubBucket is the starting lockingScript of the next.
Advantages:
- This can foster a new kind of light wallet that can easily query their UTXO set from nodes. The wallet can query extra SubBuckets or SubBuckets from different nodes to circumvent nodes omitting UTXOs by not following the ordering. By checking the ordering in the SubBucket and its EC mutliset, the wallet can determine if the node follows the UTXO ordering.
- The wallet may maintain the privacy of the user by downloading extra SubBuckets to obfuscate the intended locking script query. These extra SubBuckets can also be used to check the UTXO ordering of the node.
- Such wallets won’t have to build a history of transactions over the blockchain or query the entire UTXO set. This gets even safer if the EC multiset is committed to the block header.
- Since such UTXO set is generated once per block, a node does not need to generate additional filters for each wallet subscribing to a locking script (or maybe my understanding of SPV is at fault).
Disadvantages:
- Will add overhead of splitting the UTXO algorithmically. However, clever schemes of passing UTXO sets from one SubBucket to another can leverage the O(n) operation of generating EC multiset.
- Nodes may not follow the UTXO ordering so closely, and will lead to omission of UTXOs.
- Increases the size of P2P UTXO Commitments (“utxocmts”) message.
- The UTXO set may be flooded by UTXO with the same locking script to make the ordering and SubBucket formation difficult.