CHIP 2021-01 PMv3: Version 3 Transaction Format

bitjson · February 23, 2022, 6:33pm

Hi all,

I’ve been doing a lot of research on contract-validated token solutions for Bitcoin Cash, and I’m hoping to get some feedback on an early-draft proposal:

PMv3: A Draft v3 Transaction for Bitcoin Cash:

This specification describes a version 3 transaction format for Bitcoin Cash. This format:

Enables fixed-size inductive proofs (for covenants like CashTokens) by allowing transactions to optionally include a hash for any unlocking bytecode.

Unifies the 4 transaction integer formats around the existing virtual machine integer format (Script Numbers) with a parsing-enabled derivative called Ranged Script Numbers (RSN).

Defines an upgrade path for the subdivision of output values (“fractional satoshis”).

Reduces transaction sizes by cutting wasted bytes: ~12 bytes for small transactions, and ~3 additional bytes per input/output.

Minimizes changes to the existing transaction format.

… more

My goal is to make it possible to develop full token systems within the existing VM system and without requiring miners to track or validate new kinds of data.

For a high-level introduction to Bitcoin Cash contracts, covenants, and this inductive proof strategy, please see: CashTokens: Build Decentralized Applications on Bitcoin Cash.

I also recently wrote Prediction Markets on Bitcoin Cash to highlight a major use case. I named the proposal (PMv3) after this use case to disambiguate from any other v3 transaction proposals.

Beyond enabling miner-validated and SPV-compatible tokens, I’m convinced that a fixed-size inductive proof solution can make BCH a more efficient, alternative platform for many applications which are otherwise being developed for Ethereum. (Maintaining decentralized application state in covenants seems to offer significant, long-term scaling advantages over Ethereum’s global-state strategy.)

I’ve tried to summarize some of the other options I’ve experimented with in the Rationale section, but I’m hoping to expand it based on feedback or changes recommended here.

What do you think? I’d appreciate any questions, comments, or feedback you have. Thanks!

bitjson · February 5, 2021, 8:37pm

New transaction versions are expected to be a lot of work for everyone in the ecosystem, so it’s naturally going to be hard to “cut a release”. Here are some topics I think are important but are not directly addressed in PMv3 because they require research I don’t expect to be ready for a May 2022 deployment.

I would appreciate comments about anything missing below, and especially if I’ve misjudged a topic which should be addressed in a v3 format.

Fractional Satoshis

A strategy for subdividing satoshis will probably first become important when 1 satoshi is too expensive as a transaction fee. This happens when the price of BCH rises to around $100,000 USD, when 1 satoshi equals today’s median fee of $0.001 (2020 purchasing power).

Fortunately, even if this price rise happens before May 2022, fractional satoshi support still isn’t urgent: even for small transactions, fees can remain below $0.01 USD through another 10x price increase to $1M USD. (Note: values are in 2020 USD purchasing power. Even in a hyper-inflationary USD scenario, this timeline is not accelerated.) BCH could easily absorb most of the world’s financial markets while keeping fees below $0.10 in today’s USD.

A move to fractional satoshis is one-way, and would likely benefit from more research. With all of this in mind, I think it’s safe to wait for a future transaction format before locking-in a fractional satoshi solution. I think this can safely wait until 2024 or later.

Metadata, Tags, and Other Structural Changes

Proposals have been made to fundamentally alter the structure of transactions, changing how data is represented or allowing for other, new metadata fields. PMv3 avoids large structural changes, making the minimum-possible, incremental modifications. Most transaction infrastructure, libraries, and APIs should be easily adapted to the PMv3 format.

Larger structural changes may enable new use cases, but it seems the research required will not be ready for a 2022 release.

“Data Carrier”/`OP_RETURN` Field(s)

PMv3 makes OP_RETURN outputs equally efficient vs. a new, “data carrier” field. This should make it easier to standardize support for transactions with multiple OP_RETURN outputs without requiring further changes to the transaction format.

Further research can focus on fee-estimation/fee-setting for transactions carrying multiple data outputs (ensuring concerns about blockchain growth are addressed). PMv3 unblocks this research, since a special “data carrier field” in the transaction format is no longer needed.

32-byte P2SH

Pay-2-Script-Hash (P2SH) on BCH currently uses 20-byte hashes. This is insufficient to protect against 80-bit collision attacks in low-trust multisig contracts. Contracts and protocols can be designed to work around this potential vulnerability, but future transaction versions should attempt to solve it by adding support for paying to longer script hashes.

There are many possible solutions to this issue, and I think research will not be ready prior to May 2022.

MAST, Taproot, etc.

Merklized abstract syntax trees (MAST) and related ideas could significantly reduce transaction sizes and improve on-chain privacy. (A solution in this area will probably also serve as a solution to the “32-byte P2SH” issue above.) This is a huge area for research, and I think a solution should wait until after May 2022. See also: Taproot on BCH by @markblundeberg.

I’ve tried to cover my thought process on what “makes the cut” for PMv3. Have I missed anything that should be addressed by May 2022? Thanks!

im_uname · January 20, 2021, 9:36pm

Wouldn’t the basics of 32-byte P2SH be a relatively low-hanging fruit? Simply use a well-established hashing algorithm that is not SHA256 and outputs 256-bit hashes (say, SHA3), and we’re good to go without changing anything else.

bitjson · January 20, 2021, 9:56pm

Wouldn’t the basics of 32-byte P2SH be a relatively low-hanging fruit? Simply use a well-established hashing algorithm that is not SHA256 and outputs 256-bit hashes (say, SHA3), and we’re good to go without changing anything else.

Yes, definitely! We don’t even need a new hashing algorithm, OP_HASH256 already exists and would work perfectly. Probably the simplest option would be to use the same strategy as BIP16: Pay to Script Hash. Right now any pattern following OP_HASH160 [20-byte-hash-value] OP_EQUAL uses the P2SH infrastructure. We could do the same with OP_HASH256 [32-byte-hash-value] OP_EQUAL.

The details get a little messy though – this adds another “pattern hack” which the VM must test for in each UTXO (not too bad, but some will consider it technical debt). Maybe worse, transactions will likely be re-playable on any forks on the side of BCH since the BCH-BTC split. So BCH holders would need to avoid accidentally sending funds from other forks to these new 32-byte P2SH addresses, as those funds could be freely claimed by anyone who knows the hash preimage.

These are definitely solvable issues, but they add another dimension of complexity to a v3 transaction proposal. And once we make the change, we’re stuck with our solution. (And again, it’s a pretty rare kind of attack. Complex multisig wallets which are sophisticated enough to be vulnerable also have a lot of options for preventing it.)

We’re almost guaranteed to get a solution “for free” as part of any MAST, Taproot, or similar upgrade (which lets outputs pay to a tree of hashes). As far as I know, everyone who has worked on the question has decided it’s best to wait for that kind of solution. (But maybe @markblundeberg knows more?)

cculianu · January 25, 2021, 12:58pm

The RSN format you proposed is great as an idea in that it would save a ton of space. I’m all in favor of something like it. However the details of the implementation as you specified need to be discussed and possibly modified.

0x82 as -2. Or that 0x80 is negative 0… is bizarre for most programmers. The binary tx format should try and be as “normal” as possible. I don’t think striving for CScriptNum compatibility here is a feature, but rather it is a bug.

We should strive to be 2’s complement and not the weird bitcoin cscriptnumn business for the encapsulation format (a tx is just an encapsulation format).

IF your encoding format for these integers is signed, THEN it should use 2’s complement integers.

IF your encoding format only wants to encode unsigned integers, then the first control byte should be something like 0xfa or something along those lines for subsequent lengths 2-7.

Please clarify whether RSN plans on possibly encoding NEGATIVE numbers or not. If it does not, then definitely it should be using unsigned.

If you will need NEGATIVE numbers, then it should be using 2’s complement signed integers.

cculianu · January 25, 2021, 12:57pm

The more I think about it the more I think you want RSN to not allow for negative numbers normally – and if they do they should just be 2’s complement. Consider the common case of an RSN being used to encode a byte length.

Say it’s encoding the number 250. In your proposal such a number would take more than 1 byte to encode. Any number >127 would be more than 1 byte. Very often for most of the tx format, all numbers are unsigned. I believe the binary format of the tx should optimize for this. This way you can encode byte lengths up to 248 or whatever as 1 byte, or other numbers up to 248 as 1 byte, etc.

Using CScriptNum as the encoding we throw away half the useful range of our numbers. CScriptNum is fundamentally flawed and too hard to use – we don’t want this implementation detail exposed to a top-level binary format. On top of being too complicated – It also wastes space!

In the (rare) case where you want to encode a negative number, just take the 2’s complement representation of the bitpattern as the value. If done correctly, even values like -15 (0xf1) end up as 1 byte anyway.

Also I should note that negative numbers in the tx format are rare – and when they do occur the reader knows to expect them and can cast the 2’s complement bit pattern they read to negative when interpreting it.

In short: We SHOULD NOT use CScriptNum for the tx format lengths/numbers. We SHOULD however make everything be a minimally encoded integer in the spirit of RSN (without the CScriptNum compatibility). We want to minimize space – and we want to make it as obvious to programmers. CScriptNum is confusing and not obvious and should not be exposed on the tx binary format layer.

Script authors are free to wrestle with the CScriptNum format if they are authoring scripts or whatever – that’s up to them – it’s on a deeper layer.

The basic encapsulation layer of the tx format though should not deal with this bizarro way of encoding data.

tom · January 25, 2021, 5:31pm

First of all, you haven’t made the actual link between the need of a ‘witness’[1] and the proofs. How are inductive proofs going to be used and accessed from Script?

The next part also left me a bit puzzled :

Hashed witnesses are a minimal strategy for allowing descendant transactions to include their parent transactions as unlocking data without being forced to include their full grandparent transactions (and so on). If the parent transaction chose to provide a hash for any of its unlocking scripts, child transactions can reduce the byte size of their own unlocking data by referencing only the hash.

Can you explain the parent/child relationship here? It reads like the Vm would need to access not just the locking script, but maybe also the locking-script of a parent-transaction (of a parent-tx etc). I’m sure I read that wrong…

RSN

The RSN format looks a little familiar, but at the same time its weird. It looks homegrown. Some observations:

You argue for a fork of the VM numbers based on its familiarity. Unfortunately there are many ecosystem tools/libs that never touch the VM or anything more than a p2pkh script, but which do actually parse the transaction and they would encounter these RSNs.
I disagree with your argument which is used as support for this being the best format.
Better integration of transaction formats and script may be addressed easier
using native introspection. See Can we get native transaction introspection on Bitcoin Cash?. Would you agree that if NativeIntrospection is implemented that the majority of the cases for scripts to understand VarInts disappears?
I’m biased towards reusing existing standards:
Variable-width encoding - Wikipedia
You may like to check out CMF (which uses the encoding used already in from Bitcoin). here. It has the restarting-ability and is closer to standards linked in wikipedia above.

multisig

Pay-2-Script-Hash (P2SH) on BCH currently uses 20-byte hashes. This is insufficient to protect against 80-bit collision attacks in low-trust multisig contracts

We have a working multisig for such situations in Schnorr setups, using normal P2PKH.
Next we have the opportunity to do multisig without p2sh, if people really want to use the operator for multisig for this.

I’m bringing this up because I’m a strong believer in small incremental changes. I really don’t like a big set of changes that shock the system. Remember, in the ecosystem of Bitcoin Cash the full nodes are only a tiny little subset. Most of the work happens elsewhere, a lot of it closed-source.

Small incremental changes.

footnotes

1] ‘witness’ is a weird name that has been implied to be something smart people know about. I prefer naming that enriches understanding for us all.

bitjson · January 25, 2021, 7:15pm

Hey @cculianu – thank you for reviewing!

I think this cuts to the heart of the issue:

CScriptNum is fundamentally flawed and too hard to use – we don’t want this implementation detail exposed to a top-level binary format. […]

The basic encapsulation layer of the tx format though should not deal with this bizarro way of encoding data.

Unfortunately, the VM is already deeply-dependent on the precise format of CScriptNum. E.g. <127> <1> OP_ADD OP_SHA256 must be the SHA256 hash of 0x8000, the CScriptNum encoding of 128.

Choosing a more common number encoding would likely require an entirely new VM version, breaking compatibility with all past UTXOs. Whether we like it or not, we’re stuck with CScriptNum in the transaction format unless we plan to swap out a majority of the existing VM and create a migration path. In comparison, adding a v3 transaction format seems very tame (and requires no migration).

If spreading the use of CScriptNum encoding is distasteful enough that we want to avoid it at all costs, it is also possible to handle this issue by implementing OP_NUM2VARINT and OP_VARINT2NUM operations:

One alternative solution is to implement OP_NUM2VARINT and OP_VARINT2NUM operations. This would allow for slightly better compression of VarInts between 128 and 252 (2 bytes), but in practice relatively few transactions use those numbers for input count, output count, or bytecode lengths. This inefficiency is also dwarfed by the space savings of compressing Version (3 bytes), Outpoint Index (approx. 3 bytes per input), and output Value (2-7 bytes per output).

Note, many other P2P protocol messages use other integer formats, and these remain unchanged. This includes the P2P protocol tx message header, which should continue to use the standard 24-byte header format.

I should note, just a few months ago I would have considered it revolting to spread CScriptNum to any other part of the protocol.

But practically, I have no great arguments for avoiding its use in the rest of the transaction message (remember, it’s already used by all unlocking and locking bytecode). The VarInt format is only slightly superior for the relatively rare values between 128 and 252, and unifying the 4 integer encodings used in transactions could pay much larger byte-saving dividends by simplifying contracts.

More directly: CScriptNum is very unusual, but I don’t think that makes it entirely bad. To its credit, CScriptNum is fundamentally variable-length, making it both efficient for small values and flexible to very large values. The possibility of “negative zero” seems wrong, but the end of a little-endian encoded number is a reasonable place for encoding the number’s sign, and it leaves open several bit-manipulation tricks and minimal-encoding validation strategies for contracts.

Overall, if the Bitcoin Cash VM had to be stuck with any number encoding, I’m not disappointed that Satoshi settled on CScriptNum.

The more I think about it the more I think you want RSN to not allow for negative numbers normally

Yes, I definitely agree. I don’t see any use cases for negative numbers in the transaction format. The only value for using a number system which could otherwise support negative numbers (like CScriptNum) is to offer compatibility with the number system already used in VM bytecode. It seems reasonable to simply declare RSN to be “unsigned”, leaving the negative range of first bytes (0x80 through 0xff) open for other uses (and 0x82 to 0x87 already reserved for indicating byte length).

Using CScriptNum as the encoding we throw away half the useful range of our numbers.

To clarify, because CScriptNum is fundamentally variable-length, we only lose part of the range at each byte length. In practice, this difference is extremely small and probably only matters below 255. As mentioned above, I think the real-world savings of using CScriptNum would easily exceed the benefit of adding VarInt support to the VM, but if that seems unlikely to anyone, I could formalize the napkin-math with a simulation using on-chain data.

Regardless, the availability of a “known-invalid” range in RSN is likely worth it for other reasons alone – it gives us a huge amount of flexibility for future “fractional satoshi” encodings and is even valuable for bitcoin signed messages or Proof-of-Reserve protocols.

Again, thank you for these comments!

Using CScriptNum for other transaction integers is probably the most unexpected choice made in this proposal, so it’s something I expect we’ll want to review carefully.

cculianu · January 25, 2021, 7:38pm

No no no. I think we’re talking about two different things here and I think we can likely agree and find common ground.

I am not proposing tossing out CScriptNum from the script data – or from the bitcoin VM. The bitcoind VM can do whatever it wants. It’s a VM. The script data is just a payload. The encapsulation format (the tx format) doesn’t care what’s in the payload… does it?

As far as the bitcoin VM is concerned it can continue to read and write CScriptNum in the format it expects.

But there is no reason to e.g. encode the number of VINs in a tx in that format, or to encode the “value” of an output in that format. Is there?

Or will bitcoin scripts “see” the binary data of the enclosing tx envelope?

These are two different layers we are talking about, conceptually. One is the binary format of the enclosing tx, the other is the data of the payload of bitcoin script. Unless there is crosstalk between the 2 layers – there is no need to use the CScriptNum on the enclosing layer. Is there crosstalk? Will / do bitcoin scripts “write” values out to a tx?!? Or do they “read” back the values from the tx they are enclosed within?

And if they actually do that – you can always “lie” to the scripts and present such data items AS CScriptNum in the VM.

Or do you envision the scripts doing some fancy crypto hashing/sig checks on random bits of data in the outside/enclosing tx “envelope”?

Choosing a more common number encoding would likely require an entirely new VM version,

Why? Again… two different layers. 1 is the enclosing tx “envelope”, the other is the script data. I am not proposing changing the script data or the VM. Just the “envelope” should not use CScriptNum. The data inside can use it, so as to not break things, as you say.

Using CScriptNum for other transaction integers is probably the most unexpected choice made in this proposal, so it’s something I expect we’ll want to review carefully.

I still don’t think it’s necessary. The serailization/deserialization format should not use it. If you want, you can present a “view” to the bitcoin VM “as if” values it cares about are using this format.

Unless for some reason your scripts will author tx’s themselves from within bitcoin script… and hash the binary data and do other weirdness – I fail to see how the bticoin VM’s “view” of binary numbers imacts the enclosing TX’s binary format on the wire/on disk…

tom · January 27, 2021, 9:11pm

Today the VM numbers and the transaction-level numbers are not the same, making it still not the same, but in a different way in the future doesn’t affect the VM at all. They are still completely separate systems.

Why would you want to make those encodings the same in the first place?

bitjson · January 27, 2021, 10:25pm

@tom and I talked a bit more offline, but I also wanted to answer here for anyone else reading.

I just finished writing a full overview of Hashed Witnesses here:

Hashed Witnesses: Build Decentralized Applications on Bitcoin Cash

That post tries to describe the problem and this solution in detail, and also how CashTokens enable a lot of other applications: contract-based & miner-validated tokens, tradable synthetic assets, decentralized exchanges, and prediction markets.

For those reading this thread later, these answers will be much more intuitive after reading the post above.

Can you explain the parent/child relationship here? It reads like the Vm would need to access not just the locking script, but maybe also the locking-script of a parent-transaction (of a parent-tx etc).

Yes, that’s correct, and in some cases, the grandparent locking script.

The contract first confirms that it’s been given its true parent TX (by comparing the hash with its own outpoint transaction hash), then it inspects the parent to confirm the parent’s 0th output is the expected token covenant. (It also does the same validation for the grandparent transaction.) Here’s the explanation from the Token Transfer (...tx3) script in the CashTokens template:

/**
 * This script is the key to the inductive proof. It's 
 * designed to be unspendable unless:
 * 1. Its parent was the mint transaction funded by the transaction
 *    hash claimed (requires full parent transaction), or
 * 2. It was previously spent successfully – the parent's 0th input's
 *    UTXO used this covenant, i.e. the parent's 0th-input-parent's
 *    0th output uses this covenant. (Valiated by checking the full
 *    parent and grandparent transactions.)
 * 
 * With these limitations, if a user can move the CashToken,
 * we know the CashToken has the lineage it claims.
 */

Would you agree that if NativeIntrospection is implemented that the majority of the cases for scripts to understand VarInts disappears?

Yes, most VarInt parsing use cases are better solved by native introspection. But for any cases where the introspection is not on the current transaction, introspection opcodes probably won’t help (e.g. in CashTokens, we’re actually inspecting the parent and grandparent token TXs for validity).

I wrote here about Serialized Transaction Introspection, but I think it’s probably wise to first wait and see what other kinds of applications use ancestor transaction introspection, and only if it becomes common enough, we might eventually want to accelerate it with specialized opcodes.

‘witness’ is a weird name

I agree, sorry… I settled for “Hashed Witnesses” initially because it’s best compared against Segregated Witness. SegWit fully excludes the unlocking bytecode from the transaction, creating a new section of the block for it (and some other changes). “Hashed Witnesses” simply hashes the unlocking bytecode, compressing it into a fixed size within the transaction.

I’m definitely open to better names though. Maybe I should have just gone with Hashed Unlocking Bytecode.

bitjson · January 27, 2021, 10:44pm

Thank you for taking the time to review this! And sorry I hadn’t finished writing that Hashed Witnesses overview before today! Hopefully that helps to address the rationale in wanting all transaction integer formats to be consistent.

We could definitely implement OP_NUM2VARINT and OP_VARINT2NUM operations instead, but I think internal consistency between the VM and transaction format will pay serious dividends in the long run. And saving ~5% in transaction size and paving the way for fractional satoshis are both nice bonuses.

tom · January 28, 2021, 7:21am

May I suggest a ‘cost / complexity’ section to your proposal? This is a biggy. Pruning logic would change very considerably, validation would become much more expensive when counted in IO-reads (up to 3x, assuming it never goes beyond the grand-parent).

Questions that are still open:

Can you put the witness inside the txid-calculation? Why not?

How does Bitcoin-Script get access to the parent transaction-data?

Its not segretated, its not a witness, I’m still unsure what it is, I’ll read more and maybe when I finally understand the thing you built I can come up with some more descriptive names. For now I’d suggest "hashed-output-script’.

bitjson · January 28, 2021, 6:06pm

Thanks again for all the thoughtful questions, these have been really helpful in clarifying things for an eventual FAQ.

Cost/Complexity

May I suggest a ‘ cost / complexity ’ section to your proposal? This is a biggy. Pruning logic would change very considerably, validation would become much more expensive when counted in IO-reads (up to 3x, assuming it never goes beyond the grand-parent).

That’s a good idea.

To clarify though, I don’t think pruning logic should change at all. Nodes aren’t providing any new data to the VM – only the same UTXO information for each transaction input. (Nodes are not providing access to raw parent or grandparent transaction data, that data is actually being pushed in the unlocking bytecode with the signatures.) So there is up to 3x as much data in these transactions vs. normal transactions (~600 bytes rather than ~200 bytes), but I don’t think that has a meaningful effect on node IO.

Introspecting Parent Transactions

Can you put the witness inside the txid-calculation? Why not?
How does Bitcoin-Script get access to the parent transaction-data?

Not without having transactions grow each time the token is moved.

The root issue is: the child transaction needs to first verify that it’s been given the real contents of its parent. So it hashes the data it’s given to get a txid, then it compares that txid to the 0th outpoint transaction hash (which it knows is the correct parent txid). If they are the same, it can safely inspect the parent transaction for validity.

So really, inductive proofs are already easy to do in the current VM, there’s just this tiny implementation detail that breaks everything: we have to hash the whole parent transaction (including its proofs) to get the same hash which is already present in the outpoint transaction hash. Without being able to authenticate that the data we’re given is “the real parent transaction data”, we can’t validate it.

So to find other ways to accomplish this (other than allowing the unlocking bytecode/witness to be hashed inside the txid serialization), what we’re really looking for is ways to introspect elements of the parent transaction. But right now, nodes only have a two relevant pieces of data available:

the UTXO (the locking bytecode and the value in satoshis)
the txid

So unless we provide data other than the txid as a way of introspecting the parent transaction, we need to modify the txid so it can be used in proofs which don’t grow from transaction to transaction.

Aside, this suggests a potential future optimization: if token transactions begin taking up more than half of network throughput, it might be worth providing some new VM state which allows for introspection of the parent transaction without having to provide the whole transaction. E.g. some sort of merkle proof which allows access to contents of parent and grandparent transactions. (Though we probably should not worry about it until these tokens have serious traction on the network and we’ve validated that sort of optimization would be worth it.)

Naming

For now I’d suggest "hashed-output-script’.

It’s the input script that’s being hashed, so we could maybe go with “hashed-input-script”.

I’m personally not a fan of “script” because it’s really vague: scripts are typically human readable, and when you compile a script, it’s technically “bytecode” inside the actual transaction. (E.g. a JS library method asks for a “script”: do you give it the ASM string? UTF8 encoded? Pre-compiled to bytecode? etc.)

“Input” can also be vague because you could be talking about the bytecode in the input or the bytecode “provided by” the input (the input’s locking bytecode). If you try walking someone through a complex contract, “input” and “output” quickly become meaningless words . So I also try be maximally descriptive by referring to them specifically as “locking” or “unlocking”.

I think the most “correct” name would be Unlocking Bytecode Hash, which may optionally be provided in place of the Unlocking Bytecode, then the Hashed Unlocking Bytecode is appended at the end of the TX. (Quite the mouthful, but maybe better than throwing the word “witness” into the mix.)

I also think we should call “Script” (the language) “BCH VMB” for BCH Virtual Machine Bytecode, but I’ll stick with “Script” until a critical mass of developers are tired of that ambiguity.

bitjson · February 3, 2021, 12:37am

The following began as a response to @andrewstone’s review on Reddit, but it probably belongs here (and it’s also too long for me to post there ).

Comparisons with the Group Tokenization Proposal (Previously OP_GROUP)

Hey @andrewstone, thanks for reviewing! Sorry I missed your review earlier.

Aside: is there some other public forum where OP_GROUP has been discussed in technical detail before? (I see a few reddit threads, but they’re pretty thin on substance.) Also is this Google Doc the primary spec for Group Tokenization right now?

Before getting into details, I just wanted to clarify: I believe hashed witnesses are the smallest-possible-tweak to enable tokens using only VM bytecode. My goal is to have something we can confidently deploy in 2022. I don’t intend for this to stop research and development on other large-scale token upgrades.

If anything, I’d like to let developers start building contract-based tokens, then after we’ve seen some usage, we’ll find ways to optimize the most important use cases using upgrades like Group Tokenization.

Some responses and questions:

Group Tokenization increases a transaction’s size by 32 bytes per output. So pretty much the same in terms of size scalability. (And the latest revision includes a separate token quantity field, rather than using the BCH amount. This is a design decision that I assume CashTokens will also need).

Hashed witnesses increase the size by 32 bytes per input rather than output. and token transactions will typically only need one. Do you have any test vectors or more implementation-focused specs where I can dig into the expected size of token mint, redeem, and transfer transactions with Group Tokenization?

For CashTokens, a 32-level merkle tree proof (>4 billion token holders) requires 192 bytes of redeem bytecode and 1024 bytes of proof. That overhead would apply only to mint and redeem transactions. For transfers, the inductive proof requires only the serialization of the parent and grandparent transaction, so at ~330 bytes each, the full unlocking bytecode is probably ~1,000 bytes. (If you’re curious, you can review the code in the CashTokens Demo.)

On quantity/amount: I didn’t propose a new quantity field since it’s pretty easy to use satoshis to represent different token quantities for CashTokens which need to support it. And it’s easy for covenants to prevent users from incorrectly modifying the value.

But on the CPU use side there’s need to verify the inductive proof so this CashTokens would use more processing power.

Verifying the proof costs less than verifying a signature – two OP_HASH256s and some OP_EQUALs. There’s no fancy math: we just check the hash of the parent transactions, then check that they had the expected locking bytecode. (Check out the demo, it’s quite easy to review.)

Both require a hard fork. Since CashTokens is creating a new transaction format it is likely a larger change.

This seems… unlikely

The Group Tokenization spec is ~24 pages (~43 pages with appendices and examples), and that doesn’t include a lot of implementation detail. The PMv3 proposal is just a few pages, and includes a full binary format breakdown/comparison and test vectors.

Also, doesn’t the latest Group Tokenization spec also include a new transaction format? “We define a new transaction format that is used for transactions with groups (or with BCH).”

The inductive proof logic is also extremely critical code, even though it seems to not be part of miner consensus. Any bug in it (in any client) would allow transactions to be committed to the blockchain the appear valid on that client (but no others). In the context of SLP, I call this problem a “silent fork” (clients have forked their UTXO set – that is, their tracking of who owns what coins – but the blockchain has not forked). In many ways a silent fork is worse than an accidental blockchain fork because the problem can remain latent and exploited for an indeterminate amount of time. In contrast, a bug that causes an accidental blockchain fork is immediately obvious.

The inductive proof actually happens entirely in VM bytecode (“Script”) – if different implementations have bugs which cause the VM to work differently between full nodes, that is already an emergency, unrelated to CashTokens.

If wallets have bugs which cause them to parse/sign transactions incorrectly, their transactions will be rejected, since they won’t satisfy the covenant(s). (Or they’ll lose access to the tokens, just as buggy wallets currently lose access to money sent to wrong addresses).

On the other hand, the “silent fork” you described is possible with Group Tokenization. Group Tokenization is a much larger protocol change, where lots of different clients will need to faithfully implement a variety of specific rules around group creation, group flags, group operations, group “ancillary information”, “group authority UTXOs”, authority “capabilities”, “subgroup” delegation, the new group transaction format, “script templates”, and “script template encumbered groups”. It’s plausible that someone will get something wrong.

Again, my intention is not to have the Group Tokenization proposal abandoned – a large-scale “rethink” might be exactly what we need. I just want a smaller, incremental change which lets developers get started on tokens without risking much technical debt (if, e.g. we made some bad choices in rushing out a version of Group Tokenization).

One scalability difference is that in the Group tokenization proposal, the extra 32 bytes are the group identifier. So they repeat in every group transaction, and in general there are a lot more transactions than groups. They will therefore be highly compressible in disk or over the network, just by substituting small numbers for the currently most popular groups (e.g. LZW compression).

However, each inductive proof hash in CashTokens is unique and pseudo-random so therefore not compressible.

Yes, and moreover, the largest part of CashToken transactions will be their actual unlocking bytecode, since the network doesn’t keep track of any state for them. That’s why they are so low-risk: Hashed Witnesses/CashTokens specifically avoid making validation more stateful.

Without a doubt, a solution which makes transaction validation more stateful can probably save some storage space, but even comparing to current network activity, CashTokens would be among today’s smaller transactions. CashFusion transactions are regularly 10x-100x larger than CashToken transactions, and we’re not urgently concerned about their size. And CashToken fees will be as negligible as expected on BCH: users will pay fractions of a cent in 2020 USD per transaction.

There’s some huge misunderstanding that Group Tokens impacts scalability. It doesn’t, except in the sense that the blockchain is now carrying transactions for both BCH and USDT (for example). No token protocol will be able to have on-chain tokens without having token transactions! I’ve come to believe that people who take this scalability argument are being disingenuous – they should just say they don’t want tokens – rather than concern trolling.

Described that way – that BCH should be for BCH only – I can respect but entirely disagree with that argument. And I think that the market agrees with me based on the performance of BCH over the last few years.

To be fair, the original OP_GROUP proposal had a far a larger impact on scaling, and people might not know there’s a new spec. Also, the latest spec reduces the amount of new state introduced into transaction validation, but it’s still not “new state"-free. There are real tradeoffs being made, even if you and I agree that tokens might be worth it.

A smaller, easily-reviewed change like PMv3 would give us a great opportunity to showcase the value of contract-integrated tokens on BCH. If we see a lot of on-chain applications, larger upgrade ideas like Group Tokenization will probably get more interest.

And with an active token-contract developer community, we’d have a lot more visibility into the token features BCH needs to support!

andrewstone · February 3, 2021, 3:25am

Hashed witnesses increase the size by 32 bytes per input rather than output. and token transactions will typically only need one. Do you have any test vectors or more implementation-focused specs where I can dig into the expected size of token mint, redeem, and transfer transactions with Group Tokenization?

I understand. But there’s ultimately a 1-1 correspondence between inputs and outputs. Although, yes the outputs become part of the UTXO so there’s more database use for data in the outputs.

For CashTokens, a 32-level merkle tree proof (>4 billion token holders) requires 192 bytes of redeem bytecode and 1024 bytes of proof. That overhead would apply only to mint and redeem transactions. For transfers, the inductive proof requires only the serialization of the parent and grandparent transaction, so at ~330 bytes each, the full unlocking bytecode is probably ~1,000 bytes. (If you’re curious, you can review the code in the CashTokens Demo.)

So that’s a lot more data per transaction. For group tokens there’s no additional unlocking bytecode – its the normal P2PKH or P2SH style locking.

On quantity/amount: I didn’t propose a new quantity field since it’s pretty easy to for CashTokens which need to support it. And it’s easy for covenants to prevent users from incorrectly modifying the value.

The original OP_GROUP tokenization did the same. I got a lot of push back so I added a token quantity field.

(in that same gist you mention bigints. I’ve implemented something at www.nextchain.cash if you are interested)

Do you have any test vectors or more implementation-focused specs where I can dig into the expected size of token mint, redeem, and transfer transactions with Group Tokenization?

This seems… unlikely

Really. At its heart all you need to implement group tokens is:

a for loop that adds up the tokenized inputs
a for loop that adds up the tokenized outputs
compare them to make sure input quantity == output quantity.
a few if statements to ignore 3 in the case of mint and melt.

Calling it 1 page of consensus code is very generous – I implemented it a very clean, sparse manner. Now, back in the day, some people threw a bunch of crap problems at it to make it a monster change, probably so it would never be implemented. Like insisting on a new transaction format because “data shouldn’t be put in the output script”.

The Group Tokenization spec is ~24 pages (~43 pages with appendices and examples), and that doesn’t include a lot of implementation detail.

The original specification was just a few pages. There was a lot of push back that the spec and group tokens were not comprehensive enough. ABC claimed to need ERC-721-like functionality or nothing. So this subsequent document lays out an entire vision, including a few years of additional features, including proposing features like transaction introspection, covenants, subgroups (NFTs), etc.

On the other hand, the “silent fork” you described is possible with Group Tokenization. Group Tokenization is a much larger protocol change, where lots of different clients will need to faithfully implement a variety of specific rules around group creation, group flags, group operations, group “ancillary information”, “group authority UTXOs”, authority “capabilities”, “subgroup” delegation, the new group transaction format, “script templates”, and “script template encumbered groups”. It’s plausible that someone will get something wrong.

A “silent” fork is not possible. A mis-implementation will cause a blockchain fork. But like I’ve already said, regardless of the above concepts, its actually a short consensus implementation, and lays out an entire roadmap. Stuff like implementing the new group transaction format doesn’t make technical sense (and wasn’t in OP_GROUP). It was ABC yanking me around with their obsessive desire to refactor and perfectionize every piece of code.

I hadn’t realized that the script enforces the inductive proof – I thought that you were proposing that that proof be part of the merkle proof of tx inclusion (and the client verifies it). In that case, you are right, your proposal also won’t suffer from “silent forks”.

Anyway, I’ve implemented it in the nextchain experimental testnet (minus the make-work “requirements” like a new tx format). You can head over to www.nextchain.cash to see details.

To be fair, the original OP_GROUP proposal had a far a larger impact on scaling, and people might not know there’s a new spec.

It didn’t have any different impact on scaling.

The “new state” is a group identifier per token output, so similar to your proposal which puts 1 per input and not a significant scaling issue. Also it seems like you require much larger scripts…

A smaller, easily-reviewed change like PMv3

The original op_group was 1 page of isolated consensus code that implemented 1 new function “do group inputs and outputs balance?”. The revised Group Tokenization is a bit bigger, maybe 2-3 pages? because it implements features like mint/melt authorities. These features basically say “sometimes inputs and outputs shouldn’t balance…” so are conceptually simple code.

bitjson · February 3, 2021, 7:31am

Thanks for the clarifications!

Are there any other public forums with good technical discussion on Group Tokens? I’d love to have more background if it’s available anywhere.

Anyway, I’ve implemented it in the nextchain experimental testnet […]

Ah, nice. (Code here.)

So in summary, maybe the best comparison is:

Group Tokenization (prev. OP_GROUP) vs. CashTokens

Group Tokenization aims to be an all-inclusive framework for BCH token support – it’s much like Ethereum’s ERC20, except Group Tokens would be built directly into the BCH consensus protocol, including “groups”, “authorities”, “capabilities”, “subgroups”, etc.

CashTokens are “userland” protocols: they are completely implemented using the VM bytecode language. Like any contract, new ideas (for e.g. minting, redeeming, and token management) can be designed and deployed by anyone, whenever they want. The only prerequisite for people to get started building CashToken-style covenants is the ability to succinctly introspect parent transactions (e.g. a solution like hashed witnesses).

(discussion from CashToken Devs)

And again, I don’t think these are mutually exclusive upgrades: inductive proofs are needed for several covenant applications which aren’t likely to be covered by a built-in token protocol (e.g. weighted, blinded voting for two-way-bridged prediction markets). There are also definitely applications which could be optimized beyond inductive proofs – a future upgrade which lets us “group” and save token balances in with the UTXO set could save a lot of bytes per transaction.

andrewstone · February 3, 2021, 2:01pm

“it’s much like Ethereum’s ERC20”

I wouldn’t put it that way. As I’m sure you know (but just setting the stage) Ethereum is basically a replicated computer with shared state. Things like tokens are built on top of that computer. Something like CashTokens uses that same architecture since the logic that creates tokens into the BCH scripting language.

This is a very fundamental architectural decision. Its much more fundamental IMHO than saying that Group Tokenization is like ERC20 because they both support feature X. In fact, that’s not even accurate. ERC20 just supports transfer – its up to each contract to layer on all the other features that Ethereum has taught us that we need.

The ramifications of this architectural decision influence EVERYTHING downstream. There are advantages and disadvantages.

Disadvantages of implementing tokens in the blockchain script:

Safety: every token can be implemented differently so really before holding a token every person should review that code (which may be closed source).
Scalability (CPU): no matter how much more efficient the blockchain scripting language gets, its unlikely to exceed the efficiency of native code. Additionally, the native code can leverage miner consensus and therefore “do” a lot less – in this case just a simple sum and compare of token inputs and outputs.
Scalability (Space): the scripts that implement and enforce token constraints will take up space in transactions. Admittedly this space is very compressible using the macro scripting architecture that I laid out a few years ago but still it cannot be smaller than no script…

Advantages of implementing tokens in the blockchain script:

Flexibility: There’s just the one, but its a big one!

So let’s talk product philosophy, because excepting BTC with its 1st mover advantages, blockchains are effectively products competing for users. Here’s a key product design idea:

If you can’t be better in all respects than an established competitor, then at least be different.

Why? Because your differences will mean that you are better at some things (while being worse at others). Ideally of course, you want to be better at MORE things then you are worse at. But even if you are worse at more things, your different product can find applications that specifically need what you are better at.

This is the philosophy I think we should use with Bitcoin Cash. Verses Bitcoin, we’ve made it worse on low end machines/internet connections to make it better at handling large volumes of transactions. Verses Ethereum, we shouldn’t make it a (worse) general purpose computer (and no, we are never going to slowly build BCH script into something that can out perform the ETH VM which is IIRC going to leverage web assembly). Instead, lets make it really good at handling crypto-financial computing. The way to do this is to have powerful primitives for that task. This is why I want super-efficient native tokens, rather than implementing them at the scripting layer. This is why I want OP_PUSH_TX_DATA (transaction introspection) rather than using the inefficient CDS “trick”. This is why nextchain.cash implements variable sized big nums rather than picking 256bits (say). Ethereum defines uint256… my big num implementation will be (perhaps – its using libgmp where very clever people are focused on performance and writing asm implementations for common architectures) very slightly less efficient at 256 bit numbers but more efficient at every other size.

tom · February 3, 2021, 6:46pm

I’m still behind on reviewing the inductive proofs and other details from Jason, but I do like the discussion Andrew started.

What I specifically appreciate in Bitcoin is its scaling proposition. Validating a transction and/or script is very easy to shard over multiple processors, all parts are easy to parallelize. The only thing that we can’t parallelize is the UTXO. We need some locking in order to make sure we can detect a double spend within a signle block.

Based on the fact that the UTXO is the scaling bottleneck, I’m indeed happy with the general design in Jasons stuff: it doesn’t add anything to the UTXO.

The OP_GROUP and maybe the GROUP one too (I mirror Jasons request for some forum or documentation on it) have as a rather severe problem that they add a second 32-byte identifyer to every UTXO entry. As the current lookup is just a single 32-byte entry (and maybe a output-index), this is a rather large difference that will have an instant and negative effect on scaling.

bitjson · February 3, 2021, 10:03pm

Related discussion about how inductive proof strategies can allow covenants to offload functionality to child covenants, allowing public covenants to be parallelized. This reduces competition over who gets to make the next covenant transaction (avoiding competing chains of unconfirmed covenant transactions):