CHIP 2022-02 CashTokens: Token Primitives for Bitcoin Cash

It’s not just about the data although I arrived there thinking in that direction, trying to achieve similar functionality to “detached proofs” of PMv3, and that led me to a version (4.3 I think) where the whole genesis TX (except genesis IDs being generated) was the preimage (input scripts individually hashed) but that was too complex and still rigid so I had dropped it and went back to simple hash of the full prevout ref + prevout itself, thinking the same: you can move one step back and unwrap TXID to prove parts of “pre-genesis”.

Then when you had pinged me I was thinking “what could it possibly be?” and I thought it must be that you cracked the genesis setup! And that led me to revisit some old ideas (ID=00…00 for pre-genesis, next TX would generate the real ID) but then it hit me: genesis naturally belongs on inputs! Little did I know that you cracked something else that was giving me a headache… since subgroups got dropped I never looked back to adding more data but I had tried to achieve similar thing to your “commitment” by overloading the amount field and implementing a groupType=NFT | hasAmount which would let you have 8 “free” bytes on an NFT.

Anyway, that was just for some background - it’s about all existing UTXOs having equal “genesis potential” and creators of UTXOs being able to make a covenant that can prohibit, ignore, or require a token genesis on top of itself, with parameters specified by the covenant creator, without having to make sure they put it in the right slot. Also, if genesis is implicit, then you have to do it the roundabout way: verify your genesis output(s) introspected ID against the input’s introspected prevout TXID, if it matches - it’s a genesis. What if you want to verify the supply being created at genesis? Then you have to tally genesis TX outputs from within Script which is something we’re trying to avoid, no?

using transaction IDs as category IDs enables token-related applications to be powered by existing indexes

Well aware of this benefit and the trade-off, it’s just that I’m not convinced that it’s a good one now that I see this genesis-on-input approach.

You’re sacrificing a genesis way that perfectly fits in with Script VM and other BCH blockchain primitives in order to make some secondary jobs easier, jobs that weren’t really hard to begin with.

From talking with some others I got the impression that this easy indexing is a solution looking for a problem. The whole ecosystem is used to indexing token IDs, and if you get a random token and don’t want to use an index, then finding the genesis TX is a simple matter of tracing back a chain between your UTXO and the genesis TXO.

Is there some application of genesis commitments which cannot be supported by pre-genesis commitments?

Yeah, pre-genesis covenants, like these 2 examples:

  • Require that 10,000 sats is paid into some P2SH covenant and a NFT token created. The target covenant later easily verifies NFT’s genesis and that it’s getting burned and releases the 10,000 sats. Cool aspect of this: you can take 10,000 from any covenant UTXO, even ones not created by you, because the covenant only verifies the NFT template. Bitauth IDE
  • Require a token genesis to be created. Just that lets us do something interesting: define a standard where tokens are created from a public covenant like this (or authenticate by verifying a “detached owner” at txid/n+1). This way, services can subscribe to a single address to be updated whenever a new standard token genesis gets created. Bitauth IDE
2 Likes

This is clear to me, it was not about this, I misunderstood your spec. Question was related to what would executing OP_UTXOTOKENCATEGORY return if executed on an input whose prevout is a txid/0 and the prevout already has a token? But from the spec it’s clear - it would return the “old” token, and Script running on the input would have to look at outputs to know if it’s creating a new category and of what kind.

1 Like

Sorry for the late reply - there are some excellent points addressing my concerns. Some more thoughts for @bitjson :

Exactly – dropped. If we want logical consistency in the VM, token amounts should never exceed the maximum VM number. This is important for contracts because it eliminates a class of overflow bugs. Note however, it’s not a limitation in practice: Other token standards built on top of CashTokens can allow for an unlimited supply.

Note that even if we had some concept of “minting buckets”, off-chain token issuers could still fail to “properly label” their reserved supply (such that circulating supply is easy to calculate by all indexing applications). So the best we can do is make sure issuers know that “labelling” their reserved supply is possible (and token standards built on top of CashTokens should encourage compatibility too).

Note taken, I suppose the “right way” to do mintable fungibles, then, is to establish a very large bucket that is locked away and accessible only to a baton NFT. Standardizing how a specific way to do this is “mintable” will help application interfaces to deal with this, and separate the intentions of a “very high supply token” and a “mintable token”. One way to be very clear about it is perhaps by standardizing “mintable” tokens as tokens who are simply minted with the maximum VM supply, then further defining how the reserve can be held.

On this point: (also thanks @bitcoincashautist for clarifying some of that above!)

The reserved supply of a fungible token category can be computed by retrieving all UTXOs which contain token prefixes matching the category ID, removing provably-destroyed outputs (spent to OP_RETURN ), and summing the amount s held in prefixes which have either the minting or mutable capability."

Is there any advantage to limit reserving to mutable/mintables? It seems to me that they don’t confer any additional advantage compared to vanilla NFTs; in fact using them may complicate identification of reserve pots down the line.

While this does not affect consensus, when specifying standards for mintable tokens I’d actually do the opposite: reserves should go to a vanilla NFT so they remain in one easily identifiable chain, and we can further specify they need to be at output x of genesis for easy identification, etc. Other configurations are possible but will take the form of additional standards.

Thanks for bringing that up – I’m now thinking that paragraph in the CHIP’s rationale is just incorrect: it’s almost certainly more efficient (in terms of real-world transaction costs) to have liquidity pools for various tokens which allow you to hand the pool some tokens and receive some BCH. If you wanted to pay for a transaction in another asset, you would simply swap it and not claim all of the released BCH (paying some BCH as a fee). This is far simpler to coordinate, and doesn’t require the “coincidence of wants” issue where the miner of the next block might not care about the particular token you’re hoping to use for fees. So token holders can already pay fees in BCH using atomic swaps, and that paragraph can be deleted. (Done :+1:, thanks!)

I don’t think it’s sufficient to delete the miner fee rationale - with that gone, the only remaining rationale is safety, and there are ways to get around it (for example, adding a requirement for, say, a SIGHASH_TOKEN bit in inputs spending tokens) without losing the freedom to destroy. I feel strongly that wallet owners should be able to destroy their tokens and recover the sats - I can attempt a PR to the specs if necessary.

4 Likes

They could, with OP_RETURN, but it would be cheaper with implicit destruction so there’s an argument there for allowing implicit destruction.

If you had 10 token UTXOs with distinct category IDs, you’d need 10 inputs and 10 OP_RETURN outputs to “eat” each token + 1 pure BCH change output.

With implicit destruction, you’d need just the 10 inputs and 1 BCH change output.

Wallet safety is one argument for keeping it explicit, although there may be alternatives as you suggested.

I wonder whether there are some Script VM implications but I can’t think of any right now. It’s just one more way to burn: 1bitcoineater, 0-amount OP_RETURN, omission (needs >= instead of == balancing rule).

3 Likes

Posting a review mixed with suggestions here.

2 Likes

@bitcoincashautist and I worked through the token genesis question offline. TL;DR: if any relevant use cases are discovered in the future, there are at least two good alternatives to a built-in category preimage data structure.

First observation: we haven’t yet identified any use case that requires a contract to “accept” tokens of an unknown category where some vetting process isn’t already required, e.g. a vote of shareholders, some staking process, submission with a liquidity pool of BCH, etc. Since these processes already imply that economic actors are somehow vetting the unknown token category (e.g. reviewing the details of the issuing covenant, verifying the tokens’ utility in an external system, checking that a pegged asset has audited reserves, etc.), (re)verifying a few details of the unknown token category’s genesis transaction in the on-chain contract bytecode doesn’t offer any additional value (it just wastes transaction bytes/fees).

However, if someone did discover a use case in the future, there are at least two good options available without modifying the current specification:

  1. Using the pre-genesis transaction as a commitment structure – e.g. committing data to OP_RETURN outputs or specifying a covenant which enforces constraints around token genesis, and/or
  2. Covenants-as-standards - well-known public covenants which oversee the creation of many new tokens according to strict rules, allowing those covenants to:
    1. Enforce uniqueness across a large, managed category of NFTs, and/or
    2. Freely issue “certification tokens” – tokens which attest to the newly created token category’s compliance, e.g. fixed supply below some limit, no minting tokens, minting tokens assigned to a strict covenant, etc. Such certification tokens could also be issued to public covenants which prevent them from being moved or destroyed, allowing any transaction to temporarily borrow them for a proof (and then return them to the same covenant).

Interestingly enough, the covenants-as-standards strategy is far more efficient than either commitment structure option. It cuts down the transaction size cost of verifying category genesis properties to 36 bytes (<index> OP_UTXOTOKENCATEGORY <covenant_managed_category_id> OP_EQUAL) rather than the hundreds of bytes required for contracts to reconstruct, verify, and inspect a category ID’s preimage (whether that preimage is a transaction or a new data structure).

So with that, we concluded it makes sense to leave that part of the CHIP as-is (using transaction IDs as category IDs).

3 Likes

Thanks for the comments @im_uname, @bitcoincashautist, and @emergent_reasons! I’ll try to respond here to everything in this thread, then respond on GitHub to activity there.

Yes, exactly. I think that would be a good start for standardizing “mintable, fungible tokens” in some “SLPv2” specification. I think it’s also good idea for standards to attempt to conform to how covenants will issue such tokens by recommending that trusted token issuers hold their “reserved” tokens in outputs with either minting or mutable capabilities. (More in below response.)

If I’m understanding the question – yes. If the goal were only to standardize a method for identifying circulating/reserved tokens, I would argue that definition belongs in complete token standards rather than in this CHIP.

The rationale behind including these particular supply definitions in this CHIP is that there is an important, “emergent standard” inherent to how most covenants must issue tokens (if they use the minimum possible contract/transaction sizes): most covenants hold easily identifiable reserves of their own, unissued tokens. This supply of unissued covenant-held tokens will almost always be held in a top-level or depository child covenant, many of which will use a tracking token with the mutable capability. Simpler covenants may also directly hold the minting capability, rather than isolating it to a privileged child covenant (somewhat like a linux superuser account). In both cases, the “reserved” supply is easy to calculate by outside observers (and without adding complexity to the covenant to accommodate some standard).

Given this reality, it makes sense for higher level token standards to attempt to standardize around compatible definitions, ensuring good application-layer compatibility between on-chain and off-chain token uses.

On higher level standards: I agree – it’s a great idea to standardize around minting and holding all reserves in the 0th transaction output. (Bonus: Bitauth-supporting infrastructure like Chaingraph already supports recursive lookups of the 0th output.) So this CHIP’s only contribution in that respect would be to clarify that each of those 0th outputs must have either the minting or mutable capability, ensuring that supply calculation for standards-compliant tokens issued by centralized entities are compatible with supply calculation for covenant-issued tokens.

This is a great point, thanks for noticing @im_uname! As @bitcoincashautist mentioned, requiring OP_RETURN for token destruction makes cleaning up “token dust” much more expensive than some sort of SIGHASH_TOKEN. A signing serialization type flag also simplifies away the awkward destruction policy difference between fungible or immutable tokens and minting or mutable tokens (where the former currently cannot be implicitly destroyed, but the later can).

In fact, to prevent offline signers from being mislead by omitting token information in signing requests, we also need to add the token information directly to the signing serialization. Maybe we include the full contents of the token output prefix (same encoding) after value in the algorithm for SIGHASH_TOKEN signatures?

If you’d be interested in sending a PR, I’d definitely appreciate it! (I’m also happy to write the update or any relevant rationale, just let me know what you’re interested in doing.)

Right, it will still be possible for tokens to be explicitly burned to OP_RETURN outputs (e.g. for protocols which require proof of burn), but the SIGHASH_TOKEN strategy is both more efficient and closes some possible vulnerabilities in offline signers.

Aside: funds/tokens sent to 1bitcoineater... are actually not safe for any token standard to consider “burned” – those funds can be unlocked in the future if someone manages to acquire that private key e.g. a break in the crypto, even decades from now. Even more plausible, a very expensive collision search could eventually create a new P2PKH “burn address” with a known private key. That’s one reason we can be pretty confident in the CHIP’s token supply definitions after only removing OP_RETURN outputs: OP_RETURN covers all provably unspendable outputs which can be generated by standard transactions. (To provably burn tokens with a different locking bytecode – e.g. OP_0 – each user would have to manually submit/mine a non-standard transaction.)

4 Likes

Thanks, I’ll try to submit a PR this week. :slight_smile:

3 Likes

That captures and summarizes it - for now :slight_smile:
Let’s keep it on the back-burner while we focus on other things, and maybe we’ll later hit some “Aha!” moment when we start working out examples and standards.

Just to add some closing thoughts: I wouldn’t mind if genesis was left as-is, although I’m not fully convinced, because I see workarounds for what I had in mind with using the TXID. If you need workarounds to achieve something, then maybe the whole thing could be reworked so you don’t need those workarounds.

I feel like our genesis setup could become an important “commitment primitive” as it can preserve and carry any TXID and with that preserve a proof that something happened in the past. With that in mind I’d like to future-proof it and would want it to be as generic and flexible as possible, so we don’t wake up one day thinking like we did with TXID: Damn, it would be real nice if TXID hashed locking scripts individually while constructing the preimage.
We’re looking at it from different angles/philosophies I guess. I’m thinking: if we’re introducing blockchain-native primitives, then they should allow for maximum expressiveness to blockchain-native agents (contract entities).

1 Like

Yes. I thought it would be “automatically” included as if it was part of hashOutputs scriptPubKey but better to spell it out.

This comment made me realize something: the prevout’s token payload is special because signing serialization sometimes uses the locking script and sometimes the redeem script so we should add the prevout’s PFX inside the scriptCode, prepended to the actual script.

2 Likes

(I think you mean hashPrevouts?) Right, hashPrevouts commits to the full contents of all the transactions with outputs being spent by the current transaction. The main reason value is included separately by BIP143 (and then by BCH’s signing serialization algorithm) is to make verification easy for offline signers. In practice, many offline signers weren’t actually verifying the values from the source transactions because to do so could require transferring, decoding, and inspecting MBs of raw transactions. Committing to the value directly in the signing serialization allows for equivalent security using much simpler offline signing implementations. (Even if all transaction data must be transferred e.g. by keyboard or a QR code.)

I’m not sure if I understand the logic here, but my initial reaction is that scriptCode is already a pretty complicated idea – I think we’re best off leaving it as-is and just committing to the full token prefix directly after value. (And only if SIGHASH_TOKEN is present.)

2 Likes

I thought that it commits only to prevout refs (which by themselves commit to the prevout satoshi amount and locking script through TXID). So, I meant to include the token prefix same way as value, so that you can see what you’re signing without having to obtain parent TXes. We’re thinking the same here, but I had a different place for it in mind.

This sounds good, and cleaner :slight_smile:

No, there I meant newly created outputs although it was not really applicable to your comment about offline signing. Newly created outputs token payload needs to make it into signature preimage(s), too. We should spell out how both prevouts and new outputs are to be handled when signing is concerned.

1 Like

Finally read through all this as part of my review. Amount of constructive debate and collaboration going on is amazing.

2 Likes

Just had time to finish digesting the proposal and I finally get it. My initial impression a couple of months ago was uncertain to negative, as it seemed to be a shortcut that added application logic to layer 1. However the reasoning provided around the elemental nature of the two types of tokens is sound and the CHIP is really impressive and detailed. I hope it succeeds and I’ve gotten all excited about the possibilities for things to build on top.

4 Likes

Great to know! Can’t wait till Jason wraps up his other projects so we can push this forward together!

PS @rnbrady can we add that as a quote for the CHIP?

1 Like

Yes, sure thing and let me know how else I can help.

2 Likes

Making a comment here as placeholder that we need to address activation strategy in the CHIP.

While nonstandard, it is possible that transactions generated before activation will contain “valid-looking” token outputs, and since the ruleset doesn’t exist before activation, they can be “wrong” in a wide variety of ways - including but not limited to duplicious Category IDs, invalid-when-summed amounts, nonsense NFT capabilities, nonexistent genesis, and so on.

Declaring these pre-activation outputs as invalid might be simple as a thought experiment, but incurs technical debt in practice - we’ll need a separate pass checking UTXO height to determine the validity of all token transactions. Not ideal…

… but these should not be a big deal in practice even if we adopt the ruleset as is! These “fake” UTXOs can simply be declared “valid if exist at activation”. These may lead to shenanigans involving any categories that use txids that exist pre-activation, but there is a clean way around it: implementers (of wallets, services, smart contract providers etc.) mostly need to be aware to do a de novo two-step genesis for any tokens they generate, shenanigans are only applicable to actual users if they venture into directly using pre-activation UTXOs for genesis.

Does this mean we don’t actually need to do much? Yes, but we do also need to address this point in the specs, lest people get confused about what the best practices are.

1 Like

Goos points @im_uname .

Just to elaborate: there are a bunch of corner cases here, all somewhat related to the way this works.

What do you do if you saved a scriptPubKey to your UTXO db as a node some time ago, and it has the prefix-byte. Now some new tx, post-activation, wishes to spend that UTXO. So you deserialize the coin and lo and behold it looks it has the PREFIX_BYTE.

  • What do you do if the SPK passes muster and deserializes correctly as a [tokenData + SPK] byte blob? (Has ok capabilities, positive amount, short-enough commitment, etc). Now there is a “fake” token that can be spent… which is what @im_uname is discussing above…
    • This has implications for token amounts. It’s possible for a category-id to exceed INT64_MAX if someone makes a bogus token from pre-activation that has the same category-id as a real token from post-activation… Now your inputs can sum up to >INT64_MAX. This is a caveat for node implementors to worry about…
  • The other case is what happens if the TXO fails to deserialize because while it used to just be an opaque byte blob that containes SPK bytes, now there are new rules about SPK’s with the prefix having to follow the new token binary format (commitment length, etc). So maybe the TXO doesn’t deserialize correctly as a token… but you thought it was a token!
    • Do you deserialize it anyway and just throw all the bytes (including the PREFIX) into scriptPubKey (as it was when it was created, really)… ? (Note this would be an unspendable TXO, but still the behavior needs to be specified).
    • I say this only as a corner case because one can imagine some node software assuming “illegal” PREFIX_BYTE containing SPK’s are impossible if they come from the internal node DB, and if that assumption doesn’t hold one can imagine software crashing when it hits the impossible condition it thought was impossible…

There are also other caveats with respect to activation… that are subtle … which I can go into later.

1 Like

Great points, thanks for bringing it up @im_uname and @cculianu! That definitely needs to be addressed in the spec.

One useful observation: any locking bytecode with a PREFIX_TOKEN (0xd0) is currently provably unspendable. That’s not true for all occurrences of 0xd0 in any locking bytecode, but because 0xd0 is an unknown opcode, and an OP_IF/OP_ENDIF block can’t cross from unlocking to locking bytecode (also – push-only enforcement since Nov 2019), we know that 0xd0 can not be the first byte in any successful locking bytecode evaluation (and this has been the case since Satoshi).

So until the upgrade, all outputs with a 0xd0 prefix are practically equivalent to OP_RETURN outputs, and can reasonably be pruned from the UTXO set at upgrade time. (In fact, implementations could reasonably prune lots of other provably unspendable UTXOs from their databases, but in most other cases that probably wouldn’t be worth the effort, makes UTXO commitments harder to standardize, etc.)

After that, there shouldn’t be any need to keep track of “fake token” outputs. While it still requires some activation logic, at least node implementations don’t have to pay a cost after the upgrade block.

One caveat with this strategy (and if we go with it, should be in the spec) any token transactions prepared in advance of the upgrade should use locktime to ensure the transaction isn’t included in a pre-upgrade block. (Even if the transaction is broadcasted after the new rules are expected to be in effect, it’s still possible for a backdated re-org to burn those funds.) Of course, creating tokens doesn’t require significant funds, so many users might not care if their token-creating dust gets burned by a malicious re-org (especially people creating tokens in the first few blocks, many will just be upgrade lock-in transactions). But worth mentioning for completeness.

Related, I think this is also the right strategy for handling “improperly encoded token prefixes” after activation. If PREFIX_TOKEN is followed by an invalid token encoding, we can’t really assume the locking bytecode was intending to encode tokens (and, e.g., attempt to somehow slice off the invalid token prefix to allow the BCH dust to be spent). Instead, it’s sending dust to a provably unspendable output, and can just be treated like any OP_RETURN output. (The transaction would be non-standard anyway due to the unrecognized contract type, so in practice this would only happen if a miner deliberately mined the nonstandard transaction.)

1 Like

You can’t “just prune” things from UTXO following a certain rule that’s later slated to be spendable again - the “pruning because unspendable” needs an activation in itself, else you get a consensus failure.

2 Likes