Native Group Tokenization

It might be nice to have a thread dedicated to discussing proposals for implementing a built-in token protocol on Bitcoin Cash. I think @andrewstone’s Group Tokenization (prev. OP_GROUP) is the main public proposal currently, but if anyone knows of competing proposals, this may be a good place to discuss.

There’s some related discussion here (comparing Group Tokens to the more “userland” CashTokens idea):

However, it seems very possible a future BCH might support both 1) “parent transaction introspection” in the VM system (which can be used for CashTokens, but is also important for e.g. parallelizing covenant applications) and 2) a universal, reliable, built-in token system. Assuming it’s possible to do both, maybe this is a good place to collect discussion on (2).


A good starting argument I’ve seen in favor of a “built-in” protocol is @andrewstone’s comment here:

Disadvantages of implementing tokens in the blockchain script:

  1. Safety: every token can be implemented differently so really before holding a token every person should review that code (which may be closed source).
  2. Scalability (CPU): no matter how much more efficient the blockchain scripting language gets, its unlikely to exceed the efficiency of native code. Additionally, the native code can leverage miner consensus and therefore “do” a lot less – in this case just a simple sum and compare of token inputs and outputs.
  3. Scalability (Space): the scripts that implement and enforce token constraints will take up space in transactions.

[…]

So let’s talk product philosophy, because excepting BTC with its 1st mover advantages, blockchains are effectively products competing for users. Here’s a key product design idea:

If you can’t be better in all respects than an established competitor, then at least be different.

Why? Because your differences will mean that you are better at some things (while being worse at others). Ideally of course, you want to be better at MORE things then you are worse at. But even if you are worse at more things, your different product can find applications that specifically need what you are better at.

This is the philosophy I think we should use with Bitcoin Cash. Verses Bitcoin, we’ve made it worse on low end machines/internet connections to make it better at handling large volumes of transactions. Verses Ethereum, we shouldn’t make it a (worse) general purpose computer (and no, we are never going to slowly build BCH script into something that can out perform the ETH VM which is IIRC going to leverage web assembly). Instead, lets make it really good at handling crypto-financial computing. The way to do this is to have powerful primitives for that task. This is why I want super-efficient native tokens, rather than implementing them at the scripting layer.

Some initial comments and questions:

I definitely see the safety-value of reliable standards – we don’t want users to have to re-validate the implementation of every standardizable token.

I think a “userland” protocol design like SLP covers this, though. And I’m much more confident that bad design decisions in a userland protocol can be rectified – e.g. Ethereum now has ERC20, ERC721, ERC777, ERC1155, and more. If we needed to improve our standard (as Ethereum has demonstrated is likely), we’d need to do so on the consensus level every time, rather than allowing that experimentation to happen in less-dangerous userland.

Also, I expect the most important kinds of tokens are extremely hard to generalize: synthetic assets, “VoteCoins” in Hivemind-like prediction markets, permissions and shares in liquidity pools for BCH DEXs, etc.

These types of tokens need to interact with complex covenants, and I’m not confident that the current Group Tokenization spec applies to these cases. Group Tokenization Script Templates and Contract Encumbered Groups seem aimed here, but those solutions seem like a much more complex way to accomplish similar goals as CashToken’s parent transaction validation. (@andrewstone would you disagree?)

I’d almost prefer the Group Tokenization proposal focus more directly on the more easily standardizable token types: fixed-supply tokens, custodial stablecoins, loyalty tokens, etc. I think there’s a case to be made that inductive proofs are overkill for these applications, and a simple, miner-validated token scheme could save bandwidth (the CPU and Space Scalability points above).

2 Likes

Group tokens just adds token semantics to an output. That output can still be constrained by ANY implementable smart contract. I believe that many different asset types and semantics are implementable with a Group token primitive and a smart contract.

Today, implementing covenants (outputs that put constraints on child scripts) are very hacky and painful to implement via CDS. Script Templates and Contract Encumbered Groups make covenants simple. The group author just sets a flag bit in the group, and miners enforce that grouped outputs have the same script constraint as a grouped input. Since that flag bit is part of the 32 byte group identifier, covenants are implemented with no additional information in the transaction. This is very space and CPU efficient as compared to CDS.

2 Likes

Responding to: CashTokens & PMv3: fixed-size inductive proofs, transaction integer compatibility - #22 by andrewstone – cc @tom, @andrewstone

It has been well documented, implemented, and even ported to and deployed on the ION cryptocurrency.

Is there any more technical documentation on the serialization of group token transactions? Or maybe can you link directly to the implementation and some test vectors?

Is IIP0002 the only documentation of the ION implementation?

Here are unit tests and functional test files:

The current implementation uses the OP_GROUP instruction so serialization is exactly like a normal transaction. Here is a grouped output P2SH output construction:

script << group.bytes() << SerializeAmount(amt) << OP_GROUP << OP_HASH160 << ToByteVector(dest) << OP_EQUAL;

I was not directly involved in the port to ION so I do not know about their documentation. For sure, should Group Tokenization become a candidate for inclusion in BCH, an implementation specification would need to be created that details specific implementation decisions that affect consensus (like the order and format of the group id and the token quantity. But since very the changes are actually minimal, this would not be a very large document.

2 Likes

Today, implementing covenants (outputs that put constraints on child scripts) are very hacky and painful to implement via CDS. Script Templates and Contract Encumbered Groups make covenants simple.

The OP_CHECKSIG + OP_CHECKDATASIG hack is definitely inefficient, but I think that inefficiency is well-solved by native introspection opcodes (like these – which effectively cost nothing to implement in the current VM). Native introspection opcodes are important for a lot of other applications, so I think they have a good chance of being deployed in the near future anyways.

If native introspection opcodes were introduced), would you still expect Script Templates and Contract Encumbered Groups to be worth including in a native token upgrade? Or would you consider cutting scope?

1 Like

The Group Tokenization document is the original document to propose introspection opcodes, AFAIK.

I think that we’d need to come up with a specific proposal that implements covenants with a fragment of BCH script and then compare its particulars with Group Tokenization’s contract encumbered groups. If the BCH script fragment efficiently meets the functionality, then sure cutting scope is fine.

One problem that I can anticipate in writing that fragment is how to identify what outputs go with what covenant in mixed multi-token-type and BCH transactions. Also, without loops its hard to check all outputs.

PS: I added a few questions to your cashtokens gist…

1 Like

Maybe a good example would be a Contract Encumbered Groups version of the “depository corporation” in the CashToken demo?

The user pays 10,000 satoshis to get a token. They can transfer that token to use 2-of-3, then back to single sig, then deposit the token and get back their 10,000 satoshis from the parent covenant.

(The description from the demo:)

This template provides scripts for each stage of a simplified “depository” CashToken Corporation, where users may deposit 10,000 satoshis in exchange for a CashToken. This implementation supports exactly 4 homogeneous tokens. Tokens are tracked in a 2 level Merkle tree:

    rt
   /  \
  b0   b1
 / \   / \
a0 a1 a2 a3

The covenant begins with no depositors (all leaves start as the hash of 0) and a dust balance. A user deposits 10,000 satoshis, receiving a CashToken (recorded at a0) secured by a single key. Then the CashToken is “notarized” to prove it is a descendant of the mint transaction. Once notarized, the owner transfers the CashToken to a 2-of-3 multisig for stronger security. From there, the owner transfers the token to a different owner (now single sig). Finally, the new owner redeems the CashToken with the corporation, withdrawing their 10,000 satoshi deposit, and replacing a0 in the tree with the hash of 0 (making the leaf available to other depositors).
[…]

(CashToken Demo)

Andrew, in response to you saying that there is no extra complexity for the UTXO set in GROUP (or did you mean OP_GROUP?), then maybe you can explain how a miner knows the ‘tokenid’ of an output.

Please link to some actual documentation of your proposal that helps readers understand.

You linked above to a unit test, but that only has an OP_GROUP, and in interpreter.cpp the OP_GROUP implementation is missing (it doesn’t do any checking). I can’t find any details on how this is supposed to work.

Please help us understand why you claim that the UTXO will not have any negative scaling impact from one or both of your group/op_group proposals.

2 Likes

@andrewstone pinging you about my post above, you likely never got a notification about the question :slight_smile:

@tom You said this:

The OP_GROUP and maybe the GROUP one too (I mirror Jasons request for some forum or documentation on it) have as a rather severe problem that they add a second 32-byte identifyer to every UTXO entry. As the current lookup is just a single 32-byte entry (and maybe a output-index), this is a rather large difference that will have an instant and negative effect on scaling.

There is no second lookup. The lookup (key) remains “just a single 32-byte entry (and maybe a output-index)”. Once a full node has retrieved the UTXO, there is additional data (which is part of the script) which is the group id. A miner never needs to look up UTXOs by Group ID. Miners just look up the UTXOs in a transaction the normal way…

There is lots of extensive documentation, which has been linked many times. If you are serious about a re-assessment, I will dig it all up and link it again.

The OP_GROUP implementation is exactly correct in interpreter.cpp. No checking is needed in the script machine. The OP_GROUP opcode simply identifies data that is treated in a special manner.

To implement “first class” tokens, group enforcement cannot be part of script execution… that would give script authors the option to enforce token quantity or not. Instead, quantity is guaranteed across a transaction, just like it is for the native token (hence the name “first class”), unless the transaction has spent mint or melt authorities

The code that enforces Group semantics is here: src/consensus/grouptokens.cpp · NextChain · Andrew Stone / BCHUnlimited · GitLab. Its the single function CheckGroupTokens and it returns “true” if the transaction is valid wrt Group properties, and false if its invalid.

1 Like

Please, do. What is the homepage for your project? I think it helps the discussion of native tokens if each project that has done work in this direction shares its online docs.

The point of having a ‘homepage’ is that blogs tend to get outdated as proposals change. A simple technical spec that doesn’t have any “this changed since last version” stuff in there is needed for people to actually understand the current version. Please help me understand.

You also link to src/consensus/grouptokens.cpp · NextChain · Andrew Stone / BCHUnlimited · GitLab which does a second lookup… Notice the passing in of the utxo view in the method.

Granted, it just does that to get the outputscript that is being spent, so this looks like something that could be fixed with the right architecture. But you wrote “group enforcement cannot be part of script execution”, is that the reason you didn’t manage to eliminate this expensive lookup?

Thank you for linking to the code, its quite a lot of it, I don’t think its simple. And from cashual inspection it looks a little like there is no way to set ‘amount’. Meaning that you need to burn satoshis to have tokens. That means they are colored coins, not full tokens. Did I get that right?

which does a second lookup… Notice the passing in of the utxo view in the method.

No, that is a view CACHE. There is no 2nd lookup, the data is accessed from the cache.

And from cashual inspection it looks a little like there is no way to set ‘amount’. Meaning that you need to burn satoshis to have tokens. That means they are colored coins, not full tokens. Did I get that right?

No, there’s a token amount: tokenGrp.quantity

Thank you for linking to the code, its quite a lot of it, I don’t think its simple.

Not much code is simple when you casually glance at it. Its an entire token implementation in 1 function that at its heart really is a simple set of “for” loops that are tallying up grouped quantities in inputs and outputs and then comparing them. I’m not a miracle worker.

The version I linked also implements subgroups, authorities, and script templates (covenants), so complexity beyond simple tokens exists. If BCH doesn’t want those features (or wants a phased implementation) then a lot of code can be removed.

1 Like

If you have it cached, thats great. But that is an implementation detail. Something that can (and probably will) change as block-size grows.
Notice it doesn’t bypass the hotzone.
This may not introduce a scaling bottleneck, but your code proof of concept absolutely does.

Thank you for providing many more details, please get back to us on the documentation of your ideas!

I don’t know what you are talking about with terms like “hotzone”.

This is a perfect cache (nothing is automatically removed). The data is guaranteed to be in it once its been loaded. This is the normal behaviour of that object which is part of the Satoshi codebase so I think you ought to be familiar with it. This cache is loaded during normal transaction validation. So that’s the 1 lookup needed. It will not overrun memory because the cache is exclusively for the inputs and processing of this one transaction.

This may not introduce a scaling bottleneck, but your code proof of concept absolutely does.

It does not. Full node evaluation time for a transaction before Group Tokenization is C * O(N) where N is the size of the transaction+prevouts (and C frankly is quite large given the fairly careless approach to serialization existing in much of Satoshi’s code base). In other words its linear with the size of the transaction+prevouts. Group tokenization adds a few more passes over the transaction+prevouts data, so adds a small constant to C, i.e. the whole process is (g+C) * O(N).

These additional passes to implement Group are not even needed if one desired to make a very efficient implementation. The “Group” information can be collected during the normal tx validation. However, I wanted to isolate the Group logic into its own function for clarity and ease of analysis so in this implementation of Groups, this code does its own pass over the transaction data.

I don’t know how many times I can say the same thing and yet you still disbelieve me. Rather than make repeated incorrect assumptions that things were implemented stupidly, you could ask questions or study the code. But if you are simply going to disbelieve what I say, then asking me questions will not help you; you’ll have to study the code.

1 Like

The part that is protected by a memory-lock. Typically a mutex. Adding a hotzone in a codepath lowers its ability to gain from multi-threading.

Perhaps, I’m a bit worried as your checking code clearly does call methods that lock mutexes on UTXO classes. You may be right that this is not an issue.I can’t tell without testing.

The longer term scaling of BCH is essential and we all know from experience and many years of testing that to not shoot ourselves in the foot we must not accept consensus-changes that destroy our ability to massively multi-thread transaction validation and we also that we must avoid taxing the UTXO more. Because the utxo db is the one bottleneck in our wish to multithread validation.

I’m a bit sad that you react so dismissive and in a “trust me or read the code” manner about this. I pointed out how there are worries and where specifically they are and what the goals are (I repeated the goals in the above paragraph).

I want to remind you that it is up to you to convince the rest of the ecosystem that your solution is something that we want to reimplement in our own nodes, that this is a solution that wallets want to write code for and this is a solution that miners want to burn cycles on. Without your push, nothing happens (and history shows me right, right?)

And I don’t want you pointing to code that (as shown above) causes more confusion. Why are you unwilling to document the basic concepts? If you can’t be bothered to put up a homepage with a clear description of what it is you are selling then how on earth do you think you can convince all those people to invest time in it?

I mean, it might be cool, but with your attitutde I’ll never learn more about it.

1 Like

@Tom I’m sorry to be a little short. If you are truly interested in Group tokens and pursuing first class tokens for BCH I’m super happy to help.

Its just that in my imagination of how this might work, we’d first “all” (or many of us) agree that first class tokens are a priority. Then the token authors would go off and prepare something. Then they’d present, and we’d pick one.

Instead what’s happening here is you are repeatedly making incorrect assertions that I need to run off and correct. This is inherently antagonistic, and I have to wonder if its worth the time because I don’t know your or Bitcoin Cash-as-a-whole’s commitment to adding first class tokens. Is this just idle chatter for you or do you believe that first class tokens are important for BCH? And in essence, wouldn’t it be better if you asked a question rather than make a claim?

1 Like

I see some problems with that approach, the main one being that one team preparing something isn’t the best way to get the best solution because this lets bad assumptions go on for far too long. Like the approach you took to making the code well to read, separated in one file, and you now see that I actually get worried about what may simply be implementation details with regards to the BU utxo implementation.

Instead what we are aiming for (and this is the basic concept behind the network discussions) is a shorter lead time. You have an idea, you talk about it with other devs, you have some dirty proof-of-concept, you present it. Small iterations and repeated feedback from your peers. Then when it gets big enough, you start to include the customers too in this feedback cycle.

This has several benefits, the most obvious is we challange assumptions one always makes about these things early on. Different viewpoints are good, the earlier in the design process the better. It also has the rubber duck benefit (Rubber duck debugging - Wikipedia). And naturally you can get the benefit of talking to people smarter that oneself.

These short loops is, incidentally, at the basis of most software development approaches, from agile to XP. They can explain better if you want to dive in.

This is equally frustrating for me, I want to understand the high-level first and you give me the implementation details, which are unique to BUs codebase! To get the actual high-level approach out of that is… difficult.

Notice that people are reading along, this thread got linked on reddit just yesterday. My personal opinion is indeed that native tokens (in general) are far superior than SLP, but the important part here is that our talking is seen by all and my worry about scaling is something many more people worry about. While native tokens are important, the money usecase is still the root of our chain. Because if bch-is-cash fails, the entire chain fails and it leads us down the road of extremely high fees like BTC and ETH.

And in the end all your work is for naught if we can’t convince the wider ecosystem that this design should be activated on the network. And that goes back to my earlier suggestion of how to approach this with smaller steps and more devs talking between themselves. When you explain to me (and those reading along) how it actually works and I understand it, it becomes easier to support your work. And soon you have a growing wave of support with many people willing to help. See my writeup of The story of how BCHs 2020 DAA came to be. on how this helps immensely towards actually activating something like group.

The fact is that Jason started doing this, and in mere weeks has more support than you. Because devs won’t support a proposel they don’t understand.

2 Likes

Hi all, I’d like to help here.

I will do my best to understand Group Tokens, and then to present them in a way that Tom requires. Any pointers to where I can find the relevant documentation would be helpful.

How I see it, Tom cares about the process a lot and refuses to dig deeply into content if it’s not presented according to the process. That’s fine and understandable. We’re doing peer-review, but we need to fit it into a form so peers will want to do the reviewing. So there are 2 ways to go about this: nag Tom to look into it even though it’s not presented in a way that he requires, or help Andrew present it according to spec. I’d like to help by helping Andrew present it. I started some discussions on Reddit and that’s all good to bring attention but not good to package it into the form for the process and help move things forward here.

I will likely need to bother people to explain to me things that need explaining, and I’ll find channels to do that where it doesn’t increase the noise here.

Here’s what I have so far.

Somebody has to do the math for tokens to function. I believe it would be better for the whole ecosystem if miners did it. Do we want such tokens, or not? Because if we don’t want miners to do some arithmetic in a scalable way, we’re stuck with SLP, and that isn’t really taking off, because it lacks competitive advantage and has other issues do to with hacky way of implementation which couldn’t be avoided given the historical context. Now we are here and we could move forward. SLP now competes with other blockchains, so what would be the problem if it also competed with a solution on the same blockchain, and if that solution were better than other blockchains? The ecosystem would benefit, and maybe we’d attract more users, adoption and talent instead of having them build elsewhere.

This is not a technical argument, but still an important one. Do we want it? Why do we want it? Who will have to “pay” for it? Is the price acceptable for what we’d be getting? What do the miners think about paying that CPU price? What about opportunity cost of not implementing it? Anyway…

Below is how I addressed some concerns on Reddit. I think that’s a start and I hope I got it right and readable for the laymen, but if not please correct me.

Processing a BCH transaction today can use CPU time that is bounded by a linear function of transaction size.

Linear function scaling would continue to be the fact, even if Group Tokens were introduced. Right now I take Andrew’s claim(s) at face value, but it would be nice if others would verify it at this stage, and it would be a must to verify were it to be included in the HF.

A peer-review process. But we have to get the peers to review. Seems like Andrew could use some help motivating his peers to review, or presenting his work in a reviewable way. That’s what prompted me to get the ball rolling.

This includes database operations, since each of these can be performed in linear time and the number of these is bounded linearly by the size of the transaction.

Isn’t it bounded by the number of outputs, though? Size can mean different things. SLP tokens add to the size of transactions, do they not? Anything in the OP_RETURN adds to the size, and it’s not been a problem even though anyone’s allowed to put whatever they want in to increase the size of transaction, even now.

Thing is, with OP_RETURN, miners don’t have to do anything with that data, so it only increases the size in kilobytes of data passed around, it doesn’t add cycles. More transaction outputs, linearly more time. More outputs, linearly more time. Bigger outputs because more data in each, less than linear. Why? Because miners have to doSomething() with every output, whether it contains a token transaction or not. That doSomething() is quite big, they have to verify the signatures, perform crypto math etc. Every output increases the number of times a miner has to call doSomething() by 1. Adding Group Tokens doesn’t increase the number. It adds a little basic bookkeping math inside the doSomething() function. It piggy-backs on something miners have to do anyway, and then when all the doSomething() is finished checks the signatures + this group token running total and says whether the TX is valid or not.

Every Group Token TX is also a BCH TX because it has to pay the fee. This is no different than some TX which amounts to 0 BCH and has something in the OP_RETURN. It spends the entirety of inputs for fees and gets that piece of data written on the blockchain. The only difference is, it doesn’t ask the miner to add a few numbers to check whether the data in the OP_RETURN makes sense. Now, SLP users/nodes do that math.

Somebody has to do the math for tokens to function. The argument is - it would be better for the whole ecosystem if miners did it.

Processing time would continue to be bounded by a linear function, with slightly altered slope. That’s it. That’d be the cost of processing, if my understanding is correct.

Moreover any changes should not significantly increase the size of the UTXO database.

Why it shouldn’t increase it just a little, though? And how do we define what is little and what is significant? We aren’t the stakeholders here. It’s not our CPU and RAM that will have to process this, so we should really be asking nodes & miners that.

Would you work just a little extra to have the best simple token operations on the market? That’s what we have to ask nodes & miners. There are always trade-offs. Will refusing to take a little sacrifice now prevent BCH from achieving its potential? There’s this opportunity cost involved, which grows every day we’re not taking action. Users will enjoy the benefit for “free”, we’re not the stakeholders. All we have to do is pay the fee if we will want to use it. Users will also enjoy the benefit of adoption by both other users and new developers who may come to use our first class tokens. And having proper tokens should help there. They will all pay for the services provided by our blockchain through BCH fees.

There are two limiting possible UTXO implementations: a “fast” one that keeps database entries in RAM hashed by identifier; and a “cheap” one that stores database entries on random access storage.

We’re far behind the hardware. Changing the slope of linear scaling won’t make us get ahead of the hardware just like that, if ever. Maybe this could be argued better but I’m not equipped with arguments right now.

Any need for locking adds huge difficulty to the designer of node software.

Agreed. Thing is, it looks like Group Tokens aren’t locking, at least that’s what I see from recent discussions. If when you process each output you have to doSomethingWithOutput() then this function can be executed for each output in parallel. Then, you have to tally the outputs of transaction, and again we add a little math to doSomethingWithTXes(). This one has to wait for all output processors to finish, which it has to do anyway because it has to tally BCH balance of the TX. So no extra locks there, either.

Where are the locks?

2 Likes

So I’m working on something higher level, and in the spirit of this guideline I invite anyone interested to get involved!
Here’s the working doc https://www.reddit.com/r/btc/comments/lkb7ez/we_want_first_class_tokens_on_bch_part_2n_and_we/

1 Like

@andrewstone if you’re willing to dig it up, I’d love to see what I missed. I want to prepare some kind of compendium about group tokens… here’s a start

Timeline

2017-10-16 Intro

https://medium.com/@g.andrew.stone/bitcoin-scripting-applications-representative-tokens-ece42de81285

2017-11-19 BUIP

https://github.com/BitcoinUnlimited/BUIP/blob/master/077.md

2017-11-22 Chris Pacia’s explanation

https://web.archive.org/web/20191106134854if_/https://www.yours.org/content/colored-coins-in-bitcoin-cash-b26804e05964/

2018-05-21 Group Tokenization document

https://docs.google.com/document/d/1X-yrqBJNj6oGPku49krZqTMGNNEWnUJBRFjX7fJXvTs/edit

2021-01-27 Jonathan Toomim comments

https://youtu.be/GbS7Q4pb-yg?t=3017

2021-02-03 Jason’s thread

https://bitcoincashresearch.org/t/native-group-tokenization/278

2021-02-12 BU Poll

https://twitter.com/BitcoinUnlimit/status/1360297716757245956

2021-02-13 George and Andrew Interview

https://www.reddit.com/r/btc/comments/liflma/bitcoin_cash_needs_a_big_tech_project_to_show_it/

2021-02-13 We want first class tokens on BCH! Part 1/N, motivation

https://www.reddit.com/r/btc/comments/liz5xf/we_want_first_class_tokens_on_bch/

2021-02-15 We want first class tokens on BCH! Part 2/N, scalability, stakeholders

https://www.reddit.com/r/btc/comments/lkb7ez/we_want_first_class_tokens_on_bch_part_2n_and_we/