Native Group Tokenization

Responding to: CashTokens & PMv3: fixed-size inductive proofs, transaction integer compatibility - #22 by andrewstone – cc @tom, @andrewstone

It has been well documented, implemented, and even ported to and deployed on the ION cryptocurrency.

Is there any more technical documentation on the serialization of group token transactions? Or maybe can you link directly to the implementation and some test vectors?

Is IIP0002 the only documentation of the ION implementation?

Here are unit tests and functional test files:

The current implementation uses the OP_GROUP instruction so serialization is exactly like a normal transaction. Here is a grouped output P2SH output construction:

script << group.bytes() << SerializeAmount(amt) << OP_GROUP << OP_HASH160 << ToByteVector(dest) << OP_EQUAL;

I was not directly involved in the port to ION so I do not know about their documentation. For sure, should Group Tokenization become a candidate for inclusion in BCH, an implementation specification would need to be created that details specific implementation decisions that affect consensus (like the order and format of the group id and the token quantity. But since very the changes are actually minimal, this would not be a very large document.


Today, implementing covenants (outputs that put constraints on child scripts) are very hacky and painful to implement via CDS. Script Templates and Contract Encumbered Groups make covenants simple.

The OP_CHECKSIG + OP_CHECKDATASIG hack is definitely inefficient, but I think that inefficiency is well-solved by native introspection opcodes (like these – which effectively cost nothing to implement in the current VM). Native introspection opcodes are important for a lot of other applications, so I think they have a good chance of being deployed in the near future anyways.

If native introspection opcodes were introduced), would you still expect Script Templates and Contract Encumbered Groups to be worth including in a native token upgrade? Or would you consider cutting scope?

1 Like

The Group Tokenization document is the original document to propose introspection opcodes, AFAIK.

I think that we’d need to come up with a specific proposal that implements covenants with a fragment of BCH script and then compare its particulars with Group Tokenization’s contract encumbered groups. If the BCH script fragment efficiently meets the functionality, then sure cutting scope is fine.

One problem that I can anticipate in writing that fragment is how to identify what outputs go with what covenant in mixed multi-token-type and BCH transactions. Also, without loops its hard to check all outputs.

PS: I added a few questions to your cashtokens gist…

1 Like

Maybe a good example would be a Contract Encumbered Groups version of the “depository corporation” in the CashToken demo?

The user pays 10,000 satoshis to get a token. They can transfer that token to use 2-of-3, then back to single sig, then deposit the token and get back their 10,000 satoshis from the parent covenant.

(The description from the demo:)

This template provides scripts for each stage of a simplified “depository” CashToken Corporation, where users may deposit 10,000 satoshis in exchange for a CashToken. This implementation supports exactly 4 homogeneous tokens. Tokens are tracked in a 2 level Merkle tree:

   /  \
  b0   b1
 / \   / \
a0 a1 a2 a3

The covenant begins with no depositors (all leaves start as the hash of 0) and a dust balance. A user deposits 10,000 satoshis, receiving a CashToken (recorded at a0) secured by a single key. Then the CashToken is “notarized” to prove it is a descendant of the mint transaction. Once notarized, the owner transfers the CashToken to a 2-of-3 multisig for stronger security. From there, the owner transfers the token to a different owner (now single sig). Finally, the new owner redeems the CashToken with the corporation, withdrawing their 10,000 satoshi deposit, and replacing a0 in the tree with the hash of 0 (making the leaf available to other depositors).

(CashToken Demo)

Andrew, in response to you saying that there is no extra complexity for the UTXO set in GROUP (or did you mean OP_GROUP?), then maybe you can explain how a miner knows the ‘tokenid’ of an output.

Please link to some actual documentation of your proposal that helps readers understand.

You linked above to a unit test, but that only has an OP_GROUP, and in interpreter.cpp the OP_GROUP implementation is missing (it doesn’t do any checking). I can’t find any details on how this is supposed to work.

Please help us understand why you claim that the UTXO will not have any negative scaling impact from one or both of your group/op_group proposals.


@andrewstone pinging you about my post above, you likely never got a notification about the question :slight_smile:

@tom You said this:

The OP_GROUP and maybe the GROUP one too (I mirror Jasons request for some forum or documentation on it) have as a rather severe problem that they add a second 32-byte identifyer to every UTXO entry. As the current lookup is just a single 32-byte entry (and maybe a output-index), this is a rather large difference that will have an instant and negative effect on scaling.

There is no second lookup. The lookup (key) remains “just a single 32-byte entry (and maybe a output-index)”. Once a full node has retrieved the UTXO, there is additional data (which is part of the script) which is the group id. A miner never needs to look up UTXOs by Group ID. Miners just look up the UTXOs in a transaction the normal way…

There is lots of extensive documentation, which has been linked many times. If you are serious about a re-assessment, I will dig it all up and link it again.

The OP_GROUP implementation is exactly correct in interpreter.cpp. No checking is needed in the script machine. The OP_GROUP opcode simply identifies data that is treated in a special manner.

To implement “first class” tokens, group enforcement cannot be part of script execution… that would give script authors the option to enforce token quantity or not. Instead, quantity is guaranteed across a transaction, just like it is for the native token (hence the name “first class”), unless the transaction has spent mint or melt authorities

The code that enforces Group semantics is here: src/consensus/grouptokens.cpp · NextChain · Andrew Stone / BCHUnlimited · GitLab. Its the single function CheckGroupTokens and it returns “true” if the transaction is valid wrt Group properties, and false if its invalid.

1 Like

Please, do. What is the homepage for your project? I think it helps the discussion of native tokens if each project that has done work in this direction shares its online docs.

The point of having a ‘homepage’ is that blogs tend to get outdated as proposals change. A simple technical spec that doesn’t have any “this changed since last version” stuff in there is needed for people to actually understand the current version. Please help me understand.

You also link to src/consensus/grouptokens.cpp · NextChain · Andrew Stone / BCHUnlimited · GitLab which does a second lookup… Notice the passing in of the utxo view in the method.

Granted, it just does that to get the outputscript that is being spent, so this looks like something that could be fixed with the right architecture. But you wrote “group enforcement cannot be part of script execution”, is that the reason you didn’t manage to eliminate this expensive lookup?

Thank you for linking to the code, its quite a lot of it, I don’t think its simple. And from cashual inspection it looks a little like there is no way to set ‘amount’. Meaning that you need to burn satoshis to have tokens. That means they are colored coins, not full tokens. Did I get that right?

which does a second lookup… Notice the passing in of the utxo view in the method.

No, that is a view CACHE. There is no 2nd lookup, the data is accessed from the cache.

And from cashual inspection it looks a little like there is no way to set ‘amount’. Meaning that you need to burn satoshis to have tokens. That means they are colored coins, not full tokens. Did I get that right?

No, there’s a token amount: tokenGrp.quantity

Thank you for linking to the code, its quite a lot of it, I don’t think its simple.

Not much code is simple when you casually glance at it. Its an entire token implementation in 1 function that at its heart really is a simple set of “for” loops that are tallying up grouped quantities in inputs and outputs and then comparing them. I’m not a miracle worker.

The version I linked also implements subgroups, authorities, and script templates (covenants), so complexity beyond simple tokens exists. If BCH doesn’t want those features (or wants a phased implementation) then a lot of code can be removed.

1 Like

If you have it cached, thats great. But that is an implementation detail. Something that can (and probably will) change as block-size grows.
Notice it doesn’t bypass the hotzone.
This may not introduce a scaling bottleneck, but your code proof of concept absolutely does.

Thank you for providing many more details, please get back to us on the documentation of your ideas!

I don’t know what you are talking about with terms like “hotzone”.

This is a perfect cache (nothing is automatically removed). The data is guaranteed to be in it once its been loaded. This is the normal behaviour of that object which is part of the Satoshi codebase so I think you ought to be familiar with it. This cache is loaded during normal transaction validation. So that’s the 1 lookup needed. It will not overrun memory because the cache is exclusively for the inputs and processing of this one transaction.

This may not introduce a scaling bottleneck, but your code proof of concept absolutely does.

It does not. Full node evaluation time for a transaction before Group Tokenization is C * O(N) where N is the size of the transaction+prevouts (and C frankly is quite large given the fairly careless approach to serialization existing in much of Satoshi’s code base). In other words its linear with the size of the transaction+prevouts. Group tokenization adds a few more passes over the transaction+prevouts data, so adds a small constant to C, i.e. the whole process is (g+C) * O(N).

These additional passes to implement Group are not even needed if one desired to make a very efficient implementation. The “Group” information can be collected during the normal tx validation. However, I wanted to isolate the Group logic into its own function for clarity and ease of analysis so in this implementation of Groups, this code does its own pass over the transaction data.

I don’t know how many times I can say the same thing and yet you still disbelieve me. Rather than make repeated incorrect assumptions that things were implemented stupidly, you could ask questions or study the code. But if you are simply going to disbelieve what I say, then asking me questions will not help you; you’ll have to study the code.

1 Like

The part that is protected by a memory-lock. Typically a mutex. Adding a hotzone in a codepath lowers its ability to gain from multi-threading.

Perhaps, I’m a bit worried as your checking code clearly does call methods that lock mutexes on UTXO classes. You may be right that this is not an issue.I can’t tell without testing.

The longer term scaling of BCH is essential and we all know from experience and many years of testing that to not shoot ourselves in the foot we must not accept consensus-changes that destroy our ability to massively multi-thread transaction validation and we also that we must avoid taxing the UTXO more. Because the utxo db is the one bottleneck in our wish to multithread validation.

I’m a bit sad that you react so dismissive and in a “trust me or read the code” manner about this. I pointed out how there are worries and where specifically they are and what the goals are (I repeated the goals in the above paragraph).

I want to remind you that it is up to you to convince the rest of the ecosystem that your solution is something that we want to reimplement in our own nodes, that this is a solution that wallets want to write code for and this is a solution that miners want to burn cycles on. Without your push, nothing happens (and history shows me right, right?)

And I don’t want you pointing to code that (as shown above) causes more confusion. Why are you unwilling to document the basic concepts? If you can’t be bothered to put up a homepage with a clear description of what it is you are selling then how on earth do you think you can convince all those people to invest time in it?

I mean, it might be cool, but with your attitutde I’ll never learn more about it.

1 Like

@Tom I’m sorry to be a little short. If you are truly interested in Group tokens and pursuing first class tokens for BCH I’m super happy to help.

Its just that in my imagination of how this might work, we’d first “all” (or many of us) agree that first class tokens are a priority. Then the token authors would go off and prepare something. Then they’d present, and we’d pick one.

Instead what’s happening here is you are repeatedly making incorrect assertions that I need to run off and correct. This is inherently antagonistic, and I have to wonder if its worth the time because I don’t know your or Bitcoin Cash-as-a-whole’s commitment to adding first class tokens. Is this just idle chatter for you or do you believe that first class tokens are important for BCH? And in essence, wouldn’t it be better if you asked a question rather than make a claim?

1 Like

I see some problems with that approach, the main one being that one team preparing something isn’t the best way to get the best solution because this lets bad assumptions go on for far too long. Like the approach you took to making the code well to read, separated in one file, and you now see that I actually get worried about what may simply be implementation details with regards to the BU utxo implementation.

Instead what we are aiming for (and this is the basic concept behind the network discussions) is a shorter lead time. You have an idea, you talk about it with other devs, you have some dirty proof-of-concept, you present it. Small iterations and repeated feedback from your peers. Then when it gets big enough, you start to include the customers too in this feedback cycle.

This has several benefits, the most obvious is we challange assumptions one always makes about these things early on. Different viewpoints are good, the earlier in the design process the better. It also has the rubber duck benefit (Rubber duck debugging - Wikipedia). And naturally you can get the benefit of talking to people smarter that oneself.

These short loops is, incidentally, at the basis of most software development approaches, from agile to XP. They can explain better if you want to dive in.

This is equally frustrating for me, I want to understand the high-level first and you give me the implementation details, which are unique to BUs codebase! To get the actual high-level approach out of that is… difficult.

Notice that people are reading along, this thread got linked on reddit just yesterday. My personal opinion is indeed that native tokens (in general) are far superior than SLP, but the important part here is that our talking is seen by all and my worry about scaling is something many more people worry about. While native tokens are important, the money usecase is still the root of our chain. Because if bch-is-cash fails, the entire chain fails and it leads us down the road of extremely high fees like BTC and ETH.

And in the end all your work is for naught if we can’t convince the wider ecosystem that this design should be activated on the network. And that goes back to my earlier suggestion of how to approach this with smaller steps and more devs talking between themselves. When you explain to me (and those reading along) how it actually works and I understand it, it becomes easier to support your work. And soon you have a growing wave of support with many people willing to help. See my writeup of The story of how BCHs 2020 DAA came to be. on how this helps immensely towards actually activating something like group.

The fact is that Jason started doing this, and in mere weeks has more support than you. Because devs won’t support a proposel they don’t understand.


Hi all, I’d like to help here.

I will do my best to understand Group Tokens, and then to present them in a way that Tom requires. Any pointers to where I can find the relevant documentation would be helpful.

How I see it, Tom cares about the process a lot and refuses to dig deeply into content if it’s not presented according to the process. That’s fine and understandable. We’re doing peer-review, but we need to fit it into a form so peers will want to do the reviewing. So there are 2 ways to go about this: nag Tom to look into it even though it’s not presented in a way that he requires, or help Andrew present it according to spec. I’d like to help by helping Andrew present it. I started some discussions on Reddit and that’s all good to bring attention but not good to package it into the form for the process and help move things forward here.

I will likely need to bother people to explain to me things that need explaining, and I’ll find channels to do that where it doesn’t increase the noise here.

Here’s what I have so far.

Somebody has to do the math for tokens to function. I believe it would be better for the whole ecosystem if miners did it. Do we want such tokens, or not? Because if we don’t want miners to do some arithmetic in a scalable way, we’re stuck with SLP, and that isn’t really taking off, because it lacks competitive advantage and has other issues do to with hacky way of implementation which couldn’t be avoided given the historical context. Now we are here and we could move forward. SLP now competes with other blockchains, so what would be the problem if it also competed with a solution on the same blockchain, and if that solution were better than other blockchains? The ecosystem would benefit, and maybe we’d attract more users, adoption and talent instead of having them build elsewhere.

This is not a technical argument, but still an important one. Do we want it? Why do we want it? Who will have to “pay” for it? Is the price acceptable for what we’d be getting? What do the miners think about paying that CPU price? What about opportunity cost of not implementing it? Anyway…

Below is how I addressed some concerns on Reddit. I think that’s a start and I hope I got it right and readable for the laymen, but if not please correct me.

Processing a BCH transaction today can use CPU time that is bounded by a linear function of transaction size.

Linear function scaling would continue to be the fact, even if Group Tokens were introduced. Right now I take Andrew’s claim(s) at face value, but it would be nice if others would verify it at this stage, and it would be a must to verify were it to be included in the HF.

A peer-review process. But we have to get the peers to review. Seems like Andrew could use some help motivating his peers to review, or presenting his work in a reviewable way. That’s what prompted me to get the ball rolling.

This includes database operations, since each of these can be performed in linear time and the number of these is bounded linearly by the size of the transaction.

Isn’t it bounded by the number of outputs, though? Size can mean different things. SLP tokens add to the size of transactions, do they not? Anything in the OP_RETURN adds to the size, and it’s not been a problem even though anyone’s allowed to put whatever they want in to increase the size of transaction, even now.

Thing is, with OP_RETURN, miners don’t have to do anything with that data, so it only increases the size in kilobytes of data passed around, it doesn’t add cycles. More transaction outputs, linearly more time. More outputs, linearly more time. Bigger outputs because more data in each, less than linear. Why? Because miners have to doSomething() with every output, whether it contains a token transaction or not. That doSomething() is quite big, they have to verify the signatures, perform crypto math etc. Every output increases the number of times a miner has to call doSomething() by 1. Adding Group Tokens doesn’t increase the number. It adds a little basic bookkeping math inside the doSomething() function. It piggy-backs on something miners have to do anyway, and then when all the doSomething() is finished checks the signatures + this group token running total and says whether the TX is valid or not.

Every Group Token TX is also a BCH TX because it has to pay the fee. This is no different than some TX which amounts to 0 BCH and has something in the OP_RETURN. It spends the entirety of inputs for fees and gets that piece of data written on the blockchain. The only difference is, it doesn’t ask the miner to add a few numbers to check whether the data in the OP_RETURN makes sense. Now, SLP users/nodes do that math.

Somebody has to do the math for tokens to function. The argument is - it would be better for the whole ecosystem if miners did it.

Processing time would continue to be bounded by a linear function, with slightly altered slope. That’s it. That’d be the cost of processing, if my understanding is correct.

Moreover any changes should not significantly increase the size of the UTXO database.

Why it shouldn’t increase it just a little, though? And how do we define what is little and what is significant? We aren’t the stakeholders here. It’s not our CPU and RAM that will have to process this, so we should really be asking nodes & miners that.

Would you work just a little extra to have the best simple token operations on the market? That’s what we have to ask nodes & miners. There are always trade-offs. Will refusing to take a little sacrifice now prevent BCH from achieving its potential? There’s this opportunity cost involved, which grows every day we’re not taking action. Users will enjoy the benefit for “free”, we’re not the stakeholders. All we have to do is pay the fee if we will want to use it. Users will also enjoy the benefit of adoption by both other users and new developers who may come to use our first class tokens. And having proper tokens should help there. They will all pay for the services provided by our blockchain through BCH fees.

There are two limiting possible UTXO implementations: a “fast” one that keeps database entries in RAM hashed by identifier; and a “cheap” one that stores database entries on random access storage.

We’re far behind the hardware. Changing the slope of linear scaling won’t make us get ahead of the hardware just like that, if ever. Maybe this could be argued better but I’m not equipped with arguments right now.

Any need for locking adds huge difficulty to the designer of node software.

Agreed. Thing is, it looks like Group Tokens aren’t locking, at least that’s what I see from recent discussions. If when you process each output you have to doSomethingWithOutput() then this function can be executed for each output in parallel. Then, you have to tally the outputs of transaction, and again we add a little math to doSomethingWithTXes(). This one has to wait for all output processors to finish, which it has to do anyway because it has to tally BCH balance of the TX. So no extra locks there, either.

Where are the locks?


So I’m working on something higher level, and in the spirit of this guideline I invite anyone interested to get involved!
Here’s the working doc

1 Like

@andrewstone if you’re willing to dig it up, I’d love to see what I missed. I want to prepare some kind of compendium about group tokens… here’s a start


2017-10-16 Intro

2017-11-19 BUIP

2017-11-22 Chris Pacia’s explanation

2018-05-21 Group Tokenization document

2021-01-27 Jonathan Toomim comments

2021-02-03 Jason’s thread

2021-02-12 BU Poll

2021-02-13 George and Andrew Interview

2021-02-13 We want first class tokens on BCH! Part 1/N, motivation

2021-02-15 We want first class tokens on BCH! Part 2/N, scalability, stakeholders

Looks like I missed quite a bit of discussion over Valentine’s Day weekend! Going to try to compress responses into one post:

Group Tokenization Needs a Spec

@tom and others have been extremely generous with their time, deeply reviewing this topic and several other proposals in this forum.

I don’t think this is about process: Group Tokenization simply does not have a specification. It has a 43 page, mostly-prose, Google Doc with no public edit history and very little provided rationale about specific technical decisions. (Edit: In fact, I think it was edited as I wrote this comment? it is now 44 pages.)

Huge segments of the Group Tokenization proposal don’t seem to be formally specified at all: “group authority UTXOs”, authority “capabilities”, “subgroup” delegation, the new group transaction format, “script templates”, and “script template encumbered groups” – all of these are described in prose, but the reader is left to guess about important details. In my review, I tried to assume the best in each case, but I can see why others find that frustrating.

After this discussion, I was reasonably convinced that “native” tokens could be implemented without negatively impacting scaling. I’m not yet sure whether the latest Group Tokenization proposal does so successfully, but we’ll see once a draft specification exists. (I think earlier versions of the proposal did impact scaling, but I can’t verify without a history of spec changes.) Regardless, a complete specification would be the best way to put all these fears to rest.

@andrewstone: would you consider developing an “implementation-ready” specification like PMv3? We need some concise, details-only document of the precise changes and any relevant test vectors. It’s valuable to include rationale, but please provide it in a truncatable “Rationale” section at the end. I think it’s also very important that the specification be source-controlled in a Git repo.

Unanswered Concerns

We still don’t have answers for some of the concerns I mentioned at the beginning of this thread:

So, @andrewstone:

  1. How are you confident that the current Group Tokenization proposal is “complete” and doesn’t need to first be tested in a “userland” system like CashTokens? Or do you expect future upgrades can correct any deficiencies we discover after this Group Tokenization proposal is rolled out?
  2. Can you provide any full example of “group tokens” interacting with covenants?

On (2), I posed a small challenge a couple of weeks ago:

I’m asking this again because in reading your recent comparison of Group Tokens and CashTokens on Reddit, you still seem to think “CashTokens” is an alternative to Group Tokens. I want to suggest: the most interesting use cases allowed by parent transaction introspection (“CashTokens”) are not possible to achieve with Group Tokens. Group Tokens need parent transaction introspection too. I think if you try to implement this covenant (as might be used by a side-chained prediction market) you will find that you need a solution like hashed witnesses.

Unless someone can demonstrate how Group Tokens might avoid the need for hashed witnesses, my current preference is that we first implement a smaller change like PMv3 in May 2022. This would give end-user-developers complete flexibility in implementing creative new token designs at no risk to the network. (And PMv3/hashed witnesses are important for use cases other than tokens, so they will remain valuable regardless of whatever “token solution” ultimately sees the most adoption.)

After some miner-validated token standards emerge on the market, we could eventually choose one like this Group Tokenization proposal to “bless” as the standard (e.g. in May 2023). With good real-world examples, we’d be able to more effectively evaluate features and tradeoffs. It seems premature to optimize specific token use cases – using new, permanent consensus changes and data structures – before we can quantify the value of those optimizations.


That “spec” doc is more than a spec. It is a roadmap with multiple options considered and full of use examples etc.

The step 1 of the roadmap is simple, it involves a new opcode OP_GROUP with simple verification requirements. Yes, it writes that an older spec required to look around but current one does not, so I think it scales well and avoids problems toomim was talking about.

Actual manipulation of the tokens is achieved by cleverly using the spec above, it’s not part of the consensus touching spec, but it will be a part of userland spec, an user manual. How to do X? How to mint, how to melt, how to change authority, etc… that all happens in userland. OP_GROUP simply enables such solutions.

The most basic implementation is so simple it’s beautiful and I think that’s confusing everyone, because we don’t know what we’re discussing, step 1 or something down the line which may or may not come after step 1. That’s where a spec for only the step 1 will come in handy, which I volunteer to write when I gather enough knowledge.

From previous correspondence and reading how group tokens are supposed to work, you just attach a special message to each output it’s so simple it’s beautiful… because it uses bch for double-spend checks because every group tx is also a bch tx, then when it’s time to check the tx balance you piggy back on those reads writes and deletes you have to do anyway for bch. Once you’ve read a group utxo you add the group amount to some cache ie running sum(s), once you’ve deleted it, you subtract from the sum, and at the end you check the balance of both bch and whatever group tokens you encountered and give a yes or no for the tx. It’s as simple as that. There’s no extra loops or looking at other TXOs. All the magic happens inside existing loops we use to check bch balance. I’d be happy if Andrew could confirm my understanding.

Now, if you want to be a group token block explorer, only then you have to index the tokens, but that can be done separately, that’s userspace, not node/mining job. Can’t have tokens and not have them. Question is - who does what part of the job to have them? With SLP it’s all userland. With group tokens, miners will do a tiny but important part: basic accounting equation.

1 Like