CHIP 2021-05 Bounded Looping Operations

cculianu · December 14, 2024, 1:26pm

Any need for something to break out of loops early? Like OP_BREAK?

bitjson · December 19, 2024, 6:26am

Ok, thank you all for your patience here, the CHIP is updated:

The Repeated Bytes counter is eliminated (the VM Limits CHIP already sets much stricter limits than Repeated Bytes ever did)
Links to implementations and test vectors
now includes a changelog, with this version marked as v.1.2.0.

Also updated in Libauth and Bitauth IDE’s next branches (live already on https://ide.bitauth.com) and the test vectors and performance benchmarks have been updated/expanded.

bitjson · December 19, 2024, 6:38am

It should be possible to rewrite any program that uses OP_BREAK (and/or an OP_CONTINUE) into something using e.g. OP_IF/OP_ELSE/OP_ENDIF. So OP_BREAK wouldn’t enable new use cases, and it would only save a few bytes per loop here and there.

Most opcodes we’ve added so far are enabling new use cases and/or saving hundreds/thousands of bytes, so comparatively it’s hard to justify taking up additional codepoints to save a few bytes in a small subset of contracts. (Though I wouldn’t necessarily be against OP_BREAK and/or OP_CONTINUE if someone were to demonstrate their value.)

I’ll add a mention in rationale, thanks!

cculianu · December 19, 2024, 10:04am

True. Code points are very limited and if we use them all up we are forced to go with 2 byte code points … which is also potentially wasteful.

AsashoryuMaryBerry · December 26, 2024, 12:22am

Small issue:
There’s still a reference to a Repeated Bytes counter under “Technical Specification”

bitjson · December 29, 2024, 4:21am

Ah – just removed that outdated reference, thank you!

cculianu · January 27, 2025, 3:46pm

~~What influenced the choice of naming? I see that in other forth(s) out there, it’s DO.... LOOP and/or other keywords are used, never BEGIN....UNTIL.~~

~~My problem is with the choice of OP_BEGIN which I feel is not indicative of a loop.~~

~~In older languages like pascal, BEGIN just meant the start of a code block (like the curly brace { in C or JavaScript).~~

~~I wonder whether OP_BEGIN is unexpected as the keyword for the start of a loop … since it defies expectations as compared to other languages… perhaps?~~

~~Maybe OP_LOOP… OP_UNTIL would be a better choice of names?~~

~~Or, do you envision re-using OP_BEGIN to signify some other construct in the future?~~

cculianu · January 27, 2025, 3:46pm

Hmm. Nevermind. I actually see there are some forth(s) out there that do exactly BEGIN UNTIL. So this does keep with the tradition of at least some forths. See: Simple Loops (Gforth Manual)

I rescind my previous comment.

MathieuG · February 24, 2025, 7:21am

I want to add to the discussion that loops would make array-datastructure abstractions much more useful in high level languages compiled to BCH script

‘for each element in array do x’, currently variable sized array abstractions are not really possible.

IMO the usecases in the CHIP should be updated, different people seem to have the understanding that this functionality isn’t really required yet by contract authors in practice - i strongly disagree but the CHIP is very much lacking in this regard as the use-cases are described in a very abstract way and haven’t been updated since 2021…

MathieuG:

it would be good to document the practical needs for “Simplifying Repeated Procedures” like merkle proof validation and “Aggregation” like aggregating value of all inputs. This way we can better understand its practical relevancy.

For example the workaround for a person adding multiple UTXOs to a smart contract and the contract aggregating the input values would be a simple consolidation beforehand. We already see this type of “consolidate UTXOs” button used for the Paytaca browser extension. Only In the case where the added inputs are controlled by different parties would this not be possible.

Any practical & pressing needs for merkle validation would also be good to be documented. Merkle root validation was an important part of pmv3 tokenization but luckily that complexity was avoided by the CashTokens CHIP which replaced it.

It would also be good to refer to the native introspection CHIP to emphasize that aggregation was expected to be delivered in this way:

While it would be possible to define aggregate operations like OP_TXINPUTVALUE and OP_TXOUTPUTVALUE to make this derived state accessible to all contracts, this proposal considers standardization of any derived state to be premature

now there’s also aggregation of CashTokens in the in- and outputs so it saves having to introduce atleast 4 different opcodes

Further it would become possible to construct tx.hashOutputs and tx.hashPrevouts like we had before native introspection. This could be useful in case a multi-step contract can commit to later payouts, like demo-ed in my ‘upgraded sha gate’ contract

BitcoinCashPodcast · April 19, 2025, 2:20pm

Seems like an obvious include to me. Every programming language basically has/needs loops, I don’t know why BCH script would be any different. As long as it’s limited by VM Limits & fees (same as everything else) then I can’t see a problem.

I agree with Mathieu that some more examples with CashTokens would be helpful.

If we need examples of use cases, consider that the BLISS Jessica/Velma CashTokens will have additional functionality (eventually), and we’re going to need to loop over them.

Loop over BLISS CashTokens
if (committment) <= 100 then --do Jessica stuff--
if (committment) <= 200 then --do Velma stuff--
if (committment) <= 300 then --do ...2026 upgrade mascot stuff--

That kind of thing.

MathieuG · April 20, 2025, 7:47am

Regarding the usecases I’d like to add that looping operations are especially useful to have variable length Array types in high-level languages like CashScript. Then you can do operations for each element in the array.

With recent discussion of BitCann to create a domain name system on BCH, an issue has been a way to enforce UTF‑8 hex strings only contain allowed characters. Ideally this would be done with a looping construction performing checks for each character of the UTF‑8 hex string.

As said higher loops also allow to emulate math constructions like exponentiation for variable exponents.

I think loops would be especially useful for implementing ZK-cryptography operations in script

bitjson · May 2, 2025, 4:48pm

Thanks for the comments everyone! Just a bump to note that the CHIP now links the BCHN implementation and directly commits the updated test vectors (now covering a few thousand more cases and benchmarks )

bitjson · June 30, 2025, 4:05pm

After chatting with other developers who are reviewing this CHIP, once people get the hang of Forth-style loops, they quickly notice the artificially heavy “cost” of each instruction vs. actual compute time.

The oldest snippet in the CHIP (from 2021) is actually the perfect example:

<0> <0>
OP_BEGIN                            // Loop (re)starts with index, value
  OP_OVER OP_UTXOVALUE OP_ADD       // Add UTXO value at index to total
  OP_SWAP OP_1ADD OP_SWAP           // Increment index
  OP_OVER OP_TXINPUTCOUNT OP_EQUAL
OP_UNTIL                            // Loop until index equals the TX input count
OP_NIP                              // Drop index, leaving sum of UTXO values

This loop steps over each UTXO and sums up their values. Literally just adding small numbers in a tight loop that all of today’s node implementations power through many times faster than the same “cost” of e.g. hashing or signature checking.

However, the “Base Instruction Cost” of 100 means that every single pass over every single instruction – even unexecuted instructions inside OP_IF/OP_ENDIF blocks – costs nearly half of a SHA256 hash iteration.

In practice that means this chunk of bytecode only “pays for its own compute” up to ~10 UTXOs. Base instruction Cost absorbs almost all of the budget, with a tiny sliver taken by the addition itself.

Of course, this was a careful choice in the VM Limits upgrade: Rationale: Selection of Base Instruction Cost. In short: even if all of our review missed some specific performance issue before the May 2025 upgrade – in any one of the node implementations – the worst impact couldn’t magnify attacks by greater than 10x.

Since our measured safety margin was 10x to 100x, the base instruction cost of 100 was conservative enough to give us a complete, additional layer of protection – between 1x and 10x more conservative than the specific-costs layer – even if our review had missed a critical vulnerability.

Anyways, all this to say: I think the loops CHIP should also reduce Base Instruction Cost another level, from 100 to 10, to be more inline with actual performance.

For context, note that BTC’s Taproot upgrade established an implicit Base Instruction Cost of 1 – meaning 100x lower than ours – such that performance issues are much more likely to become real DoS vectors, especially in alternative implementations (not to mention the technical debt that would be incurred to support loops or functions).

In our case, even though the 2025 upgrade is live and no further performance issues were found, I still think it’s wise to iterate slowly – lowering to a base instruction cost of 10 retains at least a 10x margin of safety over the BTC VM, and wouldn’t impact our current 10-100x margin of safety (see the benchmarks here – various fast-instruction-evaluation benchmarks on BCHN are so fast that none made the worst-cases graph; you’ll have to look at the CSVs.)

Why should the Loops CHIP lower Base Instruction Cost?

I want to note that “lowering base instruction cost” is very specific to the loops CHIP: it’s doesn’t really make sense unless you’re doing cheap, tight iteration, and the only way 99% of contract developers will encounter this unusual behavior is when using a simple loop to do something they think should be very cheap (and they’re right) – e.g. summing the value of more than 10 UTXOs.

In general, we don’t want tooling like CashScript or AlbaDsl to have to deal with this weird edge case (which would look like this), particularly since it’s primary, network-defensive purpose has passed.

On a more practical level, it’s also not reasonable to thoroughly benchmark a base instruction cost “lowering” without loops. With loops, we can simply benchmark that every single instruction inside a tight loop still costs much less than hashing (which again, is itself far, far faster than signature checking), and we’re literally doing that benchmarking this year: as part of the Loops CHIP.

I’m leaving the cost at 100 for now in Bitauth IDE’s 2026 testing preview, so you easily see for yourself how silly the limit is with various kinds of fast loops.

Sorry for the length, and sorry I haven’t summarized all this in the CHIP yet. (PRs welcome!)

I’m trying to wrap up a different (post-quantum) project before taking some personal time.
(Note that the post-quantum project isn’t related to this change, and doesn’t need a lower base instruction cost. The primary benefactors of this change will be new developers and tooling maintainers, who won’t have to learn or develop workarounds in these edge cases.)

For now, I just pushed the one-sentence spec change here: https://github.com/bitjson/bch-loops/commit/9c1f55d9625e210c9c6eff113c3fa34049b586a5

Any comments or feedback appreciated, either here or in the repo. Thanks!

tom · June 30, 2025, 8:21pm

Please keep separate proposals for very different components in separate CHIPs.

Changing the cost of the administrative system affects ALL of the opcodes, as such it would be confusing to mix those different concepts with this proposal.
People that don’t follow this chip would be very surprised at the cross-over effect elsewhere.

As we see the same issue with an unrelated chip having a proposal for changes in the amount of bytes allowed in a token utxo, I’d argue a simple “update misc limits” would be good to combine them and not have a load of chips.

BitcoinCashPodcast · July 1, 2025, 8:28pm

This is a very good idea. The changes from P2S outside the P2S part, and the loops cost make a nice package to provide as a separate VM Limits cleanup related CHIP.

MathieuG · July 2, 2025, 7:08am

Great to see this very practical consideration being taken into account! Indeed if summing the values of 10 inputs causes you to run into the VM limits then this defeats much of the practical purpose of loops in the first place!

Even though the CHIP has been first proposed in 2021, it’s this interaction with the VM limits that defines how usable loops in script will be in practice.

Totally agree on this point, and I think it makes total sense for it to be part of this CHIP.

Indeed the last thing we want is for the average CashScript dev to have to become experts at budgetting ‘opcode cost’ to even be able to use loops normally inside their contracts. Up to now the VM limits have been in the background as protection against excessive resource consumption by attackers, we want to keep it this way… in the background.

The analysis in the VM limits CHIP makes it clear this base cost was chosen with orders of magnitude of safety margin. Loops changes how contract developers would run into these limits and thus it makes sense to revisit this parameter as part of this CHIP, the alternative is to have pretty much all developers who’ll use the looping operations to complain about this base cost (and by extension the way loops was introduced/designed).

I’m somewhat frustrated by the immediate bikeshedding responses (about structure instead of contents) which totally ignore how fundamentally both aspects are related.
The VM limits chip was always clear that this base cost was expected to be lowered over time, it was designed with this in mind:

Because base instruction cost can be reduced - but not increased - in future upgrades without invalidating contracts, a base cost of 100 is more conservative than lower values (like 10 , 1 , or 0 ). Additionally, a relatively-high base instruction cost leaves room for future upgrades to differentiate the costs of existing low and fixed-cost instructions, if necessary.

Kiok46 · July 2, 2025, 12:27pm

Indeed the last thing we want is for the average CashScript dev to have to become experts at budgetting ‘opcode cost’ to even be able to use loops normally inside their contracts. Up to now the VM limits have been in the background as protection against excessive resource consumption by attackers, we want to keep it this way… in the background.

Completely agree with this, I think it makes sense to adjust the base cost in this chip.

I’d also like to further emphasize the importance of this CHIP. Beyond all the other applications, existing contract systems would have been much smaller and better designed if loops had already existed.

BitCANN, OpenCashDAO, Power Function, accumulation contracts, and loop’s future use in verifying proofs in ZKPs are just a few examples. It would be great/essential to have this CHIP included in the May 2026 upgrade with adjusted base costs.

Kiok46 · August 14, 2025, 12:28pm

Another use case for loops is in Cross-Input Compression. I have prepared an example for a CPMM (Constant Product Market Maker) where one contract is responsible for enforcing logic of all the other CPMM contracts used within the transaction while maintaining the flexibility to be used in any order with other contract systems.

im_uname · August 15, 2025, 3:46pm

I was just made aware of the lowering of base instruction cost. Bundling this in a widely-known-to-be-desirable loops proposal is a terrible precedent to continue (the previous instance being SIGHASH_UTXOs), and I would like to voice my opposition to it.

The opposition is not based on the content of the amendment itself - it sounds fairly desirable, and likely will stand on its own. But the practice of bundling is bad, and will lead to further bad things that omnibus bills tend to do in modern politics.

bitjson · August 15, 2025, 5:43pm

Thank you all for the comments!

There are clearly mixed feelings about whether or not the base instruction cost change should be in its own CHIP, so let’s break it out – @cculianu has kindly agreed to break it out into a separate CHIP and send a PR, and I’ll be happy to help contact stakeholders for lock-in separately from the Loops CHIP.