CHIP 2021-05 Bounded Looping Operations

bitjson · May 28, 2021, 6:20pm

The Bitcoin Cash VM is strictly limited to prevent maliciously-designed transactions from requiring excessive resources during transaction validation.

Loops were originally excluded from this design on the mistaken assumption that they risk consuming excessive validation resources. In reality, a small number of VM operations make up the majority of resource usage during transaction validation, and any implementation of bounded loops can be many orders of magnitude faster than the most pathological VM bytecode constructions.

Introducing this basic control flow structure would make BCH contracts significantly more powerful and efficient.

This proposal includes two new BCH virtual machine (VM) operations, OP_BEGIN and OP_UNTIL , which enable a variety of loop constructions in BCH contracts without increasing the processing or memory requirements of the VM.

The precise implementation is influenced by the BEGIN ... UNTIL indefinite loop construction in the Forth programming language (upon which the rest of the Bitcoin Cash VM is based). This particular construction offers maximal flexibility: definite loops can be easily emulated with these indefinite loops, but indefinite loops could not be emulated with definite loops.

Review and feedback are appreciated!

proteusguy · May 29, 2021, 6:19am

I support this improvement. Pretty close to what I had argued for when Bitcoin’s forth-like VM was first announced. Bounded non-determinism rather than unlimited Turing completeness. No need for gas. All leaving this out did was create huge transactions for unrolled fixed loops and eliminate any chance for flexible but safe dynamic runtime logic.

freetrader · May 29, 2021, 10:59am

I see there are some example cases in the CHIP around how this may be used.

What I did not find mentioned (maybe I overlooked something) is mention of looping – possibly with nesting – of some expensive operations like signature verifications, possibly in some degenerate scripts which may be small in terms of script size, run fine (even within the bounds of REPEATED_BYTES) but take a long time to execute and hence pose a new risk if they are not easily spotted ahead of execution time…

At the moment we have a transaction’s size as a proxy for its complexity – sigops which skew this are already subject to separate regulations. This makes is somewhat easier to spot potential “troublemakers”.

I think loops could pose a difficulty for keeping script verification performance well scalable (i.e. open up new DoS vectors) if they are not screened with some care.
Certainly this seems like a juicy CHIP to review in terms of possible pathologies.

tom · May 29, 2021, 4:54pm

Doesn’t this solve the issues you described? SigChecks protection Protocol Upgrade

freetrader · May 29, 2021, 7:01pm

At first glance, I think it would solve it for signature checks, yes. Thanks Tom.

proteusguy · May 30, 2021, 10:31am

Yes counting the difference in position between the begin/until against total script size limits (which actually penalizes code within the loop that may perform if branches) is a vastly cheaper method of cost accounting transaction executions than gas. Elegant and simple to implement. This is a really powerful improvement to bitcoinscript that only has upside in terms of expressiveness and physical tx size savings.

bitcoincashautist · February 19, 2022, 3:15pm

Great work! This proposal is well thought out and allows much more utility without significantly increasing node operating costs, and it would be great if it could make it into 2023 or 2024 upgrade window. Clever way to limit the “damage” by essentially interpreting it as a macro, it can’t expand to more than the equivalent “manual” looping 10,000b script. Paired with hash counter of the other CHIP, I think we can unlock a lot of utility at negligible or even 0 expense.

MathieuG · June 18, 2023, 10:19am

I think bounded loops would be a very nice addition to the BCH VM!
Now that Bitcoin Cash scriptability has become much more powerful with CashTokens it would be good to document the practical needs for “Simplifying Repeated Procedures” like merkle proof validation and “Aggregation” like aggregating value of all inputs. This way we can better understand its practical relevancy.

For example the workaround for a person adding multiple UTXOs to a smart contract and the contract aggregating the input values would be a simple consolidation beforehand. We already see this type of “consolidate UTXOs” button used for the Paytaca browser extension. Only In the case where the added inputs are controlled by different parties would this not be possible.

Any practical & pressing needs for merkle validation would also be good to be documented. Merkle root validation was an important part of pmv3 tokenization but luckily that complexity was avoided by the CashTokens CHIP which replaced it.

MathieuG · July 5, 2023, 11:50am

It the research topic “The need for additional Math opcodes” @bitcoincashautist discussed how bounded loops can be used to emulate missing math opcodes like pow & log.

Further in the topic it was discussed how bounded loops could be used to emulate higher order division together with op_muldiv

So although there’s no pressing need for aggregation/merkle root validation, bounded loops enable emulating opcodes like pow & log and enable higher precision math division (uint126)

bitjson · December 13, 2024, 7:51am

Hi all, I’m re-proposing this for activation in 2026. No changes to the technical specification, just cleaned up background info and removed the note that it’s waiting on CHIP 2021-05 Targeted Virtual Machine Limits (now part of the 2025 upgrade )

CHIP:

I posted some discussion here:

https://x.com/bitjson/status/1867470832030855335

bitcoincashautist · December 13, 2024, 9:03am

Nice! It’s missing a note on interaction with VM limits: of course each executed opcode inside loops should add to the running opcode and hash cost and if it exceeds the budget the TX would fail.

Good approach to limit total unrolled size to 10k, so we can’t have a situation where a loop ends up executing 100k cheap ops and bloats the script while still staying under the total op cost budget.

bitjson · December 13, 2024, 6:43pm

Yes, thanks! Added an issue here: Drop `Repeated Bytes` · Issue #5 · bitjson/bch-loops · GitHub

We can actually drop the Repeated Bytes concept altogether, potential abuse is fully handled by VM limits, and there are plenty of important algorithms in the zero-knowledge proof and post-quantum world that will require more (but computationally cheap) instructions. (Wanted to get the other proposals out yesterday and didn’t want to make any technical changes to loops without a changelog and updated test vectors. Working on this next.)

cculianu · December 14, 2024, 1:26pm

Any need for something to break out of loops early? Like OP_BREAK?

bitjson · December 19, 2024, 6:26am

Ok, thank you all for your patience here, the CHIP is updated:

The Repeated Bytes counter is eliminated (the VM Limits CHIP already sets much stricter limits than Repeated Bytes ever did)
Links to implementations and test vectors
now includes a changelog, with this version marked as v.1.2.0.

Also updated in Libauth and Bitauth IDE’s next branches (live already on https://ide.bitauth.com) and the test vectors and performance benchmarks have been updated/expanded.

bitjson · December 19, 2024, 6:38am

It should be possible to rewrite any program that uses OP_BREAK (and/or an OP_CONTINUE) into something using e.g. OP_IF/OP_ELSE/OP_ENDIF. So OP_BREAK wouldn’t enable new use cases, and it would only save a few bytes per loop here and there.

Most opcodes we’ve added so far are enabling new use cases and/or saving hundreds/thousands of bytes, so comparatively it’s hard to justify taking up additional codepoints to save a few bytes in a small subset of contracts. (Though I wouldn’t necessarily be against OP_BREAK and/or OP_CONTINUE if someone were to demonstrate their value.)

I’ll add a mention in rationale, thanks!

cculianu · December 19, 2024, 10:04am

True. Code points are very limited and if we use them all up we are forced to go with 2 byte code points … which is also potentially wasteful.

AsashoryuMaryBerry · December 26, 2024, 12:22am

Small issue:
There’s still a reference to a Repeated Bytes counter under “Technical Specification”

bitjson · December 29, 2024, 4:21am

Ah – just removed that outdated reference, thank you!

cculianu · January 27, 2025, 3:46pm

~~What influenced the choice of naming? I see that in other forth(s) out there, it’s DO.... LOOP and/or other keywords are used, never BEGIN....UNTIL.~~

~~My problem is with the choice of OP_BEGIN which I feel is not indicative of a loop.~~

~~In older languages like pascal, BEGIN just meant the start of a code block (like the curly brace { in C or JavaScript).~~

~~I wonder whether OP_BEGIN is unexpected as the keyword for the start of a loop … since it defies expectations as compared to other languages… perhaps?~~

~~Maybe OP_LOOP… OP_UNTIL would be a better choice of names?~~

~~Or, do you envision re-using OP_BEGIN to signify some other construct in the future?~~

cculianu · January 27, 2025, 3:46pm

Hmm. Nevermind. I actually see there are some forth(s) out there that do exactly BEGIN UNTIL. So this does keep with the tradition of at least some forths. See: Simple Loops (Gforth Manual)

I rescind my previous comment.