CHIP-2025-08 Functions (Takes 2 & 3)

tom · August 13, 2025, 3:08pm

I’d love to see ANY usage of any version of the idea as published scripts to see what people are actually doing with this.
Very academic all this talking without any actual usage in real code…

For those that didn’t follow along in the beginning of the year, here the result of some of my research. An alternative we could talk about (not many did, however) that has some design requirements;

This is money we are making programmable, which means that there is a high incentive to steal and there are long lists of such problems on other chains. Draining a Decentralized Autonomous Organization is an experience we should try to avoid.

Only “trusted” code can be run.
The definition of trusted here is simply that the code was known at the time when the money was locked in. Which means that at the time the transaction was build and signed, we know the code that is meant to unlock it. To make this clear, with P2SH the code is ‘locked in’ using the hash provided ensuring that only a specific unlocking script can run.
Separation of data and code.
Subroutines are code, code can use data in many ways. Multiply it, cut it and join it. Code can’t do any of those things to other code.

One clear and easy to understand example of the dangers here is one where a script author may assume that the creation of 2 outputs will then result in those two being spent in one specific transaction. While the way that utxo works, this isn’t a given at all. But one of the more common misunderstandings on how stuff works.
As such a script may try to read data from another input and use it as code. Believing that to be safe.

If that code ends up on chain anyone can brute force a transaction that supplies just the right data in order to convince the script (that is now on-chain and immutable) that it can be spent. And take the money that was locked in that transaction.

This is easy to write and exploit, just a introspection to get a specific locking script from a numbered input, cut it and then op-define it and run it.

This is the main “thing” that people have worried about with regards to mixing introspection and “the stack” and functions. And what I understand the ‘executable bit’ is meant to solve. But, again, I don’t think it solves it very nicely and just hides the real problem.

cculianu · August 13, 2025, 3:13pm

See: Quantumroot: Quantum-Secure Vaults for Bitcoin Cash

He makes heavy use of the proposal in (1) and it’s not clear it would work as elegantly with proposal (2) or (3)…

ShadowOfHarbringer · August 13, 2025, 3:14pm

My idea is, that is why we can/will just remove the limitations after a trial period. And then we can have cool stuff like Quantumroot.

cculianu · August 13, 2025, 4:21pm

Really, we should just have everything working now so we never have to worry about this again… and so BCH is clearly lightyears ahead of the competition.

ShadowOfHarbringer · August 13, 2025, 4:24pm

Well I lack the technical competence to make the right decision, also evidence from more competent people I could use does not exist, so I will just drop this topic.

Maybe you can simply make the decision yourself?

Later, guys.

jimtendo · August 14, 2025, 12:52am

Thanks heaps for writing these up Calin. I’m leaning towards (1) still also. In some ways, I actually think losing the static analyze-ability (for lack of better term) might be a perk because it makes it difficult for miners to omit evaluating transactions they think might be computationally expensive which I think might make the VM Limits CHIP more reliable as an indicator of inclusion (but it’s very possible I’ve overlooked something here).

Just a question on Take II:

This flag (and thus the restriction on how OP_DEFINE can be used) may be removed in a future Bitcoin Cash network upgrade in order to explicitly allow “code that writes code”.

If we do remove this restriction, can you think of any situations where it might make an existing contract insecure? E.g. I write a contract with the assumption that fCanDefineFunctions exists and, once removed, it then makes the contract-system I’ve developed insecure/vulnerable? This would probably be my biggest concern, but haven’t thought through whether that’s actually valid. If it is, we might want to look at doing something like this in tandem with something like an OP_ENV opcode? ( Wider discussion, an OP_ENV for the VM upgrades - #9 by tom )

Jonas · August 14, 2025, 4:10pm

Here is a high-level example of what we could do with the original OP_DEFINE/OP_INVOKE (but not take2&3).

I could ask people to pay to a p2sh(32) address so I have a bunch of UTXOs that could only be unlocked with a redeemscript in the form of:

From the 0th output, push the locking bytecode to the stack (OP_0 OP_OUTPUTBYTECODE)
Split so the first bytes is removed and everything after byte 200
Duplicate the data on stack, hash it and compare to an hardcoded hash
Do OP_DEFINE on the top stack item
Run the code with OP_INVOKE

Output#0 that contains the code could have a output of the form:
OP_RETURN <200 bytes of bytecode> [...]

This means that I could spend all UTXOs in one transaction but only need to include the 200 bytes logic once.

Of course the code can be moved to a specific input and have even greater sizes.

cculianu · August 14, 2025, 4:28pm

Man this is so powerful. Combined with p2s where you don’t even need the redeemscript pushed it can make for some really compact txns that spend from contracts very efficiently indeed.

Thanks for the example.

Yeah to me the original Take 1 proposal basically stimulates the imagination towards new possibilities for clever ways to do smart contracts. It really unleashes the script vm to be a very flexible computational and logic system … what you just outlined is just an example of how flexible it becomes …

cculianu · August 14, 2025, 5:23pm

Indeed. Definitely could be a perk although I guess if you do a poor man’s taproot type setup you can hide some spend paths anyway even without the dynamism of code that composes code but yes… obfuscation can be a perk…

It’s possible that some strange contract that is kind of artificial and contrived would rely on the failure mode existing then when goes away yes bad things may happen.

But we sort of made it a policy of BCH already since at least 2023 to not rely on such failure modes as part of contract logic because we basically like to say current limits on things can always be relaxed.

Doubly so when we announce that’s the case ahead of time.

In this case I don’t think any real world contract would care…

But consider that already you could have relied in overflowing 64-bit math before 2025 and now after May your contract no longer fails in that failure mode … since we have 80k bit math now!

Op-env might be a good idea tbh… some day maybe. Esp if we do very radical changes … at some point.

That’s one avenue to liberate us from previous constraints in already locked funds that may or may not have made assumptions we are breaking for sure that’s a valid way to solve that problem …

jimtendo · August 16, 2025, 3:20am

Yeah, I can’t think of a real-world case where it matters and so long as we have this policy:

But we sort of made it a policy of BCH already since at least 2023 to not rely on such failure modes as part of contract logic because we basically like to say current limits on things can always be relaxed.

… I think it’s fine: Contract Devs should assume this might be relaxed in future.

I’m still leaning strongly with the original approach, but would compromise on Take II. So long as we keep the door open to allowing it in future (as I do think it’s a good capability and there will be some good uses for it, similar to Jonas’s examples), I’d be okay with it.

Thanks again for writing those up, really appreciate it.

bitjson · August 20, 2025, 12:22am

Hi all, I’m back from leave and catching up here.

Much of the past few weeks’ discussion about “code that writes code” misunderstands what is already practical on BCH.

Reenabling opcodes (2018) closed the 2011-era “static analysis” discussion for BCH

With regard to “static analysis”, please review Aside: the 2011-era “static analysis” debate from April 23 and Rationale: Non-Impact on Performance or Static Analysis in the CHIP since May 2. I’ll highlight an excerpt from the CHIP:

While the Bitcoin VM initially had few limits, following the 2010 emergency patches, it could be argued that the expected runtime of a contract was (in principle) a function of the contract’s length, i.e. long contracts take longer to validate than short contracts.

In practice, expensive operations (hashing and signature checking) have always dominated other operations by many orders of magnitude, but some developers still considered it potentially useful that contracts could “in principle” be somehow “reviewed” prior to execution, limiting wasted computing resources vs. “fully evaluating” the contract.

As is now highlighted by more than a decade of disuse in node implementations and other software: such “review” would necessarily optimize for the uncommon case (an invalid transaction from a misbehaving peer) by penalizing performance in the common case (standard transactions) leading to worse overall validation performance and network throughput – even if validation cost weren’t dominated by the most expensive operations.

The 2018 restoration of disabled opcodes further reduced the plausibility of non-evaluated contract analysis by reenabling opcodes that could unpredictably branch or operate on hashes (OP_SPLIT, bitwise operations, etc.). For example, the pattern OP_HASH256 OP_1 OP_SPLIT OP_DROP OP_0 OP_GREATERTHAN OP_IF OP_HASH256 ... branches based on the result of a hash, obviating any further attempt to inspect without computing the hash. Note that simply “counting” OP_HASH256 operations here also isn’t meaningful: valid contracts can rely on many hashing operations (e.g. Merkle trees), and the more performance-relevant digest iteration count of any evaluation depends on the precise input lengths of hashed stack items; the existence of hash-based branching implies that such lengths cannot be predicted without a vulnerability in the hashing algorithm.

Later, the SigChecks (2020) resolved long-standing issues with the SigOps limit by counting signature checks over the course of evaluation rather than by attempting to naively parse VM bytecode. Finally, the VM limits (2025) upgrade retargeted all VM limits to use similar density-based evaluation limits.

In summary, fast validation is a fundamental and intentional feature of the VM itself – critical for the overall throughput of the Bitcoin Cash network. Hypothetical “pre-validation” of contracts never offered improved performance, would have unnecessarily complicated all implementing software, and has been further obviated by multiple protocol upgrades in the intervening years.

We already have “code that writes code”

PSA: CashVM is Turing complete within atomic transactions following the CashTokens (2023) upgrade. This was a core motivation behind PMv3 and ultimately the CashTokens (2023) upgrade – see “Proofs by Induction” and this later CashTokens announcement post.

In fact, BCH was arguably Turing complete following the 2018 upgrade (and more-practically after 2019), but lack of introspection and details in OP_CHECKSIG's behavior caused transaction sizes to explode and quickly hit limits after a few iterations, see the illustration in “Fixed-Size Inductive Proofs”.

Anyways, with CashTokens we can now efficiently continue execution across transaction inputs. Inputs can use inspection to reference each other, and it’s very practical to create complex graphs both within and across transactions. Jedex (2022) included some of the first published examples.

So, PSA:

Bitcoin Cash is Turing complete.
It is already simple today to write “code that writes code”.
It is already simple today to write code that executes “arbitrary code” provided at spend time.

Here’s a simple, practical example of “code that writes code”. This contract executes “arbitrary code” – inserted by the user’s wallet at spend time – within a single, atomic transaction. Again, this can already be done on mainnet BCH today, without any 2026 CHIPs. (Contracts and detailed comments are here, published July 1):

In the Quantumroot Schnorr + LM-OTS Vault, we defer specifying the quantum signing serialization algorithm until spending time. This maximizes privacy and efficiency:

Vaults can support multiple serialization functions without publishing unused options to the blockchain, and
Signing serialization algorithms and other vault behavior can be upgraded locally (via software update to the user’s wallet) without sweeping any UTXOs.

To do this, the Quantum Lock contract 1) verifies that an appropriate quantum signature covers the quantum_signed_message, then 2) executes quantum_signed_message, a short script that commits to a specific transaction digest hash and then checks that the surrounding transaction matches it. This is particularly necessary because we cannot rely on the quantum-vulnerable OP_CHECKSIG or OP_CHECKDATASIG opcodes to extract a signing serialization, so we must construct our own via introspection to prevent a quantum attacker from malleating the transaction in the mempool.

One option to downlevel this contract for CashVM 2025 without modifying any functionality: we can setup the quantum_signed_message in a “function output” using an instantly-spent setup transaction (note this could be funded by a “quantum hot wallet” which doesn’t defer specification of signing serialization algorithm). The Quantum Lock output is then modified to accept an index in place of quantum_signed_message. To extract the offloaded signing serialization + commitment, the downleveled Quantum Lock simply inspects the P2SH redeem bytecode (OP_INPUTBYTECODE) at the provided index (or to save bytes in this particular case, sign the P2SH OP_UTXOBYTECODE directly, ignoring the P2SH template). Upon valid signature, the Quantum Lock contract succeeds, delegating validation to whatever arbitrary code is included in the “function output”, all within the same atomic “execution transaction”.

Some important things to note:

The “arbitrary code” is specifically executed within the same, atomic transaction, not in a setup transaction. It’s easy, for example, to place something like a Zero-Confirmation Escrow on the execution transaction, and we can even build this directly into a larger DEX or decentralized application in which external contracts or behaviors rely on the atomicity of the execution transaction.
The invoking contract is requiring the execution of “arbitrary code” based on the result of a computation. Here it first checked a quantum signature, but it could have just as easily OP_CATed-away, relying entirely on metaprogramming and equivalence: <snippet_a> <snippet_b> OP_CAT OP_SIZE OP_SWAP OP_CAT <index> OP_INPUTBYTECODE OP_EQUAL.
These “function outputs” today have to self-ensure their own spend-authorization and non-malleation to avoid griefing in the mempool (e.g. here it’s quantum signed). In fact, these function outputs “work” today without such protections, unexpectedly creating real, practical security and denial-of-service issues that will only become apparent when attackers notice that the contract’s author didn’t protect those outputs in the mempool.
Even though the Bitcoin Cash VM supports this use case today and real products can fully rely on it, the overall interaction is wasteful (in terms of transaction bytes, fees, and nodes wasting validation resources), and the setup transaction introduces the possibility of network latency or interruption causing poor user experiences. That resulting flakiness is fine for some products (e.g. primarily-online, async, decentralized applications), while it will appear to cause real-world lag in other products (e.g. products that have to work fast and reliably in-person). If end users are ever frustrated by the lag, “the BCH network” will be reasonably blamed as “slow” or “laggy”, despite the contract working correctly from the perspective of zero-conf systems. If we want BCH to be widely used permissionless money for the world, we should aim to alleviate such impediments to practical usefulness.

Hopefully someone will find all of this to be an interesting thought experiment because:

Loops CHIP would make the above discussion irrelevant

Confusingly, some stakeholders are supportive of native loops but simultaneously cite this “code that writes code” behavior as a concern for Functions CHIP lock-in.

Above I’ve demonstrated that we don’t even need loops to write “code that writes code” on mainnet today. Obviously the activation of native loops would make these positions even more logically incompatible.

Shoutout: TurtleVM – CashVM running inside of CashVM

Related: thanks to @albaDsl for demonstrating last month how practical it is to implement a CashVM interpreter on CashVM with the TurtleVm Proof of Concept. Very cool project!

Summary

Thank you to @cculianu, @bitcoincashautist, @Jonas and others who have dedicated time and effort to examining this “code that writes code” contention over the past few weeks.

The underlying issue is a misunderstanding: BCH already supports “code that writes code” today – and all other trappings of Turing completeness.

That’s a good thing: powerful contract capabilities make Bitcoin Cash more useful as permissionless money for the world.

bitjson · August 20, 2025, 12:35am

Also relevant to this topic: CHIP-2025-05 Functions: Function Definition and Invocation Operations - #20 by bitjson

CHIP-2025-05 Functions: Function Definition and Invocation Operations

Reminder: unsurprising functions are safer functions

Just a reminder from Withdrawing OP_EVAL :

CHIP 2024-12 OP_EVAL: Function Evaluation

At a much higher level: language-level function definition makes compilation or ports from other languages (EVM, WASM, JS, etc.) far safer and lower cost . Any deficiency in Bitcoin Cash’s function support (e.g. requiring stack scheduling, not allowing stack inputs, etc.) necessarily creates quirks and unexpected edge cases, often resulting in less-safe workarounds and harder-to-audit artifacts.

Attempting to be “clever” by adding restrictions on function definition locations, esoteric failure cases, demanding various kinds of pre-commitments, or otherwise tinkering with function definition cannot prevent contract authors from making “code mutability” or “code injection” related mistakes in contracts that work on BCH today.

Instead, unusual limitations on native functions are very likely to increase development costs and cause new, practical contract vulnerabilities by making CashVM diverge from the native capabilities of other languages in commensurately unusual/surprising ways.

It would be a shame if one of these so-produced edge cases were to set back the port of a decentralized application from another blockchain ecosystem, delay implementation of BCH as a target in a zkVM compiler, or create a denial-of-service vector in a ported contract due to faulty implementation of some unnecessary workaround.

Unsurprising functions are safer functions, and the Functions CHIP’s native, immutable functions are as boring and unsurprising as functions can be.

Jonas · August 21, 2025, 6:48am

Here is a more generalized (but untested!) version of the same concept

Imagine a redeem script of the form

OP_DUP
OP_HASH256
<hash of contract logic>
OP_EQUAL
OP_IF
  <1>
  OP_DEFINE
  <1>
  OP_INVOKE
OP_ELSE
  <0>
  OP_INPUTBYTECODE
  // .. Series of OP_SPLITS to get the data corresponding to the first element after the redeem script of input#0
  OP_DUP
  OP_HASH256
  <hash of contract logic>
  OP_EQUALVERIFY
  <1>
  OP_DEFINE
  <1>
  OP_INVOKE
OP_ENDIF

This will act as a wrapper where either the actual logic of the contract is pushed on the stack on the current input or is supplied on the first input letting someone spend one or many UTXOs of the same (long) contract but only need to supply it once. It’s possible to only have the 32 byte hash once in the redeem script by juggling that value on stack (but I did not do that for clarity).

BitcoinCashPodcast · August 20, 2025, 6:29pm

This is such a good post. I am still processing it, but it makes sense to me that since BCH is already Turing complete the “code that writes code” issue isn’t a Rubicon we’re crossing with functions, and that “static analysis” also isn’t a big need (as seen in BTC having far poorer tooling than ETH - raising available dev power and capability actually makes things SAFER because you get a bigger network effect & smarter devs who build better tooling with more funding in a bigger ecosystem).

Seems bullish for Functions.

bitjson · August 20, 2025, 10:48pm

Along with this:

I want to highlight again Use of Stack-Based Parameters from the CHIP (since May 23), specifically:

However, by preventing function definition and/or invocation from accepting data from the stack (i.e. directly using the results of previous computation), this alternative would complicate or prevent transpilation of complex computations from other languages (where no such limitations are imposed) to Bitcoin Cash VM bytecode. Instead, many branch table, precomputation table, and similar operation cost-reducing constructions would need to be “unrolled” into expensive conditional blocks (e.g. OP_DUP <case> OP_EQUAL OP_IF...OP_ENDIF OP_DUP <case> ... OP_EQUAL ), wasting operation cost and significantly lengthening contracts.

cculianu · August 21, 2025, 8:31pm

I deleted my post because my example was kind of contrived. Sorry for the noise!

bitjson · September 5, 2025, 12:03pm

Thank you to @cculianu, @MathieuG, and @emergent_reasons for discussions that prompted me to work on building out some metaprogramming examples, i.e. “function factories”.

Precomputed tables (see above excerpt from the CHIP) are the strongest example I’ve found so far.

Precomputed tables are useful for a variety of both financial and cryptographic algorithms, and there are two levels on which metaprogramming can reduce the resulting contract lengths:

Wallet-Side Precomputed Tables – tables which are precomputed “wallet-side” and encoded in the signed transaction.
Validation-Unpacked Tables – tables which are even cheaper to unpack during evaluation than they are to directly encode in the transaction.

To use a precomputed table, contracts can simply define a function which pushes the precomputed value at an expected index. (For multiple tables, use addition or bitwise operations to start the next table at a different function index.) When you need a precomputed value, invoke the index.

Wallet-Side Precomputed Tables

The intuition for this construction is that many small push instructions end up inflating transaction sizes more than longer pushes.

From Quantumroot “Appendix B: On CashAssembly”:

Given loops, longer stack items tend to be better for encoding: a 2,144-byte quantum signature only requires 3 bytes of overhead to encode with OP_PUSHDATA2, but it takes 67 bytes of overhead to push each signature component of a w=4 LM-OTS signature.

This effect is even more pronounced on common precomputed tables: without metaprogramming, a 256-item table (e.g. for opCost efficient bit-reversal permutation) wastes at least 1KB on encoding overhead. With metaprogramming, that comes down to ~10 bytes:

Without metaprogramming

<<0x80>> <0x01> OP_DEFINE // (encoded: 0x0201805189)
<<0x40>> <0x02> OP_DEFINE // (encoded: 0x0201405289)
<<0xc0>> <0x03> OP_DEFINE // (encoded: 0x0201c05389)
// ... (total: 256 entries)

Notice each line duplicates 0201, increments a number, and duplicates 89. That’s 4 bytes of overhead per table item, wasting of ~1024 bytes in overhead, in addition to the sum of the precomputed item lengths (here, 256 bytes). Total: ~1280 bytes.

With metaprogramming

<0x8040c0> // encoded: 0x4d00018040c0 ... (total 259 bytes)
// Then OP_SPLIT each byte and define it as a push instruction (using loops + function factories)

With metaprogramming, we’re down to only 259 bytes for the precomputed table contents, plus a few dozen bytes to encode the function factories. Total: ~300 bytes.

Validation-Unpacked Tables

Above we’ve already saved 1KB, but we can do better: instead of hard-coding the remaining 256 byte constant in the contract, we can unpack it with some cheap math. For some precomputed tables, the required opCost will rule out validation-time unpacking, but for others (like a bit-reversal lookup table), it could bring the encoding cost down even further.

Potential change: `BIN` vs. `NUM` identifiers

While working on this, I realized that we could save a byte in many of these meta-programming cases by eliminating the numeric casting requirement for function identifiers – a BIN rather than NUM.

Numeric identifiers would remain the optimal choice for most contracts (allowing single-byte pushes via OP_0 through OP_16), and even constructions relying on numeric-ness of function identifiers (e.g. iterating over a list of functions) would continue to work as in the CHIP today.

I’m going to continue reviewing this potential change, then I’ll open a PR to update that rationale section and possibly make the correction. (I’ll also add “Takes 2 & 3” to the reviewed alternatives.)

A final note

I recognize that proposing any kind of change to the Functions CHIP will reduce certainty about it’s readiness to lock-in for 2026. That’s OK with me. The protocol is forever – I would prefer any CHIP be delayed a year rather than a flaw be prematurely locked-in. (Even if the flaw is a 1-byte inefficiency in metaprogramming use cases.)

Thanks again to all the reviewers so far, and if you see something in the CHIP that isn’t clear, please don’t hesitate to ask.

cculianu · August 22, 2025, 8:59am

Obvious question: wouldn’t the above be faster and couldn’t you just do the above today… if you just pushed the individual bytes to the stack individually? In other words the 256-byte array would be faster just if it were bytes on the stack… which you can use OP_PICK to select …?

As for making the function identifier a byte blob – would be fine. Might be a good idea to keep it constrained to some maximum-length byte string e.g. 8 bytes max or something.

Jonas · August 22, 2025, 10:29am

Pushing one byte of data requires 2 bytes of code:
0x01 <byte>
So pushing 256 bytes comes in at 512 bytes of code compared to 276 bytes in the “metaway”.
Getting a value from the stack is the same:
<index> OP_PICK vs <index> OP_INVOKE

By pushing to the stack you’d also need to drop it all before the script terminates (just a few bytes with loops).

Edit: Just realized that “do the above today” might include the use of loops at which you seem to be right. You could loop and split in the same way but instead of doing OP_CAT with 0x01 for each byte and then OP_DEFINE just push it on the stack. Should be more efficient (but would probably require a bit more juggling of the stack).

cculianu · August 22, 2025, 10:58am

Oh yeah I forgot about that. “Clean stack” rule, of course. Fair point.