CHIP 2024-12 OP_EVAL: Function Evaluation

To clarify a point - I’m talking about contract-based 0-days, not something that breaks out of the VM.

1 Like

Based on discussion elsewhere, apparently I need to clarify that my concerns are about usages of EVAL outside the window of what most people are probably considering acceptable. Because regardless of how it’s intended to be used, it works the way it works, including all the things that are soft-considered “unacceptable”. It will surely be used in those ways.

A quote from discussion:

It might be worth articulating the point of view that you’re concerned about “OP_EVAL being used unsafely for callbacks/user-provided-functions/remote-code” in contracts […] to shave some time for others just so that it doesn’t get muddled.

:pray:

That’s why I’m proposing to sacrifice a bit of simplicity and byte-efficiency at the very-well-controlled VM level in order to reduce unexpected outcomes in the wild west of contracts managing billions of dollars.

Additionally, nothing about this would block eventually a raw OP_EVAL if it becomes apparent that is in fact super valuable to allow those remote code / whatever use cases.

2 Likes

It’s the full picture. We just add 1 bit of state to stack items. They get it only if explicitly pushed via a data push op. The bit is preserved/inherited only with stack opcodes (dup, swap, roll, etc.) and cleared with all others (cat, split, etc.).

Putting something on stack via introspection DOES NOT get executable bit.

So we preserve input locality, but you can still push something via input’s data push, then dup it and have it verified against something on another input or output, and then execute the item still having the bit from the input-local push.

In the future, we may want to revise this, we could have introspection opcodes also have the bit, but still prohibit mutations (modification loses the bit). This would make it TX-local (you could place some eval script in an op_return and have all your inputs call it without replicating it).

2 Likes

Yes. Basically:

  1. Do your thing with main program, commit part of stack that must not be mutated by user-provided code.
  2. Let user-provided code run and do its thing.
  3. Verify stack state matches the state committed in 1., continue with the main program.

Example, the code in 3. needs to operate on result of 1. and 2., but we must prevent 2. from modifying result of 1., what do?

// 1. run some code that ends with 1 stack item
{some code}
// Commit the result to an opreturn (creator of tx must set it so it matches the result)
OP_DUP <0> OP_OUTPUTBYTECODE <2> OP_SPLIT OP_NIP OP_EQUALVERIFY

// 2. Evaluate some user-provided code
<0> OP_UTXOTOKENCOMMITMENT OP_EVAL

// 3. Verify it added just 1 stack item...
OP_DEPTH <2> OP_EQUALVERIFY
//  ...and didn't mess with result of 1.
OP_SWAP OP_OUTPUTBYTECODE <2> OP_SPLIT OP_NIP OP_EQUALVERIFY
{some more code that does something with results of 1. and 2.}
1 Like

I think I get it now. Smart! Quite simple also. And if we ever add op_eval, it can use exactly the same code path except ignore the bit. Right?

2 Likes

It’s all just op_eval, with/without the executable bit tracking and requirement. So, yes it can later use the same code path except ignore the bit.

To clarify, this is meant for having op_eval but with the extra rule that it can eval only stack items that have the executable bit. It would fail the script if it tried to eval a stack item without the bit.

Later, if we wanted to allow cross-input eval or mutable eval scripts, we could extend the bit to be set for results of introspection, too, or remove the executable bit tracking altogether and not require it by op_eval.

2 Likes

I was asked in another thread to comment on OP_DEFINE / OP_INVOKE vs OP_EVAL so will respond here.

In general I think the evaluation feature is an immensely valuable addition to the bytecode language. It brings functions and recursion which are essential building blocks. Together with loops it completes the key features of a simple but expressive language. And these features are fully constrained by the VM sandbox and VM limits. After adding OP_EVAL to albaVm it became simple to implement functions such as exponentiation, merge sort, and basic elliptic curve multiplication.

I prefer OP_EVAL (possibly combined with OP_PUSH_EXECUTABLE) over OP_DEFINE / OP_INVOKE. There is a deeper level fundamental difference between these two solutions for function evaluation even though they look similar. Basically OP_EVAL fits into our current stack-based model of computation whereas OP_DEFINE / OP_INVOKE extends it by introducing global state. (Jason also brings up this topic in his CHIP). The global state is assign-once, but even so does change things around a bit. A couple of examples:

1)

With OP_DEFINE / OP_INVOKE, expressions involving function definitions are no longer self contained. You can paste the following expression involving the pow function (recursively implemented using OP_EVAL) anywhere into your own code where an integer is expected and it should work (e.g. in bitauth IDE 2026):

... (your code)
<2>
<16>
<0x3178009c635177777767785297009c63527952795296527976627695777777675279537953795194537976629577777768687662>
OP_EVAL
... (your code)

Had it been implemented using OP_DEFINE / OP_INVOKE, then the above evaluation would fail in case pow used a function slot that was already occupied. This makes interactive bitauth/REPL use more complicated and is as far as I can see a divergence from what we have today.

This also has implications for example when sharing libraries of compiled functions between tools. Now we need a linker to patch up function definitions from separately compiled modules so that their function slot usage does not overlap, instead of just bringing them in as is.

2)

OP_DEFINE / OP_INVOKE allows the function table to be used as a global assign-once array. A value can be assigned to a slot in the table by assembling a function that returns the value and OP_DEFINEing it to that slot. Two expressions at separate places in a program may have an agreement to pass data via function slot x. This way an expression is no longer only a function of its arguments on the stack, but also has access to global state calculated somewhere else in the program.


My sense is that we should continue to keep the bytecode language purely stack based and not also introduce global state. If we want to explicitly call out all “lambda creation sites” then I prefer the eval-bit suggestion by @bitcoincashautist/ @im_uname ( https://github.com/bitjson/bch-functions/issues/2). Although, currently my overall preference is to just have OP_EVAL on its own.

3 Likes

This is a great argument, thanks for joining and bringing it up! Also, it complicates cross-input code sharing when some other input defines functions inside functions and you need the whole thing. You can’t just slice the thing and execute it from running input’s context because slots could clash.

Which version? The “only push opcodes set the bit” version has the problem in that you can’t reuse code from other inputs.

We’d need an OP_PUSH_EXECUTABLE as you suggested. It could actually work the same as OP_DEFINE but instead adding to the table it just sets the bit on the top stack item.

Or we need a 3rd stack for executable blobs, where OP_DEFINE pushes new definitions <n> OP_INVOKE executes the n-deep item on the executable stack, and <m> OP_UNDEFINE clears m top items. The stack wouldn’t have to be left empty when main script finishes, the purpose of UNDEFINE is to allow inner scripts or callers to clean up and not mess up callers’s depth references.

Thank you for reviewing!

I appreciate this and agree with the general principle. :pray: I want to note that this specific topic – function/“word” definition – is a spot where our model actually diverges from Forth dialects (and most concatenative and stack-based languages). Functions can of course be handled exclusively via stack scheduling and stack juggling (or just duplicated, as we do today), but the VM state in question (the “wordlist”) is generally considered a core element of stack-based models.

Fun, thanks for sharing! Have you compared the bytecode length and opCost of the OP_BEGIN/OP_UNTIL equivalent? (And/or vs. more efficient pow algorithms?) I’ve found loops are usually more efficient for the internal implementations, with functions primarily for high-level factoring e.g. CHIP 2024-12 OP_EVAL: Function Evaluation - #46 by bitjson

I expect many compilers will optimize index assignment, as it’s very easy (even via statically-applied transformations) to save bytes by assigning OP_0 through OP_16 to the most-commonly-encoded functions (not necessarily the same as most-commonly-invoked, and inlined functions don’t need an assignment at all). In this case though – assuming optimization isn’t important, no inlining, and/or a loop implementation isn’t preferred – the lambda could accept index(s) to use or even pass defined function identifiers, in addition to (lambda) function bodies (like OP_EVAL). (Aside: you might find bch-wizard useful.)

Certainly possible, the mutable version of that is described in Rationale: Immutability of Function Bodies. It’s quite inefficient vs. optimal stack scheduling though. Today’s equivalent is essentially what CashScript does already: naive deep picking from a stack area that the contract treats as a set of global registers.

2 Likes

Withdrawing OP_EVAL

Hi everyone,

On this BCH podcast we discussed why I think function support is important for Bitcoin Cash contracts, and I shared some context on the range of implementation options. In my continuing due diligence, I’ve concluded that the two-opcode “proper functions/word definition” (OP_DEFINE/OP_INVOKE) approach is a better technical choice than OP_EVAL.

I was initially more wary of the two-opcode approach: it’s slightly less byte-efficient in the simplest case (+3 bytes) and theoretically less byte-efficient for complex contracts, assuming a sufficiently advanced compiler (optimal stack scheduling).

However, experimentation has shifted my perspective since December:

  • OP_EVAL-based functions require exceptional integration effort in compilers and tooling, entailing considerable additional risk of compilation bugs. OP_DEFINE/OP_INVOKE offers equivalent capabilities at greater safety and minimal implementation cost.

  • The theoretical optimizations made possible by OP_EVAL are tiny and fundamentally temporary: the optimizations could reduce “glue code” between fixed business logic in some contracts, but a future upgrade enabling better deduplication of reused contract bytecode (e.g. read-only inputs) could fully eliminate those bytes from practical transaction sizes and blockchain storage growth. Therefore, even with extremely low time-preference, it would be a very low-impact use of time/resources to implement and verify the most aggressive, theoretical OP_EVAL-based optimizations (function juggling + stack scheduling integration; in fact, requires some novel development WRT published literature) for the minimal additional savings (max 3 bytes per non-inlined function after compilation) vs. a naive “deep pick” approach (e.g. what CashScript currently does). That means: OP_DEFINE/OP_INVOKE would likely remain both safer and more efficient in actual practice for at least the next year or two (other than the single-OP_EVAL case), with OP_EVAL only becoming temporarily more efficient in relatively rare cases, and only if 1) a novel, aggressive optimizer gets built and verified despite the meager return on investment, and 2) contract authors are willing to trade some safety and external auditability for those meager savings. (For reference, basic stack scheduling in tools like CashScript could save ~2 bytes per contract data element in nearly every contract. That optimization is a prerequisite to the OP_EVAL one – and far more commonly applicable than OP_EVAL’s max 3-byte optimization per non-inlineable function – but to my knowledge even that prerequisite remains unimplemented in any BCH-targeting compiler.)

  • At a much higher level: language-level function definition makes compilation or ports from other languages (EVM, WASM, JS, etc.) far safer and lower cost. Any deficiency in Bitcoin Cash’s function support (e.g. requiring stack scheduling, not allowing stack inputs, etc.) necessarily creates quirks and unexpected edge cases, often resulting in less-safe workarounds and harder-to-audit artifacts.

Summary

  • OP_EVAL is less safe and generally less efficient than OP_DEFINE/OP_INVOKE.

  • OP_EVAL’s minimal theoretical savings (3 bytes per defined, non-inlined function) require novel research and riskier compilation/audit tooling, and those specific savings may ultimately have a zero-byte impact on transaction sizes.

  • Proper function definition – OP_DEFINE/OP_INVOKE – simplifies compilation/ports from other languages (EVM, WASM, JS, etc.), improving the availability and safety of development tooling.


Based on this research, I’m withdrawing my advocacy for OP_EVAL and modifying the proposal to split it into two operations: OP_DEFINE and OP_INVOKE.

Most of the CHIP remains unchanged, but to minimize confusion, I’ve bumped the version to v2.0.0 and renamed the CHIP: CHIP-2025-05 Functions: Function Definition and Invocation.

Previous links continue to work, but I’ve also updated the repo to be titled bch-functions:

5 Likes

I started a topic with the updated title:

2 Likes

Just for the record: I agree with every word in this essay.

4 Likes