CHIP 2024-12 OP_EVAL: Function Evaluation

bitjson · January 17, 2025, 11:39am

Word/function signature verification

I see some discussion about function signatures being generally useful in programming (I agree), and that being somehow an argument for OP_EXEC and/or “stack isolation”.

Please note that Libauth and Bitauth IDE have supported compile-time verification of “function signatures” and behavior since 2020 – I’ve found it invaluable for optimizing various constructions (e.g. P2SH assurance contract - #15 by bitjson predates CashTokens lock-in).

Adding – via a consensus upgrade – some half-baked runtime checks to approximate function signatures wouldn’t simplify or improve our current capabilities.

bitjson · January 17, 2025, 12:35pm

If I understand correctly, @andrewstone designed Nexa’s OP_EXEC.

@andrewstone, do you disagree with my review of OP_EXEC? I also responded to you here:

https://x.com/bitjson/status/1880227699198714348

GAndrewStone: TBH, tl;dr. It allows the template to execute untrusted holder constraints.

OP_EXEC is like requiring motorcycle helmets in a swimming pool.

It simply misunderstands contract development.

Even with the correction I described (@NexaMoney’s version is even more nonsensical) – OP_EXEC adds no security and harms protocol complexity, contract complexity, and overall transaction sizes when compared to OP_EVAL.

If you disagree, it should be easy to provide a counterexample that doesn’t hand-wave about context (e.g. leaving a blank for the “untrusted code”). Given any particular contract, what exploit is prevented by OP_EXEC’s stack isolation? Please be sure to include threat model info, then I can help you optimize it by switching to OP_EVAL.

Again, unless someone can produce a credible counterexample (with enough context to review), I don’t plan to spend more time on OP_EXEC.

tom · January 17, 2025, 10:08pm

Thanks Jason for a very enlightening series of posts.

You asked a bunch of times to show a “counterexample”, which was your choice of words indicating an example that did something nasty. Inside of the VM. Knowing full well that a VM is already a sand-box.
Repeating that so often while ignoring all the actually quotable objections I made is really quite enlightening.

You’re arguing that stack isolation is “not good” and that “experienced programmers will be misled” by “intuition”. Yeah, because who would have the audacity to ask people with experience! Where is the fun in that?

Ok, lets leave the schoolyard and sum up the actual facts;

op-exec from nexa is super expensive in every usage. Nobody likes that one.
op-eval avoids a function signature and avoid stack safety to be less expensive.
a simple op-exec1 (lets use my design for discussion sake) is not expensive. Is actually cheaper than op-eval for repeated cases. And Jason stated he expects hundreds or more calls in some cases. Arguing cost makes op-eval lose.
a simple op-exec1 design with pushing of a single int for argument count allows stack safety.

A argument count as part of a function description has 2 main advantages:

a toolset like an IDE can do compile time checking. Notice that “compile time” is per definition at the time of transaction construction. I mean, no full node compiles, so this one should have been obvious. But we got someone misunderstanding this, so lets be clear. A transaction with a single output that has a bunch of pre-defined subroutines can be used by a contract you’re building. You’ll compile the contract using the binary library and voila, you have compile time safety.
Or, in short, this 1 byte that indicates the number of arguments is the simplest form of an API-docs for your IDE to use. Pretty cheap, if you ask me!
a much more important usage of an argument count is the stack protection.
Now, this is not about being able to escape your VM. That suggestion is more like a “when did you stop hitting your wife” question.
Instread, this is about avoiding all the hallmarks of unsafe code.

Arguing “it is not worth anything” is a great way of saying you don’t actually have an argument against it, you just don’t like it. Fine, don’t like it. It still is valuable to others and practically free.

Unsafe code in this context is about unpredictable code. If I use this library method in my contract, will it do funny stuff? Will my money be stolen because I didn’t manage to understand the code I used from another dev?

Being able to chop up a script on-stack, partly execute it with usage of codeseparators and indeed altering or reading the stack outside your subroutine, those are all really nasty things that makes code very hard to understand and near impossible to test to be “correct” when used in not yet written scripts.

But the bottom line here is that this is money. This is not some toy virtual machine to do cool new things with. Well, not new, nothing here is novel. Jason may think he came up with this all on his own, but sorry to say that he’s about 60 years late to that party. One way or another any and all options we pick have been tried before.
Maybe that knowledge helps letting the ego’s deflate and we can get back to picking something that actually is good for Bitcoin Cash.

I started the discussion some time ago here:

My opinions on each item above in the original post.

And, to repeat, op-eval doesn’t just allow spaghetti code that can steal your funds, it ALSO has a pretty serious security issue by default:
code can come from any place without being known at the time of building the transaction. That allows injection of untrusted code and that means bugs can steal your money.

You can claim you can write code to verify your inputs, but first of all this after DECADES is still the number one issue in software: unverified inputs. (xkcd).
And that just throws out of the window the entire argument that it would somehow be cheaper byte-wise, being forced to write your own input verification.

None of the arguments hold water, which is why Jason isn’t replying to my posts, which is why he uses very angry and dismissive language (tit for tat) because he knows he’s in the wrong and his lovely op-eval is rotten under the surface.

bitcoincashautist · January 17, 2025, 10:35pm

Which others, how many smart contracts have you designed? Let’s hear from these others.

It’s not even his (it came from Gavin, remember?) and it is certainly not rotten, it is the preferred solution.

bitjson · January 18, 2025, 3:35am

@tom again, unless someone can produce a credible counterexample (with enough context to review), I don’t plan to spend more time reviewing stack isolation.

And I’d go back further than that! There are many decades of prior art here – if anything, this CHIP is taken more from Chuck Moore than Gavin or any of the other 2012 proposals. OP_EVAL gets BCH VM bytecode to a “fully capable” Forth dialect.

On the other hand, the various 2012 proposals did nonsensical things like clearing stacks and preventing “nesting” – i.e. functions calling functions – and generally misunderstood the control stack and/or how non-trivial Forth programs are factored. (Understandable for the time: VM limits were a huge, untenable problem that overshadowed clear thinking about a lot of topics, and most “smart contract” ideas were very hypothetical. We’re spoiled now to have more certainty on both.)

bitcoincashautist · January 18, 2025, 3:38pm

It will be a NOP only if it passess, else it will fail the TX, that’s just like some:
<0> OP_UTXOBYTECODE <0> OP_OUTPUTBYTECODE OP_EQUALVERIFY sequence. The individual locktime/sequence opcodes work the same. If it passes it’s like a NOP from PoV of surrounding code, otherwise it fails the TX because predicate not satisfied.
You can’t replace that with data because result depends on TX context, and the purpose is to force spender to set the TX context right.

OP_EXEC makes it possible for untrusted parties to insert their predicate checks into the placeholder inside the main contract, without the possibility of breaking “outer” predicate checks created by the designer.

Consider running OP_EXEC with arguments 0 and 0 (which means the untrusted_code can’t affect the main stack): it can only be some kind of user-specified -VERIFY sequence.

{some trusted code} <untrusted_code> <0> <0> OP_EXEC {some trusted code}

Now imagine the preceding trusted code does some calculation, and the succeeding code is supposed to continue it. The latter code can trust the stack state that the former code resulted in, because the OP_EXEC guarantees that it couldn’t have changed it.

but if you have {some trusted code} <untrusted_code> OP_EVAL {some trusted code} then the untrusted_code could’ve changed the result of the preceding block, and the succeeding block would have to treat the stack state as untrusted data - just as if it’s being executed after input’s data pushes - the spender could’ve provided anything to be run as untrusted_code.

EVAL could also be used to create a slot in the contract for user-set predicate check, but then it must be either called as either first or last in the contract (and if last then the main must execute a final VERIFY to “lock-in” whatever checks it did).

Here’s a full example (link to open it in BitauthIDE).

Unlocking script:

// Next user-set constraint commitment, decided by the spender of this TX
<0x4ae81572f06e1b88fd5ced7a1a000945432e83e1551e6f721ee9c00b8cc33260>
// Reveal user-commited constraint for this spend
// (the one committed by the previous TX)
<0x51>

Locking script:

// Example of some fixed covenant code
OP_INPUTINDEX OP_UTXOVALUE OP_2 OP_DIV
OP_INPUTINDEX OP_OUTPUTVALUE OP_EQUALVERIFY

// Force inheriting the fixed covenant part, while allowing the user
// to change only the commitment for the user-set constraint
OP_ACTIVEBYTECODE <33> OP_SPLIT OP_DROP
OP_ROT OP_CAT
<0x8862> OP_CAT
OP_HASH256 <0x87> OP_CAT <0xaa20> OP_SWAP OP_CAT
OP_INPUTINDEX OP_OUTPUTBYTECODE
OP_EQUALVERIFY

// Verify additional constraints committed by the previous TX
OP_DUP
OP_SHA256
OP_PUSHBYTES_32
// Current user-set constraint commitment
0x4ae81572f06e1b88fd5ced7a1a000945432e83e1551e6f721ee9c00b8cc33260
OP_EQUALVERIFY
OP_EVAL

tom · January 18, 2025, 11:19pm

And ignoring all the actual relevant points against your chip that have been made.

But, really, you’re barking up the wrong tree. You are the one making a suggestion that goes againts decades of actual software engineering practices.
You are violently against a practically free way of avoiding said problems.
And said barking is without any arguments. The wider bitcoin cash ecosystem doesn’t need you to “review” anything. This specific feature is not hard to do. If you refuse to even address actual critisism of your proposal, then please go away. We don’t need that. We need someone that actually can work with others. More people working together get better results.

At this point NOT accepting that there is a one byte-costing approach which avoids well documented and known problems is just plain malicious.

Again, you have given NO arguments why you don’t want this one feature in the list of possible features for the common idea of subroutines. While not even mentioning, let alone discussing, any other features that would be pretty cool to have.

bitcoincashautist · January 18, 2025, 10:40pm

You’d know, considering you’re doing all the barking with no arguments.

tom · January 18, 2025, 11:51pm

I would be very interested in hearing if contract coders see value in features like:

having a subroutine that is able to alter the stack in a way that goes against the basic “function” based design of cashscript. Or any programming language using function design.
Specifically, a subroutine would be able to remove more from the stack than “expected” by the function signature. Which means you can’t expect it to behave the same even if you call it with the exact same arguments.
the ability to have the unlocking script have a push which is your unlocking code. Not like p2sh where the hash has to match, but without hash verification. A transaction mined on-chain that you can write code to unlock.
the ability to cut / join and otherwise alter a subroutine’s code.

Anyone interested in any of those features? Is the risk of making mistakes worth it to you?

bitcoincashautist · January 19, 2025, 10:33am

Script is not CashScript. Script is more akin to assembly, which is why EVAL is fine.

If you wrote a function in CashScript that takes in 2 args and returns 1 then the compiler would create a well-behaved Script bytecode for that function - which doesn’t even try to break out by consuming more than 2 stack items, so why would we need low-level guarantees when we’re already controlling the bytecode to be eval’d and can ourselves guarantee that it adheres to calling convention or have the compiler produce compliant bytecode?

A function written in CashScript can’t surprise the compiler because compiler is the one creating the bytecode - compiler decides what’s the max. number of stack items it will pop during execution.

If the contract author wants to make his program modular, to occasionally “load” some of his code to be executed from input’s data or from another input/output, he always needs to authenticate the code against a commitment (dup, hash, equalverify) - even when using OP_EXEC - which is why Jason makes a good argument that OP_EXEC doesn’t really add security, it only adds some flexibility for one class of use-cases that would be akin to 3rd party plugins, but we don’t see much use for those.

emergent_reasons · January 21, 2025, 7:20am

Heavy debate on merits and details is very important. Attacks on character, ad hominem, passive aggression, caricaturing others, etc. are not warranted or welcome here. Fair warning.

bitcoincashautist · January 21, 2025, 8:10am

I pasted the whole HTML of this page to Claude and asked him to find examples of:

Here are the receipts, for posteriority:

Looking through the thread, here are some examples of unconstructive behavior:

From Post #23 (Tom):

Accuses Jason of deliberately ignoring points: “Repeating that so often while ignoring all the actually quotable objections I made is really quite enlightening.”

Sarcastic/dismissive: “Yeah, because who would have the audacity to ask people with experience! Where is the fun in that?”

Personal attack suggesting malicious intent: “he knows he’s in the wrong and his lovely op-eval is rotten under the surface”

Questions motives: “Jason may think he came up with this all on his own, but sorry to say that he’s about 60 years late to that party”

From Post #27 (Tom):

Hostile/dismissive: “If you refuse to even address actual critisism of your proposal, then please go away”

Accuses of malicious intent: “is just plain malicious”

From Post #28 (bitcoincashautist to Tom):

Retaliatory snark: “You’d know, considering you’re doing all the barking with no arguments.”

In Post #31, a moderator (emergent_reasons) had to step in and warn: “Heavy debate on merits and details is very important. Attacks on character, ad hominem, passive aggression, caricaturing others, etc. are not warranted or welcome here. Fair warning.”

The exchange appears to have devolved from technical discussion into personal attacks, particularly from Post #23 onward.

emergent_reasons · January 21, 2025, 8:27am

Can just paste the whole html. We are living in the future

Ok back to OP_EVAL.

Jonas · January 21, 2025, 10:45am

I’ve been thinking about this for a while and I certainly don’t think stack protection should be implemented for OP_EVAL.
Messing with the stack is a feature that could open the door for many compile time optimizations.
A contrived example could be a reusable bytecode to clean up the stack:

<push bytecode to pop everything recursively from the stack>
<some more pushes>
[...]
OP_DEPTH
OP_IF
  // Do some stuff, leave "true" on stack to exit
  OP_IF
    <index to stack clean bytecode> OP_PICK OP_EVAL
  OP_ENDIF
OP_ENDIF
OP_DEPTH
OP_IF
  // Do some more stuff, leave "true" on stack to exit
  OP_IF
    <index to stack clean bytecode> OP_PICK OP_EVAL
  OP_ENDIF
OP_ENDIF
[...]

Jonas · January 21, 2025, 1:01pm

Another example:

Given some arbitrary data blob you want to push all bytes directly after 0x00 as a new item on the stack, i.e. 0x0001000204560078 → 0x01, 0x02, 0x78.
It’s possible to do a recursive bytesequence to do those pushes and execute with OP_EVAL.

For OP_EXEC you’d have to first do OP_EXEC on one bytecode that does the counting and returns that as a push so you know how many items the real function will produce that you could give as an argument to another bytecode.

tom · January 21, 2025, 2:49pm

Great question!

Look at Jonas’s post to get the answer. Because people will try to work around it.

And more to the point, because safety of your money should not be optional.

The important part here is that this stack protection which Jason is focusing on fighting is actually practically free.
So the question really that should be asked is why there is pushback at all. Why are some people very actively advocating against a free stack protection.

Jonas · January 21, 2025, 3:18pm

What? I wrote examples in low level Script, not CashScript.

Safety from what? A misbehaving compiler (like, for instance cashc)?

I literally gave from-the-top-of-my-head examples of things that would be more convoluted (i.e. not free) with stack protection.
In the first example each IF-block would need to end with an explicit number of OP_DROPs.
In the second example there would be two different OP_EXECs, one for counting the number of pushes the second one would produce and pushing that number as an input to the second OP_EXEC. Since the second example is recursive a counter for the OP_EXEC parameter count would also need to be tracked thru the callstack.
Adding stack protection and pushing a number for specifying the number of parameters adds no value. Just because the bytecode can’t do unspecified things with the stack doesn’t make it “secure”.

That being said, I still think OP_EVAL (and friends) is a can of worms. I am not (yet) in favor of adding it.

bitjson · January 21, 2025, 3:19pm

Very glad to see others experimenting with OP_EVAL and reviewing OP_EXEC/stack isolation, thanks @bitcoincashautist and @Jonas!

Ignoring the missing context for a moment: this just describes a poorly-factored program. Concatenative languages are underappreciated:

With OP_EXEC: trusted_code_a <untrusted_code> <0> <0> OP_EXEC trusted_code_bc
Saving 2+ bytes with OP_EVAL: trusted_code_ab <untrusted_code> OP_EVAL trusted_code_c

But again – everything that matters is being hidden behind undefined untrusted_code and trusted_code identifiers. It’s not possible to properly review any code snippet without context.

In this case, you’ve restated (omitting context) a less-efficient formulation of my attempted steelman of OP_EXEC (“User-provided pure functions”). The same questions apply:

Why untrusted_code rather than untrusted_result? I.e. what specific feature of this scenario makes it impossible or less efficient to provide code rather than a precomputed result?
Which party first created this contract? Why were they unable to “compile in” the untrusted_code?
1. If not the originator, whose wallet provided the untrusted_code?
2. How did that user/wallet audit the resulting contract behavior (including the “untrusted” code)?
3. If applicable, how did the covenant or other contract parties review the untrusted_code before somehow internalizing it or signing on to the new contract? Given a malicious input at this stage, can the covenant be permanently frozen? (I.e. is there already a critical vulnerability in the evaluation prior to the evaluation in which OP_EXEC is assumed to add some sort of security?)
Can this scenario be made more efficient by removing the OP_EVAL/OP_EXEC and using a pre-commitment structure (“covenants-as-standards”), where the untrusted_code gets “compiled” into resulting child covenants?
1. If so, does that need to happen on-chain, or are we wasting on-chain bytes to pretend that end-user wallets don’t need to audit the full child covenant’s behavior (including the “untrusted” code).
Specifically, what “untrusted” code can make the contract misbehave?
1. Who stands to lose money if it misbehaves? (If it’s just the current spender – again, we’re dealing with a wallet bug.)
2. If a third party or the covenant are at risk: is there an equivalent exploit path that does not rely on stack item counts? I.e. if the “untrusted” code is a market making algorithm, can it be rewritten to offer the exploiter more control over the “price” such that stealing funds does not require any sort of stack manipulation, only an unexpected mathematical result which otherwise obeys the “function signature”?

Again:

tom · January 21, 2025, 4:06pm

In Bitcoin Cash we have today efforts underway that try to define a system of “templates”, whic are close to libbitauth templates.

The core features aimed at is that normal users can use this library of pre-defined ASM to build their transactions on top of.

Adding any sort of eval/exec to this mix allows those templates to reference library code. Which I’m sure will happen, as NPM shows, people are doing that kind of optimizations because we are mostly lazy beings. Nothing wrong with that.

The “untrusted” concept is thus absolutly relevant in the wider scope of things.

Using either proposal you can get that code from another output using introspection. But that code may not be vetted as well as it should have been. Which makes it untrusted.

The basic concept of either op-eval/op-exec both allow the unlocking script to have a push with that code, absolutely qualifies as untrusted if you don’t go through the extra steps of validating it.

So, in short, the idea that we need not add (free!!!) protection rules because there is no way to mistakingly end up running “bad” code in your transaction is naive. Nobody downloads plain executables from the Internet either, right? In reality they do, virus’ protection companies exist. That’s the reality we need to operate in.

Wallet bugs are certainly possible. Please don’t dismiss that as a problem that doesn’t exist or can be ignored.

bitcoincashautist · January 21, 2025, 4:14pm

So what? You can replace a .dll on your system with a hacker’s. Why would you do it, though? Just verify the hash so you know what you’re loading.

It’s agnostic of the source. You can just have all the pushes of eval scripts you need be part of the locking bytecode which makes them trusted and you can just call them with <N> OP_PICK OP_EVAL whenever you need them.

As the contract author - choice is yours. You want trusted? Either provide it as part of main script’s pushes (part of locking bytecode), or hash-verify if you load externally (either by unlocking data push or introspection). You want untrusted? Then refactor your contract so those come after the last VERIFY of the main’s logic you want to protect.

Script is low-level, we don’t need it to hold people’s hands.