CHIP-2025-05 Functions: Function Definition and Invocation Operations

OK, based on your past discussions in eval/subroutines I thought you wanted to prohibit the non-authenticated case with plain OP_DEFINE, which is why I strongly resisted your OP_DEFINE_VERIFY. If it’d just be in addition, then it’s on you to make the case for additional opcode just to shave off 3 bytes of a particular bytecode pattern. Why not optimize P2PKH then? We could have <sig> <pubkey> <hash> OP_CHECK_PUBKEY_AND_SIG instead of <sig> <pubkey> OP_DUP OP_HASH160 <hash> OP_EQUALVERIFY OP_CHECKSIG. This would save a meaningful amount of bytes because P2PKH is like 99% of transactions.

4 Likes

Given that it’s optional it could very well be the case that no one is using it, which leaves GPs qualms as they are.

I don’t think you are reading the room. Make this a separate CHIP.

3 Likes

@tom, good to hear that you are viewing OP_DEFINE_VERIFY as an addition to the OP_DEFINE / OP_INVOKE opcodes and not as a replacement for OP_DEFINE. It seems you are in agreement with the key aspects of ‘CHIP-2025-05 Functions’.

From what I understand, your main concern is about code/data that is external to the contract and the risks around executing it. Take the example of pulling in code from other UTXOs referenced by a transaction. In a future scenario, assuming we have read-only inputs & P2S, this technique could perhaps allow for a form of shared libraries on the blockchain (I’d like to hear other people’s view on if this is realistic and desirable).

Let’s take my collection of elliptic curve functions as an example. If it was sitting in a UTXO (assuming it fits), then a script in a transaction referencing this UTXO could fetch the library code using OP_UTXOBYTECODE, extract the function bodies using OP_SPLIT, perform any necessary transformations, and then define the library functions in its own local function table.

However, one issue with using OP_DEFINE_VERIFY for this is the amount of bytes needed for the hashes. The EC library has about 20 functions. To bring them in with OP_DEFINE_VERIFY would require ~20*32 = ~640 bytes of hash data. The library as a whole is ~700 bytes, so this would be a considerable overhead.

Compare this with just verifying a single 32 byte hash across the whole library and using plain OP_DEFINE to define the functions. I can see that you would value having the hash taken right before defining the function to ensure that we e.g. got the OP_SPLIT and transformations right. But from a security standpoint I believe that having trusted code verifying the library hash and extracting and defining the functions is equivalent.

I think you are after the fact that the presence of an OP_DEFINE_VERIFY opcode could possibly nudge the developer to do the right thing and not forget to verify hashes of external code. But just as a developer could forget to take a hash across the library, they could forget to use OP_DEFINE_VERIFY instead of OP_DEFINE (as @jonas also pointed out). So I’m not convinced about this opcode.

BTW, I saw your ‘OP_RUNSUB2’ proposal in the withdrawn chip and found it interesting. The fact that it requires transaction input validation to be serialized might be too much of a downside to make it viable though. Otherwise, a version of it could have been a way to greatly simplify accessing external functions.

6 Likes

Thanks for the review,

I agree that we’re honestly just scratching the surface wrt opportunities for scripting functions.

I have seen a lot of ideas that probably would need refinement and iterations to make them perfect. I’m comparing this to the “group” idea that on its own would indeed have worked, but the iterations and hard work turned it into a great solution instead and we now have FTs/NFTs that are stellar.

Your point with the library of functions being expensive if each is hashed was one that I also made in the past, while suggesting to use the concept called MAST that would be a trivial addition to allow some or many functions to be verified with only minimal number of bytes hash.
Additionally, a hash for code should likely be allowed to be a 20 byte one, not a 32 bytes. Making your 700 bytes setup become 22 instead.

That’s the point that seems to be unacknowledged, it really is not equivalent. And that is the main point where I expect people to start losing money.

You using OP_SPLIT and getting executable code from that is not some limiting factor in a UTXO setup.

A p2s script gets stored on the blockchain until it is spent. If in that script there is an eval that gets its code from anywhere else than the output that already on the blockchain, it becomes a puzzle piece that you need to find the matching piece for. But that doesn’t have to be the original piece, any thief can just create a new piece that fits. The moment they find something that fits, they can take the money stored on that utxo.

That people have attacked me saying that nobody will ever make this mistake, while i actually expect it to be a very common mistake for people that use p2s / op-eval as a replacement for p2sh. Security is hard, mistakes are extremely common.


Anyway, there is a lot possible to improve in functions. We stole the initial idea from Gavin, realizing it was quite simplistic. Then we looked at Nexa and realized that one was over engineerd and limiting. The first iteration was Ok, but I am happy to see that Jason took my idea of using opdefine instead of oppick to handle the code, which is absolutely a great improvement and everyone agreed on it being so.
So, this chip is better than it was at the beginning of the year, but iterating ideas still could very likely have made it better. I’ve tried, but I’ve openly been attacked in a way that makes it look like I didn’t like functions. While I’ve always been ambivalent. Very nasty politics there.

Yeah, Jason prefers long-form docs instead of posts here. As such it was an amalgamation of ideas, not a polished solution. As the status indicates. It packed in a LOT of concepts for this. It’s been out there for the entire year, but its ideas have been mostly ignored. Well, the “define” idea has been taken, which was the most important one, makes it worth it.
Yet, I feel we lost opportunities this year.

I don’t have anything against Jasons functions chip as it stands today, as the bar of it not hurting people has been reached. I do think we can do better, and ideally Jason decides to move it to next year to let it mature.

arguably, most naive contract authors will likely be using higher level tooling like cashscript or some other library to create their contracts. You’ve noted that your idea could be an additional opcode like OP_DEFINE_VERIFY (or maybe OP_INVOKE_VERIFY?). Couldn’t this just be a macro provided by some higher-level tooling?

In a MAST setup for this example, where would the Merkle proofs for the individual functions be stored? Each proof would be several hashes long.

You only need the merkle-root. One hash for all functions you import. I think if you google for “MAST Bitcoin” you’ll find some documentation on this design. There is a BIP, but that one is probably too technical for most.

The tooling people haven’t really shown an interest in this concept, last message from Mathieu here on this topic was a “maybe” and months ago.
It would be nice to get more people working across layers involved in solving problems, I do agree with you there.

Most of the past year discussions have been with jonas and bca vehemently disagreeing with the statement that authors could lose money by not verifying their inputs.
And, indeed, if we accept that this is indeed possible for a group of usages, we can move forward and try to solve it.

Maybe the best idea is to have a OP_DEFINE_MAST opcode for 2027, that may be the best bang for the buck. Depends a bit of people actually being interested in working on solving problems or just dismissing problems that the experts won’t have.

That’s it. I’m out. Unclear if I ever return here to BCR.

Don’t leave, you are very much valued by the community!

4 Likes

There has been a few MAST proposals so it would be helpful if you linked to the one you are referring to that avoids Merkle-proofs.

If we look at for example BIP 341 (Taproot) we can see that to spend a Taproot output one needs to provide a control block which contains the Merkle-proof. It can hold of up to 128 hashes (32 bytes each) depending on the size of the Merkle tree. This is what proves the inclusion of the script in the Merkle tree.

Probably the easiest to understand it is if you look at a block header. It has one merke-root and the number of transactions in a block is variable.

The merkle-tree is built by hashing the actual [data] (here a series of transactions), in a specified way. If you provide all transactions you need not provide any hashes, and the merkle root is there to verify that all data is proper and byte for byte as expected.
Look up the size of the block header, it is a standard unchangable 80 bytes. That is because it only holds the merkle-root. and no other hashes.

MAST in a way of verifying the content of scripts would work identical. Regardless of how many scripts you’d supply in the unlocking script, the mast operation just hashes all of them into a tree which results in a single hash that is then compared to the one stored on the blockchain in the output we are unlocking.

What may be confusing is that (see the SPV chapter in the bitcoin whitepaper) merkle trees have a second feature that is used in mast. You can omit one piece of [data] at the unlocking time and instead provide its hash. Which may be useful in some cases.

Edit:

so, in short, MAST uses merkle-trees. But they in normal cases are only in-memory. Not shipped. You don’t need to store them on chain.

Yes, Merkle trees can be represented by a single root hash. Inclusion proofs (Merkle-proofs), however, consist of a Merkle-path from the leaf node to the root as illustrated by the Taproot BIP.

Thanks for engaging in conversation. I have what I need regarding OP_DEFINE_VERIFY. My conclusion is, as before, that a separate opcode is not justified but that tooling can define it as a macro.

1 Like

the whole merkle-root and mast discussion was a bit off topic here, yes.

1 Like

Yes, better to collect these ideas into a separate CHIP.

I think that’s a bit of a trivialization of the whole debate… input verification is obviously programming-101 level stuff… I don’t have a stake there but let’s try to stay focused on the big picture?

I think it’s been established that having a quick and easy way to verify some bytecode before defining/invoking it is probably a good idea. The disagreement seems to come from whether or not it should be baked into the protocol as its own opcode.

I personally think no, it’s the responsibility of a higher-level tool. (Don’t forget to null-terminate your C strings, btw!)

Probably because those people are pragmatic and will build for things that exist today. If Functions makes it to mainnet, I’m sure /someone will create such tooling if the need arises… maybe even you could?

…what?

This kind of quipping is really unproductive imo. Yes, I know it’s not just you. General statement.

3 Likes

My personal goal is always to avoid any and all personal attacks, and just focus on the tech. In the last year or two, specifically with a small number of people here, the efforts were very much one sided and productivity went negative.

I’ve always trusted that a moderator or otherwise indepdendent 3rd party could step in and stop personal attacks, correct (intentional) mis-interpretations and such. But this has not happened, we even have the most useful moderator on telegram leave who tried. As such I’m thinking we need to call such problems as they are.
Maybe a new moderator steps up and we can get back to being civil. That would be my dream.

I think a define opcode that validates the pushes as being what was expected is useful.
Imagine the usecase where I have an output:
[ripe-160 hash] [id-start] [script-count] OP_DEFINE_MAST

Then an unlocking script has a bunch of pushes for your to-eval-scripts. Say, 10 pushes of scripts.
This is a neat replacement of p2sh, giving you much more power and flexibility for the complex type of things.
While validating your inputs with little to no overhead.

I’d probably use it, but I don’t know if others would. Maybe we can turn the whole topic around and move to have people come together for a good solution not the first but for the next upgrade.

I think one of the most beautiful things about getting Functions is that you could actually define these things as functions on the script level and publish them on-chain for anyone else to use… imagine using introspection to compose functions stored on UTXOs as some kind of on-chain library… such a library would be widely auditable, tamper-proof, and verifiable by hash… what do you think of that as a solution to the problems you’re describing?

3 Likes

Bruh.

This platform is pretty unmoderated, otherwise an independent 3rd party would step in and temp-ban you long ago.

What you are doing is repeatedly pretending that the arguments of the other side never happened and just keep peddling your point of view.

It’s extremely frustrating, because essentially you are ignoring the existence of other people and their opinions completely.

You have been doing it here and you have been doing it in AMM transaction-related discussions for over a year.

This is not how it looks from here. Actually the opposite.

3 Likes

If you look at my CHIP from last January, it has quite a lot of work in that direction.
There are various really interesting things possible, and cheaply if you do it correctly. But they have not been explored, mentioned or discussed. (Well, except some throwing of shade)

We are missing out the opportunity to do this without cut and paste of byte-arrays by ignoring a much nicer way that I described here: BitcoinCash/CHIP-subroutines: Declare and call subroutines with some new opcodes. - CHIP-subroutines - BitcoinCash Code

OP_RUNSUB2 is identical to OP_RUNSUB except that it fetches the subroutine list from earlier processed inputs on this same transaction.

To put that in perspective of Jason’s chip: Imagine a ‘op_define’ done in one of those UTXOs you’re talking about, you using that as one of your inputs and then being able to use that script with basically no overhead (no introspection code, no split / cat etc).

Maybe we can do that in a future opcode, but my idea has been ignored so far. Maybe you can think about it and since it is not a perfect fit, it will need work. But it would be nice to see that being taken up if people see a core of usefulness.

I think that is orthogonal to what I’ve been talking about. Different usecases and different security requirements.
I mean, not everyone will use those provides scripts. There will be people using their own. For instance in ‘Beta’ release. And those people will need to verify their inputs to avoid their funds being lost.

Thank you to all contributors and reviewers so far – I’ve frozen the Functions CHIP for lock-in at v2.0.2 (26e22566), and stakeholder statements will be periodically updated through November 1. Final approval requests will go out in early October.

Please feel free to open issues in the repo for further comments, clarifications, or feedback, and please continue to publish and/or send pull requests with stakeholder statements.

This CHIP is integrated and live on the Sept. 15 public test network (tempnet) – please see that topic for details on how to setup a tempnet node and experiment with the upgrade :rocket:

4 Likes