CHIP-2025-05 Functions: Function Definition and Invocation Operations

Maybe I’m unable to explain it well, which is why there is a long form that I’ve linked to.

I’ve said this now several years in a row to you specifically, BCA, if there can be a phone conversation to have a much higher bandwidth meeting then maybe this is solvable in an hour. It probably was in the ABLA case where you ended up apologizing for not understanding what I’ve been saying for 9 months and that I actually was right… A phone call could have shaved off those 9 months and avoided a lot of problems. Not to mention a lot of frustration all around.
I’m trying to be patient about it. Honestly, I am.

But it is frustrating that nobody seems to be able to understand the simple issue of code insertion as I explained with an actual example. If anyone does, please repeat in your own words to see where the confusion lies.

To be clear, the “issue” has not changed since day one. My argument is exactly the same as it has always been. Which is why we use the git repo to make clear it is immutable.

Bringing up “library distribution” (a concept that is in its infancy and can absolutely change in future) is weird.
It may indicate that you are thinking about things from ONE specific expected usage point of view. But as this is a low level programming language component, the usage that I explained is also possible and very likely to be how programmers will use it, as it is actually closer to how Bitcoin has operated for a decade.

Please do just talk about this in a level headed way so we can improve the proposal, as that is my intention.
Because it is possible to improve and fix. As long as we are able to talk about it without pointing fingers.

Thanks

The examples I’ve seen could easily be classified as programmer errors. Someone being able to write a broken contract that is exploitable shouldn’t be an obstacle for new functionality since we wouldn’t be able to deploy any VM features.

6 Likes

Hehe, you’re not wrong. But “programming” is a 60 year old profession and as those that have been here a long time know, the entire space is geared towards improving the status quo for programmers.
The main reason why new languages get released all the time is this. Improving the experience for the programmers. Less ability to shoot your own foot. (that’s a 1980s programmer reference, apologies)

I’m not sure why there is any push back at all here.
It’s not like I’m asking for much. It’s just that Jason doesn’t acknowledge he got ideas from me or something. While I did applaud him taking my suggested approach. I don’t really care about getting credit for my many solutions already being deployed in BCH, but it would be nice to not get push back every, EVERY single time.

And I have given a minimal intrusive (like, zero things taken away) solution, which seems to not get discussion either…

So, I insist to try and improve things. Maybe others will join in and we can all have a great Bitcoin Cash Scripting engine.

Tom, I may have missed some context, so to be fully clear:

  1. Is the solution you are advocating for to add OP_DEFINE_VERIFY in addition to the two opcodes (OP_DEFINE / OP_INVOKE) in CHIP-2025-05 Functions? Or is it to replace OP_DEFINE with OP_DEFINE_VERIFY?

  2. Is the above your preferred solution, or are you still in favor of multi-byte OP_DEFINE as in your withdrawn CHIP?

1 Like

That’s applicable to high level languages.
BCH Script already contains foot guns (in the name of flexibility) and it’s plain weird to reserve one opcode for adding some “safety” for one specific use-case.

If an UTXO is locked and the script requires you to provide some data and a signature you need to validate the pubkey.
If an UTXO is locked and the script requires you to provide a specific NFT commitment you need to validate the category.
If an UTXO is locked and the script requires you to provide a specific code blob you need to validate the hash.

Why do we need extra belt and suspenders for the last example?

Validating “unknown” code with the hash is just one way of trusting external code, another one might be to validate that a specific NFT is used as input or that the code is signed by a known pubkey. Do we need extra opcodes for those also?

1 Like

Thanks good question.

The withdrawn chip has a bunch of ideas all together. Stack protection, sharing scripts between utxos, MAST, etc are probably too advanced and while I think they are cheap enough to add there seems to be resistance because today people don’t seem to need them.
So the chip remains withdrawn. I’m not pushing for those ideas. I won’t object to good ideas being stolen, though.

Indeed the idea to have an OP_DEFINE_VERIFY next to the proposed op-define is specifically made because the two opcodes have a very different use cases.

Today we have p2sh, which basically means you have zero code in your output, you present all your code at unlocking time.

The combination of op_define and op_define_verify allows the basics of repeatedly calling a method. But the verify directly ties to the existing way of usage in the p2sh.
What the verify method allows is that people create an output of type p2s. BUT it runs some code that may be held secret until unlocking.
During unlocking (in the input) the code is pushed and that is done securely because the hash is already on the blockchain.
So you re-created p2sh from normal components using a p2s and a op-define-verify.

With this basic concept you can build out the ideas. For instance you can create a poor man’s MAST by defining things only in an if-block and as such you can omit this script in the unlocking if it is not needed.

Additionally, you can use the verify feature to ‘pick’ data from another input on the same transaction securely. Even doing things like OP_CAT and similar. Because the end result is protected by hash. This means that if I create a transaction with 10 outputs then I can avoid copy pasting the script across all of them if they reuse it. Instead a couple of opcodes are used to fetch it.
While the original (non verify) can do this too, a user could spend this by building a transaction that pushes a new output with code that then hacks the original locking scripts. It would mean losing money. To instead use op_define_verify your script solves that problem.

Naturally, the normal define does not go away. So all the things people have been explaining before remain possible.

2 Likes

Here’s the link to referenced Tg discussion: Telegram: View @bchbuilders

It is useless either way, as @Jonas illustrated above (which seems to have gone over Tom’s head).

If we’d replace OP_DEFINE with OP_DEFINE_VERIFY then people could still make scripts that run “untrusted” code by doing […] <any blob> OP_DUP OP_HASH256 <hardcoded hash> OP_DEFINE_VERIFY […]. I thought Tom was suggesting this option, which is why I strongly resisted it, because it would hamper cases where you don’t need that specific kind of authentication, like when you have the blob push in locking script - it’s already committed and immutable, no need for additional hash verification and wasting 32 bytes with each define, same how legacy P2PK addresses don’t need to hash-verify the pubkey, and yet nobody can change the evaluated pubkey when spending the UTXO because it’s defined in the locking script rather than pushed by spender like in P2PKH case.

Btw, we had a already had contract that footgunned itself by forgetting to authenticate what should be authenticated, the first version of tapswap had a bug where pat changed my contract from P2PK to P2PKH but forgot to add hash verification. Thankfully nobody exploited it, and pat “upgraded” all contracts by using the exploit and moving people’s offers to the new version.

If we’d add OP_DEFINE_VERIFY alongside OP_DEFINE then I wouldn’t object as strongly because it’s just an optimization of a particular use of OP_DEFINE, probably not worth the trouble, and it would expand the scope of functions CHIP and extra bike shedding could prevent it from getting over the line for '26 activation. If Tom wants this in addition to OP_DEFINE, he should make a stand-alone CHIP to try add it.

3 Likes

nobody is suggesting to do that.

Apart from BCA nobody seems to have been bikeshedding the idea of an additional verify opcode. It solves a lot of cases. Including the GP one linked in another thread.

So if he says he won’t object (strongly), then I think there is a rought consensus.

OK, based on your past discussions in eval/subroutines I thought you wanted to prohibit the non-authenticated case with plain OP_DEFINE, which is why I strongly resisted your OP_DEFINE_VERIFY. If it’d just be in addition, then it’s on you to make the case for additional opcode just to shave off 3 bytes of a particular bytecode pattern. Why not optimize P2PKH then? We could have <sig> <pubkey> <hash> OP_CHECK_PUBKEY_AND_SIG instead of <sig> <pubkey> OP_DUP OP_HASH160 <hash> OP_EQUALVERIFY OP_CHECKSIG. This would save a meaningful amount of bytes because P2PKH is like 99% of transactions.

4 Likes

Given that it’s optional it could very well be the case that no one is using it, which leaves GPs qualms as they are.

I don’t think you are reading the room. Make this a separate CHIP.

3 Likes

@tom, good to hear that you are viewing OP_DEFINE_VERIFY as an addition to the OP_DEFINE / OP_INVOKE opcodes and not as a replacement for OP_DEFINE. It seems you are in agreement with the key aspects of ‘CHIP-2025-05 Functions’.

From what I understand, your main concern is about code/data that is external to the contract and the risks around executing it. Take the example of pulling in code from other UTXOs referenced by a transaction. In a future scenario, assuming we have read-only inputs & P2S, this technique could perhaps allow for a form of shared libraries on the blockchain (I’d like to hear other people’s view on if this is realistic and desirable).

Let’s take my collection of elliptic curve functions as an example. If it was sitting in a UTXO (assuming it fits), then a script in a transaction referencing this UTXO could fetch the library code using OP_UTXOBYTECODE, extract the function bodies using OP_SPLIT, perform any necessary transformations, and then define the library functions in its own local function table.

However, one issue with using OP_DEFINE_VERIFY for this is the amount of bytes needed for the hashes. The EC library has about 20 functions. To bring them in with OP_DEFINE_VERIFY would require ~20*32 = ~640 bytes of hash data. The library as a whole is ~700 bytes, so this would be a considerable overhead.

Compare this with just verifying a single 32 byte hash across the whole library and using plain OP_DEFINE to define the functions. I can see that you would value having the hash taken right before defining the function to ensure that we e.g. got the OP_SPLIT and transformations right. But from a security standpoint I believe that having trusted code verifying the library hash and extracting and defining the functions is equivalent.

I think you are after the fact that the presence of an OP_DEFINE_VERIFY opcode could possibly nudge the developer to do the right thing and not forget to verify hashes of external code. But just as a developer could forget to take a hash across the library, they could forget to use OP_DEFINE_VERIFY instead of OP_DEFINE (as @jonas also pointed out). So I’m not convinced about this opcode.

BTW, I saw your ‘OP_RUNSUB2’ proposal in the withdrawn chip and found it interesting. The fact that it requires transaction input validation to be serialized might be too much of a downside to make it viable though. Otherwise, a version of it could have been a way to greatly simplify accessing external functions.

6 Likes

Thanks for the review,

I agree that we’re honestly just scratching the surface wrt opportunities for scripting functions.

I have seen a lot of ideas that probably would need refinement and iterations to make them perfect. I’m comparing this to the “group” idea that on its own would indeed have worked, but the iterations and hard work turned it into a great solution instead and we now have FTs/NFTs that are stellar.

Your point with the library of functions being expensive if each is hashed was one that I also made in the past, while suggesting to use the concept called MAST that would be a trivial addition to allow some or many functions to be verified with only minimal number of bytes hash.
Additionally, a hash for code should likely be allowed to be a 20 byte one, not a 32 bytes. Making your 700 bytes setup become 22 instead.

That’s the point that seems to be unacknowledged, it really is not equivalent. And that is the main point where I expect people to start losing money.

You using OP_SPLIT and getting executable code from that is not some limiting factor in a UTXO setup.

A p2s script gets stored on the blockchain until it is spent. If in that script there is an eval that gets its code from anywhere else than the output that already on the blockchain, it becomes a puzzle piece that you need to find the matching piece for. But that doesn’t have to be the original piece, any thief can just create a new piece that fits. The moment they find something that fits, they can take the money stored on that utxo.

That people have attacked me saying that nobody will ever make this mistake, while i actually expect it to be a very common mistake for people that use p2s / op-eval as a replacement for p2sh. Security is hard, mistakes are extremely common.


Anyway, there is a lot possible to improve in functions. We stole the initial idea from Gavin, realizing it was quite simplistic. Then we looked at Nexa and realized that one was over engineerd and limiting. The first iteration was Ok, but I am happy to see that Jason took my idea of using opdefine instead of oppick to handle the code, which is absolutely a great improvement and everyone agreed on it being so.
So, this chip is better than it was at the beginning of the year, but iterating ideas still could very likely have made it better. I’ve tried, but I’ve openly been attacked in a way that makes it look like I didn’t like functions. While I’ve always been ambivalent. Very nasty politics there.

Yeah, Jason prefers long-form docs instead of posts here. As such it was an amalgamation of ideas, not a polished solution. As the status indicates. It packed in a LOT of concepts for this. It’s been out there for the entire year, but its ideas have been mostly ignored. Well, the “define” idea has been taken, which was the most important one, makes it worth it.
Yet, I feel we lost opportunities this year.

I don’t have anything against Jasons functions chip as it stands today, as the bar of it not hurting people has been reached. I do think we can do better, and ideally Jason decides to move it to next year to let it mature.

arguably, most naive contract authors will likely be using higher level tooling like cashscript or some other library to create their contracts. You’ve noted that your idea could be an additional opcode like OP_DEFINE_VERIFY (or maybe OP_INVOKE_VERIFY?). Couldn’t this just be a macro provided by some higher-level tooling?

In a MAST setup for this example, where would the Merkle proofs for the individual functions be stored? Each proof would be several hashes long.

You only need the merkle-root. One hash for all functions you import. I think if you google for “MAST Bitcoin” you’ll find some documentation on this design. There is a BIP, but that one is probably too technical for most.

The tooling people haven’t really shown an interest in this concept, last message from Mathieu here on this topic was a “maybe” and months ago.
It would be nice to get more people working across layers involved in solving problems, I do agree with you there.

Most of the past year discussions have been with jonas and bca vehemently disagreeing with the statement that authors could lose money by not verifying their inputs.
And, indeed, if we accept that this is indeed possible for a group of usages, we can move forward and try to solve it.

Maybe the best idea is to have a OP_DEFINE_MAST opcode for 2027, that may be the best bang for the buck. Depends a bit of people actually being interested in working on solving problems or just dismissing problems that the experts won’t have.

That’s it. I’m out. Unclear if I ever return here to BCR.

Don’t leave, you are very much valued by the community!

4 Likes

There has been a few MAST proposals so it would be helpful if you linked to the one you are referring to that avoids Merkle-proofs.

If we look at for example BIP 341 (Taproot) we can see that to spend a Taproot output one needs to provide a control block which contains the Merkle-proof. It can hold of up to 128 hashes (32 bytes each) depending on the size of the Merkle tree. This is what proves the inclusion of the script in the Merkle tree.

Probably the easiest to understand it is if you look at a block header. It has one merke-root and the number of transactions in a block is variable.

The merkle-tree is built by hashing the actual [data] (here a series of transactions), in a specified way. If you provide all transactions you need not provide any hashes, and the merkle root is there to verify that all data is proper and byte for byte as expected.
Look up the size of the block header, it is a standard unchangable 80 bytes. That is because it only holds the merkle-root. and no other hashes.

MAST in a way of verifying the content of scripts would work identical. Regardless of how many scripts you’d supply in the unlocking script, the mast operation just hashes all of them into a tree which results in a single hash that is then compared to the one stored on the blockchain in the output we are unlocking.

What may be confusing is that (see the SPV chapter in the bitcoin whitepaper) merkle trees have a second feature that is used in mast. You can omit one piece of [data] at the unlocking time and instead provide its hash. Which may be useful in some cases.

Edit:

so, in short, MAST uses merkle-trees. But they in normal cases are only in-memory. Not shipped. You don’t need to store them on chain.

Yes, Merkle trees can be represented by a single root hash. Inclusion proofs (Merkle-proofs), however, consist of a Merkle-path from the leaf node to the root as illustrated by the Taproot BIP.

Thanks for engaging in conversation. I have what I need regarding OP_DEFINE_VERIFY. My conclusion is, as before, that a separate opcode is not justified but that tooling can define it as a macro.

1 Like