CHIP-2025-05 Functions: Function Definition and Invocation Operations

tom · August 20, 2025, 11:19am

You misunderstood the problem people are trying to solve. I honestly don’t remember anyone suggesting a problem with code mutation.

The problem is about mixing data and code. Turning data stack items into code stack items.
Both not just allowed by your chip, but specifically part of your design requirements.

You can, for instance, do these:

copy another output script. Cut it up and paste stuff in there. Then turn it into callable code.
have the user push data in the unlocking script and without any checking if this is “correct” just execute it.

I don’t mind you having your own termology for things, the functionality is the point.
The functionality:

At the time the output is signed and broadcast, the code that is going to run at unlocking is likewise set and unchangable.

You can reach that requirement that in more than one way, the p2sh solution is to hash the code and store the hash on the blockchain. I think that works quite well.

The concerns are described in long form here: BitcoinCash/CHIP-subroutines: Declare and call subroutines with some new opcodes. - CHIP-subroutines - BitcoinCash Code
please read them as you didn’t address them and from your message it looks like you think you did, so I guess some re-reading would be in order.

Jonas · August 20, 2025, 11:49am

If script authors can use OP_DEFINE together with introspection they can also use this where applicable
[…] OP_DUP OP_HASH256 <hardcoded hash> OP_EQUALVERIFY […]

bitcoincashautist · August 20, 2025, 1:46pm

You don’t even understand the problem. Your main contribution to the topic has been noise.

ABLA · August 20, 2025, 7:06pm

bitjson:

Attempting to be “clever” by adding restrictions on function definition locations, esoteric failure cases, demanding various kinds of pre-commitments, or otherwise tinkering with function definition cannot prevent contract authors from making “code mutability” or “code injection” related mistakes in contracts that work on BCH today.

Instead, unusual limitations on native functions are very likely to increase development costs and cause new, practical contract vulnerabilities by making CashVM diverge from the native capabilities of other languages in commensurately unusual/surprising ways.

It would be a shame if one of these so-produced edge cases were to set back the port of a decentralized application from another blockchain ecosystem, delay implementation of BCH as a target in a zkVM compiler, or create a denial-of-service vector in a ported contract due to faulty implementation of some unnecessary workaround.

Unsurprising functions are safer functions, and the Functions CHIP’s native, immutable functions are as boring and unsurprising as functions can be.

I think Jason has a point here! And I like the original version of Functions. If we can somehow address General protocol concerns and move on with this, it would be a wonderful upgrade for our ecosystem.

General Protocol article

I believe Jason understands the problem, as clearly stated in the GP article above. We need a productive solution please. Time is limited, and we are near the finishing line to ship this safely and demonstrate our trust in the VM limit upgrade as well.

Jonas · August 21, 2025, 6:59am

Given that we can go the other way today (OP_ACTIVEBYTECODE for example) I don’t see the the horror here for a low level language as Script. And as said many times before, it’s fundamentally possible to do it with emulators such as TurtleVM.

tom · August 27, 2025, 11:19am

No, time is not limited. Maybe the amount of changes in one year is fine without this one, they are separate suggestions for a reason.
And most importantly is that this issue has been on the table in clear language for 8 months. Bad planning on Jason’s side is no reason for haste for the Bitcoin Cash community.

tom · August 27, 2025, 11:37am

Maybe I’m unable to explain it well, which is why there is a long form that I’ve linked to.

I’ve said this now several years in a row to you specifically, BCA, if there can be a phone conversation to have a much higher bandwidth meeting then maybe this is solvable in an hour. It probably was in the ABLA case where you ended up apologizing for not understanding what I’ve been saying for 9 months and that I actually was right… A phone call could have shaved off those 9 months and avoided a lot of problems. Not to mention a lot of frustration all around.
I’m trying to be patient about it. Honestly, I am.

But it is frustrating that nobody seems to be able to understand the simple issue of code insertion as I explained with an actual example. If anyone does, please repeat in your own words to see where the confusion lies.

To be clear, the “issue” has not changed since day one. My argument is exactly the same as it has always been. Which is why we use the git repo to make clear it is immutable.

Bringing up “library distribution” (a concept that is in its infancy and can absolutely change in future) is weird.
It may indicate that you are thinking about things from ONE specific expected usage point of view. But as this is a low level programming language component, the usage that I explained is also possible and very likely to be how programmers will use it, as it is actually closer to how Bitcoin has operated for a decade.

Please do just talk about this in a level headed way so we can improve the proposal, as that is my intention.
Because it is possible to improve and fix. As long as we are able to talk about it without pointing fingers.

Thanks

Jonas · August 27, 2025, 12:14pm

The examples I’ve seen could easily be classified as programmer errors. Someone being able to write a broken contract that is exploitable shouldn’t be an obstacle for new functionality since we wouldn’t be able to deploy any VM features.

tom · August 29, 2025, 9:24am

Hehe, you’re not wrong. But “programming” is a 60 year old profession and as those that have been here a long time know, the entire space is geared towards improving the status quo for programmers.
The main reason why new languages get released all the time is this. Improving the experience for the programmers. Less ability to shoot your own foot. (that’s a 1980s programmer reference, apologies)

I’m not sure why there is any push back at all here.
It’s not like I’m asking for much. It’s just that Jason doesn’t acknowledge he got ideas from me or something. While I did applaud him taking my suggested approach. I don’t really care about getting credit for my many solutions already being deployed in BCH, but it would be nice to not get push back every, EVERY single time.

And I have given a minimal intrusive (like, zero things taken away) solution, which seems to not get discussion either…

So, I insist to try and improve things. Maybe others will join in and we can all have a great Bitcoin Cash Scripting engine.

albaDsl · August 29, 2025, 10:16am

Tom, I may have missed some context, so to be fully clear:

Is the solution you are advocating for to add OP_DEFINE_VERIFY in addition to the two opcodes (OP_DEFINE / OP_INVOKE) in CHIP-2025-05 Functions? Or is it to replace OP_DEFINE with OP_DEFINE_VERIFY?
Is the above your preferred solution, or are you still in favor of multi-byte OP_DEFINE as in your withdrawn CHIP?

Jonas · August 29, 2025, 10:44am

That’s applicable to high level languages.
BCH Script already contains foot guns (in the name of flexibility) and it’s plain weird to reserve one opcode for adding some “safety” for one specific use-case.

If an UTXO is locked and the script requires you to provide some data and a signature you need to validate the pubkey.
If an UTXO is locked and the script requires you to provide a specific NFT commitment you need to validate the category.
If an UTXO is locked and the script requires you to provide a specific code blob you need to validate the hash.

Why do we need extra belt and suspenders for the last example?

Validating “unknown” code with the hash is just one way of trusting external code, another one might be to validate that a specific NFT is used as input or that the code is signed by a known pubkey. Do we need extra opcodes for those also?

tom · August 29, 2025, 11:16am

Thanks good question.

The withdrawn chip has a bunch of ideas all together. Stack protection, sharing scripts between utxos, MAST, etc are probably too advanced and while I think they are cheap enough to add there seems to be resistance because today people don’t seem to need them.
So the chip remains withdrawn. I’m not pushing for those ideas. I won’t object to good ideas being stolen, though.

Indeed the idea to have an OP_DEFINE_VERIFY next to the proposed op-define is specifically made because the two opcodes have a very different use cases.

Today we have p2sh, which basically means you have zero code in your output, you present all your code at unlocking time.

The combination of op_define and op_define_verify allows the basics of repeatedly calling a method. But the verify directly ties to the existing way of usage in the p2sh.
What the verify method allows is that people create an output of type p2s. BUT it runs some code that may be held secret until unlocking.
During unlocking (in the input) the code is pushed and that is done securely because the hash is already on the blockchain.
So you re-created p2sh from normal components using a p2s and a op-define-verify.

With this basic concept you can build out the ideas. For instance you can create a poor man’s MAST by defining things only in an if-block and as such you can omit this script in the unlocking if it is not needed.

Additionally, you can use the verify feature to ‘pick’ data from another input on the same transaction securely. Even doing things like OP_CAT and similar. Because the end result is protected by hash. This means that if I create a transaction with 10 outputs then I can avoid copy pasting the script across all of them if they reuse it. Instead a couple of opcodes are used to fetch it.
While the original (non verify) can do this too, a user could spend this by building a transaction that pushes a new output with code that then hacks the original locking scripts. It would mean losing money. To instead use op_define_verify your script solves that problem.

Naturally, the normal define does not go away. So all the things people have been explaining before remain possible.

bitcoincashautist · August 29, 2025, 11:28am

Here’s the link to referenced Tg discussion: Telegram: View @bchbuilders

It is useless either way, as @Jonas illustrated above (which seems to have gone over Tom’s head).

If we’d replace OP_DEFINE with OP_DEFINE_VERIFY then people could still make scripts that run “untrusted” code by doing […] <any blob> OP_DUP OP_HASH256 <hardcoded hash> OP_DEFINE_VERIFY […]. I thought Tom was suggesting this option, which is why I strongly resisted it, because it would hamper cases where you don’t need that specific kind of authentication, like when you have the blob push in locking script - it’s already committed and immutable, no need for additional hash verification and wasting 32 bytes with each define, same how legacy P2PK addresses don’t need to hash-verify the pubkey, and yet nobody can change the evaluated pubkey when spending the UTXO because it’s defined in the locking script rather than pushed by spender like in P2PKH case.

Btw, we had a already had contract that footgunned itself by forgetting to authenticate what should be authenticated, the first version of tapswap had a bug where pat changed my contract from P2PK to P2PKH but forgot to add hash verification. Thankfully nobody exploited it, and pat “upgraded” all contracts by using the exploit and moving people’s offers to the new version.

If we’d add OP_DEFINE_VERIFY alongside OP_DEFINE then I wouldn’t object as strongly because it’s just an optimization of a particular use of OP_DEFINE, probably not worth the trouble, and it would expand the scope of functions CHIP and extra bike shedding could prevent it from getting over the line for '26 activation. If Tom wants this in addition to OP_DEFINE, he should make a stand-alone CHIP to try add it.

tom · August 29, 2025, 11:33am

nobody is suggesting to do that.

Apart from BCA nobody seems to have been bikeshedding the idea of an additional verify opcode. It solves a lot of cases. Including the GP one linked in another thread.

So if he says he won’t object (strongly), then I think there is a rought consensus.

bitcoincashautist · August 29, 2025, 11:47am

OK, based on your past discussions in eval/subroutines I thought you wanted to prohibit the non-authenticated case with plain OP_DEFINE, which is why I strongly resisted your OP_DEFINE_VERIFY. If it’d just be in addition, then it’s on you to make the case for additional opcode just to shave off 3 bytes of a particular bytecode pattern. Why not optimize P2PKH then? We could have <sig> <pubkey> <hash> OP_CHECK_PUBKEY_AND_SIG instead of <sig> <pubkey> OP_DUP OP_HASH160 <hash> OP_EQUALVERIFY OP_CHECKSIG. This would save a meaningful amount of bytes because P2PKH is like 99% of transactions.

Jonas · August 29, 2025, 12:42pm

Given that it’s optional it could very well be the case that no one is using it, which leaves GPs qualms as they are.

I don’t think you are reading the room. Make this a separate CHIP.

albaDsl · September 2, 2025, 8:56am

@tom, good to hear that you are viewing OP_DEFINE_VERIFY as an addition to the OP_DEFINE / OP_INVOKE opcodes and not as a replacement for OP_DEFINE. It seems you are in agreement with the key aspects of ‘CHIP-2025-05 Functions’.

From what I understand, your main concern is about code/data that is external to the contract and the risks around executing it. Take the example of pulling in code from other UTXOs referenced by a transaction. In a future scenario, assuming we have read-only inputs & P2S, this technique could perhaps allow for a form of shared libraries on the blockchain (I’d like to hear other people’s view on if this is realistic and desirable).

Let’s take my collection of elliptic curve functions as an example. If it was sitting in a UTXO (assuming it fits), then a script in a transaction referencing this UTXO could fetch the library code using OP_UTXOBYTECODE, extract the function bodies using OP_SPLIT, perform any necessary transformations, and then define the library functions in its own local function table.

However, one issue with using OP_DEFINE_VERIFY for this is the amount of bytes needed for the hashes. The EC library has about 20 functions. To bring them in with OP_DEFINE_VERIFY would require ~20*32 = ~640 bytes of hash data. The library as a whole is ~700 bytes, so this would be a considerable overhead.

Compare this with just verifying a single 32 byte hash across the whole library and using plain OP_DEFINE to define the functions. I can see that you would value having the hash taken right before defining the function to ensure that we e.g. got the OP_SPLIT and transformations right. But from a security standpoint I believe that having trusted code verifying the library hash and extracting and defining the functions is equivalent.

I think you are after the fact that the presence of an OP_DEFINE_VERIFY opcode could possibly nudge the developer to do the right thing and not forget to verify hashes of external code. But just as a developer could forget to take a hash across the library, they could forget to use OP_DEFINE_VERIFY instead of OP_DEFINE (as @jonas also pointed out). So I’m not convinced about this opcode.

BTW, I saw your ‘OP_RUNSUB2’ proposal in the withdrawn chip and found it interesting. The fact that it requires transaction input validation to be serialized might be too much of a downside to make it viable though. Otherwise, a version of it could have been a way to greatly simplify accessing external functions.

tom · September 7, 2025, 9:16pm

Thanks for the review,

I agree that we’re honestly just scratching the surface wrt opportunities for scripting functions.

I have seen a lot of ideas that probably would need refinement and iterations to make them perfect. I’m comparing this to the “group” idea that on its own would indeed have worked, but the iterations and hard work turned it into a great solution instead and we now have FTs/NFTs that are stellar.

Your point with the library of functions being expensive if each is hashed was one that I also made in the past, while suggesting to use the concept called MAST that would be a trivial addition to allow some or many functions to be verified with only minimal number of bytes hash.
Additionally, a hash for code should likely be allowed to be a 20 byte one, not a 32 bytes. Making your 700 bytes setup become 22 instead.

That’s the point that seems to be unacknowledged, it really is not equivalent. And that is the main point where I expect people to start losing money.

You using OP_SPLIT and getting executable code from that is not some limiting factor in a UTXO setup.

A p2s script gets stored on the blockchain until it is spent. If in that script there is an eval that gets its code from anywhere else than the output that already on the blockchain, it becomes a puzzle piece that you need to find the matching piece for. But that doesn’t have to be the original piece, any thief can just create a new piece that fits. The moment they find something that fits, they can take the money stored on that utxo.

That people have attacked me saying that nobody will ever make this mistake, while i actually expect it to be a very common mistake for people that use p2s / op-eval as a replacement for p2sh. Security is hard, mistakes are extremely common.

Anyway, there is a lot possible to improve in functions. We stole the initial idea from Gavin, realizing it was quite simplistic. Then we looked at Nexa and realized that one was over engineerd and limiting. The first iteration was Ok, but I am happy to see that Jason took my idea of using opdefine instead of oppick to handle the code, which is absolutely a great improvement and everyone agreed on it being so.
So, this chip is better than it was at the beginning of the year, but iterating ideas still could very likely have made it better. I’ve tried, but I’ve openly been attacked in a way that makes it look like I didn’t like functions. While I’ve always been ambivalent. Very nasty politics there.

Yeah, Jason prefers long-form docs instead of posts here. As such it was an amalgamation of ideas, not a polished solution. As the status indicates. It packed in a LOT of concepts for this. It’s been out there for the entire year, but its ideas have been mostly ignored. Well, the “define” idea has been taken, which was the most important one, makes it worth it.
Yet, I feel we lost opportunities this year.

I don’t have anything against Jasons functions chip as it stands today, as the bar of it not hurting people has been reached. I do think we can do better, and ideally Jason decides to move it to next year to let it mature.

kzKallisti · September 7, 2025, 9:40pm

arguably, most naive contract authors will likely be using higher level tooling like cashscript or some other library to create their contracts. You’ve noted that your idea could be an additional opcode like OP_DEFINE_VERIFY (or maybe OP_INVOKE_VERIFY?). Couldn’t this just be a macro provided by some higher-level tooling?

albaDsl · September 8, 2025, 7:52am

In a MAST setup for this example, where would the Merkle proofs for the individual functions be stored? Each proof would be several hashes long.