CHIP 2021-05 Targeted Virtual Machine Limits

cculianu · April 10, 2024, 11:30am

Specification should specify at what point during execution the stack memory usage limit is enforced

I would suggest the specification specify that the 130,000 byte limit is enforced after the currently-executed OP code completes .

Why specify this? Because it makes it very clear and future-proofs the specification.

Note: the current stack limit of 1000 only is enforced after the execution of the current OP code completes.

Further rationale: We may introduce future opcodes that are complex and that temporarily exceed limits, only to resume back to below-limit after op-code-completion. As such, the limit should be specified to apply at some specific point. And it makes sense to mirror current operation of the stack depth limit (which only applies the stack depth limit after the current op-code completes execution).

cculianu · April 10, 2024, 5:45pm

In the spec, what is the definition of an “evaluation” for the purposes of hitting the hash ops limit? For example, a p2sh:

(1) evaluates the locking script, hashing the redeemscript via e.g. OP_HASH160 or whatever.
(2) then it does another evaluation using the redeemscript (which itself may contain further hashing op codes).

When (2) begins evaluating redeemscript, is the hash count reset to 0 or does it continue where (1) left off for the counts?

EDIT: So far in my test implementation, I am having it not reset to 0 as it proceeds to the redeemscript, thus the hashOps count is 1 when the redeemScript begins execution in the p2sh case.

EDIT2: For the hash limit: OP_CHECKSIG and friends don’t count towards the hash ops limit, right?

mtrycz · April 11, 2024, 5:22pm

I’m late to the party, but I wanted to share, I brought myself up to date with the CHIP, and notably the 130kB limit lacks a rationale.

I understand that it is preserving the current maximum limit, but it is not stated as to why we’d want to keep the current maximum.

Thank you in advance, and sorry if this has already been addressed, the thread is way too long to parse in one go, sorry!

cculianu · April 11, 2024, 10:30pm

It has a rationale. Please read the spec section: GitHub - bitjson/bch-vm-limits: By fixing poorly-targeted limits, we can make Bitcoin Cash contracts more powerful (without increasing validation costs). – in particular expand out the " Selection of 130,000 Byte Stack Memory Usage Limit " bullet point.

In summary: 130KB is the limit we implicitly have now. The new explicit limit just preserves status quo.

cculianu · April 13, 2024, 8:23am

I worry now that with 10KB pushes, it might be possible for miner-only txns to contain data blobs in the scriptSig. We have 1650 byte limit as a relay rule for scriptSig – but what do people think about moving that limit to consensus perhaps as part of a separate CHIP?

BitcoinCashPodcast · May 3, 2024, 9:20am

Without having done the in depth research probably required, I am in favour of making consensus rules in alignment with relay rules, I’m still not entirely sure why they’re different. Maybe there is a good reason, so Chesterton’s Fence says not to fuck that up, but it seems like an area the BCH community should be looking into anyway. Of course, we’re all unreasonably busy.

cculianu · May 3, 2024, 4:46pm

My two cents here: fwiw I am 10000% in favor of just tightening consensus rules to 100% match relay rules now. The fact that the two differ seems to me like the original Bitcoin devs (before they were captured)… were deprecating some things and they intended to remove them. FWIW I think it would make life a lot easier on everybody if consensus tightened to precisely match relay. No surprises. No possibility also of perverse incentives. This is because consensus rules are so liberal they may end up allowing for… shenanigans. Better to plug that hole. My two cents.

ShadowOfHarbringer · May 7, 2024, 4:22pm

I agree.

I am still waiting for somebody to give a good reason “why not”.

Until I hear some strong arguments against, I will remain in support of this.

tom · May 13, 2024, 8:46pm

Two reasons.

First, it is a pipe dream. You can take all todays relay rules and make them consensus rules. But tomorrow someone may introduce new relay rules. You’re back to square one.
Or, in other words, if your intention is to equalize a decentralized system, you’re doing it wrong.

Second reason is that the standard rules (here named relay rules for some reason) are tightening of the consensus rules mostly beyond what is reasonable in a proper free market system. To be clear, we don’t actually yet have a free market system. Which is why those tightened rules make sense today.

A set of properties are currently not possible to be adjusted by the free market. There isn’t enough volume for one, but more importantly there is no software that exists that allows those properties to be taken into account when doing Bitcoiny things (like mining).

Make those tools and then make the properties like op-return size, script-type, script-size etc etc able to be limited by the free market, and you don’t need them as standard rules anymore, and certainly not as consensus rules.

ShadowOfHarbringer · May 18, 2024, 1:02am

I don’t understand, I think you meant the reverse.

Take all the consensus rules and copy them to the relay rules (source relay rules from consensus rules).

Isn’t it?

Then you achieve 100% coherency in the decentralized system plus also you increase consistency too, since the system is more “obvious” and transparent about everything.

tom · May 18, 2024, 8:46am

You seem to think that “standard” rules (you call relay rules) are somehow set by a central party and not somehow decentralized.

The reality is that any miner or merchant can add new rules to their node on which they reject transactions.

To repeat:

if your intention is to equalize a decentralized system, you’re doing it wrong.

BitcoinCashPodcast · May 18, 2024, 9:58pm

No you’re not. The entire reason for a blockchain is to have decentralised agreement on an “equal” set of transactions.

Some things can be user adjustable sure. But some things can’t. We’re just discussing what falls into each category.

Maybe today there is quite a lot of flexibility on relay rules. In future, perhaps the “default” is to have a much less flexible set, everything being more likely consensus.

Does that mean we should or could somehow BAN people doing stuff on their own node? No, of course not. Does it mean we could benefit from having more sensible defaults so that the likelihood of discrepancies is smaller? Yes, absolutely.

Same as ABLA. Do you NEED to use ABLA, could you instead just manually tweak your own blocksize to be above the limit? Yes. Is that what most people are likely to do? No, they’ll just use ABLA & that’s brilliant.

Having better sensible defaults is not the same as attempting to ban or deny some kind of configurability, and having sensible defaults is a really really good idea (and trying to prevent progress on better defaults by trying to conflate them with an attempt to ban alternatives is not).

bitcoincashautist · May 19, 2024, 5:22am

Yup, anyone can patch their node to make their own node’s relay rules tighter than consensus.

Can we really expect that people would start experimenting with relaxing relay rules on their own? AFAIK the only precedent of this happened with Inscriptions/Ordinals where some pools created a way to get some non-standard TXs mined.

Those are what you call consensus rules. Relay rules are more like: “I won’t bother with this TX for now, even though it is valid. I’ll get it later if it gets mined.”

This. If the default is that relay rules are much stricter than consensus, then some pools relaxing it on their own could put others at a disadvantage, because everyone else would have to download a bunch of “unseen” TXs once the more lax pool would announce their block.

Q re. impact on DSPs: if some edge node sees a TX violating its relay rules and trying to double-spend a TX which was relay-compliant, will it still generate and propagate a DSP?

If the default is that relay rules are == consensus by default, then whomever patches their node to make relay more strict would put themselves at a disadvantage (because he’d have to later download a bunch of TXs all at once on block announcement).

tom · May 19, 2024, 11:44am

You might want to refer to the earlier post instead where this was addressed as “second reason”:

I don’t know, the only reason I pointed this out is because the GOAL of equalizing a decentralized system is not really tenable. It should not be a goal just like you don’t have as a goal to create a language that all people talk. The moment you think you succeeded, someone WILL change something. So, I repeat, if your GOAL is to equalize a decentralized system (a group of random people) you’re doing it wrong. Instead, embrace the chaos and make sure you’re capable of dealing with failures.
This is like making sure all cars drive the same speed, they won’t. Not as long as you have people driving them.

ShadowOfHarbringer · May 19, 2024, 3:39pm

Oh, I get your point now.

Sure, what you are saying might happen, but it most probably won’t happen.

Miners are the most following part of BTC and BCH ecosystem. It has been the rule for the last 8 years. I am not seeing this changing.

So, what will happen is miners will follow whatever rules BCHN makes by default, like they follow the fee inclusion rules right now. Unless we break stuff and make life hard for them.

So, historically and psychologically/socially your argument is unfortunately completely incorrect.

It is usually a good decision to increase consistency in a decentralized system, it makes it easier for system participants to stay in sync. Assuming they will follow, which they most probably will.

ShadowOfHarbringer · May 19, 2024, 8:19pm

This is why increasing consistency and transparency is the way. Having an incosistent or not transparent decentralised system is a quick way to create chaos and trouble for ourselves in the future, like in the example you gave.

Sourcing the TX relay rules from TX fee inclusion rules is one of the ways to create more transparency and consistency.

tom · May 21, 2024, 11:26am

Generation of DSProofs is done by a full node that already has one in its mempool, and a second is received.
All nodes will propagate DSProofs where at minimum one of the transactions are in the mempool.

So, to answer your question, all nodes will propagate the dsproof in your scenario.

bitcoincashautist · July 22, 2024, 6:46am

On BCHN Slack, I asked what’s the status of this, and Calin said:

i’m still waiting on Jason for final spec… but we have benchmarked some things and so far he’s pleased that stuff that is slow for him in libauth is slow is BCHN and fast in libauth is fast in BCHN so he’s confident he can use libauth’s relative performance to come up with the spec.

@bitjson could you give us an update, let’s bring this over the finish line?

BitcoinCashPodcast · July 22, 2024, 4:05pm

I will be doing a Podcast episode with @bitjson this Friday to discuss in detail, I believe he plans to time it somewhat with a release of an updated version, so you can look forward to that.

bitjson · July 27, 2024, 2:30pm

Hey everyone, thanks for your patience while I was working on this over the past few months.

I’m still working on cleaning up a PR to the CHIP, I’ll hopefully have something up shortly. There was some related discussion in Telegram recently, so just wanted to post some notes here too:

My 2021 benchmarks considered hashing to be much more relatively-costly than other operations, so I had misjudged the impact of increasing the stack item length limit on the cost of other operations. Hashing is still generally the worst, but there’s another class of operations that realistically need to be limited too.
However: it turns out that simply limiting cumulative bytes pushed to the stack is a nearly perfect way to measure both computation cost and memory usage. I’m just calling this “operation cost” for now.
O(n^2) arithmetic operations (MUL, DIV, MOD) also need to be limited: it works well to simply add a.length * b.length to their “cost”.
Hashing I still think is wise to limit individually, but should also be added to cost (right now I’m using the chunk size of 64 * hash digest iterations)
And then to avoid transactions being able to max out both the operation cost and sigchecks, we’ll want to add to the operation cost appropriately when we increment sigchecks. (I’m still benchmarking this, but simply adding the length of the signing serialization seems very promising, and even resolves the OP_CODESEPARATOR concerns).

I’m still working on the precise multiplier(s) for operation cost (the goal is to derive from just beyond the worst case of what is currently possible, maybe even reigning some things in via standardness), but it will simply be “N pushed bytes per spending transaction byte”.

I also now agree with @tom’s earlier ideas that we don’t need the 130KB limit or even to track stack usage at all. Limiting the density of pushing to the stack is both simpler and more than sufficient for limiting both maximum overall memory usage as well as compute usage.

I’ll post a link here as soon as I can get a PR to the CHIP open. For now you can review the benchmark work for: