CHIP 2021-05 Targeted Virtual Machine Limits

Nice! :+1:

Since when is this a thing?

Can you point me to the discussion or a commit?

https://documentation.cash/protocol/blockchain/transaction-validation/block-level-validation-rules.html#sigchecks

Upgrade spec: Bitcoin Cash Protocol Documentation

2 Likes

Everything is looking good so far :white_check_mark:. I am at peace.

…but if anything goes seriously wrong, I will be the “I told you so guy” in 5 years.

1 Like

Hi everyone, thanks for all the messages and feedback. I just pushed up two small but substantive corrections:

Thanks again to @cculianu for encouraging me to reconsider input length-based densities rather than based on transaction length. Vastly superior, now documented in rationale.

I also want to highlight that the operation cost change is a great example of our defense in depth strategy working as intended. :fire:

Basically: after some benchmarking of bigint ops (up to 10,000 bytes) on Libauth and BCHN, I realized we shouldn’t “discount” something (VM number encoding) that should be very cheap on well-optimized implementations (BCHN) but is a hassle to optimize in other implementations. (If we make it “cheap” because it can be cheap, we’ve forced all future implementations to implement the optimization. So now this CHIP always takes the more conservative approach.)

Anyways, it’s notable that even in the worst case, this correction only slightly improves the alignment between the consensus-measured “operation cost” and the real-world runtime performance on a non-optimized VM (worst case less than 2x the expected validation time).

If our limits were more naive, for some implementations this issue could easily produce an order of magnitude difference between cost and real-world performance (and if it were bad enough, maybe an implementation-specific exploit). However, our limits are also designed to catch unknown, implementation-specific issues (just clarified this more in rationale):

Because stack-pushed bytes become inputs to other operations, limiting the overall density of pushed bytes is the most comprehensive method of limiting all sub-linear and linear-time operations (bitwise operations, VM number decoding, OP_DUP , OP_EQUAL , OP_REVERSEBYTES , etc.); this approach increases safety by ensuring that any source of linear time complexity in any operation (e.g. due to a defect in a particular VM implementation) cannot create practically-exploitable performance issues (providing defense in depth) and reduces protocol complexity by avoiding special accounting for all but the most expensive operations (arithmetic, hashing, and signature checking).

This was some nice unexpected validation of our approach, so wanted to highlight it here.

3 Likes

This is definitely a possibility, though I think we should try to avoid it if possible: right now we’re in the enviable position of not having to even account for computation in the “pricing” of transactions. That means transaction fee-rate checking can occur at a higher level than the level which validates the VM bytecode in TX inputs. It probably wouldn’t be a huge issue in practice to muddy TX fee-rate checking with VMB evaluation, but avoiding that makes the most of our architectural advantages over systems like EVM (which are architecturally forced to incorporate computation into pricing).

That said, I certainly see a place for a future upgrade which offers a higher per-byte budget based on the actual fee-rate paid, so e.g. fee rates of 2 sat/byte would get a budget slightly higher than 2x the 1 sat/byte budget (to encourage economizing on bandwidth rather than forcing extremely expensive computations to pad their contract length). But I avoided that additional complexity in this CHIP to cut scope. Once people are creating significant contract systems using ZKPs or homomorphic encryption, the topic can be revisited by a future CHIP, ideally with some empirical evidence of how much bandwidth the network would save by offering those users more “compute” for a higher fee-rate. As it stands, authors would simply need to re-arrange cross-input data or pad their contracts with extra unused bytes to afford a higher budget, and we’ll be able to see those behaviors on chain. (Note that it’s also possible a future CHIP will choose simply to increase the base compute budget rate offered for 1 sat/byte transactions, especially if validation performance continues to improve at a faster rate than required by block size scaling.)

Anyways, thanks for this comment @ShadowOfHarbringer, I added a bit more about it to this rationale section:

Finally, this proposal could also increase the operation cost limit proportionally to the per-byte mining fee paid, e.g. for fee rates of 2 satoshis-per-byte, the VM could allow a per-byte budget of 2000 (a 2.5 multiple, incentivizing contracts to pay the higher fee rate rather than simply padding the unlocking bytecode length). However, any allowed increase in per-byte operation cost also equivalently changes the variability of per-byte worst-case transaction validation time; such flexibility would need to be conservatively capped and/or interact with the block size limit to ensure predictability of transaction and block validation requirements. Additionally, future research based on real-world usage may support simpler alternatives, e.g. a one-time increase in the per-byte operation cost budget.

3 Likes

Next Steps

We now have two implementations (of both Limits and BigInt) – BCHN (C++) and Libauth (JS/TS) – with extremely consistent results and relative performance differences in exactly the places we would expect based on complexity analysis (in each case, the spec takes the most conservative approach to assigning operation costs).

We learned a bit more in bouncing the spec between JS and C++. I’m fairly confident now that further changes to the spec will be minor, so implementation work in other node implementations won’t have to go through the same churn. This is also an opportunity to verify empirically that we’ve gotten everything right.

To “preregister” my planned approach to evaluating whether or not we got these CHIPs right: as we get performance benchmark results from each implementation, I’ll rank-order all results by their worst performance-to-opCost ratio for the worst-performing implementation on the benchmark in question. The worst case should be within the same order of magnitude as worst-case 1-of-3 bare multisig (BMS), and ideally it should never exceed 200% 1-of-3 BMS performance-to-opCost.

Given the results from Libauth (shared last month) + Calin’s BCHN results from last week, it seems this will be proven empirically, but I’ll wait to get complete results (and any more spec feedback) from all other implementations before claiming we’re done.

So: We need help getting patches and then benchmark results for all other node implementations. I’m reaching out to all the teams today, and I’m happy to hop on calls to help teams ramp up quickly or answer any questions. This is also the perfect time to get wider feedback on if any spec details can be further simplified to make implementation easier in other languages (JS and C++ gave us a pretty wide view already, but another round of implementation feedback should give us more confidence that we’re not leaving any more protocol simplifications on the table.)

In the meantime, I’ll revise all the benchmarks again for this latest iteration (hopefully one last time :crossed_fingers:) and make some pretty graphs that I can keep updated as we get results from other implementations. I’m also hoping to knock out the stretch goal of expanding our new test suite to incorporate the old script_tests.json such that it can be fully deprecated.

When we’re done here, we should have both an expanded set of functional test vectors and an extremely comprehensive set of benchmarking test vectors that can be used to reveal any performance deficiencies in any VM implementation.

5 Likes

I agree, however this is now.

I was talking about a possible future. If computations become more expensive in the future, but not dangerously expensive, including a simple algorithm with some kind of ratio for calculating the cost of a TX the miner bears + TX fee is a natural thing to do.

1 Like

The problem with scaling the budget based on fee paid is: min. fee is not consensus (and should not be). If we say 2sat/byte gives you double that’s just valid for the current state and sets this arbitrary point as baseline for consensus.
If relay rules later change to 0.5sat/byte, then you’d have to pay 4x the fee to get the 2x budget.

here I suggested an alternative:

declare some additional “virtual” bytes without actually having to encode them. Then, transaction fees could be negotiated based on raw + virtual bytes, rather than just raw bytes

maybe it could be worked out to actually work good for us, I would include these virtual bytes against blocksize limit & for ABLA, too

2 Likes

That indeed reverses the way it works. (reverses cause and effect).

What CAN work is to add the score to some framework or algorithm that is used to decide the priority of a transaction. Low priority means that the transaction may take several blocks to get mined, low priority means that higher priorty transactions well, have priority.
The easy way to get a higher priority is to pay more fees. Or spend coins that have older age etc.

The end result is the same, except that it is not consensus.

Either way, this little thought experiment is not for THIS chip. I just thought it useful to point out that it may be done as permissionless innovation.

2 Likes

Minor fixup proposal on one of the js math code blocks: https://github.com/bitjson/bch-vm-limits/pull/26

Actually… all the javascript examples should switch to using bigint so that we don’t have to deal with the lack of clarity of floats - including both truncation and precision issues. I’ll close this and make a PR for that.

Replaced with this: https://github.com/bitjson/bch-vm-limits/pull/27

1 Like

As I understand it, math operations now require the size of the output to determine the cost. Further, I think that implies that full script execution is required to know the cost of the script… which seems odd.

  1. Is my understanding correct?
  2. If so, is this the only place that requires full script execution (not just reading of the script itself) in order to validate?
  3. If so, wouldn’t it be smart to use a conservative heuristic on output.size instead?

edit - actually… knowing the size of the inputs also depends on full execution??? What am I missing or is this just an acceptable cost?

Can someone define spending input byte for me more explicitly?

It may be common knowledge, but I don’t know it, and I don’t see it specified except in the test vectors. Just wanting to make sure it is not an ambiguity in the spec text. If not, then no worries it’s just my lack of knowledge.

I see no explanation of expected activation or maintenance costs. These are important to be explained clearly even if they are simple. E.g. are there activation costs for nodes (yes of course), for wallets (probably no for common wallets, yes in theory for any web wallets that create contracts such as cauldron, guru, bch bull, … but it seems no in practice since new limits will allow existing contracts), miners (asics? probably not), pools (do they need to even be aware beyond consensus validity? any impact on tx cost? I think no, but it should be explicit), exchanges, etc.

I see that @bitcoincashautist has made a great costs section for the big big ints chip. The topic needs to be covered somehow for this CHIP also.

1 Like

The “Hashing Limit” section has a handful of unnamed, bare constants that I didn’t find an explanation for:

  • 7: ((41 + unlocking_bytecode_length) * 7) / 2
  • 1, 8, 64: 1 + ((message_length + 8) / 64) + (is_double ? 1 : 0); (the first 1+, not the conditional)

Each of them deserves a name and/or explanation. In contrast, e.g. the numbers 41, 800 have explanations.

I can guess at or infer the source of them based on other parts of the spec, but I’m not totally confident about my inference. It should be in the spec even if it’s explained here.

2 Likes

No, in that you don’t need full execution, as in, run through the end of the Script and only then be able to tell whether it fails or not. You can stop and mark invalid the moment when the budget is exceeded, you need to execute only up until that point. Important for DoS scenarios, you don’t exceed your budget even if someone tries to give you a TX excessively over the budget.

Hmm @bitjson did specify it like that in his operation cost table, e.g. for OP_MUL (a b -> c) it is specified as 100 + (2 * c.length) + (a.length * b.length), but I wonder why? Can’t we just do (a.length * b.length)*3 ?

You only need the size of script to know the execution budget. Then you start evaluating and you only need to execute up until the point where inputs can be determined, if still under budget you continue to next one, etc. You linearly execute the Script and keep updating the operation cost, the moment it exceeds the budget, you can fail it, without needing to execute the rest.

Right. That’s what I was thinking it would be good to avoid, but I think I see that the whole system is made on this basis where it’s by nature a static + runtime accounting process in many aspects. Makes sense and avoids needing to make bad (either limitingly tight or dangerously loose) assumptions. It’s probably obvious to those steeped in validation process, but not to someone who isn’t.

1 Like

Hopefully I can help clarify some things with my chiming in –

  1. There are two concepts here to keep in mind – the cost of execution … and the execution budget (limit).
  2. The cost of execution of a particular script is not known ahead of time… until you execute it. I would say that is an accurate statement, yes, Emergent.
  3. The existing sigCheck mechanism which Mark Lundeberg added in 2018/2019 has the same property – before that there was a heuristic “sigOps” mechanism that just scanned the script for 0xba or whatever and used that to determine if the script would pass/fail ahead of time. This was problematic… because unexecuted branches would add to the cost needlessly.
    • sigCheck limit also has the property that you don’t know the actual cost until you execute
    • vmLimits now also has this property as well – given that sigChecks exists and we are ok with it, this isn’t a huge departure from status quo. It’s just another dynamic cost you only “know” once you execute.
  4. Now… executuon limit is definitely known ahead of time. The execution budget, as it were, is based on “input byte length”. The concept of input bytes is identical to what (old) documentation refers to as scriptSig. It’s just the bytes in the transaction “scriptsig”… whatever they may be. Usually they are just a bunch if push ops – push of pubkey and push of sig (for p2pkh). The total byte count of that blob is what’s used to determine the execution budget – not the output size!
  5. So… output size doesn’t come into the picture at all here … it can be added — but as I understand it Jason felt it wasn’t needed – it’s sufficient to just go with scriptSig and use that byte size to determine the execution budget. The current proposal then has the side-effect of incentivizing p2sh over some crazy non-standard “bare” contract…

So the scenario we have is:

  • the execution budget (limit) is determined statically by just looking at the size of scriptSig
  • then, you execute the script (which you need to do anyway) to determine if among all the thousand other things it does, it also respects that budget. If it doesn’t, kill it and fail. If it does, great… pass (assuming it returns true).

I hope that helps?

5 Likes

Thanks! After your and BCA’s responses, I’ve filed two issues that would be better to resolve. I’d probably be able to write a PR for the static considerations if Jason wants that, but probably wouldn’t do justice to the other one.

2 Likes

GP has posted this issue. We think handling the two main issues listed is required in some form and not optional before our endorsement can be considered for 2025 activation. The 3rd issue listed is more minor but we still recommend some reasonable handling of it.

2 Likes

Great, thanks to General Protocols for the review! I’ll work on incorporating the feedback on both CHIPs.

Re stakeholder statements: I plan to start asking for final statements beginning in October and after all the active implementations have completed their development and review work. (Stakeholders are free to send PRs with statements before then, but I’ll have to clear any statements if feedback from one of the implementations requires meaningful CHIP language changes. Easier to post those elsewhere for now.)

5 Likes