CHIP 2021-05 Targeted Virtual Machine Limits

The problem with scaling the budget based on fee paid is: min. fee is not consensus (and should not be). If we say 2sat/byte gives you double that’s just valid for the current state and sets this arbitrary point as baseline for consensus.
If relay rules later change to 0.5sat/byte, then you’d have to pay 4x the fee to get the 2x budget.

here I suggested an alternative:

declare some additional “virtual” bytes without actually having to encode them. Then, transaction fees could be negotiated based on raw + virtual bytes, rather than just raw bytes

maybe it could be worked out to actually work good for us, I would include these virtual bytes against blocksize limit & for ABLA, too

2 Likes

That indeed reverses the way it works. (reverses cause and effect).

What CAN work is to add the score to some framework or algorithm that is used to decide the priority of a transaction. Low priority means that the transaction may take several blocks to get mined, low priority means that higher priorty transactions well, have priority.
The easy way to get a higher priority is to pay more fees. Or spend coins that have older age etc.

The end result is the same, except that it is not consensus.

Either way, this little thought experiment is not for THIS chip. I just thought it useful to point out that it may be done as permissionless innovation.

2 Likes

Minor fixup proposal on one of the js math code blocks: https://github.com/bitjson/bch-vm-limits/pull/26

Actually… all the javascript examples should switch to using bigint so that we don’t have to deal with the lack of clarity of floats - including both truncation and precision issues. I’ll close this and make a PR for that.

Replaced with this: https://github.com/bitjson/bch-vm-limits/pull/27

1 Like

As I understand it, math operations now require the size of the output to determine the cost. Further, I think that implies that full script execution is required to know the cost of the script… which seems odd.

  1. Is my understanding correct?
  2. If so, is this the only place that requires full script execution (not just reading of the script itself) in order to validate?
  3. If so, wouldn’t it be smart to use a conservative heuristic on output.size instead?

edit - actually… knowing the size of the inputs also depends on full execution??? What am I missing or is this just an acceptable cost?

Can someone define spending input byte for me more explicitly?

It may be common knowledge, but I don’t know it, and I don’t see it specified except in the test vectors. Just wanting to make sure it is not an ambiguity in the spec text. If not, then no worries it’s just my lack of knowledge.

I see no explanation of expected activation or maintenance costs. These are important to be explained clearly even if they are simple. E.g. are there activation costs for nodes (yes of course), for wallets (probably no for common wallets, yes in theory for any web wallets that create contracts such as cauldron, guru, bch bull, … but it seems no in practice since new limits will allow existing contracts), miners (asics? probably not), pools (do they need to even be aware beyond consensus validity? any impact on tx cost? I think no, but it should be explicit), exchanges, etc.

I see that @bitcoincashautist has made a great costs section for the big big ints chip. The topic needs to be covered somehow for this CHIP also.

1 Like

The “Hashing Limit” section has a handful of unnamed, bare constants that I didn’t find an explanation for:

  • 7: ((41 + unlocking_bytecode_length) * 7) / 2
  • 1, 8, 64: 1 + ((message_length + 8) / 64) + (is_double ? 1 : 0); (the first 1+, not the conditional)

Each of them deserves a name and/or explanation. In contrast, e.g. the numbers 41, 800 have explanations.

I can guess at or infer the source of them based on other parts of the spec, but I’m not totally confident about my inference. It should be in the spec even if it’s explained here.

2 Likes

No, in that you don’t need full execution, as in, run through the end of the Script and only then be able to tell whether it fails or not. You can stop and mark invalid the moment when the budget is exceeded, you need to execute only up until that point. Important for DoS scenarios, you don’t exceed your budget even if someone tries to give you a TX excessively over the budget.

Hmm @bitjson did specify it like that in his operation cost table, e.g. for OP_MUL (a b -> c) it is specified as 100 + (2 * c.length) + (a.length * b.length), but I wonder why? Can’t we just do (a.length * b.length)*3 ?

You only need the size of script to know the execution budget. Then you start evaluating and you only need to execute up until the point where inputs can be determined, if still under budget you continue to next one, etc. You linearly execute the Script and keep updating the operation cost, the moment it exceeds the budget, you can fail it, without needing to execute the rest.

Right. That’s what I was thinking it would be good to avoid, but I think I see that the whole system is made on this basis where it’s by nature a static + runtime accounting process in many aspects. Makes sense and avoids needing to make bad (either limitingly tight or dangerously loose) assumptions. It’s probably obvious to those steeped in validation process, but not to someone who isn’t.

1 Like

Hopefully I can help clarify some things with my chiming in –

  1. There are two concepts here to keep in mind – the cost of execution … and the execution budget (limit).
  2. The cost of execution of a particular script is not known ahead of time… until you execute it. I would say that is an accurate statement, yes, Emergent.
  3. The existing sigCheck mechanism which Mark Lundeberg added in 2018/2019 has the same property – before that there was a heuristic “sigOps” mechanism that just scanned the script for 0xba or whatever and used that to determine if the script would pass/fail ahead of time. This was problematic… because unexecuted branches would add to the cost needlessly.
    • sigCheck limit also has the property that you don’t know the actual cost until you execute
    • vmLimits now also has this property as well – given that sigChecks exists and we are ok with it, this isn’t a huge departure from status quo. It’s just another dynamic cost you only “know” once you execute.
  4. Now… executuon limit is definitely known ahead of time. The execution budget, as it were, is based on “input byte length”. The concept of input bytes is identical to what (old) documentation refers to as scriptSig. It’s just the bytes in the transaction “scriptsig”… whatever they may be. Usually they are just a bunch if push ops – push of pubkey and push of sig (for p2pkh). The total byte count of that blob is what’s used to determine the execution budget – not the output size!
  5. So… output size doesn’t come into the picture at all here … it can be added — but as I understand it Jason felt it wasn’t needed – it’s sufficient to just go with scriptSig and use that byte size to determine the execution budget. The current proposal then has the side-effect of incentivizing p2sh over some crazy non-standard “bare” contract…

So the scenario we have is:

  • the execution budget (limit) is determined statically by just looking at the size of scriptSig
  • then, you execute the script (which you need to do anyway) to determine if among all the thousand other things it does, it also respects that budget. If it doesn’t, kill it and fail. If it does, great… pass (assuming it returns true).

I hope that helps?

5 Likes

Thanks! After your and BCA’s responses, I’ve filed two issues that would be better to resolve. I’d probably be able to write a PR for the static considerations if Jason wants that, but probably wouldn’t do justice to the other one.

2 Likes

GP has posted this issue. We think handling the two main issues listed is required in some form and not optional before our endorsement can be considered for 2025 activation. The 3rd issue listed is more minor but we still recommend some reasonable handling of it.

2 Likes

Great, thanks to General Protocols for the review! I’ll work on incorporating the feedback on both CHIPs.

Re stakeholder statements: I plan to start asking for final statements beginning in October and after all the active implementations have completed their development and review work. (Stakeholders are free to send PRs with statements before then, but I’ll have to clear any statements if feedback from one of the implementations requires meaningful CHIP language changes. Easier to post those elsewhere for now.)

5 Likes

Thanks for the questions, and thanks @bitcoincashautist and @cculianu for the detailed answers, just going to add a few more comments:

Yeah, there’s a naming-clarity issue here. The CHIP says “approximate spending input byte length” when referring to 41 + unlockingBytecode.length (A.K.A. scriptSig length), but it would be good for us to (in the spec) just define a proper term.

Background (for others reading along): all evaluations take place inside transaction inputs. The new density limits are based on the length of those transaction inputs, see the very long Rationale: Use of Input Length-Based Densities.

Anyways, the precise number is more intelligent than “encoded input length” for a few reasons (Rationale: Selection of Input Length Formula), but things would be clearer if we committed to a formal term for it. In Libauth, it’s now Density Control Length, which I think is descriptive and hard to confuse with anything else. I’ll try to revise the CHIP language to use that as formal terminology. Thanks @emergent_reasons!

3 Likes

Thanks, we should make those more explicit.

The 7 is just part of a possible formula for 3.5, without * 7, for 0.5:

For standard transactions, the maximum density is approximately 0.5 hash digest iterations per spending input byte; for block validation, the maximum density is approximately 3.5 hash digest iterations per spending input byte. See Rationale: Selection of Hashing Limit and Rationale: Use of Input-Length Based Densities.

Given the spending input’s unlocking bytecode length (A.K.A. scriptSig), hash digest iteration limits may be calculated with the following C functions:

int max_standard_iterations (int unlocking_bytecode_length) {
    return (41 + unlocking_bytecode_length) / 2;
}
int max_consensus_iterations (int unlocking_bytecode_length) {
    return ((41 + unlocking_bytecode_length) * 7) / 2;
}

And the hash digest iteration formula is (I believe) the most efficient way to calculate what goes on inside all of our hash functions:

Note on 64-Byte Message Block Size

Each VM-supported hashing algorithm – RIPEMD-160, SHA-1, and SHA-256 – uses a Merkle–Damgård construction with a block size of 512 bits (64 bytes), so the number of message blocks/digest iterations required for every message size is equal among all VM-supported hashing functions.

I just double checked it with a few different LLMs, but would appreciate others confirming if that formula is ideal. (Of course, it’s technically possible for implementations to add a counter rather than use this formula, but it’s somewhat less likely they’d want to modify their hashing implementation to add an internal counter.)

We could probably also clarify that note to be more explicit about the 8 and + 1 accounting for padding. Maybe:

Explanation of Digest Iteration Formula

Each VM-supported hashing algorithm – RIPEMD-160, SHA-1, and SHA-256 – uses a Merkle–Damgård construction with a 512-bit (64-byte) message block length, so the number of message blocks/digest iterations required for every message length is equal among all VM-supported hashing functions. The specified formula correctly accounts for padding: hashed messages are padded with a 1 bit, followed by enough 0 bits and a 64-bit (8-byte) message length to produce a padded message with a length that is a multiple of 512 bits. Note that even small messages require at least one hash digest iteration, and an additional hash digest iteration is required for each additional 512-bit message block in the padded message.

How’s that explanation?

2 Likes

Ok better! I’m still a bit unclear on the minimum part. It sounds like it would require a conditional (or min equivalent) to represent what is in the paragraph? I must still be missing something.

A construct you might use is:

# A: <explanation>
# B: <explanation>
# C: <explanation>
asdf = A + (C * D)

This pretty much removes all room for doubt.

Re:

And:

(Note @bitcoincashautist and I talked out of band, but will summarize here:)

All operations which create or modify stack items (not just math) must be evaluated to “know their total cost”, but this is precisely because our limits already operate one step ahead of actual execution. Changing any particular operation’s sub-linear cost formula(s) to an operand length-based heuristic would be both technically incorrect (in that total cost would no longer match bytes manipulated) and not improve worst-case performance. On the arithmetic ops specifically, the rationale is described in Arithmetic Operation Cost and Rationale: Inclusion of Numeric Encoding in Operation Costs.

Re cost determination requiring contract execution:

As @cculianu mentioned, the 2020 SigChecks upgrade was careful to preserve some optimizations discussed here. We’ve done the same thing (Rationale: Continued Availability of Deferred Signature Validation) and taken additional steps to limit hashing, as it cannot be deferred.

With these limits, VM implementations can always abort before expensive operations, and the specific design also allows us to avoid special per-operation logic or heuristics for all sub-linear and linear-time operations:

This proposal limits the density of both memory usage and computation by limiting bytes pushed to the stack to approximately 700 per spending input byte (the per-byte budget of 800 minus the base instruction cost of 100).

Because stack-pushed bytes become inputs to other operations, limiting the overall density of pushed bytes is the most comprehensive method of limiting all sub-linear and linear-time operations (bitwise operations, VM number decoding, OP_DUP, OP_EQUAL, OP_REVERSEBYTES, etc.); this approach increases safety by ensuring that any source of linear time complexity in any operation (e.g. due to a defect in a particular VM implementation) cannot create practically-exploitable performance issues (providing defense in depth) and reduces protocol complexity by avoiding special accounting for all but the most expensive operations (arithmetic, hashing, and signature checking).

Additionally, this proposal’s density-based limit caps the maximum memory and memory bandwidth requirements of validating a large stream of transactions, regardless of the number of parallel validations being performed1.

Alternatively, this proposal could limit memory usage by continuously tracking total stack usage and enforcing some maximum limit. In addition to increasing implementation complexity (e.g. performant implementations would require a usage-tracking stack), this approach would 1) only implicitly limit memory bandwidth usage and 2) require additional limitations on linear-time operations.

  1. While the 10KB stack item length limit and 1000 item stack depth limit effectively limit maximum memory to 10MB plus overhead per validating thread, without additional limits on writing, memory bandwidth can become a bottleneck. This is particularly critical if future upgrades allow for increased contract or stack item lengths, bounded loops, word/function definition, and/or other features that reduce transaction sizes by increasing their operation density.

I also highlighted an example above where our approach here is measurably safer than any previously proposed for bitcoin (BCH) or bitcoin-like contract systems: (:rocket:)

1 Like

Blockquote
Right. That’s what I was thinking it would be good to avoid, but I think I see that the whole system is made on this basis where it’s by nature a static + runtime accounting process in many aspects. Makes sense and avoids needing to make bad (either limitingly tight or dangerously loose) assumptions. It’s probably obvious to those steeped in validation process, but not to someone who isn’t.

You know – I was thinking about some optimizations to the VM and one optimization is to “lazy convert” the ScriptNums to vector data – that is, only convert the c result of a + b = c to a vector when it’s used “as a vector”, otherwise keep it as a ScriptNum. Such an optimization is called a lazy evaluation optimization and in some usage patterns it can pay off. Say that c was then to be input into another operation later “as a” script num… you save on not having to serialize it to vector only to unserialize from vector back to script num.

Anyway – what I am getting at is you may have a point @emergent_reasons – basically there is some cost to having to calculate the length… (our num encoding in memory for the stack nums differs from a bigint lib like libgmp… so you pay the cost of encoding to stack each time).

I suppose there is no real other way to cost this though… so, it is what it is. Serialization to vector is fast enough but it’s faster to not have to do that… especially for ephemeral temporary values.

Oh well.

2 Likes

I’d like to talk a bit about this one…

There are a couple of assumptions that Core pushed upon us that are actually really not logical and they seem to have been used to come to the conclusions you made.

In short, I think the approach should be to base the budget on the combined output and input scripts.

The most important assumption that should be challenged is the concept that fees pay for transactions and pay for UTXO ‘storage’. The Satoshi client didn’t solely use that, it was one of various ways and Core has over time removed and changed things to make fees be the only way to differentiate transactions.
A lot of things went wrong there, from the removal of coin-days being used to the most harmful change that transaction priorities are no longer a concept decided by mners in a free market (min-relay fee).

If we internalize this, then the idea that a big output-script (aka locking script, aka scriptPubKey) does not have any influence on the budget of execution fails the sniff test. Someone already paid for that, afterall.

Look at this from the practical point of view:

imagine a setup where the full node remembers the actually consumed processing budget. The miner then uses that as an input for a transaction-priority. You have to imagine that as when a transaction uses a high amount of CPU (obviously not more than the limit) it lowers its priority and that may mean it can take 10 blocks to get mined.
Obvious way to counter this is to have a high coin-days-destroyed or as last resort to pay a larger fee. (all this is permissionless innovation that very likely will happen when blocks get full).

Bottom line here is that the fear stated in the rationale of “volatility of worst-case transaction validation performance” is solved in a neat way by simply charging actual usage (lets not call it gas fee, though). Again, in a permissionless way that spreads the load because that is the most profitable thing to do.

As an aside, the attack has always been based solely on a crappy utxo implementation, it was never a real problem of cpu consumption. That is because transactions are validated already well before the block-header comes in. So worries about that should not remove normal incentives. End aside.

To get back to the point, whatever is mined in the output script has been paid for in some way or other. Ignoring that already paid for data in the VM limits chip creates wrong incentives. As Calin above wrote:

Regardless of anyones opinion of p2sh or whatever, it should be obvious that VM limits should not have such side-effects. Side effects are bad. mkay?

Edit: I think the simple way to do this is to add the output-script of the UTXO and the input-script that unlocks it together in total size and use that size as the input.
This script is ALREADY going to be combined into one and passed as one to the VM. As such it is the natural approach that actually accounts for the VM usage.

Thanks for the comments @tom! That alternative is discussed in Rationale: Use of Input Length-Based DensitiesI agree that the operation cost limit could be further increased based on UTXO size, but there are some serious economic and worst-case performance implications that would need to be worked out before it could be done safely.

The net effect of such a change would be:

  1. A very minimal (constant) increase in the budget available to all P2SH contracts (with slight advantage to P2SH32 over P2SH20), and
  2. A many-orders-of-magnitude increase in the worst-case budget available to currently-nonstandard contracts.

Fortunately, we can reduce the scope of this proposal by basing the budget only on input size for now. If there’s demand for it (considering how excessive these retargeted limits will already be for most usage), future upgrades could always raise limits by increasing budgets based on UTXO size or other TX overhead (like other same-TX outputs), by discounting specific operations or usage patterns (e.g. Karatusba-optimizable multiplication), etc. For now, this CHIP 1) takes the most conservative possible choice for each of these areas and 2) assumes the worst-case implementation of every system (e.g. even a zero-day bug in some implementation’s OP_SWAP making its inadvertently O(n) cannot be exploited).

That rationale section with the relevant section highlighted:

Use of Input Length-Based Densities

This proposal limits both hashing and operation cost to maximum densities based on the approximate byte length of the input under evaluation (see Rationale: Selection of Input Length Formula).

Alternatively, this proposal could measure densities relative to the byte length of the full containing transaction, sharing a budget across all of the transaction’s inputs. Such a transaction-based approach would provide contracts with the most generous possible computation limits given the transaction fees paid, allowing computationally-expensive inputs to also claim the computing resources purchased via the bytes of transaction overhead, outputs, and of other inputs. Additionally, if a future upgrade were to relax output standardness, transaction-based budgets would also offer non-P2SH (Pay to Script Hash) contracts a greater portion of the computing resources purchased via the transaction’s other bytes, particularly for contracts which rely mainly on introspection for validation and include little or no unlocking bytecode (e.g. the “Merge Threads” script in Jedex).

However, this proposal’s input-based approach is superior in that it: 1) allows contract authors to reliably predict a contract’s available limits regardless of the size and other contents of the spending transaction, 2) ensures that contract evaluations which do not exceed VM limits can be composed together in multi-input transactions without further regard for VM limits, 3) preserves the simplicity of parallelizing input validation without cross-thread communication, 4) more conservatively limits the worst-case validation performance of maliciously-invalid transactions (by failing the malicious transaction earlier), and 5) could be safely expanded into a transaction-based approach by a future upgrade if necessary.

Another alternative to offer greater limits to non-P2SH contracts would be to base densities on the byte length of both locking and unlocking bytecode, i.e. including the Unspent Transaction Output (UTXO) in the input’s budget. However, this alternative approach would increase the volatility of worst-case transaction validation performance: the price of the UTXO’s past bandwidth is paid by the previous transaction’s mining fee, while the price of the UTXO’s storage is paid only by the time-value of associated dust; if compute resources aren’t strictly tied to costs in the present (like the current transaction’s mining fee), the instantaneous computation requirements of transaction validation are not bound (e.g. slow-to-validate UTXOs may be stored up and evaluated in a smaller number of attack transactions). Instead, this proposal bases computation limits only on costs paid in the present, with the 41-byte minimum input length providing a reasonable minimum computation budget.

Finally, this proposal could also increase the operation cost limit proportionally to the per-byte mining fee paid, e.g. for fee rates of 2 satoshis-per-byte, the VM could allow a per-byte budget of 2000 (a 2.5 multiple, incentivizing contracts to pay the higher fee rate rather than simply padding the unlocking bytecode length). However, any allowed increase in per-byte operation cost also equivalently changes the variability of per-byte worst-case transaction validation time; such flexibility would need to be conservatively capped and/or interact with the block size limit to ensure predictability of transaction and block validation requirements. Additionally, future research based on real-world usage may support simpler alternatives, e.g. a one-time increase in the per-byte operation cost budget.

As this proposal attempts to avoid any impact to worst-case validation time – and future upgrades can safely deploy increases in operation cost limits – solutions for increasing operation cost limits are considered out of this proposal’s scope.

1 Like