CHIP 2024-07 BigInt: High-Precision Arithmetic for Bitcoin Cash

This proposal increases the maximum length of VM numbers again (following the 2022 increase from 4 to 8 bytes), this time from 8 bytes to 258 bytes, the second limit selected by Satoshi in 2010.

Hi everyone,

I’ve spent the past few months reviewing and benchmarking various solutions for the VM Limits CHIP, and in the course of that research (and previously while working on Jedex) I spent quite a bit of time exploring approaches for contracts to use or emulate higher-precision math.

In short: emulated precision is woefully inefficient in the VM system, and Bitcoin Cash is well designed to offer native support for high precision arithmetic.

Some background

Satoshi originally designed the VM to support arbitrary-precision arithmetic; numbers had no direct limits at all, and the sign-magnitude number format used is typically more efficient/less complex for arbitrary-precision arithmetic implementations than other options (Satoshi originally used OpenSSL’s big number library).

In July 2010, Satoshi began sneaking emergency patches into his Bitcoin implementation to prevent a variety of possible attacks against the network. In his first big VM patch, he instituted a variety of changes and limits to the VM – among them limiting numbers to 258 bytes. However in August 2010, he decided to remove the VM’s dependency on OpenSSL entirely, following instability he previously discovered in the right shift operation which he now realized would have split the network. In that patch, he limited VM numbers to 4 bytes (now lacking a big number implementation he trusted) and disabled a number of opcodes (many of which Bitcoin Cash has now reenabled).

Relation to Limits CHIP

VM numbers are limited to 8 bytes because the VM’s current limit system cannot account for the actual cost of expensive arithmetic operations.

The Limits CHIP re-examines this system to replace the existing opcode and stack item length limits with better-targeted equivalent limits, allowing contracts more power/flexibility without increasing transaction processing or memory requirements.

It turns out that to replace these limits, it’s prudent for us to correctly account for expensive arithmetic operations; OP_MUL, OP_DIV, and OP_MOD are O(n^2) in the worst case, while OP_ADD and OP_SUB are only O(n).

Further, it’s critical that we carefully test both the system which limits arithmetic, and high-precision arithmetic itself, together. I.e. we want to discover any potential issues with arithmetic limits before they are activated, and the safest way to do that is to actually use them.

2025 Deployment

I had been considering proposing this CHIP for 2026, but after chatting with other developers, and given that a large part of the implementation must necessarily be activated by the limits CHIP (the arithmetic cost limit), I think we can and should increase the VM number limit at the same time.

Review & Feedback

Despite the close relation to the limits CHIP, I’ve written this as a separate CHIP for ease of discussion and feedback.

You’ll notice the technical specification has only one sentence; the necessary changes will be quite different from implementation to implementation, so I expect that the test vectors will do the majority of lifting in this CHIP. (In Libauth for example, the underlying implementation has always used arbitrary-precision arithmetic, so only the constant specifying the limit needs to be changed, 8258. BCHN on the other hand, would need to re-introduce big number arithmetic.)

Thanks to @cculianu for implementing Libauth’s benchmarking suite in BCHN and answering my questions over the past few months, and thank you to everyone who joined in the impromptu discussion in various Telegram groups recently!

10 Likes

Podcast with lots of detailed discussion of this proposal available now: https://www.youtube.com/watch?v=oPQ8w0yZ88E

6 Likes

For people who haven’t watched GP’s spaces, some points:

  1. We’re very, very excited about this. It’ll solve so many problems both ours and for the whole ecosystem! That said…
  2. We’re also very, very worried about timing. We’re now in August and this is the first time anyone’s heard about this, not to mention the extremely skeletal shape the CHIP’s in.

Unless there’s an avalanche of updates coming in the next week or two, this is going to be no doubt rushed.

Tests will need to happen very soon, benchmarks will need to happen very soon, there’ll need to be time for public discussions based on the tests and benchmarks, numerous parties and implementations - not just the main beneficiaries - will need to be consulted.

Releasing in November is going to be an extremely tall order, and I’m not sure if there’s any rabbits to be pulled out of hats. If there is, please do so sooner rather than later no matter what shape they’re in, public discussion will accelerate fixing any imperfections.

But I suspect there’s a very real possibility that Bigint will not make it for 2025, and it’ll be quite disappointing if the widely anticipated VM Limits CHIP is dragged down by it not making it to 2025 either. From what I can gather, the dilemma is between:

  1. If bigint isn’t activated, contracts will become bigger for workarounds making more permissive limits necessary; if there’s no bigint, additional limits on opcodes like OP_MUL also becomes unnecessary.

  2. If bigint is activated, the reverse happens: contracts will become (much) smaller, enabling tighter limits in general; additional limits on opcodes like OP_MUL also becomes necessary.

Correct me if wrong, but this is the basis of “bigint is tied to the VM LImits CHIP”.

What’s missing from this picture, though, is that even a conservative VM Limits change without bigint is still a big improvement over the status quo, I do not think we should dismiss that possibility. It’s perfectly acceptable to focus our limited resources for now on “what kind of VM Limits will accommodate a future bigint CHIP?” and get that over the bar, instead of trying to push for all-or-nothing. When in doubt, nobody’s gonna blame us for erring on the conservative side.

Again, if there are any rabbits to be pulled, please do so now. We’re cutting awfully close even if there’s a full pen of them.

4 Likes

What kind of updates are you expecting? Specification-wise it’s just changing one number, right? With such a simple change, what’s there to talk about except performance impact and how to deal with it? The number is a limit of arithmetic opcodes. So this is a VM limit increase, too. :slight_smile:

The obstacle to just doing it before when we changed the number from 32-bit to 64-bit was uncertainty about performance impact. After all, 64-bit arithmetic ops can be natively executed by CPU, whereas anything bigger requires splitting the problem into a bunch of 64-bit ops so the cost of that 1 opcode becomes N times (add, sub) or N^2 times (mul, div, mod) more expensive.

The VM limits CHIP will remove the obstacle by creating a framework for limiting cost of any Script, so this CHIP on top of that one becomes a low cost and low risk improvement, while unlocking lots of benefits.

I’d love to see this for '25, then I could play with the chainwork oracle idea without having to twist my brain by having to emulate uint256 ops with int64 math and spend 1500 bytes of Script on what could’ve been 10 bytes if this CHIP would be activated.

2 Likes

What kind of updates are you expecting?

  • The fact that you need to summarize all that is one thing missing.
  • Extensive exploration of costs, whatever they are or are not.
  • Extensive exploration of alternatives. Currently my somewhat flippant summary is “it’s the number satoshi originally chose” which is not the worst rationale, but also not great.
  • Even though I happen to think along the same lines, it needs to steel-man the cost of contract breakage rather than blame hypothetical contract authors for using the overflow properties of existing 64 bit numbers. Also needs to investigate any known usage between then and now (this will be a human process because any such contract by nature doesn’t validate as a transaction).
  • Even if it’s likely, node projects have not agreed to this even in principle yet - BCHN, Verde, Kth, (apparently BCHD now too!)
  • Generally the CHIP is bare bones. You know well how much work and back and forth and new revelations can appear in a period of serious review - which this CHIP has not experienced. Some of that is bikeshedding, but not all of it.
  • That’s off the top of my head

I’d love to see this for '25

So would we. That’s why uname is laying out what is missing and the plethora of rabbits that need to be pulled out of hats.

However, I’d even more like to see the continued growth of confidence in BCH, which builds through the absence of fuckups. Fuckups happen when you rush.

This is not to mention the lindy effect of the CHIP process. We are either following it or not. If not, then… what’s the foundation we are standing on? Can you describe it? Can you convince others of it?

For what it’s worth, back in I think 2021, GP was in a similar situation except much worse than contract authors today. And for similar reasons, we still made the painful decision to oppose inclusion of the CHIPs in that year that would have massively benefited us.

5 Likes

@bitjson just bumping in case :slight_smile:

3 Likes

Hey all,

I just pushed up the latest update to the VM Limits CHIP and summarized it here:

Thank you for the comments and for outlining content we should add. I’m also working on an independent set of test vectors for BigInt now.

The existing benchmarks from the Limits CHIP cover BigInt, but I’ll continue expanding those too.

I agree that there’s a lot of work to be done. Of course, if implementations aren’t ready by November, I would also push for a delay to 2026.

For now, I’m cautiously optimistic that we can get implementations ready and thoroughly-tested before November.

8 Likes

From what I understand, it’s perfectly feasible for us to just do VM Limits this year & leave BigInt for next year.

The rub is that the testing for VM Limits has to take account of BigInts, because if we ship VM Limits this year but then afterwards discover the limits are set wrong relative to the BigInts then we’re in a mess. And we already know we probably want BigInts if we can get it, so it’s not like solve on problems and then look into the other afterwards, it’s just look at it all now.

And if it’s all sufficiently tested, then they kind of end up being a package deal anyway.

But I understand Jason’s strategy to go hard now to get everyone into gear, because if he doesn’t then it’s a self-fulfilling prophecy that it becomes destined for 2026 because everyone will figure they have plenty of time later (later being, wait a year then take another look).

4 Likes

How is progress coming here? All very exciting stuff!

1 Like

I will add an addendum to this, though. Would really like to see this activate in 2025 as well but we need some serious movement happening!

1 Like

I would rather have everything well-tested and especially thoroughly benchmarked before we go with deployment.

Late update is 1000x better than a serious malfunction with massive repercusions and PR hit X years from now when we reach gigabyte blocks.

It’s already being tested, the test cases are extensive:

I’m a little annoyed with the “oh maybe we can’t make it” attitude, don’t make this into a self-fulfilling prophecy, like what the hell it’s a simple enough upgrade and there’s 2-3 months of time until “the freeze” and most of the work has actually been done by Jason already, it’s just a matter of wrapping in a nice CHIP wrapping and shipping it, and we have 2-3 months of time to do it, which is more than enough.

1 Like

I would rather be too careful than sorry later.

Excellent, being annoyed will raise the brain activity, elevate the processing in the brain to “higher gear” and will make a person rethink things twice.

It’s just evolution.

Being too nice can often result in creating closed-loop thinking and “small circles of love and pleasure” where everybody follows everybody else that are bound to make mistakes, because everybody is afraid to criticize everybody else.

I am not saying “slow down” or “stop”. I am just saying “be veeeeeery careful”.

1 Like

Performance impact can be assessed by “big O” complexity analysis of the opcodes, and will be confirmed by benchmarks, and I have some preliminary results obtained from Calin today and they look promising, I extracted the interesting part (ops near the cost limit):

# ID, TxByteLen, Hz, Cost, CostPerByte, Description
trxhzt, 366, 11876.3, 1.000, 1.000000, "[baseline] 2 P2PKH inputs, 2 P2PKH outputs (one Schnorr signature, one ECDSA signature) (nonP2SH)"
6w680g, 8419, 12328.7, 0.963, 0.041878, "[benchmark] OP_MUL 2048-byte number (alternating bits set) and 2048-byte number (alternating bits set) (P2SH32)"
34vzam, 8408, 8772.7, 1.354, 0.058930, "[benchmark] OP_MUL 4096-byte number (alternating bits set) and 4096-byte number (alternating bits set) (nonP2SH)"

So, with the budgeting system introduced by VM Limits CHIP, a 400 byte TX could be allowed to do one 2kB x 2kB multiplication, or a bunch of 256-bit multiplications, but in any case the CPU cost can’t exceed the cost of a typical P2PKH TX, because using up the budget means an invalid TX.

With the VM Limits CHIP framework you’d get an allowance of how many bytes your TX can operate on, and you can’t exceed it whatever you do because it would result in an invalid TX.

The BigInt CHIP is basically VM Limits CHIP applied to arithmetic opcodes the same way as others (hash ops, stack ops - as soon as counter would get exceeded, TX evaluates as invalid). It’s a no-brainer IMO, but Jason did the responsible thing and extracted it as separate CHIP.

The impact of BigInt on node operating costs will be 0, and we can have volunteers reproduce benchmark results on various hardware as final verification.

BigInt does introduce an additional implementation risk, which should be mentioned in the CHIP “risks” section: that of correctness of implementing BigInt operations using int64 ops in c++. This risk will be mitigated by node developer quality control process and extensive test suite, which can be verified against multiple open-source BigInt libraries.

1 Like

Hi all, I just posted some updates to the Limits topic:

Just want to add for anyone interested, the mega-bench” branch here includes a combinatorial set of benchmarks over all 23 BigInt-manipulating opcodes at different combinations of lengths [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192] and several types of “fills” (to expose performance differences related to e.g. carry behavior). These are the tests @cculianu got running against BCHN last week.

Since there’s >1GB of tests now, it’s going to take me a bit longer to merge these and continue building out additional test vectors and benchmarks (I’m having to revise how Libauth generates and runs them to support incremental builds and benchmarking, and also going to compress everything by default), but I should have things cleaned up more over the next week or two. Until then, implementers can just grab the .zip files from that branch.

3 Likes

I also just merged this too: Consider dropping the VM number length limit · Issue #1 · bitjson/bch-bigint · GitHub (thanks for opening @bitcoincashautist!)

Going to copy the latest rationale:

Removal of Number Length Limit

With the activation of the Arithmetic Operation Cost Limit, the additional limitation on VM number length has no impact on worst-case transaction validation performance (see Tests & Benchmarks). This proposal removes the limit, allowing valid VM numbers to extend to the stack element size limit (A.K.A. MAX_SCRIPT_ELEMENT_SIZE) of 10,000 bytes. See CHIP: Targeted Virtual Machine Limits.

Alternatively, this proposal could raise the limit to a higher constant value like 258, the constant selected by Satoshi Nakamoto in reverted makefile.unix wx-config -- version 0.3.6 (July 29, 2010). However, because the limit is no longer necessary for mitigating worst-case transaction validation cost, selection of any particular constant would be arbitrary.

By fully-removing the limit, overall protocol complexity is reduced, simplifying both future VM implementations and contract development. For VM implementations, eliminating out-of-range cases significantly reduces the combinatorial set of possible inputs and outputs for the numeric operations: OP_1ADD (0x8b), OP_1SUB (0x8c), OP_NEGATE (0x8f), OP_ABS (0x90), OP_NOT (0x91), OP_0NOTEQUAL (0x92), OP_ADD (0x93), OP_SUB (0x94), OP_MUL (0x95), OP_DIV (0x96), OP_MOD (0x97), OP_BOOLAND (0x9a), OP_BOOLOR (0x9b), OP_NUMEQUAL (0x9c), OP_NUMEQUALVERIFY (0x9d), OP_NUMNOTEQUAL (0x9e), OP_LESSTHAN (0x9f), OP_GREATERTHAN (0xa0), OP_LESSTHANOREQUAL (0xa1), OP_GREATERTHANOREQUAL (0xa2), OP_MIN (0xa3), OP_MAX (0xa4), and OP_WITHIN (0xa5). For contract authors, eliminating the possibility of out-of-range errors prevents a class of potential vulnerabilities arising from a contract system’s failure to validate that intermediate arithmetic results never exceed an (uncommonly-encountered) maximum number length limit.

Non-Inclusion of Implementation-Specific Technical Details

This proposal specifies only the necessary change to the consensus protocol: removal of the number length limit.

The software changes required to support this consensus change differ significantly from implementation to implementation. VM implementations which already internally use arbitrary-precision arithmetic for VM number manipulation may only need to disable code enforcing the previous limit, while other implementations may be required to integrate an arbitrary-precision arithmetic library or language primitive. In all cases, implementations should verify their functional behavior and performance characteristics against the Tests & Benchmarks.

Non-Inclusion of VM Number Format or Operation Descriptions

Beyond dropping the unnecessary limit on VM number length, this proposal does not modify any other properties of the VM. Notably, the precise format and behavior of VM numbers across all VM operations – especially OP_1ADD (0x8b), OP_1SUB (0x8c), OP_NEGATE (0x8f), OP_ABS (0x90), OP_NOT (0x91), OP_0NOTEQUAL (0x92), OP_ADD (0x93), OP_SUB (0x94), OP_MUL (0x95), OP_DIV (0x96), OP_MOD (0x97), OP_BOOLAND (0x9a), OP_BOOLOR (0x9b), OP_NUMEQUAL (0x9c), OP_NUMEQUALVERIFY (0x9d), OP_NUMNOTEQUAL (0x9e), OP_LESSTHAN (0x9f), OP_GREATERTHAN (0xa0), OP_LESSTHANOREQUAL (0xa1), OP_GREATERTHANOREQUAL (0xa2), OP_MIN (0xa3), OP_MAX (0xa4), and OP_WITHIN (0xa5) – are part of network consensus and do not otherwise change as a result of this proposal. For the avoidance of doubt, see the Tests & Benchmarks.

2 Likes

I’m now working on this CHIP in parallel with @bitjson, and there are 3 pull requests open:

Still waiting on Jeson to review them, and if anyone wants to see the CHIP with those merged in the meantime, they can see it in my working repo here:

2 Likes

Amazing, exciting work as always.

Some feedback:

Under
Arithmetic Operations → OP_MOD (0x97)
“e.g. 7/3 will return 1, -7/3 will return -1, 7/-3 will return 1, and -7/3 will return -1.”
The second and fourth examples are the same, so I assume you meant
“and -7/-3 will return -1.” at the end.

Under
Ternary Operations → OP_WITHIN (0xa5)
“The 2nd-to-top item defines the “left” and includive…”
I think that’s supposed to be “inclusive”.
Also, you might want to clarify that the examples are in [evaluated][left boundary][right boundary] order; I was confused and had to reread a few times.

3 Likes

Under Arithmetic Operations → General requirements
Typo: “With this upgrade, that requirement has been removes”
Should say “removed”.

Under Unary Operations → OP_0NOTEQUAL (0x92)
It says “If value is not 0 change it to 1. If it is 0, leave it unchanged.”
But later says “if executed on aabbdd the result will be an empty stack item (0).”
Shouldn’t the result be 01 according to the description?

2 Likes

Thanks for your comment and feedback, implemented all your comments in the repo now (also some formatting improvements to examples)!

4 Likes