CHIP 2023-07 Composite Arithmetic Opcodes

MathieuG · July 5, 2023, 5:51pm

Bitcoin Cash has upgraded to 64-bit integers and now with the advent of CashTokens the BCH VM enables complex applications, this poses the need both for higher-order math and for allowing overflows in intermediate calculations.

Now, shortly after the CashTokens upgrade there are already multiple different Constant ProductMarket Maker (CPMM) contracts live on the network which carefully have to consider strategies for multiplication overflows. The composite aritmathic opcode op_muldiv allows the intermediate calculation to overflow and hence simplifies the strategies dealing with overflows, leading to simpler, more efficient and more capable smart contracts.

Moreover, the introduction of op_muldiv and op_mulmod is a big step towards enabling emulation of higher precision math (uint126) in BCH contracts.

This proposal adds two new math opcodes to the Bitcoin Cash Virtual Machine (BCH VM), op_muldiv and op_mulmod .

op_muldiv greatly simplifies contracts using 64-bit integer multiplication in intermediate calculations. These calculations currently fail on overflow so contract authors need to either restrict input ranges or use clever math workarounds which sacrifice precision.

op_muldiv presents a clean alternative to this and together with op_mulmod it paves the way for emulating higher precision math in BCH contracts.

Review and feedback are appreciated!

BigBlockIfTrue · July 5, 2023, 11:30pm

In my professional experience unrelated to bitcoin, a function like OP_MULDIV is indeed very useful if you want to compute stuff while avoiding floating-point arithmetic.

The most obvious alternative to this would be to have actual 128-bit integer support in the VM. But in that case this function could in turn enable working with 256-bit intermediate results.

im_uname · July 6, 2023, 9:08am

What is the worst case cpu load compared to their component opcodes?

emergent_reasons · July 6, 2023, 9:12am

Please add a consideration of OP_MULDIVMOD in alternatives.

bitcoincashautist · July 6, 2023, 10:40am

We’ll need an implementation to benchmark, but since it maps well to native CPU instructions I think it’s safe to say it will be on the cheaper end of opcodes.

What’s that? Is it a 2-byte opcode? If we went for 2-byte we could go for complete set, like OP_COMPOSITE that would calculate all combinations:

a * b / c
a * b % c
(a + b) / c
(a + b) % c
(a - b) / c
(a - b) % c

MathieuG · July 6, 2023, 2:02pm

Added a section in the rationale discussing the choice between added the current two proposed opcodes or the complete set like @bitcoincashautist mentions above

Complete set of composite operations for overflows

Only the composite aritmatic opcodes op_muldiv and op_mulmod are proposed as those are immediately useful. Other composite arithmetic functionality (adddiv, addmod, subdiv and subdiv) is very useful in simplifying uint126 math and in enabling easy overflow checks. However the number of cases where addition or subtraction overflows are much smaller than for multiplication, and it is possible to workaround the overflow without losing precision.

Alternatives include reserving 6 codepoints or starting with a shared codepoint right away for all composite operations.

cculianu · July 6, 2023, 2:44pm

I fully support this idea. Well thought through spec.

As for performance implications: most of the slowness in VM is due to manipulation of heap memory and other shenanigan. An opcode like this can be done with native int128 CPU instructions and its performance cost would be negligible. Everything else the VM does is costly — this would not be.

Anyway the spec as it stands looks good. I’ll leave it to others if they want to define an OP_COMPMATH or whatnot but the current 2 extra opcodes proposed here should be sufficient as a minimal subset to acconplish anything using ~128 bits of precision.

MathieuG · July 7, 2023, 9:42am

Had some more discussion with @bitcoincashautist on the advantages of having a complete set.
The composite subtraction opcodes can easily be done with composite addition by simply flipping a the sign of the second argument. A minimal complete set would include op_adddiv & op_addmod too along side the two originally proposed.

Doing int127 addition & subtraction turned out to be cumbersome so it makes sense to include all four operations at once!

MathieuG · July 7, 2023, 9:53am

Published a major revision of the CHIP.

It now proposes op_adddiv & op_addmod to be included too!
To demonstrate concretely how easy it is to do int127 math with the new arithmetic opcodes an example CashScript library Math.cash is has been added too.

Complete set of composite operations for overflows
The set of four composite opcodes enables the full usecase, there is no need for op_subdiv and a corresponding op_submod as this can easily be done with the composite addition opcodes just by fliping the sign of the second argument. This is illustrated in the Math.cash example library. The full set of composite opcodes is proposed instead of just the immediately useful op_muldiv for the more narrow AMM usecases to enable higher precision math which is useful much more broadly.

Trying to do int127 addition and subtraction without the composite addition opcodes is roundabout and complex, while technically possible without losing precision, it was found to be much cleaner to include it along side the composite multiplication opcodes to complete the set.

Choice of codepoints
In the range of arithmetic opcodes there are 4 open slots but the possibility of re-enabling OP_LSHIFT & OP_RSHIFT makes this non-ideal. The OP_UNKNOWN range presents a clean choice to put the four consecutive opcodes. The range 0xda (218) through 0xdd (221) was chosen to leave a reasonable gap for possible future introspection opcodes, so they could be grouped together.

dagurval · July 10, 2023, 1:58pm

I’m in support of this CHIP. The size of a liquidity pool in the Cauldron DEX contract is currently limited by the size of int64.

This can be worked around today by reducing precision (higher spread between buy and sell). With OP_MULDIV, the precision loss would be significantly less.

MathieuG · July 11, 2023, 1:47pm

Had a discussion with @dagurval about overflows and overflow-workarounds for Caldron Dex.

Cauldron uses x * y = k to calculate the target constant “k” for the pool to check the "effective k " after a user interaction. This overflows on maxint32 * maxint32 which means a pool can hold at a maximum ~2.1 billion sats (21BCH) and ~2.1 billion tokens.

To work around this limit, the most expedient solution for now would be to use (x/c) * (y/c) = k where “c” is a division constant chosen by the pool operator. This sacrifices on precision of the multiplication, take the following example:

An AMM pool holds 888,888 satoshis and 888,888 tokens. If for the design the pool, the “k” should not overflow 1 billion , we can use c=1000.

The pool constant k would be calculated by k = (888,888 / 1000) * (888,888 / 1000)
=> k = 888 * 888 so k = 788,544
or when rounding up after each division k = 889 * 889 = 790,321
The calculation loses precision because the truncation with the division happens before the multiplication.
With a muldiv opcode, which only does the division at the end, the calculation would be
k = muldiv(888,888 ; 888,888 ; 1,000,000) = 790,121 so only truncation after the multiplication.

Another option with op_muldiv & op_mulmod would be to emulate int127 multiplication and the int127 inequaltiy check if full precision is necessary.

MathieuG · July 11, 2023, 2:37pm

The team behind Fex.cash AMM worked around the overflow limitation in their own way as they explain in full in their whitepaper, here the most relevant part:

Customized Multiplication and Division

Since the multiplication in Bitcoin Cash scripts may cause overflow problems, the AMM of Fex.Cash implementation defines a custom multiplication and division method.

Variable Definition
  let TwoExp30 = 1073741824; // 2 ** 30
TwoExp30 is roughly one billion, a large enough number.

Purpose of Calculation

Given r ∈ [0, TwoExp30) .

If we want to calculate a * r / TwoExp30 , the expression a * r is likely to overflow， but >(a/TwoExp30) * r and (a%TwoExp30) * r are not overflow.

If we want to calculate a * TwoExp30 / r ， the expression a * TwoExp30 is likely to overflow, but >(a/r) * TwoExp30 and (a%r) * TwoExp30 are not overflow.

Derivation
 a * (x/y) = (floor(a/y) * y + (a%y)) * (x/y)
            = (floor(a/y) * y * (x/y) + (a%y) * (x/y)
            = (floor(a/y) * x + (a%y) * x / y
If x = r, y = TwoExp30 , then: a * r / TwoExp30 = floor(a/TwoExp30) * r + (a%TwoExp30) * r / TwoExp30

If x = TwoExp30, y = r , then: a * TwoExp30 / r = floor(a/r) * TwoExp30 + (a%r) * TwoExp30 / r

As can be seen these workarounds get quite mathematical, and while this can be abstracted away in libraries over time so developers don’t need to worry about all the math, it makes it hard to reason about the precision of the calculations.

The Fex.cash team actually made a separate version of what the contract code would look like with library imports. These user-libraries would be a good intermediate solution until the next hardfork upgrade May 2024, but hopefully AMMs won’t need these workarounds for the years to come.

bitcoincashautist · July 13, 2023, 7:05am

I misunderstood, this would be e = ((a * b) / c) % d right?

Contracts writers can use OP_MULDIV and OP_MULMOD to easily implement:

int127 = int64 * int64,
int127 = int127 / int64, and
int64 = int127 % int64

operations.

With that, they can also implement e = ((a * b) / c) % d, if a, b, c & d are all within int64 range and both c and d are != 0, and the result e does not overflow int64.

MathieuG · July 27, 2023, 1:23pm

Published a new version of the composite arithmetic opcodes CHIP!
I added Higher script limits to evaluated alternatives and I wrote an example CashScript library to emulate the composite opcodes.

Higher script limits

With a proposal such as CHIP 2021-05 Targeted Virtual Machine Limits, it becomes practically feasable to emulate muldiv,mulmod,adddiv & addmod functionality.
This is demonstrated in the emulatedOpcodes.cash example CashScript library, which only uses the current VM capabilities.
As can be seen from just the file length, it would be impractical to use such functions in smart contracts due to the 201 opcode & 520 bytesize limit.
By lifting or re-engeneering these limits, the same functionality proposed in the CHIP can be achieved at the cost of larger smart contracts.
With good library support in Cashscript the complexity of using such emulated functions can be hidden from developers.
The Int127Math.cash library could work by importing emulatedOpcodes.cash instead of utilizing native muldiv,mulmod,adddiv & addmod functions.

tl121 · July 28, 2023, 4:47pm

Specifications need to be useful, complete, concise and correct. The current draft is concise, but not complete and hence I am unable to determine if it is correct. Additional detail is needed before I can enumerate all the edge conditions that must be analyzed, verified, and corrected if necessary. Ultimately, it comes down to bits N on the stack causing something to happen, i.e. K new bits of state. A specification must define exactly what the output K bits must be for all 2**N possible inputs.

Because the specification must be concise, this must be achieved by references to earlier authoritative specifications. These references need to be cited in the draft, before the CHIP can be adequately reviewed. This is especially important if reviewers such myself are to be able to contribute to this review.

MathieuG · July 31, 2023, 7:13pm

@emergent_reasons clarified that he meant an opcode which returns both the result of muldiv & mulmod for the same 3 parameters. I misundertood it too, the naming is cause of the confusion, maybe OP_MULCOMPOSITES would be clearer.

The reason given for considering OP_MULCOMPOSITES is that no standalone usecase was demonstrated for OP_MULMOD besides along side OP_MULDIV for higher order math emulation, and hence the question if mulmod only ever appears together, should it have a standalone opcode (and a similar reasoning for OP_ADDMOD which could be changed to OP_ADDCOMPOSITES instead).

I think the one opcode saved by returning both items (even if it would not introduce stack management overhead) is minimal benefit. The drawbacks of it are that the VM instructions would be less straightforward (see naming confusion for example) but it also complicates any future usecase of just mulmod, introducing overhead.

emergent_reasons · August 1, 2023, 2:18am

Whatever decision you guys fall on is fine. Regarding the naming, divmod is the name for the composite operation in python, second most popular language in the world. Possibly in others. It is also directly semantic while “composites” does not.

The reasoning for not replacing MULMOD with MULDIVMOD is weak:

Relative cost: 1 drop in the worst case (expected to be rare)
Relative gain: 1 opcode and 3 data pushes (expected to be common)
Naming: non-issue, it’s already a standard name for millions of developers
Compiler: More complex (is it really?) logic needed - this is the only point in favor of MULMOD that I can think of.

bitcoincashautist · August 1, 2023, 2:41am

Agreed. I think MULDIVMOD is a better choice.

You lay it out good:

When you need just MULMOD (when? uncommon) it just costs 1 drop
Significant savings when you need both (common), I suspect also on stack juggling
We can bikeshed the naming some more, but it’s a non-argument IMO, we get functionality and with clear description users shouldn’t be confused about how it works
Few more arithmetic and stack ops hardly qualify as “more complex” IMO. Both proposed opcodes are simple. 3 stack items in, simple arithmetics to produce result or fail, 1 or 2 stack items out.

And I’m not sure if there’s utility in having standalone ADDDIV. Standalone muldiv is very useful when working with fractions of 2x int64, not sure about standalone ADDDIV.

It’s hard to defend standalone MULDIV too even though use case can be demonstrated, it’d be redundant just to save 1 byte, and if we emulate higher ints then even the use for multiplying with a fraction would need the MULDIVMOD. If we go for just the 2 (ADDDIVMOD and MULDIVMOD) then we could slot them into the arithmetics group

emergent_reasons · August 1, 2023, 2:44am

I didn’t even think about going that far… makes sense too whatever you guys work out should be good All options are flavors of “good”.

im_uname · September 11, 2023, 7:29am

While Satoshi himself was notorious in providing multiple opcodes with overlapping function, imo there’s a lot of value in using just one opcode (MULDIVMOD, and ADDDIVMOD). Breaking it down with drop doesn’t seem terribly costly, and there is a nonzero maintenance cost associated with maintaining more opcodes, or introducing them over multiple cycles.

With that said I’ll be happy with a vanilla MULDIV too, it’s just a bit… suboptimal?

I see this as a good candidate for activation May 2025 (next cycle) given its late introduction and slower pace in testing and legwork. Meanwhile it’ll be nice to get the holes filled in:

Evaluation of alternatives need to mention the cost of complexity too. Emulation has a nontrivial cost that can introduce unexpected overflow vulnerabilities, and bumping precision again has a nontrivial maintenance cost (especially for non-saotshi implementations). Please evaluate/mention those.
A comparison between the two-opcode solution (MULDIVMOD / ADDDIVMOD) and four-opcode solution should be added. How many expected usecases of MULMOD alone are expected out there?
It’ll be nice to get concrete code demonstrations from Cauldron and Fex on code simplification, instead of just a high level description.

Looking forward to this otherwise!