The need for additional Math opcodes

MathieuG · June 18, 2023, 1:59pm

I read about the concept of a “logarithmic scoring rule” for automated market markers (AMMs) in the Truthcoin whitepaper. I expect AMM-style DEXes to play a big role in Bitcoin Cash going forward. So I want to start the discussion on whether it wouldn’t be a good time to consider adding extra math primitives to Bitcoin Cash in the May 2024 Upgrade to enable such AMMs. The math primitives that come to mind are log , exp , root and pow.

I found this article visualizing different AMM curves. It shows how the Balancer protocol on Ethereum uses an exponential curve. To allow maximum experimentation with different AMM equations it would be good to have these additional math primitives.

@bitjson wrote in his list of “what future opcodes could make sense in the BCH model?”

Given that bounded loops might not be a priority for the next upgrade, getting those new math primitives could enable new designs that are not possible without them.

emergent_reasons · June 18, 2023, 2:33pm

Good discussion to start.

We would definitely want to base on a known body of knowledge for integer operations. When mixing integers and exponentials, it’s going to be edge cases everywhere. I’m not aware of that kind of standard, but probably somebody is.

I would think that root can be wrapped into pow since it’s just a specialized version of pow.
I would think exp I assume you mean e^x could be wrapped into pow as well?
Since you mentioned (I think) e, log could also cover it by having a base argument.

All that to say, if we allow a rich set of arguments (which represent a RISC style), then just these should cover it all?

pow
log

On the other hand, going with a less-arguments, more op_codes CISC style to cover common cases, then these would suffice?

pow + epow (+ maybe 2pow)
10log + elog (+ maybe 2log)

Jonathan_Silverblood · June 19, 2023, 1:03pm

I’d be ok with either, but voicing my opinion just in case someone else has a stronger feeling about them:

If we go with the RISC approach and minimal number of opcodes, I assume that cashscript and similar will fill in the gap by offering the other versions and compiling down to the limited set of raw opcodes.

If we go with the more extensive set, then we will probably be more like the rest of the opcodes and there’s some value in consistency, and there might be value in edge case management that is context dependendent (I don’t know, I’m not a math guy).

ShadowOfHarbringer · June 19, 2023, 7:21pm

We should always take into consideration how much processing power it will take to mine blocks with such OPcodes when the blocksizes reach 1-10GB, assuming for some reason (think a VERY popular merchant app) 30% or more of transactions is filled with multiple such opcodes.

There should be a simulation done on today’s average (4-8 cores, 16GB RAM) PC to determine how long such a block would take to process in such a scenario.

Not breaking the P2P Cash / Big Blocks goal by accident should continue to be one of our main areas of focus.

bitcoincashautist · June 19, 2023, 7:31pm

AFAIK implementation of integer pow/log is just some arithmetic & bit shift ops, inside some bounded loops, shouldn’t stick out as more expensive than some other opcodes.

tl121 · June 19, 2023, 9:52pm

I would not underestimate the importance of a precise specification of any new opcodes, which must be specified down to the last bit for all possible inputs. And once this is done, exactly how any implementation can be thoroughly validated under all possible edge conditions.

There is a long history of low quality math library calculations starting with the difference between real numbers, rational numbers and integers, continuing through coding errors in library routines, compiler errors, and even processor errors. Remember the Pentium FDIV bug?

tl121 · June 19, 2023, 9:56pm

Speed not an issue. Issue is making sure we don’t have a repeat of block 74638.

bitjson · June 21, 2023, 2:23am

An area of research that belongs in this discussion: what’s the most efficient way to emulate higher-precision math in the BCH VM?

Right now the BCH VM is in a fantastic position – math operations are very flexible, but not enough to require per-operation accounting:

I think any future math-related upgrades should maintain this performance characteristic. It would be a shame if we had to complicate the VM by e.g. counting up the number of executed exponentiation operations, or worse, implement some form of “gas”.

One possible solution is to have new operations simply error on overflows, and the ~64-bit VM integer size naturally limits the computation required to correspond with bytecode length. Then contract authors are properly incentivized to simplify and reduce the need for higher precision math, as higher-precision calculations require additional bytes to emulate. E.g. for Jedex’s market making algorithm to support tokens with total supply greater than 2^55, we can either 1) emulate ~150-bit unsigned multiplication or 2) use a “Higher-Denomination Token Equivalent (HDTE)” within the market maker.

For use cases where simply performing the higher-precision math is most advantageous, it would be great if some researcher(s) could also figure out:

The most byte-efficient way to perform those operations, and
What further capabilities would be useful for reducing the bytecode length of emulating much higher precision operations while maintaining the correlation between bytecode length and computation cost (e.g. various bit shifts, SIMD, bounded loops, etc.)

With CashTokens now, it’s already possible to implement essentially unlimited precision math in contracts, even without changes to the VM limits: you can perform part of the computation in each contract execution, handing the result after each section of the program to the next <201 opcode contract. You really only start hitting limits at the 100KB transaction standardness limit (and consensus allows up to 1MB), and even that can be worked around using multiple transactions. (And with better targeted VM limits developers wouldn’t even need these workarounds.)

I started experimenting with building a general macro “library” for higher-precision math into Bitauth IDE last year as part of the Jedex demo, but I haven’t made it very far. I’ll bet other developers could make much faster work of the problem!

So for anyone interested in this topic:

I’d love to see some research and user-space libraries that enable >128-bit precision multiplication and division;
In the context of new proposals, I’d love to see examples using new proposed math opcodes in higher-precision calculations too, e.g. an example of >128-bit exponentiation using a proposed OP_EXP.

I expect this research is required to evaluate the long term impacts of various implementation details in any new, math-related upgrade proposals: API, performance considerations, overflow behavior, etc.

ShadowOfHarbringer · June 21, 2023, 2:56am

Still, it’s probably better to check it right now than do a hard-fork removal of such new OPCodes 6 years from now, because it turns out they are too slow for mining & relaying 5GB blocks somehow…

We run a risk of the BCH protocol getting frozen up later in its lifetime, so it would be wiser to double-check and make sure before we go there.

bitcoincashautist · June 21, 2023, 7:34am

I implemented some uint256 arithmetics by using ScriptVM int64, with the objective of creating a native Bitcoin-compatible header chainwork oracle.
I got to calculating a single block’s chainwork from the block’s compressed target (script unlock_verify_chainwork).
Next step would be to instantiate a CashTokens covenant chain, that would start from some header and add up the chainwork, and keep the height & cumulative chainwork in the NFT’s commitment.
Then, anyone could obtain a copy of the state as an immutable NFT, for use in contracts that would want the information.

What I noticed is this: it takes a lot of bytecodes to cast back and forth from uint32 to ScriptNum. I think some opcode for this would be beneficial for anyone trying to do higher precision math using lower precision VM primitives.

MathieuG · June 21, 2023, 7:41am

Someone in BSV has made a library implementing log , exp , root and pow. I’m not sure how representative it is for BCH (given that OP_MUL & 64-bit integers were added). In their article they write:

The exponential function is smallest with only 8kb . The logarithm comes in with 17kb while root and pow require 25kb each as they are using exp and log.

Without new Math primitives developers who want to use AMM equations involving exp and log will have to use similar workarounds. If these numbers are representative, it would mean at the very least 125 UTXOs to use pow (not counting the overhead of extra code required to split over so many UTXOs) without any changes to the VM limits. This seems practically prohibitive, Targeted Virtual Machine Limits would be required to realistically allow for this kind of experimentation.

The erroring-on-overflow is how I thought it would work. If there is a large need for such a “general macro library for higher-precision math” then it seems our choice of 64-bit integers was quite restrictive after all. Re-reading the Bigger Script Integers CHIP from ~2 years ago, it seemed it was totally unanticipated that there would be a real need for higher precision math.

If this becomes a common use, it might make sense to revisit this idea by @markblundeberg

bitcoincashautist · June 21, 2023, 9:10am

I think one nice general-purpose improvement would be OP_MULDIV so you could multiply any 64-bit number with a fraction of any two int64s, and it would preserve full 64-bit precision for the result (or fail if overflow).

It would save contract authors from having to test for overflows if doing something like: <3> OP_MUL <1000> OP_DIV, they could just multiply with any fraction less than 1 and be sure it won’t overflow and that they’ll get a precise result.

bitjson · June 21, 2023, 5:27pm

Awesome, glad to hear you’re looking into it!

The only way to end up with a uint32 “number” within the VM is to pull it in from a source not intended for the BCH VM, right? So this would really only be applicable to the specific case of contracts parsing block headers or signed data blobs containing uint32 numbers from outside the system?

Unless I’m misunderstanding, these “uint32 numbers” are essentially strings regardless of whether or not conversion utilities exist – they still don’t work with any other operations or have any native use within the VM.

Blessing a particular uint32 format with VM opcodes would be similar to e.g. adding “utf8 number” utilities to cast numbers to/from utf8 strings. In fact, the only other number format that is already inherently “supported” by the VM is the r and s values encoded in signatures and x and (sometimes) y values encoded in public keys. But here again, there’s little point in us considering them anything but strings from the perspective of the VM.

Interesting, thanks for sharing! I expect those numbers are massively inflated by the lack of some bounded looping mechanism – even very simple algorithms that could be represented in <20 bytes require ~8x the operations simply to re-apply the same operations across each byte, or even ~64x for operations at the bit level. You also have to pay the cost of all IF/ELSE branches within each iteration, even if most are unused. Bytecode length ends up being pretty disconnected from computation costs.

I should have clarified that bullet point more. The next paragraph added more context:

That remains the case – 64-bit numbers are an excellent, conservative choice for our “base numbers” (though we were careful to ensure future upgrades can expand further), and I still think that range supports most practical use cases.

It’s important to distinguish between numbers used for representation of real values (like satoshis, shares, votes, tickets, etc.) and numbers used in intermediate calculations by contracts. Bumping up our base number range to 128 or 256 bits wouldn’t help with those intermediate values; we’d just need to e.g. emulate 512-bit fixed precision multiplication to fully support the new larger number values (or we’re back to this same problem of needing to sanitize inputs by limiting them to some value lower than the maximum range of the base numbers).

You can see this in practice on Ethereum: 2^256 numbers haven’t eliminated the need for big integer libraries – instead, contracts need to emulate even higher precision to avoid sanitizing inputs of the larger base numbers. E.g. In practice, essentially no tokens actually need 18 decimal places in the real world (just the network fees often cost over half of those decimals places), but the contract ecosystem is forced to pay an ongoing cost for that unused “base” precision.

Anyways, I think the real solution here is to figure out the most byte-efficient “macro” to emulate higher-precision computations in BCH VM bytecode, then optimize that macro to be relatively byte-efficient without fully offloading it to the VM implementation. This keeps bytecode length tied to computation costs, incentivizing contract authors to conserve both (without requiring the overhead of some arbitrary operation-counting or gas system). Any optimizations here also continue to pay dividends even if future VM numbers expand to 128-bit or larger numbers, as it seems that higher-precision math is always useful for some intermediate calculations.

(N.B.: My perspective on distinguishing arithmetic from crypto math is also informed by an expectation that sidechains are significantly more efficient than automata for a wide variety of use cases.)

Awesome! I’d love to see more detail on this idea; this seems like exactly the kind of optimization I’m hoping we can find to simplify practical implementations of higher-precision math libraries, both for existing operations and for proposed operations like exp and log.

bitcoincashautist · June 21, 2023, 7:16pm

Context is my uint256 implementation. I push the uint256 as 32-byte stack item, split it to 8x 32-bit uints. Then I have to do operations on them to compute the big number, yeah? But because script nums are signed I must append 0x00 to each of them, then do the math, then strip any 0x00, and finally concatenate them all to get the uint256 result.

MathieuG · June 22, 2023, 6:25am

Thanks this is the context I was missing

bitcoincashautist · June 22, 2023, 11:00am

I think fixed-point arithmetic is the direction we should look into: Fixed-point arithmetic - Wikipedia

IMO it’s a natural extension of our basic integer arithmetic, and we’d have consistent granularity of results. The exp and log ops could be popping 4 stack items (2 fractions of 2x 64-bit ints) and pushing 2 stack items (a fraction of 2x 64-bit ints).

bitcoincashautist · June 27, 2023, 8:30am

Let’s consider the OP_MULDIV primitive:

Word	Value	Hex	Input	Output	Description
OP_MULDIV	???	0x??	a b c	out	a is first multiplied by b then divided by c. The intermediate product CAN be in the INT128 range, while the final result MUST be in the INT64 range.

We can implement INT32 version as a macro op_muldiv: OP_ROT OP_ROT OP_MUL OP_SWAP OP_DIV but then we’re limited to INT32 precision for the result. Still, we can use the macro just to demonstrate utility. Consider this example:

Multiply some x with 0.999998^10 at max precision:

<1000000000> <0xffffff7f> OP_DUP
<999998> <1000000>
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
OP_2DUP OP_2ROT OP_2ROT op_muldiv OP_2SWAP
op_muldiv
OP_SWAP op_muldiv

Result: 999979999

The above example essentially unrolls an OP_POW by using the naive method, which could be further optimized by the square & multiply algorithm:

<1000000000> <0xffffff7f> OP_DUP
<999998> <1000000> op_muldiv
OP_DUP OP_2 OP_PICK op_muldiv OP_DUP
OP_DUP OP_3 OP_PICK op_muldiv
OP_DUP OP_3 OP_PICK op_muldiv
OP_2 OP_PICK op_muldiv
OP_SWAP op_muldiv

With bigger VM limits, one could use this as a building block to implement other functions using Taylor series.

I found that Ethereum folks are considering it too:

and they already have two similar opcodes that may be useful for other purposes: OP_ADDMOD, OP_MULMOD

emergent_reasons · June 27, 2023, 11:26am

Just a historical note about this. It was discussed heavily. 128 would obviously be more useful, but it comes with heavy costs, especially at the time when these were some of the first CHIPs, and we had less resources / eyes / developers than today. 64-bit has standards and standard implementations everywhere that require mostly just a little boundary drawing to fit the VM. 128 is a different beast that would require custom implementations and probably new standards. More risk.

I would have loved to have 128, but it wasn’t worth the cost at the time.

If 128 ever really became desired, it’s a relatively straightforward upgrade path from 64 to 128 from a behavior perspective. The only difference would be that some contracts that failed due to overflow would now not fail. As far as I know, that would only matter for an app that specifically uses overflow as a logic mechanism.

MathieuG · July 2, 2023, 4:46pm

Now there are already 2 CashTokens AMMs! Cauldron Swap & Fex.Cash!
Cauldron Swap has published its whitepaper already, FexCash is in preparation to do so soon. Interestingly Cauldron Swap is made using raw BCH script while FexCash is written in CashScript.

For these AMMs using x * y = k the discussed op_muldiv opcode would help with addressing overflows of the 64-bit multiplication. Rosco Kalis explained the old version of AnyHedge using 32-bit integers had to do the same complex math workaround AMMs now have to do to address overflows.

Discussing the future of AMMs on BCH @im_uname wrote

xy=k is one of those ideas that sound so dumb when you hear it, but turns out to be so convenient people are willing to use it to the tune of billions despite being inefficient as all hell

Eventually, new math opcodes or the emulation thereof will be important for enabling experimentation with more advanced and efficient AMM functions.

@bitcoincashautist shows above using op_muldiv it would be reasonable to emulate op_pow. and from there on any AMM function could be approximated with Tayler functions. I assume being able to emulate op_pow means op_exp & op_log could also be emulated more cheaply?

In conclusion, the new op_muldiv opcode suggested above seems like an important primitive which could see great use right away in simplifying Constant Product Market Maker DEXes, and in emulating missing Math opcodes

MathieuG · July 3, 2023, 9:07am

Taking a closer look at the EVM Arithmetic Instructions it is interesting that they only have exp (for exponentiation with any base) from the ones discussed at the top and use math libraries for things like root & log.

Searching for an explanation of the op_addmod & op_mulmod use cases I found this medium post

There are also opcodes that help the compiler in detecting overflows and underflows in the code, some of the common ones are ADDMOD, MULMOD and SMOD

It is notable the op_muldiv EIP discussion linked by @bitcoincashautist above is initiated by Uniswap developers who say

In Uniswap V3, we make heavy use of a FullMath#mulDiv , which allows you to multiply two numbers x and y then divide by a third number d , when x * y > type(uint256).max but x * y / d <= type(uint256).max

Which is the point I was making above for AMM DEXes. The rational for the EIP on ETH continues with

This function is extremely useful to smart contract developers working with fixed point math. Hari (@_hrkrshnn ) suggested that we write an EIP to add a muldiv opcode.

This is a first important consideration, to enable fixed point math the working of the op_muldiv opcode should be different than above. Reading the EIP discussion further:

Some initial discussions we had when writing the EIP:
– Support both “checked” (abort on overflow) and “unchecked” (do not abort) case
– Whether to return multiple stack items or not
– Support accessing the hi bits (after division) or not
The current design wanted to avoid multiple opcodes and returning multiple stack items, hence we arrived at having the special case with z=0 .

Having a special case like this when dividing by zero can be inconsistent and even dangerous for contract design, so apart from the number of stack items returned the denominator 0 case should be discussed.

Lastly, rereading the Market Making Algorithm of Jedex it says

Notably, because Jedex aggregates liquidity for every market tick, slippage for any particular order can be significantly minimized by other orders within the same tick (in a sense, pool sizes grow dynamically based on intra-tick demand); in this context, even the relatively-simple Constant Product function offers better liquidity than much more complex pricing functions used within other decentralized exchange designs.

It would be interesting to hear from @bitjson whether Jedex would benefit from a muldiv opcode, or how it works around the multiplication overflow issue.