Research: Block difficulty as a price oracle

bitcoincashautist · November 13, 2024, 6:49pm

I had a hypothesis that difficulty could be used to estimate the price of BCH in MWh, which I spent some time researching, and here I will share my findings.

First, recall that:

Block headers encode a 4-byte compressed_target which is a custom scientific notation: {3 byte mantissa}{1 byte exponent} where we obtain the int256 target by doing: mantissa * 2^((exponent - 3) * 8);
To be accepted by the network, block header hash must satisfy block_hash <= target;
Chainwork contribution is the expected number of hashes given by 2^256 / (target + 1), and cumulative chainwork is used to resolve which is the “longest” chain;
max_target is that of genesis block, which had compressed_target = 0xffff001d and is the easiest PoW;
Difficulty is defined as target / max_target;
Difficulty of 1 is then equivalent to chainwork of 4295032833, which is equal to 4.29 gigahashes.

We can think of hashes as a commodity extracted by the miners and then sold to blockchain network(s) in exchange for block reward.
We postulate that each block is a price point, where the full block reward (subsidy + fees) is exchanged for the hashes. The network is the buyer, and miners are the sellers.
If miners are over-producing, the DAA auto-corrects the purchase price by slowing down the blocks, and if they’re under-producing, the DAA auto-corrects the purchase price by speeding up the blocks. The DAA functions as an automatic market maker (AMM).

If that is so, we can define BCHGH price as (difficulty / (subsidy + fees)). Each block header & coinbase TX is then a native price oracle for BCHGH. We don’t need any external data in order to know this price, the blockchain itself records it.

But can we somehow link hashes to something in the real world? It is impossible to produce a hash without spending energy, so if we know the average amount of energy used to produce a hash then we could have some estimate. But the amount of energy changes with each new generation of ASICs.

Thankfully, it looks like we can model the efficiency gains using the Logistic function. I fit one such curve on the data (source, fetched on 2024-11-08) and got:

asic_efficiency = 700 / (1 + e^(-9.67E-09 * (x - 1931558400)) (unit: GH/J)

where:

x is epoch time
numerator 700 sets the asymptote, roughly corresponding to 0.5nm tech (further 10x improvement from current 5nm tech)
factor of 9.67E-09 found such that the curve angle on log scale matches the angle of data points
offset 1931558400 to fit the data points

With this, we can infer BCHMWH price using (difficulty / asic_efficiency) / (subsidy + fees).

Then, to better see whether this would map to external prices I pulled FRED global energy price index ($100/MWh in 2016) and then obtained inferred BCHUSD.

Let’s see how so inferred price matches real prices (coinmarketcap), for both BTC and BCH.

BCHUSD	BTCUSD

Not perfect, not bad. Before 2015, the mining market and ASIC development was still immature but value of block reward was enough to crowdfund rapid catching up with state of the art chip-making. Since it caught up, we can observe better correlation between inferred and actual.

Use of difficulty price oracle for minimum relay fee algorithm

What I really wanted to see is whether we could use difficulty to automatically set the minimum relay when new price highs are reached, in order to minimize the lag between price making highs and people coordinating a reduction of min. fee.

To not have to depend on external data, the idea to set the min. fee in watt-hours rather than sats or USD, and just use the ATH of BCHMWH to set the min. fee in sats.

Here’s how that would look like for 0.1 Wh/byte

BCH	BCH

As we can see, setting it so would keep the min. fee under $0.01. BCH would have it lower because current price is about 10% of last ATH, while BTC is near its ATH.

If we would not use ATH price but current price, then the min. fee would not only be going down with new ATHs but it would be freely floating and would look like this:

BCH	BCH

Kilian · November 15, 2024, 10:25am

My first question is about the current DAA , for finding the compressed_target in the block header the exponent is deducted by 3, i’m assuming this is just so the value can be negative or positive without the need of a sign bit?

Secondly why do you chose this specific curve to fit the data and choose the asymptote at 700?

bitcoincashautist · November 15, 2024, 9:31pm

This is not a feature of current DAA but of block header format, as decided by Satoshi. Exponent of 0 “erases” whatever 3 bytes of mantissa by shifting them right and then target is 0.

Moore’s law is pure exponential, which is why it will eventually be broken, because in nature many things start off as exponential, but nothing can be exponential forever, because there are phyical limits to everything.

Logistic function models a process where the exponent decays with time, like in population growth. See Logistic function - Wikipedia

700 is a nice round number roughly 10x from where we are now with 5nm tech, and just a guess that efficiency gains will almost fully diminish at about 0.5nm (reminder that diameter of smallest atom, hydrogen, is 0.1nm, and diameter of Si is 0.2nm, and you need more than 1 atom to make a gate).

Btw, look what I found now: David Burg, Jesse H. Ausubel, “Moore’s Law revisited through Intel chip density”

They found that the data fits to 2 Logistic curves:

Sigmoidal trends of processor evolution

The density of transistors was then fit to Eq (2), resulting in a well-defined bi-logistic trend (Fig 4A). Interestingly, both phases have characteristic times (Δ t i ) of 9.5 years. Midpoints of these distinct growth curves occurred circa 1979 and 2008, with approximately 30 years separating them.

So, who knows, maybe ASICs will saturate as predicted, but then a new paradigm will be discovered and kick-off a new Logistic curve.

Kilian · November 16, 2024, 12:33am

Thanks, woops I was totally misinterpreting that first part, its just how the difficulty number is represented.

Using a logistic function intuitively makes total sense in a situation like this. The method of approximating where the efficiency gains hit a limit is the hard part. It sounds quite reasonable to assume the gains diminish at the atom level.

Jonathan_Silverblood · November 17, 2024, 9:44am

slightly off topic, but now I imagine us discovering a blockspace broadcasted through space, we measure the hashes and determine that whatever civilization produced that blockchain must have had access to some X amount of energy, setting a lowest indicator for their location on the Kardashev scale, effectively telling us that we’re in the Dark Forest and should be really really careful about what we do next -.-

A bit more on-topic - this is interesting. I’m not sure if the slight discrepency in the USD cost you predict with this VS the market price is due to enefficiency in the model or the market, to be honest. It’s probably a bit of both. Either way, to use the data point in a contract we would need to either support introspection of the blockchain / headers, or a contract that validates the block headers and outputs the price points as tokens. would be interesting to see if such a contract could also automate the token ditribution through AMM and let the market determine the price of this data.

bitcoincashautist · November 18, 2024, 12:03pm

It needs a guess on efficiency of their hardware. It could still be a measure of how more advanced than us they are - but we couldn’t tell whether it is due to 100x more efficient hardware or due to 100x more energy spent - likely due to both.

Of course there’d be discrepancy, because in reality ASICs don’t magically and smoothly get replaced for latest model and with 0 costs. Also, miner’s average energy cost is not equal to world average energy cost.
What’s interesting is that there’s not so much discrepancy from 2018 onwards - possibly because in a developed market Logistic curve models ASIC saturation, too.

That’s what got me here, I want to have an on-chain header oracle so we can trustlessly speculate on GH/s and on GH/BCH, and have a GH overcollateralized stablecoin.

I have a work-in-progress oracle design that validates block headers and emits verified state as tokens, but unfortunately I can’t reliably extract total fees because that’d require parsing coinbase transactions, too, and if they’d be too big then they’d be impossible to parse due to VM limits.

Possible solutions:

introspection of blockchain headers (& some aggregate data like total fees collected)
streaming hash opcodes (so you could perform hash of a big message by splitting the job to multiple chained TXs)
different limits for coinbase and other TXs, so that coinbase can reliably fit inside some non-coinbase TXs unlocking bytecode

Kilian · November 18, 2024, 12:47pm

I once purposefully bought some old secondhand ASIC’s that in no way could ever mine profitably in my area due to a high price per KWh. But by placing them in my basement I was able to offset some of my heating bill because I didn’t have to heat the floor above as much anymore.

Even though the actual price per hash was higher in this case the fact that is was still online providing hash power was because in this case the heat produced was also perceived as value and part of the equation for the AMM.

Aside from this niche use case I would expect the model to become more accurate over time.