Scalability

Scalability

One common concern about Ethereum is the issue of scalability. Like Bitcoin, Ethereum suffers from the flaw that every transaction needs to be processed by every node in the network. With Bitcoin, the size of the current blockchain rests at about 15 GB, growing by about 1 MB per hour. If the Bitcoin network were to process Visa's 2000 transactions per second, it would grow by 1 MB per three seconds (1 GB per hour, 8 TB per year). Ethereum is likely to suffer a similar growth pattern, worsened by the fact that there will be many applications on top of the Ethereum blockchain instead of just a currency as is the case with Bitcoin, but ameliorated by the fact that Ethereum full nodes need to store just the state instead of the entire blockchain history.

The problem with such a large blockchain size is centralization risk. If the blockchain size increases to, say, 100 TB, then the likely scenario would be that only a very small number of large businesses would run full nodes, with all regular users using light SPV nodes. In such a situation, there arises the potential concern that the full nodes could band together and all agree to cheat in some profitable fashion (eg. change the block reward, give themselves BTC). Light nodes would have no way of detecting this immediately. Of course, at least one honest full node would likely exist, and after a few hours information about the fraud would trickle out through channels like Reddit, but at that point it would be too late: it would be up to the ordinary users to organize an effort to blacklist the given blocks, a massive and likely infeasible coordination problem on a similar scale as that of pulling off a successful 51% attack. In the case of Bitcoin, this is currently a problem, but there exists a blockchain modification suggested by Peter Todd which will alleviate this issue.

In the near term, Ethereum will use two additional strategies to cope with this problem. First, because of the blockchain-based mining algorithms, at least every miner will be forced to be a full node, creating a lower bound on the number of full nodes. Second and more importantly, however, we will include an intermediate state tree root in the blockchain after processing each transaction. Even if block validation is centralized, as long as one honest verifying node exists, the centralization problem can be circumvented via a verification protocol. If a miner publishes an invalid block, that block must either be badly formatted, or the state S[n] is incorrect. Since S[0] is known to be correct, there must be some first state S[i] that is incorrect where S[i-1] is correct. The verifying node would provide the index i, along with a "proof of invalidity" consisting of the subset of Patricia tree nodes needing to process APPLY(S[i-1],TX[i]) -> S[i]. Nodes would be able to use those nodes to run that part of the computation, and see that the S[i] generated does not match the S[i] provided.

Another, more sophisticated, attack would involve the malicious miners publishing incomplete blocks, so the full information does not even exist to determine whether or not blocks are valid. The solution to this is a challenge-response protocol: verification nodes issue "challenges" in the form of target transaction indices, and upon receiving a node a light node treats the block as untrusted until another node, whether the miner or another verifier, provides a subset of Patricia nodes as a proof of validity.


Blockchain Scalability

One of the largest problems facing the cryptocurrency space today is the issue of scalability. It is an often repeated claim that, while mainstream payment networks process something like 2000 transactions per second, in its current form the Bitcoin network can only process seven. On a fundamental level, this is not strictly true; simply by changing the block size limit parameter, Bitcoin can easily be made to support 70 or even 7000 transactions per second. However, if Bitcoin does get to that scale, we run into a problem: it becomes impossible for the average user to run a full node, and full nodes become relegated only to that small collection of businesses that can afford the resources. Because mining only requires the block header, even miners can (and in practice most do) mine without downloading the blockchain.

The main concern with this is trust: if there are only a few entities capable of running full nodes, then those entities can conspire and agree to give themselves a large number of additional bitcoins, and there would be no way for other users to see for themselves that a block is invalid without processing an entire block themselves. Although such a fraud may potentially be discovered after the fact, power dynamics may create a situation where the default action is to simply go along with the fraudulent chain (and authorities can create a climate of fear to support such an action) and there is a coordination problem in switching back. Thus, at the extreme, Bitcoin with 7000 transactions per second has security properties that are essentially similar to a centralized system like Paypal, whereas what we want is a system that handles 7000 TPS with the same levels of decentralization that cryptocurrency originally promised to offer.

Ideally, a blockchain design should exist that works, and has similar security properties to Bitcoin with regard to 51% attacks, that functions even if no single node processes more than 1/n of all transactions where n can be scaled up to be as high as necessary, although perhaps at the cost of linearly or quadratically growing secondary inefficiencies and convergence concerns. This would allow the blockchain architecture to process an arbitrarily high number of TPS but at the same time retain the same level of decentralization that Satoshi envisioned.

Problem: create a blockchain design that maintains Bitcoin-like security guarantees, but where the maximum size of the most powerful node that needs to exist for the network to keep functioning is substantially sublinear in the number of transactions.

Scalability in bitcoin

VISA handles on average around 2,000 transactions per second (tps), so call it a daily peak rate of 4,000 tps. It has a peak capacity of around 56,000 transactions per second, [[1]](https://usa.visa.com/dam/VCOM/download/corporate/media/visa-fact-sheet-Jun2015.pdf) however they never actually use more than about a third of this even during peak shopping periods. [2]

PayPal, in contrast, handled around 10 million transactions per day for an average of 115 tps in late 2014. [3]

Let's take 4,000 tps as starting goal. Obviously if we want Bitcoin to scale to all economic transactions worldwide, including cash, it'd be a lot higher than that, perhaps more in the region of a few hundred thousand tps. And the need to be able to withstand DoS attacks (which VISA does not have to deal with) implies we would want to scale far beyond the standard peak rates. Still, picking a target let us do some basic calculations even if it's a little arbitrary.

Today the Bitcoin network is restricted to a sustained rate of 7 tps due to the bitcoin protocol restricting block sizes to 1MB.


Scalability, Part 1: Building on Top

Ethereum Scalability and Decentralization Updates

How do I compare the “scalability” capabilities between ethereum and bitcoin?

Let me try to explain.

Bitcoin Block Size Limit

The Bitcoin side is pretty simple to understand. The bitcoin blockchain has a hardcoded block size limit of 1 MiB. With an average transaction size of around 600 B and a target block time of 10 minutes, you get

1024 * 1024 / 600 B = 1747.7 transactions per block,

which translates down to

1747.7 / 600 s = 2.9127 transactions per second.

Here we are at around 3 transactions per second in practice, however, if you reduce the average transaction size, it's possible in theory to reach higher rates, maybe 7 transactions per second? That said, there is nothing that scales in Bitcoin unless the network finds consensus on a solution to increase the blocksize or any other scalability fix.

Ethereum Block Gas Limit

Ethereum introduces a new concept which has no transaction or block size limit but a gas limit. Gas is a unit which basicly calculates fee costs. Every transaction, every contract execution and every data storage operation on the blockchain costs gas. Every block has a block gas limit of default 4,712,388 gas which can be spent on every block.

Let's assume an average transaction size of 21,000 gas per transaction which is required for a default value transfer and a target block time of 15 seconds, we have by default the following:

4712388 / 21000 = 224.4 transactions per block

which translates down to

224.4 / 15 = 14.96 transactions per second.

So, at the current level of gas block limit and block time, there is a default possible throughput of 15 transactions per second. If you increase the required gas per transaction, it's probably a little bit lower.

But since you asked about scalability, the yellow paper specifies in equations 44-46 how the block gas limit scales:

Which basicly means the block gas limit can increase by 1+1/1024 each block, or:

(1+1/1024)^5760 = 276.51227240329152144804

which is a scalability factor for 276 per day. Ethereum scales indefinitely. In theory, in practice, the early olympic testnet was able to stress the network to levels at around 25 transactions per second. And this is only the status quo. See also this post about transaction size.

How does Ethereum deal with blockchain scalability?

Ethereum blocks are limited by the block gas limit (currently around 4.7 million gas). Each transaction specifies how much gas it's willing to spend. A block can only fit as much as the block gas limit, so if someone specifies a transaction of 4.7 million gas, a miner cannot fit any more transactions in that block.

So you can see some differences against Bitcoin. Another important one, is dynamic behavior that every time a block is mined, the miner of that block can nudge the block gas limit (BGL) either higher or lower (from the previous block gas limit), by a factor of 1/1024. For example if the current BGL is 1024, the miner of the next block can set the BGL to be as low as 1023, as high as 1025, or somewhere in between.

Other scalability challenges:

Above is about on-chain scalability. A complimentary approach to scalability is to do things off-the-blockchain while being able to still use the blockchain when necessary. Examples:

For more current discussions, see the live research and EIP channels. And keep an eye on the Ethereum Improvement Proposals.

EIP 103 (Serenity): Blockchain rent

Ethereum Announces “Unlimited” Scalability Roadmap

How many transactions per second are the devs planning for?

scalability_paper

Toward a 12-second Block Time

Number of transactions per second for payments

7 Transactions Per Second? Really?

Where can I find transactions per second statistics?

How are we comparing 'transactions per second'?

Ethereum difficulty adjustment algorithm

How does the Ethereum Homestead difficulty adjustment algorithm work?

Summary

If the timestamp difference (block_timestamp - parent_timestamp) is:

  • < 10 seconds, the difficulty is adjusted upwards by parent_diff // 2048 * 1
  • 10 to 19 seconds, the difficulty is left unchanged
  • >= 20 seconds, the difficulty is adjust downwards proportional to the timestamp difference, from parent_diff // 2048 * -1 to a max downward adjustment of parent_diff // 2048 * -99

This is consistent with the statement from ethdocs.org - Ethereum Homestead - The Homestead Release:

EIP-2/4 eliminates the excess incentive to set the timestamp difference to exactly 1 in order to create a block that has slightly higher difficulty and that will thus be guaranteed to beat out any possible forks. This guarantees to keep block time in the 10-20 range and according to simulations restores the target 15 second blocktime (instead of the current effective 17s).

And from Ethereum Network Status, the average block time currently is 13.86 seconds.


Details

The difficulty adjustment formula:

block_diff = parent_diff + parent_diff // 2048 * 
max(1 - (block_timestamp - parent_timestamp) // 10, -99) + 
int(2**((block.number // 100000) - 2))

where // is the integer division operator, eg. 6 // 2 = 3, 7 // 2 = 3, 8 // 2 = 4.

can be broken down into the following parts:

Sub-formula B - The difficulty bomb part, which increases the difficulty exponentially every 100,000 blocks.

+ int(2**((block.number // 100000) - 2))

The difficulty bomb won't be discussed here as it is already covered in the following Q&As:

Sub-formula A - The difficulty adjustment part, which increases or decreases the block difficulty depending on the time between the current block timestamp and the parent block timestamp:

+ parent_diff // 2048 * max(1 - (block_timestamp - parent_timestamp) // 10, -99)

Subformula A1 - Lets separate out part of Subformula A

+ max(1 - (block_timestamp - parent_timestamp) // 10, -99)

and consider what the adjustment effect is due to the timestamp difference between the current block and the parent block:

When (block_timestamp - parent_timestamp) is

  • 0, 1, 2, ..., 8, 9 seconds
    • A1 evaluates to max(1 - 0, -99) = 1
    • A evaluates to +parent_diff // 2048 * 1
  • 10, 11, 12, ..., 18, 19 seconds
    • A1 evaluates to max(1 - 1, -99) = 0
    • A evaluates to +parent_diff // 2048 * 0
  • 20, 21, 22, ..., 28, 29 seconds
    • A1 evaluates to max(1 - 2, -99) = -1
    • A evaluates to +parent_diff // 2048 * -1
  • 30, 31, 32, ..., 38, 39 seconds
    • A1 evaluates to max(1 - 3, -99) = -2
    • A evaluates to +parent_diff // 2048 * -2
  • 1000, 1001, 1002, ..., 1008, 1009 seconds
    • A1 evaluates to max(1 - 100, -99) = -99
    • A evaluates to +parent_diff // 2048 * -99
  • > 1009 seconds
    • A1 evaluates to max(1 - {number greater than 100}, -99) = -99
    • A evaluates to +parent_diff // 2048 * -99

So, if the timestamp difference (block_timestamp - parent_timestamp) is:

  • < 10 seconds, the difficulty is adjusted upwards by parent_diff // 2048 * 1
  • 10 to 19 seconds, the difficulty is left unchanged
  • >= 20 seconds, the difficulty is adjust downwards proportional to the timestamp difference, from parent_diff // 2048 * -1 to a max downward adjustment of parent_diff // 2048 * -99

The Source Code

From Go Ethereum - core/block_validator.go, lines 264-311:

func calcDifficultyHomestead(time, parentTime uint64, parentNumber, parentDiff *big.Int) *big.Int {
    // https://github.com/ethereum/EIPs/blob/master/EIPS/eip-2.mediawiki
    // algorithm:
    // diff = (parent_diff +
    //         (parent_diff / 2048 * max(1 - (block_timestamp - parent_timestamp) // 10, -99))
    //        ) + 2^(periodCount - 2)
    
    bigTime := new(big.Int).SetUint64(time)
    bigParentTime := new(big.Int).SetUint64(parentTime)
    
    // holds intermediate values to make the algo easier to read & audit
    x := new(big.Int)
    y := new(big.Int)
    
    // 1 - (block_timestamp -parent_timestamp) // 10
    x.Sub(bigTime, bigParentTime)
    x.Div(x, big10)
    x.Sub(common.Big1, x)
    
    // max(1 - (block_timestamp - parent_timestamp) // 10, -99)))
    if x.Cmp(bigMinus99) < 0 {
        x.Set(bigMinus99)
    }
    
    // (parent_diff + parent_diff // 2048 * max(1 - (block_timestamp - parent_timestamp) // 10, -99))
    y.Div(parentDiff, params.DifficultyBoundDivisor)
    x.Mul(y, x)
    x.Add(parentDiff, x)
    
    // minimum difficulty can ever be (before exponential factor)
    if x.Cmp(params.MinimumDifficulty) < 0 {
        x.Set(params.MinimumDifficulty)
    }
    
    // for the exponential factor
    periodCount := new(big.Int).Add(parentNumber, common.Big1)
    periodCount.Div(periodCount, ExpDiffPeriod)
    
    // the exponential factor, commonly referred to as "the bomb"
    // diff = diff + 2^(periodCount - 2)
    if periodCount.Cmp(common.Big1) > 0 {
        y.Sub(periodCount, common.Big2)
        y.Exp(common.Big2, y, nil)
        x.Add(x, y)
    }
    
    return x
}

Change difficulty adjustment to target mean block time including uncles

Specification

Currently, the formula to compute the difficulty of a block includes the following logic:

adj_factor = max(1 - ((timestamp - parent.timestamp) // 10), -99)
child_diff = int(max(parent.difficulty + (parent.difficulty // BLOCK_DIFF_FACTOR) * adj_factor, min(parent.difficulty, MIN_DIFF)))
...

If block.number >= METROPOLIS_FORK_BLKNUM, we change the first line to the following:

adj_factor = max(1 + len(parent.uncles) - ((timestamp - parent.timestamp) // 9), -99)

Specification (1b)

adj_factor = max((2 if len(parent.uncles) else 1) - ((timestamp - parent.timestamp) // 9), -99)

Rationale

This new formula ensures that the difficulty adjustment algorithm targets a constant average rate of blocks produced including uncles, and so ensures a highly predictable issuance rate that cannot be manipulated upward by manipulating the uncle rate. The formula can be fairly easily seen to be (to within a tolerance of ~3/4194304) mathematically equivalent to assuming that a block with k uncles is equivalent to a sequence of k+1 blocks that all appear with the exact same timestamp, and this is likely the simplest possible way to accomplish the desired effect.

Changing the denominator from 10 to 9 ensures that the block time remains roughly the same (in fact, it should decrease by ~3% given the current uncle rate of 7%).

(1b) accomplishes almost the same effect but has the benefit that it depends only on the block header (as you can check the uncle hash against the blank hash) and not the entire block.


Is it possible to change the block target time?

What was the first block mined with Homestead?

How is the Mining Difficulty calculated on Ethereum?

How do I decrease the difficulty on a private testnet?

How to make Ethereum mining difficulty static for a private chain?

eip-2.mediawiki

Genesis block Explanation

mixhash A 256-bit hash which proves, combined with the nonce, that a sufficient amount of computation has been carried out on this block: the Proof-of-Work (PoW). The combination of nonceand mixhash must satisfy a mathematical condition described in the Yellowpaper, 4.3.4. Block Header Validity, (44). It allows to verify that the Block has really been cryptographically mined, thus, from this aspect, is valid.

nonce A 64-bit hash, which proves, combined with the mix-hash, that a sufficient amount of computation has been carried out on this block: the Proof-of-Work (PoW). The combination of nonceand mixhash must satisfy a mathematical condition described in the Yellowpaper, 4.3.4. Block Header Validity, (44), and allows to verify that the Block has really been cryptographically mined and thus, from this aspect, is valid. The nonce is the cryptographically secure mining proof-of-work that proves beyond reasonable doubt that a particular amount of computation has been expended in the determination of this token value. (Yellowpager, 11.5. Mining Proof-of-Work).

difficulty A scalar value corresponding to the difficulty level applied during the nonce discovering of this block. It defines the mining Target, which can be calculated from the previous block’s difficulty level and the timestamp. The higher the difficulty, the statistically more calculations a Miner must perform to discover a valid block. This value is used to control the Block generation time of a Blockchain, keeping the Block generation frequency within a target range. On the test network, we keep this value low to avoid waiting during tests, since the discovery of a valid Block is required to execute a transaction on the Blockchain.

alloc Allows defining a list of pre-filled wallets. That’s an Ethereum specific functionality to handle the “Ether pre-sale” period. Since we can mine local Ether quickly, we don’t use this option.

coinbase The 160-bit address to which all rewards (in Ether) collected from the successful mining of this block have been transferred. They are a sum of the mining reward itself and the Contract transaction execution refunds. Often named “beneficiary” in the specifications, sometimes “etherbase” in the online documentation. This can be anything in the Genesis Block since the value is set by the setting of the Miner when a new Block is created.

timestamp A scalar value equal to the reasonable output of Unix time() function at this block inception. This mechanism enforces a homeostasis in terms of the time between blocks. A smaller period between the last two blocks results in an increase in the difficulty level and thus additional computation required to find the next valid block. If the period is too large, the difficulty, and expected time to the next block, is reduced. The timestamp also allows verifying the order of block within the chain (Yellowpaper, 4.3.4. (43)).

parentHash The Keccak 256-bit hash of the entire parent block header (including its nonce and mixhash). Pointer to the parent block, thus effectively building the chain of blocks. In the case of the Genesis block, and only in this case, it’s 0.

extraData An optional free, but max. 32-byte long space to conserve smart things for ethernity. :)

gasLimit A scalar value equal to the current chain-wide limit of Gas expenditure per block. High in our case to avoid being limited by this threshold during tests. Note: this does not indicate that we should not pay attention to the Gas consumption of our Contracts.

difficulty: QUANTITY - integer of the difficulty for this block.
totalDifficulty: QUANTITY - integer of the total difficulty of the chain until this block.


Blocktime - Investigating Ethereum Blocktime with R

What is the measured distribution of block times since Homestead?

Pending Transactions

Is it normal pending transaction are removed after restart of geth?

How to make miner to mine only when there are Pending Transactions?

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市响迂,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌咪奖,老刑警劉巖,帶你破解...
    沈念sama閱讀 216,372評論 6 498
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件色迂,死亡現(xiàn)場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī)充岛,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,368評論 3 392
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來耕蝉,“玉大人崔梗,你說我怎么就攤上這事∨饬颍” “怎么了?”我有些...
    開封第一講書人閱讀 162,415評論 0 353
  • 文/不壞的土叔 我叫張陵盐肃,是天一觀的道長爪膊。 經(jīng)常有香客問我,道長砸王,這世上最難降的妖魔是什么推盛? 我笑而不...
    開封第一講書人閱讀 58,157評論 1 292
  • 正文 為了忘掉前任,我火速辦了婚禮谦铃,結(jié)果婚禮上耘成,老公的妹妹穿的比我還像新娘。我一直安慰自己驹闰,他們只是感情好瘪菌,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,171評論 6 388
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著嘹朗,像睡著了一般师妙。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上屹培,一...
    開封第一講書人閱讀 51,125評論 1 297
  • 那天默穴,我揣著相機(jī)與錄音怔檩,去河邊找鬼。 笑死蓄诽,一個(gè)胖子當(dāng)著我的面吹牛薛训,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播仑氛,決...
    沈念sama閱讀 40,028評論 3 417
  • 文/蒼蘭香墨 我猛地睜開眼乙埃,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了调衰?” 一聲冷哼從身側(cè)響起膊爪,我...
    開封第一講書人閱讀 38,887評論 0 274
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎嚎莉,沒想到半個(gè)月后米酬,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,310評論 1 310
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡趋箩,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,533評論 2 332
  • 正文 我和宋清朗相戀三年赃额,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片叫确。...
    茶點(diǎn)故事閱讀 39,690評論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡跳芳,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出竹勉,到底是詐尸還是另有隱情飞盆,我是刑警寧澤,帶...
    沈念sama閱讀 35,411評論 5 343
  • 正文 年R本政府宣布次乓,位于F島的核電站吓歇,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏票腰。R本人自食惡果不足惜城看,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,004評論 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望杏慰。 院中可真熱鬧测柠,春花似錦、人聲如沸缘滥。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,659評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽朝扼。三九已至软吐,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間吟税,已是汗流浹背凹耙。 一陣腳步聲響...
    開封第一講書人閱讀 32,812評論 1 268
  • 我被黑心中介騙來泰國打工姿现, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人肖抱。 一個(gè)月前我還...
    沈念sama閱讀 47,693評論 2 368
  • 正文 我出身青樓备典,卻偏偏與公主長得像,于是被迫代替她去往敵國和親意述。 傳聞我的和親對象是個(gè)殘疾皇子提佣,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,577評論 2 353

推薦閱讀更多精彩內(nèi)容