Distributed systems theory for the distributed systems engineer 翻譯 中英對照

Distributed systems theory for the distributed systems engineer

適合 分布式系統(tǒng)工程師 的 分布式系統(tǒng)理論

Gwen Shapira, who at the time was an engineer at Cloudera and now is spreading the Kafka gospel, asked a question on Twitter that got me thinking.

Gwen Shapira曾在Cloudera做工程師妙黍,現(xiàn)在宣傳Kafka劣坊,他在Twitter問了以下問題睡互,使我有所思考。

I need to improve my proficiency in distributed systems theory. Where do I start? Any recommended books?
我想在分布式理論上有所提升虐先。應(yīng)該從哪開始?有推薦的書?
— Gwen (Chen) Shapira (@gwenshap) August 7, 2014

My response of old might have been “well, here’s the FLP paper, and here’s the Paxos paper, and here’s the Byzantine generals paper…”,
我第一反應(yīng)是“可以看:FLP論文、paxos論文旷赖、Byzantine將軍論文”,
and I’d have prescribed a laundry list of primary source material which would have taken at least six months to get through if you rushed.
我推薦的主要閱讀材料,如果你貿(mào)然去讀更卒,你至少要閱讀6個(gè)月才會(huì)有感覺等孵。
But I’ve come to thinking that recommending a ton of theoretical papers is often precisely the wrong way to go about learning distributed systems theory (unless you are in a PhD program).
由此可知,推薦一噸的理論論文讓你閱讀蹂空,這是了解分布式系統(tǒng)的錯(cuò)誤的方式俯萌。(除非你在讀博士)
Papers are usually deep, usually complex, and require both serious study, and usually significant experience to glean their important contributions and to place them in context.
論文一般是深?yuàn)W果录、復(fù)雜的,而且需要一系列學(xué)習(xí)和豐富的經(jīng)驗(yàn)才能感覺到其貢獻(xiàn)咐熙、才能其放到對應(yīng)的場景(以理解和應(yīng)用)弱恒。
What good is requiring that level of expertise of engineers?
工程師了解分布式理論有什么好處?

And yet, unfortunately, there’s a paucity of good ‘bridge’ material that summarises, distills and contextualises the important results and ideas in distributed systems theory;
很不幸棋恼,幾乎沒有好的引導(dǎo)文章返弹,來總結(jié)、提煉蘸泻、場景化 分布式系統(tǒng)理論中的重要結(jié)論和想法琉苇;
particularly material that does so without condescending.
特別是 通俗易懂的引導(dǎo)文章 更沒有。
Considering that gap lead me to another interesting question:
考慮這樣的空白區(qū)域悦施,讓我想問另一個(gè)問題:

What distributed systems theory should a distributed systems engineer know?
一個(gè)分布式系統(tǒng)工程師應(yīng)該了解什么樣的分布式系統(tǒng)理論?

A little theory is, in this case, not such a dangerous thing.
這種情況下去团,了解一點(diǎn)點(diǎn)理論并不是壞事抡诞。
So I tried to come up with a list of what I consider the basic concepts that are applicable to my every-day job as a distributed systems engineer.
我日常工作是一個(gè)分布式系統(tǒng)工程師,我認(rèn)為適合我的基本概念土陪,下面會(huì)給出這些基本概念昼汗。
Let me know what you think I missed!
你認(rèn)為我缺失的請告知我!

First steps 準(zhǔn)備

These four readings do a pretty good job of explaining what about building distributed systems is challenging.
下面四個(gè)讀物解釋了構(gòu)建分布式系統(tǒng)會(huì)遇到的困難。
Collectively they outline a set of abstract but technical difficulties that the distributed systems engineer has to overcome, and set the stage for the more detailed investigation in later sections
這些讀物都勾勒了一些列 抽象而非技術(shù) 的困難鬼雀,分布式系統(tǒng)工程師必須要克服這些困難顷窒。這些讀物的后面章節(jié)有更詳細(xì)的研究。

Distributed Systems for Fun and Profit is a short book which tries to cover some of the basic issues in distributed systems including the role of time and different strategies for replication.
Distributed Systems for Fun and Profit 是一本小書源哩,它想覆蓋分布式系統(tǒng)中的一些基本問題鞋吉,包括 時(shí)鐘所起的作用、不同策略的復(fù)制励烦。

Notes on distributed systems for young bloods - not theory, but a good practical counterbalance to keep the rest of your reading grounded.
Notes on distributed systems for young bloods - 非理論谓着,而是一個(gè)很好的實(shí)踐,以讓你落到實(shí)處坛掠。

A Note on Distributed Systems - a classic paper on why you can’t just pretend all remote interactions are like local objects.
A Note on Distributed Systems - 一個(gè)經(jīng)典論文赊锚,關(guān)于 為什么你不能假裝所有遠(yuǎn)程交互像本地對象一樣。

The fallacies of distributed computing - 8 fallacies of distributed computing that set the stage for the kinds of things system designers forget.
The fallacies of distributed computing 分布式計(jì)算的8個(gè)錯(cuò)誤的推論屉栓,以提醒系統(tǒng)設(shè)計(jì)者舷蒲。

You should know about safety and liveness properties:
你應(yīng)該知道 安全 和 活力:

  • safety properties say that nothing bad will ever happen. For example, the property of never returning an inconsistent value is a safety property, as is never electing two leaders at the same time.

  • 安全 說的是 永遠(yuǎn)不會(huì)發(fā)生壞事。比如友多,不返回不一致的值 是 一種 安全牲平, 同一時(shí)刻不會(huì)選出兩個(gè) 主節(jié)點(diǎn) 也是 一種 安全。

  • liveness properties say that something good will eventually happen. For example, saying that a system will eventually return a result to every API call is a liveness property, as is guaranteeing that a write to disk always eventually completes.

  • 活力 說的是 好事情終究會(huì)發(fā)生夷陋。比如欠拾,對于每個(gè)api調(diào)用胰锌,一個(gè)系統(tǒng)終究會(huì)返回一個(gè)結(jié)果,這是一種 活力藐窄;保證一次寫磁盤最終總能結(jié)束资昧,這是一種 活力。

Failure and Time 失敗和時(shí)鐘

Many difficulties that the distributed systems engineer faces can be blamed on two underlying causes:
分布式系統(tǒng)工程師面對的許多困難可以歸結(jié)為以下兩個(gè)原因:

  1. Processes may fail

  2. 進(jìn)程可能失敗

  3. There is no good way to tell that they have done so

There is a very deep relationship between what, if anything, processes share about their knowledge of time, what failure scenarios are possible to detect, and what algorithms and primitives may be correctly implemented.
進(jìn)程間怎么共用時(shí)鐘荆忍、什么樣的失敗可以檢測格带、什么樣的算法和原語可以被正確實(shí)現(xiàn),這三者之間有很深的聯(lián)系刹枉。
Most of the time, we assume that two different nodes have absolutely no shared knowledge of what time it is, or how quickly time passes.
一般情況下叽唱,我們假設(shè)不同節(jié)點(diǎn)絕對無法共用時(shí)鐘(時(shí)刻值或流過了多少時(shí)間)

You should know:
你應(yīng)該知道:

The basic tension of fault tolerance 容錯(cuò)導(dǎo)致的基本矛盾

A system that tolerates some faults without degrading must be able to act as though those faults had not occurred.
一個(gè)系統(tǒng)容忍一些錯(cuò)誤而沒有降級 必須能當(dāng)成 就像這些錯(cuò)誤沒有發(fā)生過一樣。
This means usually that parts of the system must do work redundantly, but doing more work than is absolutely necessary typically carries a cost both in performance and resource consumption.
這意味著系統(tǒng)的一部分要冗余地工作(同樣的功能部署多個(gè)節(jié)點(diǎn))譬重,冗余是絕對必要的拒逮,冗余一般會(huì)帶來性能和資源的消耗。
This is the basic tension of adding fault tolerance to a system.
這就是給一個(gè)系統(tǒng)添加冗余的基本矛盾害幅。

You should know:
你應(yīng)該知道:

  • The quorum technique for ensuring single-copy serialisability. See Skeen’s original paper, but perhaps better is Wikipedia’s entry.

  • 確保串行單復(fù)制的多數(shù)派技術(shù). 見 Skeen的原始論文, 不過或許更好的是 Wikipedia’s entry.
    (多數(shù)派中有一個(gè)是主節(jié)點(diǎn),其余為從節(jié)點(diǎn)消恍,以主節(jié)點(diǎn)接收到的寫請求序列為準(zhǔn)[串行],主節(jié)點(diǎn)單方面的要求從們接受字節(jié)的寫請求序列[從節(jié)點(diǎn)不得反抗以现、不得有異議:從節(jié)點(diǎn)是非惡意的狠怨、遵守全局規(guī)則的、非拜占庭的])

  • About 2-phase-commit, 3-phase-commit and Paxos, and why they have different fault-tolerance properties.

  • 兩步提交邑遏、 三步提交 佣赖、Paxos, 以及為什么他們不同于容錯(cuò).

  • How eventual consistency, and other techniques, seek to avoid this tension at the cost of weaker guarantees about system behaviour. The Dynamo paper is a great place to start, but also Pat Helland’s classic Life Beyond Transactions is a must-read.

  • 最終一致性、其他技術(shù) 以 對系統(tǒng)行為做更弱的保證 為代價(jià) 來 設(shè)法避開 此矛盾 . 可以看 Dynamo 論文 , 不過 必須要讀 Pat Helland的論文 經(jīng)典 Life Beyond Transactions .

Basic primitives 基本原語

There are few agreed-upon basic building blocks in distributed systems, but more are beginning to emerge. You should know what the following problems are, and where to find a solution for them:
在分布式系統(tǒng)中记盒,很少有約定的基本構(gòu)建塊憎蛤,更多的是處于形成中的基本構(gòu)建塊。有應(yīng)該知道下面的問題是什么,并且從哪能找到他們的解決方案:

Fundamental Results 基礎(chǔ)結(jié)論

Some facts just need to be internalised. There are more than this, naturally, but here’s a flavour:
有些事實(shí)只需要主觀理解(不需要關(guān)注證明).

  • You can’t implement consistent storage and respond to all requests if you might drop messages between processes. This is the CAP theorem.

  • 如果節(jié)點(diǎn)間可能丟失消息[:P]技竟,那么你不可能 既 實(shí)現(xiàn)一致性存儲(chǔ)[:C] 又 響應(yīng)所有時(shí)刻的請求[:A]. 這就是 CAP理論.

  • Consensus is impossible to implement in such a way that it both a) is always correct and b) always terminates if even one machine might fail in an asynchronous system with crash-* stop failures (the FLP result). The first slides - before the proof gets going - of my Papers We Love SF talk do a reasonable job of explaining the result, I hope. Suggestion: there’s no real need to understand the proof.

  • 在一個(gè)異步系統(tǒng)中,一致性不可能以這樣一個(gè)途徑實(shí)現(xiàn):既a) 總是正確的 屈藐; 又b) 總是能結(jié)束 即使只有一個(gè)節(jié)點(diǎn)可能以 崩潰-*停止 失敗 (FLP結(jié)論). 在看證明之前榔组,看下我以簡明的方式解釋FLP結(jié)論的論文 Papers We Love SF talk . 建議: 沒有理解證明的需求.
    (一個(gè)異步系統(tǒng)中,假設(shè)節(jié)點(diǎn)崩潰后停止而不是奔潰后又恢復(fù)联逻;1搓扯、要確保結(jié)果總是正確的,2包归、每次寫請求能夠在有限時(shí)間內(nèi)返回結(jié)果锨推。這兩點(diǎn)沒法同時(shí)滿足:這就是FLP結(jié)論)

  • Consensus is impossible to solve in fewer than 2 rounds of messages in general.

  • 一般地,只進(jìn)行少于2輪的消息傳遞公壤,不可能達(dá)成一致性 .

  • Atomic broadcast is exactly as hard as consensus - in a precise sense, if you solve atomic broadcast, you solve consensus, and vice versa. Chandra and Toueg prove this, but you just need to know that it’s true.

  • 原子廣播和一致性爱态,二者的難度精確的相等。更直白的說境钟,如果你能解原子廣播,那么你也能解一致性俭识,反之亦然慨削。 Chandra 和 Toueg 證明了這一點(diǎn), 但是你只需要知道這個(gè)論斷是成立的。

Real systems 真實(shí)系統(tǒng)

The most important exercise to repeat is to read descriptions of new, real systems, and to critique their design decisions. Do this over and over again. Some suggestions:
最重要的套媚、應(yīng)該不斷重復(fù)的實(shí)踐是:讀新的缚态、真實(shí)的系統(tǒng)的描述,并評價(jià)他們設(shè)計(jì)的決定堤瘤。 下面是建議的系統(tǒng):

Google:

Not Google:

Postscript 結(jié)尾

If you tame all the concepts and techniques on this list, I’d like to talk to you about engineering positions working with the menagerie of distributed systems we curate at Cloudera.
如果你馴服了這個(gè)列表中的所有概念和技術(shù)玫芦,我很樂意和你聊聊Cloudera的分布式系統(tǒng)工程師職位。

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末本辐,一起剝皮案震驚了整個(gè)濱河市桥帆,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌慎皱,老刑警劉巖老虫,帶你破解...
    沈念sama閱讀 222,681評論 6 517
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異茫多,居然都是意外死亡祈匙,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 95,205評論 3 399
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來夺欲,“玉大人跪帝,你說我怎么就攤上這事⌒┰模” “怎么了伞剑?”我有些...
    開封第一講書人閱讀 169,421評論 0 362
  • 文/不壞的土叔 我叫張陵,是天一觀的道長扑眉。 經(jīng)常有香客問我纸泄,道長,這世上最難降的妖魔是什么腰素? 我笑而不...
    開封第一講書人閱讀 60,114評論 1 300
  • 正文 為了忘掉前任聘裁,我火速辦了婚禮,結(jié)果婚禮上弓千,老公的妹妹穿的比我還像新娘衡便。我一直安慰自己,他們只是感情好洋访,可當(dāng)我...
    茶點(diǎn)故事閱讀 69,116評論 6 398
  • 文/花漫 我一把揭開白布镣陕。 她就那樣靜靜地躺著,像睡著了一般姻政。 火紅的嫁衣襯著肌膚如雪呆抑。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 52,713評論 1 312
  • 那天汁展,我揣著相機(jī)與錄音鹊碍,去河邊找鬼。 笑死食绿,一個(gè)胖子當(dāng)著我的面吹牛侈咕,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播器紧,決...
    沈念sama閱讀 41,170評論 3 422
  • 文/蒼蘭香墨 我猛地睜開眼耀销,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了铲汪?” 一聲冷哼從身側(cè)響起熊尉,我...
    開封第一講書人閱讀 40,116評論 0 277
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎桥状,沒想到半個(gè)月后帽揪,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 46,651評論 1 320
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡辅斟,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 38,714評論 3 342
  • 正文 我和宋清朗相戀三年转晰,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 40,865評論 1 353
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡查邢,死狀恐怖蔗崎,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情扰藕,我是刑警寧澤缓苛,帶...
    沈念sama閱讀 36,527評論 5 351
  • 正文 年R本政府宣布,位于F島的核電站邓深,受9級特大地震影響未桥,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜芥备,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 42,211評論 3 336
  • 文/蒙蒙 一冬耿、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧萌壳,春花似錦亦镶、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,699評論 0 25
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至尺借,卻和暖如春绊起,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背燎斩。 一陣腳步聲響...
    開封第一講書人閱讀 33,814評論 1 274
  • 我被黑心中介騙來泰國打工勒庄, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人瘫里。 一個(gè)月前我還...
    沈念sama閱讀 49,299評論 3 379
  • 正文 我出身青樓,卻偏偏與公主長得像荡碾,于是被迫代替她去往敵國和親谨读。 傳聞我的和親對象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,870評論 2 361

推薦閱讀更多精彩內(nèi)容