Cloud Computing Week5
==========================
總算跟完了烧栋,這里再根據final test做一些總結。
1.Power usage effectiveness (PUE) is a measure of how efficiently a computer data center uses energy; specifically, how much energy is used by the computing equipment (in contrast to cooling and other overhead). 參考
PUE = Total Facility Energy / IT Equipment Energy
2.One problem with the Hadoop system is that by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program. (有一個node很慢導致整體很慢,如何解決郊丛?引入speculative execution)
Tasks may be slow for various reasons, including hardware degradation, or software mis-configuration, but the causes may be hard to detect since the tasks still complete successfully, albeit after a longer time than expected.
Hadoop doesn’t try to diagnose and fix slow-running tasks; instead, it tries to detect when a task is running slower than expected and launches another, equivalent, task as a backup. This is termed speculative execution of tasks. (參考)
3.Hadoop Yarn作為一個分布式系統(tǒng)scheduler,將每一個server看作一個container的集合(collection),這里的container可以認為是一些CPU和memory的容易浩峡。Yarn由以下三部分組成:
- Global Resource Manager:包含一個capacity scheduler,主要功能是schedule container給任務错敢。
- (Per-server) Node Manager:執(zhí)行一些server級的功能,比如返回container已經完成任務的消息給RM缕粹。
- Application Manager: 包含2個功能1)在RM與NM之間進行使用container的協(xié)商(negotiation)2)偵測task failure稚茅。
4.對于分布式系統(tǒng)而言,有兩個概念:
- Satety: Something bad will never happen.
- Liveness: Guarantee good will happen eventually.
5.Gossip-style failure detection protocol具有一個更新規(guī)則平斩。假設Node p當前時間為123亚享,有一個entry (q, 34, 101),entry分別表示 (address, heartbeat counter, local time)绘面。此時若來了新的entry (q, 35, 110)欺税,因為這個entry中q的heartbeat counter較之前的大侈沪,因此entry更新為 (q, 35, 123)。
6.CAP:Consistency晚凿、Availability和Partition Tolerance亭罪。
- Consistency(一致性):一致性是說數(shù)據的原子性,這種原子性在經典的數(shù)據庫中是通過事務來保證的歼秽,當事務完成時应役,無論其是成功還是回滾,數(shù)據都會處于一致的狀態(tài)燥筷。在分布式環(huán)境中箩祥,一致性是說多個節(jié)點的數(shù)據是否一致。
- Availability(可用性):可用性是說服務能一直保證是可用的狀態(tài)肆氓,當用戶發(fā)出一個請求袍祖,服務能在有限時間內返回結果。
- Partition Tolerance(分區(qū)容錯性):Partition是指網絡的分區(qū)谢揪∶し海可以這樣理解,一般來說键耕,關鍵的數(shù)據和服務都會位于不同的IDC寺滚。
CAP原理指出一個分布式系統(tǒng)不可能同時滿足一致性,可用性和分區(qū)容錯性這三個需求屈雄,三個要素中最多只能同時滿足兩點村视。
7.BASE來自于互聯(lián)網的電子商務領域的實踐,它是基于CAP理論逐步演化而來酒奶,核心思想是即便不能達到強一致性(Strong consistency)蚁孔,但可以根據應用特點采用適當?shù)姆绞絹磉_到最終一致性(Eventual consistency)的效果。BASE是Basically Available惋嚎、Soft state杠氢、Eventually consistent三個詞組的簡寫,是對CAP中C & A的延伸另伍。BASE的含義:
- Basically Available:基本可用
- Soft-state:軟狀態(tài)/柔性事務鼻百,即狀態(tài)可以有一段時間的不同步
- Eventual consistency:最終一致性
參考
http://www.cnblogs.com/hustcat/archive/2010/09/07/1820970.html