Storage System

Storage hierarchy

Cache, memory -> hard disks, SSD, Tape, Optical Disk
(讀寫速度逼龟,成本)

Access time

Time taken before drive is ready to transfer data
(物理設(shè)備(硬盤,內(nèi)存..)在進行數(shù)據(jù)的轉(zhuǎn)換前需要索引到目標位置所消耗的時間)
一般來說凤壁,
內(nèi)存:納秒級
SSD:微秒級
HDD:毫秒級

Access times.png

Storage device information

  • Characters of storage device:

    • Capacity (bytes)
    • Cost(price per byte of storage)
    • Bandwidth (number of bytes that can be transferred per second; read bandwidth is not equal to write bandwidth)
    • Latency(waiting time for response/delivery of data)
  • Basic function/operation:CRUD

  • Time to complete an operation depends on both bandwidth and latency
    CompletionTime = Latency + Size/Bandwidth
    Influence factor:
    Technology(HDD or SSD)冯勉;Operation type,(read or write)舷嗡;number of operations in the workload毁腿; Access pattern(sequential or random)

  • Access pattern:

    1. Sequential: data to be accessed are located next to each other or sequentially on the device
    2. Random: data located randomly on the storage device

Hard Disk Drive

HDD structure.png
  • One or more spinning magnetic platters
    • Typically two surfaces per platter
  • Disk arm positions over the radial position (tracks) where data are stored
    • It swings across tracks (but do not extend/shrink)
  • Data is read/written by a read/write head as platter spins

Hard disk head movement while copying files between two folders:https://www.youtube.com/watch?v=BlB49F6ExkQ

  • Physical characteristics:
    2.5‘’ in laptops汰蜘, 3.5‘’ common in desktops
    rotational speed: 4,800/5,400/7,200,10,000 RPM (rotations per minute)
    platter number: 5~7
    current capicity: 10 TB (Western Digital)

  • Disk organization: platter -> tracks -> sectors
    Each platter consists of a number of tracks;
    Each track is divided into N fixed size sectors (sector size: 4KB)

CHS (cylinder-head-sector)

Early way to address a sector (Logical Block Addressing) is more common now)


CHS structure.png
example:
# cylinders: 256
# heads: 16 (i.e., 8 platters, 2 heads/platter)
# sectors/track: 64
   sector size = 4KB

capacity of the drive:
2^8 * 2^6 * 2^2* 2^10 * 2^4 = 2^30 = 1GB
overall:capacity = C * H * S * sector size 

According to CHS, data can be located before transferring, then data can be transferred

T = Tseek + Trotation + Ttransfer
Tseek : Time to get the disk head on right track
Trotation :Time to wait for the right sector to rotate under the head
Ttransfer: Time to actually transfer the data

  1. rotational latency: waiting for the right sector to rotate under the head
    On average: about 1?2 of time of a full rotation


    rotation.png
example:
Assume 10,000 RPM (rotations per minute)
60000 ms/ 10000 rotations  = 6ms / rotation
  1. seek times (For multiple tracks): waiting for the head to the right track
    On average seek time is about 1/3 max seek time


    seek the track.png

3.transfer time (related to transmission bandwidth)

Assume that data will be transferred:  512KB, 128 MB/sec transmission bandwidth
Transfer time:  512KB/128MB * 1000ms = 4ms
  1. Actual bandwidth
    Actual bandwidth = amount data/ autual time
    actually time = Tseek + Trotation + Ttransfer

Sector vs. Block

  • Block is the smallest unit of the file system
  • Sector is the smallest unit of the hard disk
  • Block has 1 or more sectors

Sequential vs. Random

Sequential operation:

  • May assume all sectors involved are on the same track
    -- need to seek to the right track or rotate to the first sector
    -- But no rotation/seeking needed afterward

Random operation: May assume all sectors are on different tracks and sectors

example: 7ms avg seek,  10,000 RPM  50 MB/sec transfer rate 4KB/block
Sequential access of 10 MB:
– Completion time = 7ms + 60*1000/10000/2 ms + 10/50 *1000 ms = 210ms
– Actual bandwidth = 10MB/210ms = 47.62 MB/s

Random access of 10 MB 
– block numbers: 10*1000/4 = 2500  (assume 1 block = 1 sector)
– Completion time = 2500 * (7 + 3 + 4/50) = 25.2s
– Actual bandwidth = 10MB / 25.2s = 0.397 MB/s

Solid State Drive

SSD.png
  • All electronic, made from flash memory
  • Limited lifetime, can only write a limited number of times.
  • Significantly better latency: no seek or rotational delay
  • Much better performance on random (however, write has much higher latency than read )
Speed comparison between read and write.png

structures of SSD

  • SSD contains a number of flash memory chips
    chip -> dies -> planes -> blocks -> pages (rows) -> cells
? Typically, a chip may have 1, 2, or 4 dies
? A die may have1or 2 planes
? A plane has a number of blocks
? A block has a number of pages 
* A page has a number of cells 
Die Layout.png
  • Page is the smallest unit of data transfer between SSD and main memory

How data is stored in SSD

  • Cells are made of floating-gate transistors : By applying high positive/negative voltage to control gate, electrons can be attracted to or repelled from floating gate
    • State = 1, if no electrons in the floating gate
    • State = 0, if there are electrons (negative charges)
      – Electrons stuck there even when power is off
      – So state is retained
  • Data in SSD are represented by the '101010...' formats, that is the state of the eletrons
floating-gate transistor.png

Read Operations

  • Electrons on the floating gate affect the threshold voltage for the floating gate transistor to conduct
  • Higher voltage needed when gate has electrons


    Read operation.png
Steps:
? Apply Vint (intermediate voltage)
? If the current is detected, gate has no electrons=> bit = 1
? If no current, gate must have electrons => bit = 0
  • Page is the smallest unit that can be read (about more details, I choose to give up.)

Write and erase

  • Write: 1 => 0
    – Apply high positive voltage (>> voltage for read) to the control gate
    – Attract electrons from channel to floating gate (through quantum tunneling)
    – Page is the smallest unit for write

  • Erase: 0 => 1 (make electrons empty)
    – Need to apply much higher negative voltage to the control gate
    – Get rid of electrons from floating gate
    – May stress surrounding cells(dangerous to do on individual pages)
    – Block is the smallest unit for erase

P/E cycle (1->0->1->0...)

P: program/write;
E: erase

  • what is P/E cycle?
    Data are written to cells (P): cell value from 1 -> 0 – Then erased (E): 0 -> 1
  • why P/E cycle?
    Every write & erase damages oxide layer surrounding the floating-gate to some extent


    P/E cycle.png

latency: read < write < erase

latency.png

MLC (Multi-level cell)

  • floating gate can hold a number of electrons to represent different states

  • SLC vs. MLC
    – Less complex
    – Faster
    – More reliable
    – Less storage
    – More costly


    MLC example.png
2 bits, 3 intermediate voltages.png

an example about the write page of SSD

P/E/P.png
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末仇冯,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子族操,更是在濱河造成了極大的恐慌苛坚,老刑警劉巖,帶你破解...
    沈念sama閱讀 211,743評論 6 492
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件色难,死亡現(xiàn)場離奇詭異泼舱,居然都是意外死亡,警方通過查閱死者的電腦和手機莱预,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,296評論 3 385
  • 文/潘曉璐 我一進店門柠掂,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人依沮,你說我怎么就攤上這事涯贞。” “怎么了危喉?”我有些...
    開封第一講書人閱讀 157,285評論 0 348
  • 文/不壞的土叔 我叫張陵宋渔,是天一觀的道長。 經(jīng)常有香客問我辜限,道長皇拣,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 56,485評論 1 283
  • 正文 為了忘掉前任薄嫡,我火速辦了婚禮氧急,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘毫深。我一直安慰自己吩坝,他們只是感情好,可當(dāng)我...
    茶點故事閱讀 65,581評論 6 386
  • 文/花漫 我一把揭開白布哑蔫。 她就那樣靜靜地躺著钉寝,像睡著了一般。 火紅的嫁衣襯著肌膚如雪闸迷。 梳的紋絲不亂的頭發(fā)上嵌纲,一...
    開封第一講書人閱讀 49,821評論 1 290
  • 那天,我揣著相機與錄音腥沽,去河邊找鬼逮走。 笑死,一個胖子當(dāng)著我的面吹牛今阳,可吹牛的內(nèi)容都是我干的言沐。 我是一名探鬼主播邓嘹,決...
    沈念sama閱讀 38,960評論 3 408
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼险胰!你這毒婦竟也來了汹押?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 37,719評論 0 266
  • 序言:老撾萬榮一對情侶失蹤起便,失蹤者是張志新(化名)和其女友劉穎棚贾,沒想到半個月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體榆综,經(jīng)...
    沈念sama閱讀 44,186評論 1 303
  • 正文 獨居荒郊野嶺守林人離奇死亡妙痹,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 36,516評論 2 327
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了鼻疮。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片怯伊。...
    茶點故事閱讀 38,650評論 1 340
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖判沟,靈堂內(nèi)的尸體忽然破棺而出耿芹,到底是詐尸還是另有隱情,我是刑警寧澤挪哄,帶...
    沈念sama閱讀 34,329評論 4 330
  • 正文 年R本政府宣布吧秕,位于F島的核電站,受9級特大地震影響迹炼,放射性物質(zhì)發(fā)生泄漏砸彬。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 39,936評論 3 313
  • 文/蒙蒙 一斯入、第九天 我趴在偏房一處隱蔽的房頂上張望砂碉。 院中可真熱鬧,春花似錦刻两、人聲如沸增蹭。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,757評論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至壮池,卻和暖如春偏瓤,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背椰憋。 一陣腳步聲響...
    開封第一講書人閱讀 31,991評論 1 266
  • 我被黑心中介騙來泰國打工厅克, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人橙依。 一個月前我還...
    沈念sama閱讀 46,370評論 2 360
  • 正文 我出身青樓证舟,卻偏偏與公主長得像硕旗,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子女责,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 43,527評論 2 349