為什么需要write stall
我們知道称鳞, 當(dāng)flush/compaction趕不上write rate的速度時(shí)涮较,rockdb會(huì)降低write rate,甚至直接停寫(xiě)冈止, 如果沒(méi)有這個(gè)策略狂票,會(huì)有什么問(wèn)題?其實(shí)主要是兩個(gè)
- 增加空間放大熙暴,耗盡磁盤(pán)空間
- 增加讀放大闺属, 極大的降低讀性能
但是慌盯, 有時(shí)候,database容易對(duì)突然暴增的寫(xiě)太過(guò)敏感掂器,或者容易低估hardware的處理能力亚皂, 這個(gè)時(shí)候就會(huì)反饋給用戶的就是意想不到的slowness 甚至query timeout
一般情況下, 通過(guò)這幾個(gè)地方你可以知道你數(shù)據(jù)庫(kù)是不是在進(jìn)行write stall
- LOG file国瓮, info log
- Compaction stats found in Log file
write stall 觸發(fā)的條件
-
Too many memtable
延緩寫(xiě): 如果max_write_buffer_number 大于3灭必, 將要flush的memtables大于等于max_write_buffer_number - 1, write 延緩
停寫(xiě): 如果將要flush 的memtable的個(gè)數(shù)大于等于max_write_buffer_number, write 直接停止等f(wàn)lush完成
在以上情況下乃摹, 一般會(huì)有這樣的日志:
Stopping writes because we have 5 immutable memtables (waiting for flush), max_write_buffer_number is set to 5
Stalling writes because we have 4 immutable memtables (waiting for flush), max_write_buffer_number is set to 5
-
Too many level-0 SST file
延緩寫(xiě): 如果L0的文件數(shù)量達(dá)到了level0_slowdown_writes_trigger禁漓,write 延緩寫(xiě)
停寫(xiě): 如果文件數(shù)量達(dá)到了level0_stop_writes_trigger, 直接停寫(xiě)直到L0->L1的compactiom減少了L0的文件數(shù)孵睬。
以上兩種情況時(shí)播歼, 會(huì)出現(xiàn)這樣的日志
Stalling writes because we have 4 level-0 files
Stopping writes because we have 20 level-0 files -
Too many pending compaction bytes
延緩寫(xiě): 如果要compation的的字節(jié)數(shù)達(dá)到soft_pending_compaction_bytes,延緩寫(xiě)
停寫(xiě): 如果該字節(jié)數(shù)目大于hard_pending_compaction_bytes掰读, 直接停寫(xiě)
Stalling writes because of estimated pending compaction bytes 500000000
Stopping writes because of estimated pending compaction bytes 1000000000
write stall vs write stop
當(dāng)發(fā)生延緩寫(xiě)的時(shí)候秘狞,rocksdb 會(huì)把寫(xiě)速率降低到delayed_write_rate, 如果待compaction的字節(jié)數(shù)量持續(xù)增加蹈集, rocksdb的寫(xiě)速率會(huì)降低到低于delayed_write_rate烁试。 note: slowdow/停寫(xiě)/待compaction的字節(jié)都是針對(duì)單個(gè)cf的, 而write stall 是針對(duì)整個(gè)DB的雾狈,也就是說(shuō):如果某個(gè)cf 觸發(fā)了write stall, 整個(gè)DB都會(huì)stall (延緩)
如何減少stall
如果stall是由pending flush引起的廓潜,可以設(shè)置這兩個(gè)參數(shù)
- 增加max_background_flushes 使更多的thread用來(lái)flush
- 增加max_write_buffer_number 是待flush的memtable更小
如果stall 是因?yàn)長(zhǎng)0的文件太多/或者太多的compaction bytes字節(jié)數(shù),compaction的速率趕不上write善榛, 注意任何減少寫(xiě)放大的行為都可以減少compaction時(shí)需要的字節(jié)數(shù),因此為了加速compaction呻畸, 可以設(shè)置這幾個(gè)參數(shù):
- 增加max_background_compactiom 使得有更多的compaction thread
- 增加write_buffer_size, 可以減少寫(xiě)方法
- 增加min_write_buffer_number_to_merge
當(dāng)然移盆, 可以增加stop/slowdow的觸發(fā)條件、compaction bytes limit伤为, 這些都可以防止write stall