目前在看HBase中跟Compaction相關(guān)的代碼通危。
對(duì)于Compaction卓舵,我們最關(guān)心的一點(diǎn)归露,就是,何時(shí)會(huì)進(jìn)行Compaction杖刷。因?yàn)檫@會(huì)直接影響我們HBase的讀寫(xiě)性能励饵。
環(huán)境
HBase rel-2.1.0
解析
HBase會(huì)在這三種情況下,進(jìn)行Compact:
- MemStore flush時(shí)
- 檢測(cè)線程檢測(cè)到需要進(jìn)行Compact時(shí)
- 主動(dòng)調(diào)用
MemStore flush
第一種情況滑燃,我們可以簡(jiǎn)單看一下HRegion.internalFlushcache(WAL wal, long myseqid, Collection<HStore> storesToFlush, MonitoredTask status, boolean writeFlushWalMarker, FlushLifeCycleTracker tracker)的注釋:
/**
* Flush the memstore. Flushing the memstore is a little tricky. We have a lot of updates in the
* memstore, all of which have also been written to the wal. We need to write those updates in the
* memstore out to disk, while being able to process reads/writes as much as possible during the
* flush operation.
* <p>
* This method may block for some time. Every time you call it, we up the regions sequence id even
* if we don't flush; i.e. the returned region id will be at least one larger than the last edit
* applied to this region. The returned id does not refer to an actual edit. The returned id can
* be used for say installing a bulk loaded file just ahead of the last hfile that was the result
* of this flush, etc.
* @param wal Null if we're NOT to go via wal.
* @param myseqid The seqid to use if <code>wal</code> is null writing out flush file.
* @param storesToFlush The list of stores to flush.
* @return object describing the flush's state
* @throws IOException general io exceptions
* @throws DroppedSnapshotException Thrown when replay of WAL is required.
*/
protected FlushResultImpl internalFlushcache(WAL wal, long myseqid,
Collection<HStore> storesToFlush, MonitoredTask status, boolean writeFlushWalMarker,
FlushLifeCycleTracker tracker) throws IOException {
......
}
這個(gè)方法的調(diào)用鏈役听,最終會(huì)檢測(cè),是否需要進(jìn)行compact.
這段代碼,在HStore.updateStorefiles(List<HStoreFile> sfs, long snapshotId)方法的最后:
/**
* Change storeFiles adding into place the Reader produced by this new flush.
* @param sfs Store files
* @param snapshotId
* @throws IOException
* @return Whether compaction is required.
*/
private boolean updateStorefiles(List<HStoreFile> sfs, long snapshotId) throws IOException {
......
return needsCompaction();
}
檢測(cè)線程
具體的類是HRegionServer.CompactionChecker
典予,它會(huì)每隔一段時(shí)間檢測(cè)一下甜滨,是否有Region需要進(jìn)行Compaction.默認(rèn)是10s檢測(cè)一次,當(dāng)然瘤袖,我們可以用hbase.server.thread.wakefrequency
這個(gè)選項(xiàng)來(lái)控制.
主動(dòng)調(diào)用
這種就沒(méi)什么好說(shuō)的了.無(wú)論是hbase shell里面衣摩,還是通過(guò)Admin API,我們都可以手動(dòng)進(jìn)行Compaction.而且捂敌,通常在性能調(diào)優(yōu)時(shí)艾扮,都是更加推薦對(duì)于Major Compaction,使用手動(dòng)的方式黍匾,而不是自動(dòng)的方式.