隨著redis不斷插入或者刪除數(shù)據(jù),dict保存的鍵值對(duì)也會(huì)增多或者減少逾雄,此時(shí)dict也會(huì)進(jìn)行相對(duì)應(yīng)的擴(kuò)容和縮容含懊,這些操作主要通過rehash來完成的。
dict的擴(kuò)容
如果是擴(kuò)容筐摘,首先判斷是否需要進(jìn)行擴(kuò)容
- 1卒茬,如果在重哈希的過程中,則直接返回
- 2咖熟,如果此時(shí)dict的容量為0圃酵,也就是才進(jìn)行了初始化,則將其容量擴(kuò)容從DICT_HT_INITIAL_SIZE馍管,默認(rèn)為4
- 3郭赐,如果dict裝載因子used/size>1.0且此時(shí)可以進(jìn)行擴(kuò)容,或者裝載因子大于dict_force_resize_ratio(設(shè)置為5)時(shí)确沸,此時(shí)強(qiáng)制進(jìn)行擴(kuò)容
- 4捌锭,擴(kuò)容過程中俘陷,容量提升一倍,此時(shí)初始化ht[1]观谦,同時(shí)更新rehashidx為0拉盾,代表要進(jìn)行重哈希過程了
/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{
/* Incremental rehashing already in progress. Return. */
if (dictIsRehashing(d)) return DICT_OK;
/* If the hash table is empty expand it to the initial size. */
if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);
/* If we reached the 1:1 ratio, and we are allowed to resize the hash
* table (global setting) or we should avoid it but the ratio between
* elements/buckets is over the "safe" threshold, we resize doubling
* the number of buckets. */
if (d->ht[0].used >= d->ht[0].size &&
(dict_can_resize ||
d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
{
return dictExpand(d, d->ht[0].used*2);
}
return DICT_OK;
}
/* Expand or create the hash table */
int dictExpand(dict *d, unsigned long size)
{
dictht n; /* the new hash table */
unsigned long realsize = _dictNextPower(size);
/* the size is invalid if it is smaller than the number of
* elements already inside the hash table */
if (dictIsRehashing(d) || d->ht[0].used > size)
return DICT_ERR;
/* Rehashing to the same table size is not useful. */
if (realsize == d->ht[0].size) return DICT_ERR;
/* Allocate the new hash table and initialize all pointers to NULL */
n.size = realsize;
n.sizemask = realsize-1;
n.table = zcalloc(realsize*sizeof(dictEntry*));
n.used = 0;
/* Is this the first initialization? If so it's not really a rehashing
* we just set the first hash table so that it can accept keys. */
if (d->ht[0].table == NULL) {
d->ht[0] = n;
return DICT_OK;
}
/* Prepare a second hash table for incremental rehashing */
d->ht[1] = n;
d->rehashidx = 0;
return DICT_OK;
}
/* Resize the table to the minimal size that contains all the elements,
* but with the invariant of a USED/BUCKETS ratio near to <= 1 */
int dictResize(dict *d)
{
int minimal;
if (!dict_can_resize || dictIsRehashing(d)) return DICT_ERR;
minimal = d->ht[0].used;
if (minimal < DICT_HT_INITIAL_SIZE)
minimal = DICT_HT_INITIAL_SIZE;
return dictExpand(d, minimal);
}
dict的縮容
dict不光有擴(kuò)容,也有縮容豁状,當(dāng)刪除元素時(shí)捉偏,會(huì)檢測(cè)裝載因子是否小于HASHTABLE_MIN_FILL(默認(rèn)10%),此時(shí)進(jìn)行縮容泻红,通過調(diào)用dictResize函數(shù)將size縮小為最接近used的數(shù)組2的n次方的大小夭禽。
int htNeedsResize(dict *dict) {
long long size, used;
size = dictSlots(dict);
used = dictSize(dict);
return (size > DICT_HT_INITIAL_SIZE &&
(used*100/size < HASHTABLE_MIN_FILL));
}
rehash的過程
dictRehash每次將重哈希至少向前推進(jìn)n步(除非不到n步整個(gè)重哈希就結(jié)束了),
每一步都將ht[0]上某一個(gè)bucket(即一個(gè)dictEntry鏈表)上的所有dictEntry移動(dòng)到ht[1]上谊路,在ht[1]上的新位置根據(jù)ht[1]的sizemask進(jìn)行重新計(jì)算讹躯。
每次移動(dòng)的dictEntry都會(huì)插入ht[1]的鏈表上的第一個(gè)位置
rehashidx記錄了當(dāng)前尚未遷移(有待遷移)的ht[0]的bucket位置。
如果dictRehash被調(diào)用的時(shí)候缠劝,rehashidx指向的bucket里一個(gè)dictEntry也沒有蜀撑,那么它就沒有可遷移的數(shù)據(jù)。這時(shí)它嘗試在ht[0].table數(shù)組中不斷向后遍歷剩彬,直到找到下一個(gè)存有數(shù)據(jù)的bucket位置酷麦。如果一直找不到,則最多走n*10步喉恋,本次重哈希暫告結(jié)束沃饶。
最后,如果ht[0]上的數(shù)據(jù)都遷移到ht[1]上了(即d->ht[0].used == 0)轻黑,那么整個(gè)重哈希結(jié)束糊肤,ht[0]變成ht[1]的內(nèi)容,而ht[1]重置為空氓鄙。
三種情況下會(huì)結(jié)束這個(gè)rehash的過程:
- 1馆揉,移動(dòng)了ht[0]上n個(gè)dictEntry鏈表
- 2,雖然沒有移動(dòng)ht[0]上n個(gè)dictEntry鏈表抖拦,但是遍歷空的ht[0]的bucket已經(jīng)達(dá)到n*10
- 3升酣,整個(gè)ht[0]所有的dictEntry已經(jīng)重哈希到ht[1]
int dictRehash(dict *d, int n) {
int empty_visits = n*10; /* Max number of empty buckets to visit. */
if (!dictIsRehashing(d)) return 0;
while(n-- && d->ht[0].used != 0) {
dictEntry *de, *nextde;
/* Note that rehashidx can't overflow as we are sure there are more
* elements because ht[0].used != 0 */
assert(d->ht[0].size > (unsigned long)d->rehashidx);
while(d->ht[0].table[d->rehashidx] == NULL) {
d->rehashidx++;
if (--empty_visits == 0) return 1;
}
de = d->ht[0].table[d->rehashidx];
/* Move all the keys in this bucket from the old to the new hash HT */
while(de) {
unsigned int h;
nextde = de->next;
/* Get the index in the new hash table */
h = dictHashKey(d, de->key) & d->ht[1].sizemask;
de->next = d->ht[1].table[h];
d->ht[1].table[h] = de;
d->ht[0].used--;
d->ht[1].used++;
de = nextde;
}
d->ht[0].table[d->rehashidx] = NULL;
d->rehashidx++;
}
/* Check if we already rehashed the whole table... */
if (d->ht[0].used == 0) {
zfree(d->ht[0].table);
d->ht[0] = d->ht[1];
_dictReset(&d->ht[1]);
d->rehashidx = -1;
return 0;
}
/* More to rehash... */
return 1;
}
在前面介紹dict的添加,刪除态罪,查找和更新操作時(shí)噩茄,替代rehash的步驟都會(huì)穿插在其中進(jìn)行,因此redis的rehash的過程是一個(gè)漸進(jìn)式的過程复颈,所有元素的rehash均衡分?jǐn)偟礁鱾€(gè)操作中去绩聘,這樣可以避免集中式rehash對(duì)機(jī)器帶來的計(jì)算尖峰以及對(duì)redis功能的影響。
參考:
http://zhangtielei.com/posts/blog-redis-dict.html
《redis設(shè)計(jì)與實(shí)現(xiàn)》第二版