本質(zhì)
哈希表的底層數(shù)據(jù)結(jié)構(gòu)是數(shù)組剩愧,數(shù)組中每一個(gè)元素存放的是鍵值對(duì)篮灼。
核心原理
f(key) = index
將key傳入,通過(guò)某個(gè)算法f幽纷,計(jì)算出索引缘滥,如果索引沖突,再通過(guò)某個(gè)辦法對(duì)索引進(jìn)行+1或-1再算一遍厦画,直到算到不沖突為止。
負(fù)載因子 = 總鍵值對(duì)數(shù) / 哈希表總長(zhǎng)度
負(fù)載因子用來(lái)衡量用來(lái)衡量哈希表的空滿(mǎn)程度,一般鞭衩,當(dāng)負(fù)載因子為0.75~1的值就會(huì)進(jìn)行自動(dòng)擴(kuò)容学搜。
存取過(guò)程(以iOS方法緩存列表底層實(shí)現(xiàn)為例)
example:Class內(nèi)部結(jié)構(gòu)中有個(gè)方法緩存列表,當(dāng)objc_msgSend進(jìn)行到搜索方法緩存列表的步驟時(shí)论衍,為使查找更高效瑞佩,緩存列表的底層就是通過(guò)哈希表實(shí)現(xiàn)對(duì)調(diào)用過(guò)的方法的緩存。
1.存
(1)根據(jù)key計(jì)算出索引值
(2)如果該索引中沒(méi)有元素坯台,就存入鍵值對(duì)
(3)如果該索引中有元素炬丸,索引沖突,就對(duì)索引進(jìn)行+1或-1進(jìn)行存儲(chǔ)
(4)如果當(dāng)前已存在的所有索引都有元素蜒蕾,就進(jìn)行哈希表的擴(kuò)容
(5)設(shè)置新空間是舊空間的2倍
(6)重新分配內(nèi)存
(7)重新設(shè)置掩碼_mask = newCapacity-1
(8)會(huì)將舊內(nèi)存釋放掉稠炬,清空緩存
- objc_cache.mm源碼解析
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
cacheUpdateLock.assertLocked();
// Never cache before +initialize is done
if (!cls->isInitialized()) return;
// Make sure the entry wasn't added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
if (cache_getImp(cls, sel)) return;
cache_t *cache = getCache(cls);
cache_key_t key = getKey(sel);
// Use the cache as-is if it is less than 3/4 full
mask_t newOccupied = cache->occupied() + 1;
mask_t capacity = cache->capacity();
if (cache->isConstantEmptyCache()) {
// Cache is read-only. Replace it.
cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
}
else if (newOccupied <= capacity / 4 * 3) {
// Cache is less than 3/4 full. Use it as-is.
}
else {
// Cache is too full. Expand it.
cache->expand();
}
// Scan for the first unused slot and insert there.
// There is guaranteed to be an empty slot because the
// minimum size is 4 and we resized at 3/4 full.
bucket_t *bucket = cache->find(key, receiver);
if (bucket->key() == 0) cache->incrementOccupied();
//設(shè)置bucket中的key和imp
bucket->set(key, imp);
}
//擴(kuò)容
void cache_t::expand()
{
cacheUpdateLock.assertLocked();
uint32_t oldCapacity = capacity();
//新的空間是舊的空間的2倍
uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;
if ((uint32_t)(mask_t)newCapacity != newCapacity) {
// mask overflow - can't grow further
// fixme this wastes one bit of mask
newCapacity = oldCapacity;
}
//重新分配內(nèi)存
reallocate(oldCapacity, newCapacity);
}
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
bool freeOld = canBeFreed();
bucket_t *oldBuckets = buckets();
bucket_t *newBuckets = allocateBuckets(newCapacity);
// Cache's old contents are not propagated.
// This is thought to save cache memory at the cost of extra cache fills.
// fixme re-measure this
assert(newCapacity > 0);
assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);
//會(huì)重新設(shè)置最新的_mask = newCapacity - 1;
setBucketsAndMask(newBuckets, newCapacity - 1);
if (freeOld) {
//會(huì)將舊的釋放掉,清空緩存
cache_collect_free(oldBuckets, oldCapacity);
cache_collect(false);
}
}
2.取
(1)通過(guò)key得到索引
(2)do-while循環(huán)咪啡,如果散列表元素的key恰好等于按位與掩碼(&_mask)取出的索引首启,就直接返回
(3)如果不是,就將索引-1繼續(xù)查找
- objc_cache.mm源碼解析
bucket_t * cache_t::find(cache_key_t k, id receiver)
{
assert(k != 0);
//返回buckets散列表
bucket_t *b = buckets();
mask_t m = mask();
//得到索引
mask_t begin = cache_hash(k, m);
mask_t i = begin;
do {
//如果buket_t的索引的恰好等于通過(guò)&_mask取出的索引撤摸,就直接返回
if (b[i].key() == 0 || b[i].key() == k) {
return &b[i];
}
} while ((i = cache_next(i, m)) != begin);
//如果不是毅桃,直接i-1,如果減到0還不行准夷,就直接是_mask(相當(dāng)于再?gòu)淖詈笠晃焕^續(xù)減)
// hack
Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
cache_t::bad_cache(receiver, (SEL)k, cls);
}
static inline mask_t cache_hash(cache_key_t key, mask_t mask)
{
//按位與mask得到索引
return (mask_t)(key & mask);
}
#if __arm__ || __x86_64__ || __i386__
// objc_msgSend has few registers available.
// Cache scan increments and wraps at special end-marking bucket.
#define CACHE_END_MARKER 1
static inline mask_t cache_next(mask_t i, mask_t mask) {
return (i+1) & mask;
}
#elif __arm64__
// objc_msgSend has lots of registers available.
// Cache scan decrements. No end marker needed.
#define CACHE_END_MARKER 0
static inline mask_t cache_next(mask_t i, mask_t mask) {
//arm64架構(gòu)钥飞,索引-1
return i ? i-1 : mask;
}
總結(jié)
Objective-C中的實(shí)現(xiàn),優(yōu)缺點(diǎn)并存衫嵌,但相對(duì)于實(shí)際情況而言读宙,還是優(yōu)點(diǎn)大于缺點(diǎn)。因?yàn)閷?duì)于方法緩存列表楔绞,一般不會(huì)有大量的數(shù)據(jù)结闸,擴(kuò)容或許是少數(shù)情況。此時(shí)酒朵,你可能會(huì)有疑問(wèn)膀估,當(dāng)數(shù)據(jù)量少的時(shí)候,豈不是犧牲了內(nèi)存耻讽?然而察纯,哈希表就是采用了空間換時(shí)間,犧牲內(nèi)存空間针肥,換取存取效率的方法饼记。所以,整體符合需求即可慰枕。
優(yōu)點(diǎn):
整體而言具则,提升了存取效率,時(shí)間復(fù)雜度為O(1)具帮,無(wú)需遍歷博肋。
缺點(diǎn):
當(dāng)哈希表比較大時(shí)低斋,如果擴(kuò)容會(huì)導(dǎo)致瞬間效率降低。
不同編程語(yǔ)言匪凡,哈希表的實(shí)現(xiàn)也各有特點(diǎn)膊畴,但是本質(zhì)和原理不變。
例如:
對(duì)于大量的數(shù)據(jù)或者數(shù)據(jù)庫(kù)中的存取病游,Java和Redis也有自己的優(yōu)化方法唇跨。
1.Java中,當(dāng)哈希函數(shù)不合理導(dǎo)致鏈表過(guò)長(zhǎng)時(shí)衬衬,會(huì)使用紅黑樹(shù)來(lái)保證插入和查找的效率买猖。
2.Redis 使用增量式擴(kuò)容方法優(yōu)化了這個(gè)缺點(diǎn),同時(shí)還有拉鏈法的實(shí)現(xiàn)(放在鏈表頭部頭插滋尉,因?yàn)樾虏迦氲恼{(diào)用概率會(huì)高)玉控。