iOS類結(jié)構(gòu)：cache_t分析

一砰奕、cache_t 內(nèi)部結(jié)構(gòu)分析

1.1 在iOS類的結(jié)構(gòu)分析中康震，我們已經(jīng)分析過類（Class）的本質(zhì)是一個結(jié)構(gòu)體燎含，結(jié)構(gòu)體內(nèi)部結(jié)構(gòu)如下 :

typedef struct objc_class *Class;
typedef struct objc_object *id;

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    class_rw_t *data() const {
        return bits.data();
    }
    ...
}

Class ISA ：指向關(guān)聯(lián)類 , 繼承自 objc_object 。參考 isa底層結(jié)構(gòu)分析
Class superclass：父類指針 , 同樣參考上述文章中有詳細指向探索腿短。
cache_t cache , 方法緩存存儲數(shù)據(jù)結(jié)構(gòu)屏箍。
class_data_bits_t bits , bit 中存儲了屬性，方法等類的源數(shù)據(jù)答姥。

1.2 在iOS類的結(jié)構(gòu)分析中铣除，我們已經(jīng)分析過 cache_t 結(jié)構(gòu)體，分為以下四個部分:

struct cache_t {
    struct bucket_t * _buckets; // 緩存數(shù)組鹦付，即哈希桶
    mask_t _mask; // 緩存數(shù)組的容量臨界值,實際上是為了 capacity 服務(wù)
    uint16_t _flags; // 位置標記
    uint16_t _occupied; // 緩存數(shù)組中已緩存方法數(shù)量
    ...省略
}

_buckets：是 bucket_t 結(jié)構(gòu)體的數(shù)組，bucket_t 是用來存放方法編號 SEL 和函數(shù)指針 IMP 的择卦。

struct bucket_t {
    explicit_atomic<uintptr_t> _imp;
    explicit_atomic<SEL> _sel;
}

_mask: mask_t m = capacity - 1; (capacity = MAX_CACHE_SIZE;)敲长，用作掩碼。因為這里緩存 Cache 的容量 Size 一直是2倍擴容的秉继，所以 MAX_CACHE_SIZE 是2的整數(shù)次冪祈噪，所以 mask 的二進制位 000011, 000111, 001111 ）剛好可以用作 Hash取余數(shù)的掩碼。剛好保證相與后不超過緩存大小尚辑。

capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;  // 擴容至兩倍

_flags: 位置標記
_occupied是當前已緩存的方法數(shù)量辑鲤。即數(shù)組中已使用了多少位置。

二杠茬、方法緩存原理探索

源碼如下：

@interface LGPerson : NSObject

- (void)sayHello;

- (void)sayCode;

- (void)sayMaster;

- (void)sayNB;

+ (void)sayHappy;

@end
#import "LGPerson.h"

@implementation LGPerson
- (void)sayHello{
    NSLog(@"LGPerson say : %s",__func__);
}

- (void)sayCode{
    NSLog(@"LGPerson say : %s",__func__);
}

- (void)sayMaster{
    NSLog(@"LGPerson say : %s",__func__);
}

- (void)sayNB{
    NSLog(@"LGPerson say : %s",__func__);
}

+ (void)sayHappy{
    NSLog(@"LGPerson say : %s",__func__);
}
@end

#import <Foundation/Foundation.h>
#import "LGPerson.h"
#import <objc/runtime.h>


// cache_t
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        LGPerson *p  = [LGPerson alloc];
        Class pClass = [LGPerson class];

        [p sayHello];
        [p sayCode];
        [p sayMaster];
        [p sayNB];

        NSLog(@"%@",pClass);
    }
    return 0;
}

2.1 我們再sayHello方法前設(shè)置斷點月褥，LLDB調(diào)試 其中的 cache_t 的數(shù)據(jù)

因為在類結(jié)構(gòu)體中 cache_t 前面有 Class ISA指針 和 Class superclass 父類指針 ，所以要偏移16位瓢喉。

(lldb) p/x pClass
(Class) $0 = 0x00000001000022a0 LGPerson
(lldb) p (cache_t *)0x00000001000022b0
(cache_t *) $1 = 0x00000001000022b0
(lldb) p *$1
(cache_t) $2 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x000000010032e420 {
      _sel = {
        std::__1::atomic<objc_selector *> = (null)
      }
      _imp = {
        std::__1::atomic<unsigned long> = 0
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 0
  }
  _flags = 32804
  _occupied = 0
}

2.2 然后執(zhí)行一步 sayHello 方法宁赤，再次進行 LLDB調(diào)試 ，查看 cache_t 的數(shù)據(jù)

2020-09-17 22:37:33.187060+0800 KCObjc[34953:549295] LGPerson say : -[LGPerson sayHello]
(lldb) p *$1
(cache_t) $3 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x00000001006ad5f0 {
      _sel = {
        std::__1::atomic<objc_selector *> = ""
      }
      _imp = {
        std::__1::atomic<unsigned long> = 11936
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 3
  }
  _flags = 32804
  _occupied = 1
}

2.3 走到這里栓票，大家應(yīng)該發(fā)現(xiàn) _buckets 决左、_mask 、_occupied 的變化了。其中_occupied 從0變?yōu)?佛猛，也證明了執(zhí)行完 sayHello 方法 之后惑芭，緩存方法數(shù)量 + 1 。接下來我們查看一下哈希桶 _buckets 的變化继找，哈希桶數(shù)據(jù)類型 struct bucket_t 我們點進去查看如下：

struct bucket_t {
public:
    inline SEL sel() const { return _sel.load(memory_order::memory_order_relaxed); }

    inline IMP imp(Class cls) const {
        uintptr_t imp = _imp.load(memory_order::memory_order_relaxed);
        if (!imp) return nil;
#if CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_PTRAUTH
        SEL sel = _sel.load(memory_order::memory_order_relaxed);
        return (IMP)
            ptrauth_auth_and_resign((const void *)imp,
                                    ptrauth_key_process_dependent_code,
                                    modifierForSEL(sel, cls),
                                    ptrauth_key_function_pointer, 0);
#elif CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_ISA_XOR
        return (IMP)(imp ^ (uintptr_t)cls);
#elif CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_NONE
        return (IMP)imp;
#else
#error Unknown method cache IMP encoding.
#endif
    }
}

我們就可以查看 _buckets 的 SEL 和 IMP 信息遂跟。

(lldb) p $3.buckets()
(bucket_t *) $4 = 0x00000001006ad5f0
(lldb) p *$4
(bucket_t) $5 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 11936
  }
}
(lldb) p $5.sel()
(SEL) $6 = "sayHello"
(lldb) p $5.imp(pClass)
(IMP) $7 = 0x0000000100000c00 (KCObjc`-[LGPerson sayHello])

然后我們也可以打開 MachOView 查看一下 sayHello 方法的 IMP 指針

MachOView

與我們 LLDB調(diào)試 結(jié)果不謀而合，完美~

2.4 接下來我們繼續(xù)執(zhí)行 sayMaster 方法 和 sayNB 方法 码荔，進行 LLDB調(diào)試 漩勤，查看 cache_t 的數(shù)據(jù)

2020-09-17 23:12:37.095330+0800 KCObjc[34953:549295] LGPerson say : -[LGPerson sayCode]
(lldb) p *$1
(cache_t) $8 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x00000001006ad5f0 {
      _sel = {
        std::__1::atomic<objc_selector *> = ""
      }
      _imp = {
        std::__1::atomic<unsigned long> = 11936
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 3
  }
  _flags = 32804
  _occupied = 2
}
2020-09-17 23:12:59.163825+0800 KCObjc[34953:549295] LGPerson say : -[LGPerson sayMaster]
(lldb) p *$1
(cache_t) $9 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x0000000103b4c7d0 {
      _sel = {
        std::__1::atomic<objc_selector *> = (null)
      }
      _imp = {
        std::__1::atomic<unsigned long> = 0
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 7
  }
  _flags = 32804
  _occupied = 1
}

走到這里，我們發(fā)現(xiàn)：
問題①. _occupied 由 2 變?yōu)榱?1 缩搅，緩存方法數(shù)量 _occupied 為什么會減少呢越败？
問題②. _mask 由 3 變?yōu)榱?7 ，至于 _mask 的變化硼瓣，大家可以能想到究飞，前面我們講過， _mask 是受緩存容量 CACHE SIZE 2 倍擴容的影響堂鲤。緩存容量 CACHE SIZE 由 4 變?yōu)榱?8 亿傅。
問題③. _buckets 里面的 SEL 和 IMP 消失了。

2.5 接下來瘟栖，我們來一探究竟葵擎。在 void incrementOccupied(); 方法中我們看到了 _occupied++; 。

void cache_t::incrementOccupied() 
{
    _occupied++;
}

2.6 然后我們在源碼中找一下半哟，什么地方執(zhí)行了 incrementOccupied(); 這個方法酬滤。驚喜來了，cache_t::insert() 方法中執(zhí)行了 incrementOccupied(); 這個方法寓涨。從名稱我們就可以發(fā)現(xiàn)盯串，這是向緩存插入的方法。

void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)
{
#if CONFIG_USE_CACHE_LOCK
    cacheUpdateLock.assertLocked();
#else
    runtimeLock.assertLocked();
#endif

    ASSERT(sel != 0 && cls->isInitialized());

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = occupied() + 1;
    unsigned oldCapacity = capacity(), capacity = oldCapacity;
    if (slowpath(isConstantEmptyCache())) {
        // Cache is read-only. Replace it.
        if (!capacity) capacity = INIT_CACHE_SIZE;
        reallocate(oldCapacity, capacity, /* freeOld */false);
    }
    else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { // 4  3 + 1 bucket cache_t
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;  // 擴容兩倍 4
        if (capacity > MAX_CACHE_SIZE) {
            capacity = MAX_CACHE_SIZE;
        }
        reallocate(oldCapacity, capacity, true);  // 內(nèi)存 庫容完畢
    }

    bucket_t *b = buckets();
    mask_t m = capacity - 1;
    mask_t begin = cache_hash(sel, m);
    mask_t i = begin;

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the
    // minimum size is 4 and we resized at 3/4 full.
    do {
        if (fastpath(b[i].sel() == 0)) {
            incrementOccupied();
            b[i].set<Atomic, Encoded>(sel, imp, cls);
            return;
        }
        if (b[i].sel() == sel) {
            // The entry was added to the cache by some other thread
            // before we grabbed the cacheUpdateLock.
            return;
        }
    } while (fastpath((i = cache_next(i, m)) != begin));

    cache_t::bad_cache(receiver, (SEL)sel, cls);
}

2.6.1 接下來我們分析一下這個小概率事件 -> 初始化方法：
如果緩存為空戒良，則開辟緩存 INIT_CACHE_SIZE ：4体捏。然后利用 reallocate() 方法 開辟空間。

enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2),
    MAX_CACHE_SIZE_LOG2  = 16,
    MAX_CACHE_SIZE       = (1 << MAX_CACHE_SIZE_LOG2),
};

if (slowpath(isConstantEmptyCache())) { // 小概率事件 -> 初始化方法
    // Cache is read-only. Replace it.
    if (!capacity) capacity = INIT_CACHE_SIZE; // 4 (枚舉定義：1 左移 2 位)
    reallocate(oldCapacity, capacity, /* freeOld */false);
}

reallocate() 方法

申請 newCapacity 大小的地址
調(diào)用 setBucketsAndMask() 方法 初始化 bucket

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
    }
}

setBucketsAndMask() 方法

將 舊bucket 存入 新bucket
將 _occupied = 0糯崎，這里我們留意到了 reallocate() 方法 會將 _occupied = 0几缭。

void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
    // objc_msgSend uses mask and buckets with no locks.
    // It is safe for objc_msgSend to see new buckets but old mask.
    // (It will get a cache miss but not overrun the buckets' bounds).
    // It is unsafe for objc_msgSend to see old buckets and new mask.
    // Therefore we write new buckets, wait a lot, then write new mask.
    // objc_msgSend reads mask first, then buckets.

#ifdef __arm__
    // ensure other threads see buckets contents before buckets pointer
    mega_barrier();

    _buckets.store(newBuckets, memory_order::memory_order_relaxed);
    
    // ensure other threads see new buckets before new mask
    mega_barrier();
    
    _mask.store(newMask, memory_order::memory_order_relaxed);
    _occupied = 0;
#elif __x86_64__ || i386
    // ensure other threads see buckets contents before buckets pointer
    _buckets.store(newBuckets, memory_order::memory_order_release);
    
    // ensure other threads see new buckets before new mask
    _mask.store(newMask, memory_order::memory_order_release);
    _occupied = 0;
#else
#error Don't know how to do setBucketsAndMask on this architecture.
#endif
}

2.6.2 接下來就是大概率事件方法

如果緩存 newOccupied + CACHE_END_MARKER（1） < capacity / 4 * 3，則什么都不需要做拇颅。

#define CACHE_END_MARKER 1

else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { // 4  3 + 1 bucket cache_t
    // Cache is less than 3/4 full. Use it as-is.
}

2.6.3 接下來就是擴容方法

如果大于總?cè)萘康?3 / 4 的時候奏司，就需要擴容了（擴容至2倍）。
擴容之后仍然需要利用 reallocate() 方法 開辟空間樟插，在 2.6.1 的
setBucketsAndMask() 方法 中我們講過韵洋， reallocate() 方法 會將 _occupied = 0竿刁。到這，我們終于理解了2.4 當中的 問題② 搪缨，為什么 _occupied 會減少食拜，因為擴容之后 _occupied 會初始化至 0，重新計算副编。

else {
    capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;  // 擴容至兩倍 4
    if (capacity > MAX_CACHE_SIZE) {
        capacity = MAX_CACHE_SIZE;
    }
    reallocate(oldCapacity, capacity, true);  // 內(nèi)存 擴容完畢
}

2.6.4 reallocate() 方法

調(diào)用 setBucketsAndMask() 方法 初始化 bucket 负甸，因為 bucket 受擴容影響重新初始化了，所以2.4 當中的 問題③ 的原因就在這里痹届。

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
    }
}

2.6.5 接下來就是 _mask 變化的方法呻待，在2.6.3 中我們知道容量擴容到 2 倍，那么 mask 的值就是 2 的 n次冪 - 1 , 所以 2.4 當中的 問題① 便迎刃而解了队腐。

mask_t m = capacity - 1;