W-TinyLFU源碼分析（Caffeine）

Caffeine使用一個(gè)ConcurrencyHashMap來(lái)保存所有數(shù)據(jù)匪补，那它的過(guò)期淘汰策略采用的什么的方式與數(shù)據(jù)結(jié)構(gòu)呢?其中寫過(guò)期是使用writeOrderDeque,這個(gè)比較簡(jiǎn)單無(wú)需多說(shuō)，而讀過(guò)期相對(duì)復(fù)雜很多对室，使用W-TinyLFU的結(jié)構(gòu)與算法参滴。

網(wǎng)絡(luò)上有很多文章介紹W-TinyLFU結(jié)構(gòu)的强岸，大家可以去查一下，這里主要是從源碼來(lái)分析砾赔，總的來(lái)說(shuō)它使用了三個(gè)雙端隊(duì)列：accessOrderEdenDeque,accessOrderProbationDeque,accessOrderProtectedDeque,使用雙端隊(duì)列的原因是支持LRU算法比較方便蝌箍。

accessOrderEdenDeque屬于eden區(qū)，緩存1%的數(shù)據(jù)暴心，其余的99%緩存在main區(qū)妓盲。

accessOrderProbationDeque屬于main區(qū)，緩存main內(nèi)數(shù)據(jù)的20%专普，這部分是屬于冷數(shù)據(jù)悯衬，即將補(bǔ)淘汰。

accessOrderProtectedDeque屬于main區(qū)脆诉，緩存main內(nèi)數(shù)據(jù)的80%甚亭，這部分是屬于熱數(shù)據(jù)贷币，是整個(gè)緩存的主存區(qū)。

我們先看一下淘汰方法入口：

  void evictEntries() {
    if (!evicts()) {
      return;
    }
    //先從edn區(qū)淘汰
    int candidates = evictFromEden();
    //eden淘汰后的數(shù)據(jù)進(jìn)入main區(qū)亏狰，然后再?gòu)膍ain區(qū)淘汰
    evictFromMain(candidates);
  }

accessOrderEdenDeque對(duì)應(yīng)W-TinyLFU的W(window)役纹，這里保存的是最新寫入數(shù)據(jù)的引用，它使用LRU淘汰暇唾，這里面的數(shù)據(jù)主要是應(yīng)對(duì)突發(fā)流量的問(wèn)題促脉，淘汰后的數(shù)據(jù)進(jìn)入
accessOrderProbationDeque.代碼如下：

int evictFromEden() {
    int candidates = 0;
    Node<K, V> node = accessOrderEdenDeque().peek();
    while (edenWeightedSize() > edenMaximum()) {
      // The pending operations will adjust the size to reflect the correct weight
      if (node == null) {
        break;
      }

      Node<K, V> next = node.getNextInAccessOrder();
      if (node.getWeight() != 0) {
        node.makeMainProbation();
        //先從eden區(qū)移除
        accessOrderEdenDeque().remove(node);
        //移除的數(shù)據(jù)加入到main區(qū)的probation隊(duì)列
        accessOrderProbationDeque().add(node);
        candidates++;

        lazySetEdenWeightedSize(edenWeightedSize() - node.getPolicyWeight());
      }
      node = next;
    }

    return candidates;
  }

數(shù)據(jù)進(jìn)入probation隊(duì)列后，繼續(xù)執(zhí)行以下代碼：

 void evictFromMain(int candidates) {
    int victimQueue = PROBATION;
    Node<K, V> victim = accessOrderProbationDeque().peekFirst();
    Node<K, V> candidate = accessOrderProbationDeque().peekLast();
    while (weightedSize() > maximum()) {
      // Stop trying to evict candidates and always prefer the victim
      if (candidates == 0) {
        candidate = null;
      }

      // Try evicting from the protected and eden queues
      if ((candidate == null) && (victim == null)) {
        if (victimQueue == PROBATION) {
          victim = accessOrderProtectedDeque().peekFirst();
          victimQueue = PROTECTED;
          continue;
        } else if (victimQueue == PROTECTED) {
          victim = accessOrderEdenDeque().peekFirst();
          victimQueue = EDEN;
          continue;
        }

        // The pending operations will adjust the size to reflect the correct weight
        break;
      }

      // Skip over entries with zero weight
      if ((victim != null) && (victim.getPolicyWeight() == 0)) {
        victim = victim.getNextInAccessOrder();
        continue;
      } else if ((candidate != null) && (candidate.getPolicyWeight() == 0)) {
        candidate = candidate.getPreviousInAccessOrder();
        candidates--;
        continue;
      }

      // Evict immediately if only one of the entries is present
      if (victim == null) {
        candidates--;
        Node<K, V> evict = candidate;
        candidate = candidate.getPreviousInAccessOrder();
        evictEntry(evict, RemovalCause.SIZE, 0L);
        continue;
      } else if (candidate == null) {
        Node<K, V> evict = victim;
        victim = victim.getNextInAccessOrder();
        evictEntry(evict, RemovalCause.SIZE, 0L);
        continue;
      }

      // Evict immediately if an entry was collected
      K victimKey = victim.getKey();
      K candidateKey = candidate.getKey();
      if (victimKey == null) {
        Node<K, V> evict = victim;
        victim = victim.getNextInAccessOrder();
        evictEntry(evict, RemovalCause.COLLECTED, 0L);
        continue;
      } else if (candidateKey == null) {
        candidates--;
        Node<K, V> evict = candidate;
        candidate = candidate.getPreviousInAccessOrder();
        evictEntry(evict, RemovalCause.COLLECTED, 0L);
        continue;
      }

      // Evict immediately if the candidate's weight exceeds the maximum
      if (candidate.getPolicyWeight() > maximum()) {
        candidates--;
        Node<K, V> evict = candidate;
        candidate = candidate.getPreviousInAccessOrder();
        evictEntry(evict, RemovalCause.SIZE, 0L);
        continue;
      }

      // Evict the entry with the lowest frequency
      candidates--;
      //最核心算法在這里：從probation的頭尾取出兩個(gè)node進(jìn)行比較頻率策州，頻率更小者將被remove
      if (admit(candidateKey, victimKey)) {
        Node<K, V> evict = victim;
        victim = victim.getNextInAccessOrder();
        evictEntry(evict, RemovalCause.SIZE, 0L);
        candidate = candidate.getPreviousInAccessOrder();
      } else {
        Node<K, V> evict = candidate;
        candidate = candidate.getPreviousInAccessOrder();
        evictEntry(evict, RemovalCause.SIZE, 0L);
      }
    }
  }

上面的代碼邏輯是從probation的頭尾取出兩個(gè)node進(jìn)行比較頻率瘸味，頻率更小者將被remove，其中尾部元素就是上一部分從eden中淘汰出來(lái)的元素够挂，如果將兩步邏輯合并起來(lái)講是這樣的：在eden隊(duì)列通過(guò)lru淘汰出來(lái)的”候選者“與probation隊(duì)列通過(guò)lru淘汰出來(lái)的“被驅(qū)逐者“進(jìn)行頻率比較旁仿，失敗者將被從cache中真正移除。下面看一下它的比較邏輯admit：

  boolean admit(K candidateKey, K victimKey) {
    int victimFreq = frequencySketch().frequency(victimKey);
    int candidateFreq = frequencySketch().frequency(candidateKey);
    //如果候選者的頻率高就淘汰被驅(qū)逐者
    if (candidateFreq > victimFreq) {
      return true;
      //如果被驅(qū)逐者比候選者的頻率高孽糖，并且候選者頻率小于等于5則淘汰者
    } else if (candidateFreq <= 5) {
      // The maximum frequency is 15 and halved to 7 after a reset to age the history. An attack
      // exploits that a hot candidate is rejected in favor of a hot victim. The threshold of a warm
      // candidate reduces the number of random acceptances to minimize the impact on the hit rate.
      return false;
    }
    //隨機(jī)淘汰
    int random = ThreadLocalRandom.current().nextInt();
    return ((random & 127) == 0);
  }

從frequencySketch取出候選者與被驅(qū)逐者的頻率枯冈，如果候選者的頻率高就淘汰被驅(qū)逐者，如果被驅(qū)逐者比候選者的頻率高办悟，并且候選者頻率小于等于5則淘汰者尘奏，如果前面兩個(gè)條件都不滿足則隨機(jī)淘汰。

整個(gè)過(guò)程中你是不是發(fā)現(xiàn)protectedDeque并沒(méi)有什么作用病蛉，那它是怎么作為主存區(qū)來(lái)保存大部分?jǐn)?shù)據(jù)的呢?

//onAccess方法觸發(fā)該方法 
void reorderProbation(Node<K, V> node) {
    if (!accessOrderProbationDeque().contains(node)) {
      // Ignore stale accesses for an entry that is no longer present
      return;
    } else if (node.getPolicyWeight() > mainProtectedMaximum()) {
      return;
    }

    long mainProtectedWeightedSize = mainProtectedWeightedSize() + node.getPolicyWeight();
   //先從probation中移除
   accessOrderProbationDeque().remove(node);
  //加入到protected中
    accessOrderProtectedDeque().add(node);
    node.makeMainProtected();

    long mainProtectedMaximum = mainProtectedMaximum();
  //從protected中移除
    while (mainProtectedWeightedSize > mainProtectedMaximum) {
      Node<K, V> demoted = accessOrderProtectedDeque().pollFirst();
      if (demoted == null) {
        break;
      }
      demoted.makeMainProbation();
      //加入到probation中
      accessOrderProbationDeque().add(demoted);
      mainProtectedWeightedSize -= node.getPolicyWeight();
    }

    lazySetMainProtectedWeightedSize(mainProtectedWeightedSize);
  }

當(dāng)數(shù)據(jù)被訪問(wèn)時(shí)并且該數(shù)據(jù)在probation中炫加，這個(gè)數(shù)據(jù)就會(huì)移動(dòng)到protected中去，同時(shí)通過(guò)lru從protected中淘汰一個(gè)數(shù)據(jù)進(jìn)入到probation中铺然。

這樣數(shù)據(jù)流轉(zhuǎn)的邏輯全部通了：新數(shù)據(jù)都會(huì)進(jìn)入到eden中俗孝，通過(guò)lru淘汰到probation,并與probation中通過(guò)lru淘汰的數(shù)據(jù)進(jìn)行使用頻率pk,如果勝利了就繼續(xù)留在probation中，如果失敗了就會(huì)被直接淘汰探熔，當(dāng)這條數(shù)據(jù)被訪問(wèn)了驹针，則移動(dòng)到protected烘挫。當(dāng)其它數(shù)據(jù)被訪問(wèn)了诀艰，則它可能會(huì)從protected中通過(guò)lru淘汰到probation中。

TinyLFU

傳統(tǒng)LFU一般使用key-value形式來(lái)記錄每個(gè)key的頻率饮六，優(yōu)點(diǎn)是數(shù)據(jù)結(jié)構(gòu)非常簡(jiǎn)單其垄，并且能跟緩存本身的數(shù)據(jù)結(jié)構(gòu)復(fù)用，增加一個(gè)屬性記錄頻率就行了卤橄，它的缺點(diǎn)也比較明顯就是頻率這個(gè)屬性會(huì)占用很大的空間绿满，但如果改用壓縮方式存儲(chǔ)頻率呢? 頻率占用空間肯定可以減少，但會(huì)引出另外一個(gè)問(wèn)題：怎么從壓縮后的數(shù)據(jù)里獲得對(duì)應(yīng)key的頻率呢?

TinyLFU的解決方案是類似位圖的方法窟扑，將key取hash值獲得它的位下標(biāo)喇颁，然后用這個(gè)下標(biāo)來(lái)找頻率漏健，但位圖只有0、1兩個(gè)值橘霎，那頻率明顯可能會(huì)非常大蔫浆，這要怎么處理呢? 另外使用位圖需要預(yù)占非常大的空間，這個(gè)問(wèn)題怎么解決呢?

TinyLFU根據(jù)最大數(shù)據(jù)量設(shè)置生成一個(gè)long數(shù)組姐叁，然后將頻率值保存在其中的四個(gè)long的4個(gè)bit位中（4個(gè)bit位不會(huì)大于15）瓦盛，取頻率值時(shí)則取四個(gè)中的最小一個(gè)。

image

Caffeine認(rèn)為頻率大于15已經(jīng)很高了外潜，是屬于熱數(shù)據(jù)原环，所以它只需要4個(gè)bit位來(lái)保存，long有8個(gè)字節(jié)64位处窥，這樣可以保存16個(gè)頻率嘱吗。取hash值的后左移兩位，然后加上hash四次滔驾，這樣可以利用到16個(gè)中的13個(gè)柜与，利用率挺高的，或許有更好的算法能將16個(gè)都利用到嵌灰。

  public void increment(@Nonnull E e) {
    if (isNotInitialized()) {
      return;
    }

    int hash = spread(e.hashCode());
    int start = (hash & 3) << 2;

    // Loop unrolling improves throughput by 5m ops/s
    int index0 = indexOf(hash, 0); //indexOf也是一種hash方法弄匕，不過(guò)會(huì)通過(guò)tableMask來(lái)限制范圍
    int index1 = indexOf(hash, 1);
    int index2 = indexOf(hash, 2);
    int index3 = indexOf(hash, 3);

    boolean added = incrementAt(index0, start);
    added |= incrementAt(index1, start + 1);
    added |= incrementAt(index2, start + 2);
    added |= incrementAt(index3, start + 3);

    //當(dāng)數(shù)據(jù)寫入次數(shù)達(dá)到數(shù)據(jù)長(zhǎng)度時(shí)就重置
    if (added && (++size == sampleSize)) {
      reset();
    }
  }

給對(duì)應(yīng)位置的bit位四位的Int值加1：

  boolean incrementAt(int i, int j) {
    int offset = j << 2;
    long mask = (0xfL << offset);
    //當(dāng)已達(dá)到15時(shí)，次數(shù)不再增加
    if ((table[i] & mask) != mask) {
      table[i] += (1L << offset);
      return true;
    }
    return false;
  }

獲得值的方法也是通過(guò)四次hash來(lái)獲得沽瞭，然后取最小值：

  public int frequency(@Nonnull E e) {
    if (isNotInitialized()) {
      return 0;
    }

    int hash = spread(e.hashCode());
    int start = (hash & 3) << 2;
    int frequency = Integer.MAX_VALUE;
    //四次hash
    for (int i = 0; i < 4; i++) {
      int index = indexOf(hash, i);
      //獲得bit位四位的Int值
      int count = (int) ((table[index] >>> ((start + i) << 2)) & 0xfL);
      //取最小值
      frequency = Math.min(frequency, count);
    }
    return frequency;
  }

當(dāng)數(shù)據(jù)寫入次數(shù)達(dá)到數(shù)據(jù)長(zhǎng)度時(shí)就會(huì)將次數(shù)減半迁匠，一些冷數(shù)據(jù)在這個(gè)過(guò)程中將歸0，這樣會(huì)使hash沖突降低：

  void reset() {
    int count = 0;
    for (int i = 0; i < table.length; i++) {
      count += Long.bitCount(table[i] & ONE_MASK);
      table[i] = (table[i] >>> 1) & RESET_MASK;
    }
    size = (size >>> 1) - (count >>> 2);
  }

關(guān)注個(gè)人微信公眾號(hào)：肌肉碼農(nóng)驹溃，回復(fù)“好友”討論問(wèn)題城丧。

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市豌鹤，隨后出現(xiàn)的幾起案子亡哄，更是在濱河造成了極大的恐慌，老刑警劉巖布疙，帶你破解...
沈念sama閱讀 211,639評(píng)論 6贊 492
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件蚊惯，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡灵临，警方通過(guò)查閱死者的電腦和手機(jī)截型，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,277評(píng)論 3贊 385
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)儒溉，“玉大人宦焦，你說(shuō)我怎么就攤上這事。” “怎么了波闹？”我有些...
開封第一講書人閱讀 157,221評(píng)論 0贊 348
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵酝豪，是天一觀的道長(zhǎng)。經(jīng)常有香客問(wèn)我精堕，道長(zhǎng)寓调，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 56,474評(píng)論 1贊 283
?港島之戀（遺憾婚禮）
正文為了忘掉前任锄码，我火速辦了婚禮夺英，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘滋捶。我一直安慰自己痛悯，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 65,570評(píng)論 6贊 386
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開白布重窟。她就那樣靜靜地躺著载萌，像睡著了一般。火紅的嫁衣襯著肌膚如雪巡扇。梳的紋絲不亂的頭發(fā)上扭仁，一...
開封第一講書人閱讀 49,816評(píng)論 1贊 290
城市分裂傳說(shuō)
那天，我揣著相機(jī)與錄音厅翔，去河邊找鬼乖坠。笑死，一個(gè)胖子當(dāng)著我的面吹牛刀闷，可吹牛的內(nèi)容都是我干的熊泵。我是一名探鬼主播，決...
沈念sama閱讀 38,957評(píng)論 3贊 408
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼甸昏，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼顽分！你這毒婦竟也來(lái)了？” 一聲冷哼從身側(cè)響起施蜜，我...
開封第一講書人閱讀 37,718評(píng)論 0贊 266
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤卒蘸，失蹤者是張志新（化名）和其女友劉穎，沒(méi)想到半個(gè)月后翻默，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體缸沃，經(jīng)...
沈念sama閱讀 44,176評(píng)論 1贊 303
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,511評(píng)論 2贊 327
?白月光啟示錄
正文我和宋清朗相戀三年冰蘑，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了和泌。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 38,646評(píng)論 1贊 340
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡祠肥，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情仇箱，我是刑警寧澤县恕，帶...
沈念sama閱讀 34,322評(píng)論 4贊 330
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站剂桥，受9級(jí)特大地震影響忠烛，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜权逗，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,934評(píng)論 3贊 313
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一美尸、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧斟薇，春花似錦师坎、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,755評(píng)論 0贊 21
一樁弒父案胯陋，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)。三九已至袱箱，卻和暖如春遏乔，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背发笔。一陣腳步聲響...
開封第一講書人閱讀 31,987評(píng)論 1贊 266
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工盟萨，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人了讨。一個(gè)月前我還...
沈念sama閱讀 46,358評(píng)論 2贊 360
代替公主和親
正文我出身青樓鸯旁，卻偏偏與公主長(zhǎng)得像，于是被迫代替她去往敵國(guó)和親量蕊。傳聞我的和親對(duì)象是個(gè)殘疾皇子铺罢，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 43,514評(píng)論 2贊 348

W-TinyLFU源碼分析（Caffeine）

TinyLFU

推薦閱讀更多精彩內(nèi)容