ElasticSearch-磁盤空間不夠引起的問題

es 6.2.4版本
logstash跑了一陣子之后不再同步數(shù)據(jù)了,日志信息如下：

[2019-06-19T10:30:28,379][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-06-19T10:30:28,379][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>125}

檢查elasticsearch日志如下：(日志沒貼全,應(yīng)該有超過flood_stage閾值的告警,因為logstash日志里已經(jīng)提示索引只讀了...可是日志被我刪掉了...)

[2019-06-19T10:30:26,746][WARN ][o.e.c.r.a.DiskThresholdMonitor] [node-1] high disk watermark [90%] exceeded on [sPv40vq_RKanFIUuBgJuUQ][node-2][/usr/elasticsearch/data/nodes/0] free: 657.4mb[8%], shards will be relocated away from this node
[2019-06-19T10:30:26,746][INFO ][o.e.c.r.a.DiskThresholdMonitor] [node-1] low disk watermark [85%] exceeded on [DJl3qtK5Twmpi-MGNuujog][node-3][/usr/elasticsearch/data/nodes/0] free: 1gb[13.4%], replicas will not be assigned to this node
[2019-06-19T10:30:56,752][WARN ][o.e.c.r.a.DiskThresholdMonitor] [node-1] high disk watermark [90%] exceeded on [sPv40vq_RKanFIUuBgJuUQ][node-2][/usr/elasticsearch/data/nodes/0] free: 657.4mb[8%], shards will be relocated away from this node
[2019-06-19T10:30:56,752][INFO ][o.e.c.r.a.DiskThresholdMonitor] [node-1] low disk watermark [85%] exceeded on [DJl3qtK5Twmpi-MGNuujog][node-3][/usr/elasticsearch/data/nodes/0] free: 1gb[13.4%], replicas will not be assigned to this node
[2019-06-19T10:30:56,752][INFO ][o.e.c.r.a.DiskThresholdMonitor] [node-1] rerouting shards: [high disk watermark exceeded on one or more nodes]

查看代碼org.elasticsearch.cluster.routing.allocation.DiskThresholdMonitor#onNewInfo,結(jié)合官網(wǎng)文檔 Disk-based Shard Allocationedit，可以知道es會對磁盤空間進行監(jiān)控,當磁盤空間使用量達到一定的閾值就會做不同的處理直撤。
這里其實是磁盤剩余空間達到了floodstage閾值项秉，導致Elasticsearch對每個索引強制執(zhí)行只讀索引塊愧捕，所以logstash在做數(shù)據(jù)同步的時候就報錯了商叹。

部分源碼如下：

    public void onNewInfo(ClusterInfo info) {
        ImmutableOpenMap<String, DiskUsage> usages = info.getNodeLeastAvailableDiskUsages();
        if (usages != null) {
            boolean reroute = false;
            String explanation = "";

            // Garbage collect nodes that have been removed from the cluster
            // from the map that tracks watermark crossing
            ObjectLookupContainer<String> nodes = usages.keys();
            for (String node : nodeHasPassedWatermark) {
                if (nodes.contains(node) == false) {
                    nodeHasPassedWatermark.remove(node);
                }
            }
            ClusterState state = clusterStateSupplier.get();
            Set<String> indicesToMarkReadOnly = new HashSet<>();
            for (ObjectObjectCursor<String, DiskUsage> entry : usages) {
                String node = entry.key;
                DiskUsage usage = entry.value;  
                //檢測磁盤空間使用量,當達到不同閾值時給出告警或者info信息
                warnAboutDiskIfNeeded(usage);  
                //磁盤使用量達到floodstage閾值,將所有索引都標記為只讀
                if (usage.getFreeBytes() < diskThresholdSettings.getFreeBytesThresholdFloodStage().getBytes() ||
                    usage.getFreeDiskAsPercentage() < diskThresholdSettings.getFreeDiskThresholdFloodStage()) {
                    RoutingNode routingNode = state.getRoutingNodes().node(node);
                    if (routingNode != null) { // this might happen if we haven't got the full cluster-state yet?!
                        for (ShardRouting routing : routingNode) {
                            indicesToMarkReadOnly.add(routing.index().getName());
                        }
                    }
                } 
                    //磁盤使用量達到高閾值,超過重新分配分片的間隔時間則重新分配
                    else if (usage.getFreeBytes() < diskThresholdSettings.getFreeBytesThresholdHigh().getBytes() ||
                    usage.getFreeDiskAsPercentage() < diskThresholdSettings.getFreeDiskThresholdHigh()) {
                    if ((System.nanoTime() - lastRunNS) > diskThresholdSettings.getRerouteInterval().nanos()) {
                        lastRunNS = System.nanoTime();
                        reroute = true;
                        explanation = "high disk watermark exceeded on one or more nodes";
                    } else {
                        logger.debug("high disk watermark exceeded on {} but an automatic reroute has occurred " +
                                "in the last [{}], skipping reroute",
                            node, diskThresholdSettings.getRerouteInterval());
                    }
                    nodeHasPassedWatermark.add(node);
                } 
                //磁盤使用量達到低閾值
                else if (usage.getFreeBytes() < diskThresholdSettings.getFreeBytesThresholdLow().getBytes() ||
                    usage.getFreeDiskAsPercentage() < diskThresholdSettings.getFreeDiskThresholdLow()) {
                    nodeHasPassedWatermark.add(node);
                } 
                //磁盤使用量未達到任何閾值穆趴，如果之前達到了高閾值或低閾值，重新路由窟却，以便能夠分配任何未分配的分片
                else {
                    if (nodeHasPassedWatermark.contains(node)) {
                        // The node has previously been over the high or
                        // low watermark, but is no longer, so we should
                        // reroute so any unassigned shards can be allocated
                        // if they are able to be
                        if ((System.nanoTime() - lastRunNS) > diskThresholdSettings.getRerouteInterval().nanos()) {
                            lastRunNS = System.nanoTime();
                            reroute = true;
                            explanation = "one or more nodes has gone under the high or low watermark";
                            nodeHasPassedWatermark.remove(node);
                        } else {
                            logger.debug("{} has gone below a disk threshold, but an automatic reroute has occurred " +
                                    "in the last [{}], skipping reroute",
                                node, diskThresholdSettings.getRerouteInterval());
                        }
                    }
                }
            }
            if (reroute) {
                logger.info("rerouting shards: [{}]", explanation);
                reroute();
            }
            indicesToMarkReadOnly.removeIf(index -> state.getBlocks().indexBlocked(ClusterBlockLevel.WRITE, index));
            if (indicesToMarkReadOnly.isEmpty() == false) {
                markIndicesReadOnly(indicesToMarkReadOnly);
            }
        }
    }

    /**
     * Warn about the given disk usage if the low or high watermark has been passed
     */
    private void warnAboutDiskIfNeeded(DiskUsage usage) {  
        //檢查磁盤剩余使用量
        // Check absolute disk values  
      // 剩余磁盤使用量 < cluster.routing.allocation.disk.watermark.flood_stage
        if (usage.getFreeBytes() < diskThresholdSettings.getFreeBytesThresholdFloodStage().getBytes()) {
            logger.warn("flood stage disk watermark [{}] exceeded on {}, all indices on this node will marked read-only",
                diskThresholdSettings.getFreeBytesThresholdFloodStage(), usage);
        } 
        // 剩余磁盤使用量 < cluster.routing.allocation.disk.watermark.high
        else if (usage.getFreeBytes() < diskThresholdSettings.getFreeBytesThresholdHigh().getBytes()) {
            logger.warn("high disk watermark [{}] exceeded on {}, shards will be relocated away from this node",
                diskThresholdSettings.getFreeBytesThresholdHigh(), usage);
        } 
        // 剩余磁盤使用量 < cluster.routing.allocation.disk.watermark.low
        else if (usage.getFreeBytes() < diskThresholdSettings.getFreeBytesThresholdLow().getBytes()) {
            logger.info("low disk watermark [{}] exceeded on {}, replicas will not be assigned to this node",
                diskThresholdSettings.getFreeBytesThresholdLow(), usage);
        }
        
        //檢查磁盤使用百分比
        // Check percentage disk values  
        // 剩余磁盤百分比 < 100 - 95(cluster.routing.allocation.disk.watermark.flood_stage)
        if (usage.getFreeDiskAsPercentage() < diskThresholdSettings.getFreeDiskThresholdFloodStage()) {
            logger.warn("flood stage disk watermark [{}] exceeded on {}, all indices on this node will marked read-only",
                Strings.format1Decimals(100.0 - diskThresholdSettings.getFreeDiskThresholdFloodStage(), "%"), usage);
        } 
        // 剩余磁盤百分比 < 100 - 90(cluster.routing.allocation.disk.watermark.high)
        else if (usage.getFreeDiskAsPercentage() < diskThresholdSettings.getFreeDiskThresholdHigh()) {
            logger.warn("high disk watermark [{}] exceeded on {}, shards will be relocated away from this node",
                Strings.format1Decimals(100.0 - diskThresholdSettings.getFreeDiskThresholdHigh(), "%"), usage);
        } 
        // 剩余磁盤百分比 < 100 - 85(cluster.routing.allocation.disk.watermark.low)
        else if (usage.getFreeDiskAsPercentage() < diskThresholdSettings.getFreeDiskThresholdLow()) {
            logger.info("low disk watermark [{}] exceeded on {}, replicas will not be assigned to this node",
                Strings.format1Decimals(100.0 - diskThresholdSettings.getFreeDiskThresholdLow(), "%"), usage);
        }
    }

這里涉及幾個配置：

cluster.routing.allocation.disk.threshold_enabled是否開啟基于磁盤的分片分配,默認true
cluster.routing.allocation.disk.watermark.low控制磁盤空間使用的低水位線,默認85%昼丑，es不會再將分片分配給磁盤使用超過這個配置的節(jié)點。

這個設(shè)置不會影響新創(chuàng)建的索引的主分片夸赫，或者是之前從未分配過的任何分片

cluster.routing.allocation.disk.watermark.high控制磁盤空間使用的高水位線,默認90%菩帝，es會將磁盤使用超過這個配置的節(jié)點中的分片重新進行分配。

這個設(shè)置將影響所有分片的分配茬腿，不管分片之前是否已經(jīng)被分配過

cluster.routing.allocation.disk.watermark.flood_stage控制磁盤空間使用的洪水水位線,默認95%呼奢，es會將磁盤使用超過這個配置的節(jié)點中的所有索引都標記為只讀。

這是防止節(jié)點耗盡磁盤空間的最后手段切平。一旦有足夠的磁盤空間允許繼續(xù)索引操作握础，需要手動釋放索引塊

cluster.routing.allocation.disk.include_relocations當計算一個節(jié)點的剩余磁盤空間時，是否考慮正在重新分配到當前節(jié)點的分片容量悴品，默認true禀综。

這可能導致錯誤的高估一個磁盤的使用率。因為分片重分配可能已經(jīng)完成了90%苔严，檢索到的磁盤使用率包含了這個重新分配的分片總大小以及這已經(jīng)分配了的90%進度的大小定枷。

cluster.routing.allocation.disk.reroute_interval分片重分配間隔，默認60秒
cluster.info.update.interval磁盤使用率檢查間隔届氢，默認30秒

關(guān)于配置的幾點說明：

上面幾個配置要么都設(shè)置為百分比依鸥，要么都設(shè)置為具體的字節(jié)值，不能混用悼沈。
可以通過在配置文件elasticsearch.yml中配置贱迟，也可以在 cluster-update-settings API 在實時群集上動態(tài)更新。直接參考官網(wǎng)文檔即可絮供。

測試

-- 添加文檔衣吠，自動創(chuàng)建索引
curl http://172.16.22.51:9200/idx_luoluocaihong/_doc/1 -X PUT  -H 'Content-Type:application/json' -d '{"user":"luoluocaihong","age":"20"}'
{"_index":"idx_luoluocaihong","_type":"_doc","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

-- 設(shè)置只讀索引塊
curl http://172.16.22.51:9200/idx_luoluocaihong/_settings -X PUT -H 'Content-Type:application/json' -d '{"index.blocks.read_only_allow_delete": true}'
{"acknowledged":true}

-- 查看索引的設(shè)置
curl http://172.16.22.51:9200/idx_luoluocaihong/_settings
{"idx_luoluocaihong":{"settings":{"index":{"number_of_shards":"5","blocks":{"read_only_allow_delete":"true"},"provided_name":"idx_luoluocaihong","creation_date":"1561107195032","number_of_replicas":"1","uuid":"3iS68s1nQMudxhyL-zNnRg","version":{"created":"6020499"}}}}}

-- 添加文檔
curl http://172.16.22.51:9200/idx_luoluocaihong/_doc/2 -X PUT  -H 'Content-Type:application/json' -d '{"user":"user_2","age":"20"}'
{"error":{"root_cause":[{"type":"cluster_block_exception","reason":"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"}],"type":"cluster_block_exception","reason":"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"},"status":403}

-- 重置只讀索引塊
curl http://172.16.22.51:9200/idx_luoluocaihong/_settings -X PUT -H 'Content-Type:application/json' -d '{"index.blocks.read_only_allow_delete": null}'
{"acknowledged":true}

-- 查看索引的設(shè)置
curl http://172.16.22.51:9200/idx_luoluocaihong/_settings
{"idx_luoluocaihong":{"settings":{"index":{"creation_date":"1561107195032","number_of_shards":"5","number_of_replicas":"1","uuid":"3iS68s1nQMudxhyL-zNnRg","version":{"created":"6020499"},"provided_name":"idx_luoluocaihong"}}}}

-- 添加文檔
curl http://172.16.22.51:9200/idx_luoluocaihong/_doc/2 -X PUT  -H 'Content-Type:application/json' -d '{"user":"user_2","age":"20"}'
{"_index":"idx_luoluocaihong","_type":"_doc","_id":"2","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}

-- 搜索
curl http://172.16.22.51:9200/idx_luoluocaihong/_search?q=age:20
{"took":19,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.2876821,"hits":[{"_index":"idx_luoluocaihong","_type":"_doc","_id":"2","_score":0.2876821,"_source":{"user":"user_2","age":"20"}},{"_index":"idx_luoluocaihong","_type":"_doc","_id":"1","_score":0.2876821,"_source":{"user":"luoluocaihong","age":"20"}}]}}

-- 查看集群設(shè)置
curl 172.16.22.51:9200/_cluster/settings

-- 修改集群設(shè)置
curl 172.16.22.51:9200/_cluster/settings -X PUT  -H 'Content-Type:application/json' -d '{"transient":{"cluster.routing.allocation.disk.watermark.low":"80%","cluster.routing.allocation.disk.watermark.high":"85%","cluster.routing.allocation.disk.watermark.flood_stage":"90%"}}'
{"acknowledged":true,"persistent":{},"transient":{"cluster":{"routing":{"allocation":{"disk":{"watermark":{"low":"80%","flood_stage":"90%","high":"85%"}}}}}}}  

-- 檢查索引狀況
curl http://172.16.22.51:9200/_cat/indices
green open idx_luoluocaihong 3iS68s1nQMudxhyL-zNnRg 5 1    2    0  17.4kb   8.7kb

-- 檢查es集群健康狀況
curl 172.16.22.51:9200/_cluster/health?pretty
{
  "cluster_name" : "iot-es",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 15,
  "active_shards" : 30,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

-- 刪除索引
curl http://172.16.22.51:9200/idx_luoluocaihong  -X DELETE

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個濱河市壤靶，隨后出現(xiàn)的幾起案子缚俏，更是在濱河造成了極大的恐慌，老刑警劉巖贮乳，帶你破解...
沈念sama閱讀 211,348評論 6贊 491
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件忧换，死亡現(xiàn)場離奇詭異，居然都是意外死亡向拆，警方通過查閱死者的電腦和手機亚茬，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,122評論 2贊 385
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來浓恳，“玉大人刹缝，你說我怎么就攤上這事碗暗。” “怎么了梢夯？”我有些...
開封第一講書人閱讀 156,936評論 0贊 347
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵言疗，是天一觀的道長。經(jīng)常有香客問我颂砸，道長噪奄，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 56,427評論 1贊 283
?港島之戀（遺憾婚禮）
正文為了忘掉前任人乓，我火速辦了婚禮勤篮，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘撒蟀。我一直安慰自己，他們只是感情好温鸽，可當我...
茶點故事閱讀 65,467評論 6贊 385
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布保屯。她就那樣靜靜地躺著，像睡著了一般涤垫。火紅的嫁衣襯著肌膚如雪姑尺。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 49,785評論 1贊 290
城市分裂傳說
那天蝠猬，我揣著相機與錄音切蟋，去河邊找鬼。笑死榆芦，一個胖子當著我的面吹牛柄粹，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播匆绣，決...
沈念sama閱讀 38,931評論 3贊 406
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼驻右，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了崎淳？” 一聲冷哼從身側(cè)響起堪夭，我...
開封第一講書人閱讀 37,696評論 0贊 266
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎拣凹，沒想到半個月后森爽，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 44,141評論 1贊 303
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡嚣镜，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 36,483評論 2贊 327
?白月光啟示錄
正文我和宋清朗相戀三年爬迟，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片菊匿。...
茶點故事閱讀 38,625評論 1贊 340
活死人
序言：一個原本活蹦亂跳的男人離奇死亡雕旨，死狀恐怖扮匠，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情凡涩，我是刑警寧澤棒搜，帶...
沈念sama閱讀 34,291評論 4贊 329
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站活箕，受9級特大地震影響力麸，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜育韩，卻給世界環(huán)境...
茶點故事閱讀 39,892評論 3贊 312
男人毒藥：我在死后第九天來索命
文/蒙蒙一克蚂、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧筋讨，春花似錦埃叭、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,741評論 0贊 21
一樁弒父案赤屋，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至壁袄，卻和暖如春类早，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背嗜逻。一陣腳步聲響...
開封第一講書人閱讀 31,977評論 1贊 265
情欲美人皮
我被黑心中介騙來泰國打工涩僻，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人栈顷。一個月前我還...
沈念sama閱讀 46,324評論 2贊 360
代替公主和親
正文我出身青樓逆日，卻偏偏與公主長得像，于是被迫代替她去往敵國和親萄凤。傳聞我的和親對象是個殘疾皇子屏富，可洞房花燭夜當晚...
茶點故事閱讀 43,492評論 2贊 348

ElasticSearch-磁盤空間不夠引起的問題

推薦閱讀更多精彩內(nèi)容