背景
Elasticsearch的副本機(jī)制提供了可靠性堪唐,可以容忍個(gè)別節(jié)點(diǎn)丟失而不影響集群的對外服務(wù),但是并不能提供對災(zāi)難性故障的保護(hù)男公,所以需要對ES集群數(shù)據(jù)做一個(gè)完整的備份合陵,以便在災(zāi)難性故障發(fā)生時(shí),能快速恢復(fù)數(shù)據(jù)踏拜。ES官方提供了快照/恢復(fù)(Snapshot/Restore)的方式低剔,支持的插件包括Azure Repository Plugin襟齿、S3 Repository Plugin、Hadoop HDFS Repository Plugin屋摔、Google Cloud Storage Respository Plugin替梨,這里我使用Hadoop HDFS Repository插件,將ES中的數(shù)據(jù)備份到HDFS上弓熏。
-
說明
本文基于Elasticsearch-5.6.0挽鞠、hadoop-2.6.0-cdh5.7.0狈孔,使用的插件及版本是repository-hdfs-5.6.0.zip,官網(wǎng)地址:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html
https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/repository-hdfs.html
ES集群快照存在版本兼容性問題嫁赏,請注意:
A snapshot of an index created in 5.x can be restored to 6.x.
A snapshot of an index created in 2.x can be restored to 5.x.
A snapshot of an index created in 1.x can be restored to 2.x.
我的情況是從5.6.0備份數(shù)據(jù)然后恢復(fù)到6.3.2潦蝇,不存在這種兼容性問題。
-
操作步驟
1. 安裝插件
分別在集群的各個(gè)節(jié)點(diǎn)安裝repository-hdfs插件
在線安裝:sudo bin/elasticsearch-plugin install repository-hdfs
離線安裝:
先wget https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-5.6.0.zip
然后bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-5.6.0.zip
2. 創(chuàng)建倉庫贤牛,并在ES注冊
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://golive-master:8020/",
"path": "elasticsearch/respositories/es_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true",
"conf.dfs.domain.socket.path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
'
創(chuàng)建過程中遇到Permission denied的問題殉簸,我暫時(shí)關(guān)閉了hdfs權(quán)限堤魁,即修改hadoop各節(jié)點(diǎn)hdfs-site.xml,添加如下配置:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
然后重啟hdfs椭微,再次執(zhí)行上述創(chuàng)建倉庫命令即可成功創(chuàng)建蝇率,查看hdfs目錄如下:
可以通過如下命令查看倉庫:
curl -X GET "172.16.221.104:9400/_snapshot/es_hdfs_repository"
返回結(jié)果如下:
{
"es_hdfs_repository": {
"type": "hdfs",
"settings": {
"path": "elasticsearch/respositories/es_hdfs_repository",
"uri": "hdfs://golive-master:8020/",
"conf": {
"dfs": {
"client": {
"read": {
"shortcircuit": "true"
}
},
"domain": {
"socket": {
"path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
}
}
}
}
}
3. 創(chuàng)建快照
為所有索引創(chuàng)建快照:
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1?wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "*"
}
'
通常你會(huì)希望你的快照作為后臺(tái)進(jìn)程運(yùn)行本慕,不過有時(shí)候你會(huì)希望在你的腳本中一直等待到完成侧漓。這可以通過添加一個(gè) wait_for_completion 標(biāo)記實(shí)現(xiàn):wait_for_completion=true布蔗,這個(gè)會(huì)阻塞調(diào)用直到快照完成。注意大型快照會(huì)花很長時(shí)間才返回顿乒。
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html
4.恢復(fù)快照
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore"
和快照類似泽谨, restore 命令也會(huì)立刻返回,恢復(fù)進(jìn)程會(huì)在后臺(tái)進(jìn)行骨杂。如果你更希望你的 HTTP 調(diào)用阻塞直到恢復(fù)完成腊脱,添加 wait_for_completion 標(biāo)記:
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore?wait_for_completion=true"
我恢復(fù)的時(shí)候是恢復(fù)到一個(gè)新的集群(6.3.2的一個(gè)集群)龙亲,因?yàn)闆]有在es注冊HDFS倉庫的位置,報(bào)錯(cuò)說找不到倉庫杜耙,于是又通過創(chuàng)建倉庫的命令注冊了一下拂盯,再執(zhí)行恢復(fù)命令就好了谈竿,這一點(diǎn)官方是這么說的:
All that is required is registering the repository containing the snapshot in the new cluster and starting the >restore process.
英文文檔:https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
中文文檔:https://www.elastic.co/guide/cn/elasticsearch/guide/current/_restoring_from_a_snapshot.html
5.獲取快照信息和狀態(tài)
獲取一個(gè)倉庫中所有快照的完整列表空凸,使用 _all 占位符替換掉具體的快照名稱:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/_all"
獲取一個(gè)快照的詳細(xì)信息:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2"
獲取一個(gè)快照更詳細(xì)的信息:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
官方文檔:
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html
-
附錄:
以下是我當(dāng)時(shí)備份/恢復(fù)數(shù)據(jù)用到的相關(guān)命令:
wget https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-5.6.0.zip
elasticsearch-5.6.0/bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-5.6.0.zip
curl 172.16.221.104:9400/_cat/indices?v
curl 172.16.221.104:9400/_cat/master?v
curl 172.16.221.104:9400/_cat/master?help
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://golive-master:8020/",
"path": "elasticsearch/respositories/es_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true",
"conf.dfs.domain.socket.path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
'
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1?wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "*"
}
'
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository"
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_status"
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.2/elasticsearch-analysis-ik-6.3.2.zip
[https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.3.2.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.3.2.zip)
bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-6.3.2.zip
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
curl 172.16.221.105:9400/_cat/master
curl 172.16.221.12:9400/_cat/nodes
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore"
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_restore" -H 'Content-Type: application/json' -d'
{
"indices": "a*,l*,m*,u*,i*"
}
'
[https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz](https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz)
curl -X DELETE "172.16.221.105:9400/.kibana-6"
curl -X GET "172.16.221.105:9400/_cat/indices"
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
curl -X POST "172.16.221.105:9400/a*,l*,m*,u*,i*/_close"
curl -X POST "172.16.221.105:9400/a*,l*,m*,u*,i*/_open"
curl -X GET http://172.16.221.105:9400/ad_base?pretty
curl -X GET http://172.16.221.105:9400/_cluster/health?pretty