Table of Contents generated with DocToc
0. 參考文檔
Elasticsearch Reference[5.2.2] Snapshot And Restore
HDFS Repository Plugin
1. 概述
最近工作中使用了es的snapshot/restore來進(jìn)行跨集群同步偷拔,將步驟整理為此文檔绕辖。
說明 :本文僅適用于es5.2.2與hadoop2.7.0吮炕,其他版本請(qǐng)參考es官方文檔宁炫。
2. 環(huán)境準(zhǔn)備
2.1. elasticsearch-5.2.2
2.1.1. elasticsearch安裝
Elasticsearch5.2.2官網(wǎng)下載地址
將程序包解壓并添加如下配置便于用head查看es信息,其他信息保持不變
http.cors.enabled: true
http.cors.allow-origin: "*"
- 啟動(dòng)es
./bin/elasticsearch -d -p pid # 后臺(tái)運(yùn)行州既,將pid輸出到pid文件
- 驗(yàn)證
$ curl localhost:9200
{
"name" : "ODQxF0o",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "eBH2lQZGQnKa5ssUXzUEig",
"version" : {
"number" : "5.2.2",
"build_hash" : "f9d9b74",
"build_date" : "2017-02-24T17:26:45.835Z",
"build_snapshot" : false,
"lucene_version" : "6.4.1"
},
"tagline" : "You Know, for Search"
}
2.1.2. repository-hdfs插件安裝
若采用hdfs作為快照存儲(chǔ)介質(zhì)谜洽,需要額外安裝插件(插件下載地址)。
- 安裝
$ ./bin/elasticsearch-plugin install file:///path/to/repository-hdfs-5.2.2.zip
- 驗(yàn)證
查看已安裝插件列表
$ ./bin/elasticsearch-plugin list
2.1.3. 造數(shù)
為了后續(xù)驗(yàn)證吴叶,在es中新建索引并添加數(shù)據(jù)阐虚。
#! /bin/bash
# 創(chuàng)建名為test-index的索引
curl -XPUT 'http://localhost:9200/test-index/'
# 插入若干條數(shù)據(jù),這種方式效率較低晤郑,大量數(shù)據(jù)推薦采用multi api
for i in {1..1001}
do
curl -XPOST 'http://localhost:9200/test-index/doc' -d '{"name":"tom"}'
done
2.2. hadoop-2.7.0
2.2.1. 下載
2.2.2. 安裝
為了簡單起見敌呈,示例采用偽分布式安裝(pseudo-distributed),將程序包解壓造寝,并修改如下配置文件:
- etc/hadoop/hadoop-env.sh
# 設(shè)置java路徑
export JAVA_HOME=/path/to/java
- etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
- etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/tmp/hadoop/2.7.0/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/tmp/hadoop/2.7.0/data</value>
</property>
</configuration>
2.2.3. 配置ssh免密
若ssh loclhost無法登陸本機(jī)磕洪,則進(jìn)行如下操作:
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
2.2.4. 格式化文件系統(tǒng)
$ ./bin/hdfs namenode -format
2.2.5 啟動(dòng)HDFS
- 啟動(dòng)
$ ./sbin/start-dfs.sh
- 驗(yàn)證
- 通過jps命令查看相關(guān)進(jìn)程是否存在
- 瀏覽器訪問web ui,地址http://localhost:50070 (hadoop2默認(rèn)為50070端口)
3. 注冊(cè)倉庫
curl -XPUT 'http://localhost:9200/_snapshot/hdfs_repo' -d
'{
"type": "hdfs",
"settings": {
"uri": "hdfs://localhost:9000",
"path": "es/hdfs_repo",
"max_restore_bytes_per_sec":"1mb",
"max_snapshot_bytes_per_sec":"1mb"
}
}'
若返回結(jié)果如下诫龙,則成功
{
"acknowledged": true
}
- 查看HDFS目錄
查看HDFS中是否已經(jīng)添加相關(guān)目錄
$ ./bin/hdfs dfs -ls -R /
20/04/29 21:04:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x - username supergroup 0 2020-04-29 21:03 /user
drwxr-xr-x - username supergroup 0 2020-04-29 21:03 /user/username
drwxr-xr-x - username supergroup 0 2020-04-29 21:03 /user/username/es
drwxr-xr-x - username supergroup 0 2020-04-29 21:03 /user/username/es/hdfs_repo
- 查看倉庫信息
$ curl 'http://localhost:9200/_snapshot/_all?pretty'
{
"hdfs_repo" : {
"type" : "hdfs",
"settings" : {
"path" : "es/hdfs_repo",
"max_restore_bytes_per_sec" : "1mb",
"uri" : "hdfs://localhost:9000",
"max_snapshot_bytes_per_sec" : "1mb"
}
}
}
- 配置說明
基于HDFS的配置最常用的為示例中幾個(gè)析显,uri和path是必不可少,余下兩個(gè)是限速參數(shù)签赃,實(shí)際生產(chǎn)環(huán)境中也不可或缺谷异。
參數(shù)詳細(xì)說明以及其他參數(shù)請(qǐng)參考如下文檔:
4. 創(chuàng)建快照
curl -XPUT 'http://localhost:9200/_snapshot/hdfs_repo/snapshot_1?wait_for_completion=false' -d \
'{
"ignore_unavailable": true,
"include_global_state": false,
"partial": true
}'
- 參數(shù)說明
參數(shù) | 說明 |
---|---|
wait_for_completion | 請(qǐng)求立即返回還是等待創(chuàng)建快照完畢后再返回,若數(shù)據(jù)量較多可以設(shè)置為false |
ignore_unavailable | 忽略創(chuàng)建快照時(shí)不存在的索引 |
partial | 默認(rèn)情況下索引存在不可用分片時(shí)該索引的備份會(huì)失敗锦聊,設(shè)置此參數(shù)為true可以備份可用的分片 |
... | ... |
更多參數(shù)說明請(qǐng)參考官方文檔
4.1. access_control_exception
此時(shí)創(chuàng)建快照會(huì)報(bào)如下錯(cuò)誤
{
"error": {
"root_cause": [
{
"type": "repository_exception",
"reason": "[hdfs_repo] could not read repository data from index blob"
}
],
"type": "repository_exception",
"reason": "[hdfs_repo] could not read repository data from index blob",
"caused_by": {
"type": "i_o_exception",
"reason": "com.google.protobuf.ServiceException: java.security.AccessControlException: access denied (\"javax.security.auth.PrivateCredentialPermission\" \"org.apache.hadoop.security.Credentials\" \"read\")",
"caused_by": {
"type": "service_exception",
"reason": "java.security.AccessControlException: access denied (\"javax.security.auth.PrivateCredentialPermission\" \"org.apache.hadoop.security.Credentials\" \"read\")",
"caused_by": {
"type": "access_control_exception",
"reason": "access denied (\"javax.security.auth.PrivateCredentialPermission\" \"org.apache.hadoop.security.Credentials\" \"read\")"
}
}
}
},
"status": 500
}
- 解決方法
在repository-hdfs插件的java security policy文件中添加如下內(nèi)容:
permission javax.security.auth.PrivateCredentialPermission "org.apache.hadoop.security.Credentials * \"*\"", "read";
并在es jvm配置文件config/jvm.options中指定該策略文件:
-Djava.security.policy=file:///path/to/plugins/repository-hdfs/plugin-security.policy
4.2. 查看快照信息
curl'http://localhost:9200/_snapshot/hdfs_repo/snapshot_1‘
curl'http://localhost:9200/_snapshot/hdfs_repo/_all‘
curl'http://localhost:9200/_cat/snapshots/hdfs_repo?v'
注:hdfs_repo為前文已經(jīng)創(chuàng)建了的倉庫名歹嘹,snapshot_1為前文已創(chuàng)建的快照名,后文同孔庭。
4.3. 快照刪除與停止
curl -XDELETE 'http://localhost:9200/_snapshot/hdfs_repo/snapshot_1'
5. 快照恢復(fù)
示例為同集群恢復(fù)尺上,若需跨集群恢復(fù)則需在目標(biāo)集群中安裝repository-hefs插件以及注冊(cè)與源集群位置一樣的倉庫。
5.1. 恢復(fù)
為了進(jìn)行驗(yàn)證首先刪除原索引
- 刪除索引
curl -XDELETE 'http://localhost:9200/test-index'
- 數(shù)據(jù)恢復(fù)
curl -XPOST 'localhost:9200/_snapshot/hdfs_repo/snapshot_1/_restore?wait_for_completion=false' -d
'{
"ignore_unavailable": true,
"include_global_state": false,
"partial": true,
"index_settings":{
"index.number_of_replicas":0
}
}'
- 參數(shù)說明
index_settings為索引設(shè)置圆到,可以通過該參數(shù)設(shè)置索引的分片數(shù)怎抛、副本數(shù)等,其他參數(shù)含義與創(chuàng)建快照時(shí)相同芽淡。
5.2. 恢復(fù)進(jìn)度查看
curl'http://localhost:9200/_snapshot/hdfs_repo/snapshot_1‘ # 信息詳細(xì)马绝,但較慢
curl'http://localhost:9200/_snapshot/hdfs_repo/snapshot_1/_status‘ # 較快,但信息少
5.3. 中止恢復(fù)
curl -XDELETE 'http://localhost:9200/_snapshot/hdfs_repo/snapshot_1' # 與刪除快照命令一樣