一 Elastic Search
1 介紹
參考資料
2 安裝
2.1 第一步安裝
##1. 解壓
[root@qphone01 software]# tar -zxvf elasticsearch-6.5.3.tar.gz -C /opt/apps/
##2. 配置環(huán)境變量
[root@qphone01 elasticsearch-6.5.3]# vi /etc/profile
#envrioment
export JAVA_HOME=/opt/apps/jdk1.8.0_45
export HADOOP_HOME=/opt/apps/hadoop-2.6.0-cdh5.7.6
export SCALA_HOME=/opt/apps/scala-2.11.8
export SPARK_HOME=/opt/apps/spark-2.2.0
export HIVE_HOME=/opt/apps/hive-1.1.0-cdh5.7.6
export ZOOKEEPER_HOME=/opt/apps/zookeeper-3.4.5-cdh5.7.6
export KAFKA_HOME=/opt/apps/kafka-2.11
export FLUME_HOME=/opt/apps/flume-1.9.0
export REDIS_HOME=/opt/apps/redis-3.2.8
export REDIS_CONF=$REDIS_HOME/conf
export ELASTICSEARCH_HOME=/opt/apps/elasticsearch-6.5.3
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$HIVE_HOME/bin:$REDIS_HOME/bin
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin:$ZOOKEEPER_HOME/bin:$KAFKA_HOME/bin:$FLUME_HOME/bin:$ELASTICSEARCH_HOME/bin
##3. 配置es的elasticsearch.yml
cluster.name: es-hzbigdata2002
node.name: qphone01
node.master: true
node.data: true
path.data: /opt/apps/elasticsearch-6.5.3/data
path.logs: /opt/apps/elasticsearch-6.5.3/logs
network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["qphone01", "qphone02", "qphone03"]
##4. 建立一個普通用戶
[root@qphone01 config]# useradd qphone01
[root@qphone01 config]# passwd qphone01
更改用戶 qphone01 的密碼 窿春。
##5. 授權(quán)
[root@qphone01 config]# vi /etc/sudoers
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
qphone01 ALL=(ALL) ALL
##6. 對整個目錄授權(quán)
[root@qphone01 apps]# chown -R qphone01:qphone01 elasticsearch-6.5.3/
2.2 第二步解決環(huán)境問題
[qphone01@qphone01 bin]$ sudo vi /etc/security/limits.conf
* soft nofile 65536
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096
[qphone01@qphone01 bin]$ sudo vi /etc/security/limits.d/20-nproc.conf
* soft nproc 4096
root soft nproc unlimited
[bigdata@qphone01 limits.d]$ sudo vi /etc/sysctl.conf
vm.max_map_count=262144
tip:
修改完之后重啟
2.3 測試
http://192.168.49.111:9200/
{
"name" : "qphone01",
"cluster_name" : "es-hzbigdata2002",
"cluster_uuid" : "iUEJ5-BRRsieI0vd7Uooww",
"version" : {
"number" : "6.5.3",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "159a78a",
"build_date" : "2018-12-06T20:11:28.826501Z",
"build_snapshot" : false,
"lucene_version" : "7.5.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
2.4 安裝head插件-谷歌瀏覽器
2.4.1 下載谷歌瀏覽器
2.4.2 安裝head插件
3 使用
3.1 RESTFul簡介
看資料
3.2 curl
3.2.1 在es當(dāng)中的增刪改查的method type
資源 | 一組資源的URI,比如:http://example.com/res/ | 單個資源的URI,比如:http://example.com/res/123 |
---|---|---|
GET | 列出URI欲逃,以及該資源組中每個資源的詳細(xì)信息(后者可選) | 獲取指定的資源的詳細(xì)信息耕漱,格式可以自選一個合適的網(wǎng)絡(luò)媒體類型(比如:XML肮蛹、JSON等) |
PUT | 使用給定的一組資源替換當(dāng)前整組資源 | 替換/創(chuàng)建指定的資源溯香。并將其追加到相應(yīng)的資源組中济舆。 |
POST | 在本組資源中創(chuàng)建/追加一個新的資源喷众。該操作往往返回新的URL | 把指定的資源當(dāng)做一個資源組各谚,并在其下創(chuàng)建/追加一個新的元素,使其隸屬于當(dāng)前資源到千。 |
DELETE | 刪除整組資源 | 刪除指定的元素 |
3.2.2 curl
- 特殊指令
URL | 描述 |
---|---|
/index/_search | 搜索指定索引下的數(shù)據(jù) |
/_aliases | 獲取或操作索引的別名 |
/index/ | 查看指定索引的詳細(xì)信息 |
/index/type/ | 創(chuàng)建或操作類型 |
/index/_mapping | 創(chuàng)建或操作mapping |
/index/_setting | 創(chuàng)建或操作設(shè)置(比如number_of_shards分片數(shù)) |
/index/_open | 打開指定被關(guān)閉的索引 |
/index/_close | 關(guān)閉指定索引 |
/index/_refresh | 刷新索引(使新加內(nèi)容對搜索可見昌渤,不保證數(shù)據(jù)被寫入磁盤) |
/index/flush | 刷新索引(會觸發(fā)Lucene提交) |
- 基本用法:3大參數(shù)
-X 指定http的請求方式:head、put憔四、get膀息、post、delete
-D 要傳輸?shù)臄?shù)據(jù)
-H 指定請求頭信息
- 入門例子:創(chuàng)建了一個索引庫
curl -XPUT 'http://hbase1:9200/bigdata' ## 向es的集群發(fā)送put請求(新建)了赵,bigdata的索引庫
3.3 操作es的crud
3.3.1 put
curl -H "Content-Type:application/json" -XPUT 'http://qphone01:9200/bigdata/emp/1' -d '{"name":"lixi", "age":34}'
{"_index":"bigdata","_type":"emp","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}
curl -H "Content-Type:application/json" -XPUT 'http://qphone01:9200/bigdata/emp/3' -d '{"name":"蒼老師", "age":40}'
{"_index":"bigdata","_type":"emp","_id":"3","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1}
tip:
1. 一個索引庫中只能由一個type潜支,index/type視為一張表
2. 1表示doc_id,表示的一個文檔的編號,在es中一條數(shù)據(jù)表示一個文檔
3. 一個index/type可以由多個doc
4. 一個doc中的數(shù)據(jù)一定是一個json柿汛,并且多個doc之間的json是非對稱的
"_index":"bigdata" : 索引庫的庫名
"_type":"emp" : 類型是emp冗酿,你可以理解為bigdata庫下有一個表,這個表叫做emp
"_id":"1" : 表示doc的編號
"_shards":{"total":2,"successful":1,"failed":0} : 分片络断,有個副本
默認(rèn)的分片是5裁替,默認(rèn)的副本因子是1
狀態(tài) | 描述 |
---|---|
綠色 | 所有主分片和副本分片都可用 |
黃色 | 所有的主分片都可用,不是所有的副本分片可用 |
紅色 | 不是所有主分片和副本分片可用 |
3.3.2 post操作貌笨,創(chuàng)建/修改索引庫
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/bigdata/emp/1' -d '{"name":"程志遠(yuǎn)", "age":18}'
{"_index":"bigdata","_type":"emp","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1}
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/bigdata/emp/4' -d '{"name":"李洪良", "age":22}'
{"_index":"bigdata","_type":"emp","_id":"4","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1}
tip:
put和post都是既可以添加數(shù)據(jù)又可以修改數(shù)據(jù)弱判。但是post還能修改其他的設(shè)置
3.3.3 Get
##1. 查詢指定的一個文檔
curl -H "Content-Type:application/json" -XGET 'http://qphone01:9200/bigdata/emp/1'
{"_index":"bigdata","_type":"emp","_id":"1","_version":2,"found":true,"_source":{"name":"程志遠(yuǎn)", "age":18}}
##2. 查詢并優(yōu)化查詢的json的格式
curl -H "Content-Type:application/json" -XGET 'http://qphone01:9200/bigdata/emp/1?pretty'
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "1",
"_version" : 2,
"found" : true,
"_source" : {
"name" : "程志遠(yuǎn)",
"age" : 18
}
}
##3. 查詢所有
curl -H "Content-Type:application/json" -XGET 'http://qphone01:9200/bigdata/_search?pretty'
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "rock",
"age" : 35
}
},
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"name" : "李洪良",
"age" : 22
}
},
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "程志遠(yuǎn)",
"age" : 18
}
},
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "蒼老師",
"age" : 40
}
}
]
}
}
##4. 條件查詢
curl -XGET 'http://qphone01:9200/bigdata/_search?q=name:rock&pretty'
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.87138504,
"hits" : [
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "2",
"_score" : 0.87138504,
"_source" : {
"name" : "rock",
"age" : 35
}
}
]
}
}
##5. 條件查詢
curl -XGET 'http://qphone01:9200/bigdata/_search?q=name:rock&_source=name&pretty'
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.87138504,
"hits" : [
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "2",
"_score" : 0.87138504,
"_source" : {
"name" : "rock"
}
}
]
}
}
##6. 分頁顯示
curl -XGET 'http://qphone01:9200/bigdata/_search?from=0&size=2&pretty'
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "rock",
"age" : 35
}
},
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"name" : "李洪良",
"age" : 22
}
}
]
}
}
3.3.4 Post:局部修改
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/bigdata/emp/4/_update?pretty' -d '{"doc": {"name":"lixi"}}'
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}
3.3.5 Delete
##1. 刪除docid(索引)
curl -H "Content-Type:application/json" -XDELETE 'http://qphone01:9200/bigdata/emp/3?pretty'
##2. 索引庫
curl -H "Content-Type:application/json" -XDELETE 'http://qphone01:9200/bigdata?pretty'
3.3.6 batch
- 批量插入
curl -H 'Content-Type:application/json' -i -XPUT 'http://qphone01:9200/qphone/student/_bulk?pretty' \
-d '
{"index":{"_id":"3"}}
{"name":"李洪浪", "sex":"男", "age":32}
{"index":{"_id":"4"}}
{"name":"李洪風(fēng)", "sex":"男", "age":45}
{"index":{"_id":"5"}}
{"name":"李洪云", "sex":"男", "age":67}
{"index":{"_id":"6"}}
{"name":"李洪雨", "sex":"男", "age":8}
{"index":{"_id":"7"}}
{"name":"李洪雷", "sex":"男", "age":56}
{"index":{"_id":"8"}}
{"name":"李洪火", "sex":"男", "age":15}
'
4 ES的插件管理之Kibana
4.1 安裝
##1. 解壓
[root@qphone01 software]# tar -zxvf kibana-6.5.3-linux-x86_64.tar.gz -C /opt/apps/
##2. 環(huán)境變量
export KIBANA_HOME=/opt/apps/kibana-6.5.3
export PATH=$PATH:$KIBANA_HOME/bin
##3. kibana.yml
server.port: 5601
server.host: "192.168.49.111"
server.name: "qphone01"
elasticsearch.url: "http://qphone01:9200"
##4. 啟動
nohup kibana serve > /dev/null 2>&1 &
4.2 測試結(jié)果
001.png
二 ES的概念
1 通用概念
1.1 Index庫和Index
索引(index)是ElasticSearch中的對邏輯數(shù)據(jù)的邏輯存儲。所以它可以分為更小的部分躁绸,你可以直接把它理解為RDBMS中的Table的數(shù)據(jù)的主鍵
索引庫可以理解為RDBMS中的DATABASE裕循。ES可以把索引存放在一個機器或者分散到多臺服務(wù)器臣嚣,每個索引有一個或者多個分片(shard),每個分片有多個副本。
1.2 Document: 文檔
存儲在ElasticSearch中的主要實體叫做文檔(document)剥哑。用RDBMS來對比的話硅则,一個文檔相當(dāng)于數(shù)據(jù)庫表中的一行記錄。
一個doc是一個可被索引的基本信息單元株婴。這些文檔都是以json格式來表示的怎虫。在index/type里面存儲的。
1.2.1 創(chuàng)建文檔
文檔通過index API被索引——使數(shù)據(jù)可以被存儲和搜索困介。但是首先要先決定文檔所在大审,如何確定:通過index\type\id來唯一確定。
語法:
PUT {index}/{type}/{id} -d '{"":""}'
POST {index}/{type} -d '{"":""}' 自定id
e.g.
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -XPUT 'http://hbase1:9200/bigdata/emp/4' -d '{"name":"wyl", "age":18}'
1.2.2 獲取文檔
1. 普通查詢
通過index\type\id座哩,但是請求方式改為GET來獲取文檔
e.g.
curl -H 'Content-Type:application/json' -XGET 'http://hbase1:9200/bigdata/emp/4?pretty'
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "wyl",
"age" : 18
}
}
tip:
pretty : 在任意的查詢字符串中添加pretty參數(shù)徒扶,都會然讓es美化輸出,讓json在響應(yīng)的時候更容易閱讀根穷。
_source : 字段不會被美化姜骡,它的樣子于輸入的時候一致,這個source存放的就是文檔的數(shù)據(jù)
"found" : true : 表示你的文旦給已經(jīng)被查找到了圈澈。如果我們請求一個不存在的文旦給尘惧,依舊會得到一個json,found為false
2. 帶響應(yīng)碼的查詢
curl -H 'Content-Type:application/json' -i -XGET 'http://hbase1:9200/bigdata/emp/4?pretty'
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XGET 'http://hbase1:9200/bigdata/emp/4?pretty'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 153
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "wyl",
"age" : 18
}
}
3. 檢索文檔一部分
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XGET 'http://hbase1:9200/bigdata/emp/4?_source=name&pretty'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 137
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "wyl"
}
}
1.2.3 更新文檔
//1. 是以覆蓋的方式修改數(shù)據(jù)啥么,版本疊加1
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XPOST 'http://hbase1:9200/bigdata/emp/4?pretty' \
> -d '{"name":"yl", "age":27}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 220
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 5,
"_primary_term" : 2
}
//2. 局部更新
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XPOST 'http://hbase1:9200/bigdata/emp/4/_update?pretty' \
> -d '{"doc":{"name":"wyl"}}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 220
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 6,
"_primary_term" : 2
}
1.2.4 刪除文檔
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XDELETE 'http://hbase1:9200/bigdata/emp/4?pretty'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 220
{
"_index" : "bigdata",
"_type" : "emp",
"_id" : "4",
"_version" : 4,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 7,
"_primary_term" : 2
}
1.2.5 批量插入
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XPOST 'http://hbase1:9200/blog/emp/_bulk?pretty' \
> -d '
> {"index":{"_id":"1"}}
> {"name":"James", "sex":"man", "salary":50000000}
> {"index":{"_id":"2"}}
> {"name":"Kobe", "sex":"man", "salary":60000000}
> '
1.2.6 檢索多個文檔
curl -H 'Content-Type:application/json' -i -XGET 'http://qphone01:9200/_mget?pretty' \
-d '{
"docs":[
{
"_index":"qphone",
"_type":"student",
"_id":1,
"_source":"name"
},
{
"_index":"qphone",
"_type":"student",
"_id":2
}
]
}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 437
{
"docs" : [
{
"_index" : "qphone",
"_type" : "student",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "程志遠(yuǎn)"
}
},
{
"_index" : "qphone",
"_type" : "student",
"_id" : "2",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "李洪良",
"sex" : "男",
"age" : 19
}
}
]
}
1.3 Type
文檔類型
在es中,一個索引對象可以存儲很多不同用途的對象重慢。例如饥臂,一個博客可以保存文章和評論逊躁。文檔類型讓我們可以輕易的區(qū)分單個索引中的不同的對象似踱。每個文檔可以有不同的結(jié)構(gòu),但是在實際部署中稽煤,對文檔按類型區(qū)分對于操作有很大的幫助核芽。但有一個限制,不同的文檔類型不能為相同的屬性設(shè)置不同的類型酵熙。例如轧简,在同一個索引中的所有的文檔類型中,一個叫title的字段必須具有相同的類型匾二。
在es6之后哮独,一個index只能有一個type
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/article/1' -d '{"title":"lijieweishenmzhemshuai", "content":"yinweitabenlaijiuhenshuai"}'
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/comment/1' -d '{"title":"lijieweishenmzhemshuai", "content":"yinweitayongpiaorou", "user":"wangyushan"}'
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Rejecting mapping update to [blog] as the final mapping would have more than 1 type: [comment, article]"}],"type":"illegal_argument_exception","reason":"Rejecting mapping update to [blog] as the final mapping would have more than 1 type: [comment, article]"},"status":400}[root@hbase1 elasticsearch-6.5.3]#
1.4 Field(數(shù)據(jù)類型)
1.4.1 基本數(shù)據(jù)類型
字符串:text拳芙、keyword
數(shù)值:long、integer皮璧、short舟扎、byte、double悴务、float睹限、half_float、scaled_float
日期:date
布爾類型:boolean
二進制類型:binary
范圍類型:integer_range讯檐、float_range羡疗、long_range、double_range别洪、date_range
1.4.2 復(fù)雜的數(shù)據(jù)類型
數(shù)組:array
對象:object
嵌套類型:nested object
1.4.3 地理位置數(shù)據(jù)類型
geo_point(點)、geo_shape(形狀)
1.4.4 專用類型
記錄ip:ip
自動補全:completion
記錄分詞:token_count
1.4.5 通過mapping映射手動指定你插入的字段類型
##1. 執(zhí)行命令特碳,發(fā)現(xiàn)以下的信息
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/article/2' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"lixi", "dt":"2020-09-04"}'
##2. 查詢?nèi)缦滤饕畔?
002.png
003.png
他自動的將json中的字段轉(zhuǎn)換唯es中的對應(yīng)的字段類型闸准,這個轉(zhuǎn)換是自動完成的
##3. 手動指定類型
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/spark?pretty' -d \
'{
"mappings":{
"sparkcore":{
"properties":{
"scala":{
"type":"double"
}
}
}
}
}'
tip:
##1) 手動的指定我們的field的類型是可以的夷家,但是必須得是新建的索引庫
##2) 必須通過mappings的映射的去指定
##3) 可以自動映射的
##4) 我們的自定義字段只是一個申請摸袁,我們可以選擇用或不用靠汁,但是在實際生產(chǎn)中蝶怔,定義好的字段就是一種規(guī)范踢星,一般在沒有得批準(zhǔn)的前提是不允許隨意的添加字段的成洗。
##4. 以下代碼我們發(fā)現(xiàn)這個日期不是date泌枪,是text碌燕。因為我們沒有指定識別日期的格式
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/article/5' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"rock", "dt2":"20200904"}'
##5. 添加日期識別格式
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/blog2?pretty' -d \
'{
"mappings":{
"article":{
"dynamic_date_formats":["yyyyMMdd"]
}
}}'
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog2/article/1?pretty' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"rock", "dt":"20200904"}'
##6. 關(guān)閉自動識別日期
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/blog2?pretty' -d \
'{"mappings":{
"article":{
"date_detection":false
}
}}'
##7. 開啟將字符串全是數(shù)字的情況識別為long類型
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/blog3?pretty' -d \
'{
"mappings":{
"article":{
"numeric_detection":true
}
}}'
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog3/article/1?pretty' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"rock", "dt":"20200904", "num":"111"}'
1.5 核心概念
1.5.1 Cluster :集群
表示es的集群,集群中有多個節(jié)點慈鸠,其中有一個為主節(jié)點青团,這個主節(jié)點是可以通過選舉產(chǎn)生的,主從節(jié)點都是對于集群內(nèi)部來說的娃肿。因為ES本身其實有一個概念:去中心化料扰。字面理解上是表示es集群是沒有主節(jié)點晒杈,但是這個沒有主節(jié)點是對外部來說的拯钻。也就是我們可以認(rèn)為es在邏輯上是一個整體说庭,你與任何一個節(jié)點通信都與整個es集群通信時等價的。
主節(jié)點的職責(zé)時負(fù)責(zé)管理整個集群的狀態(tài)寡润,包括管理分片的狀態(tài)和副本的狀態(tài)梭纹。新節(jié)點的發(fā)現(xiàn)础拨,節(jié)點的刪除。
只要在同一個網(wǎng)段之內(nèi)啟動多個es節(jié)點塔沃,就可以自動組成一個集群(es2.0之前可以自動發(fā)現(xiàn)蛀柴,es2.0之后就不可以了)
如何查看集群的狀態(tài):
[root@hbase1 config]# curl -XGET -H "Content-Type:application/json" 'http://hbase1:9200/_cluster/health?pretty'
{
"cluster_name" : "bigdata-etc",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 6,
"active_shards" : 12,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
1.5.2 分片
可以在創(chuàng)建索引庫的時候指定分片,相當(dāng)于rdd或者kafka中的partition的概念肮韧。
如:
curl -XPUT 'ip:port/index' -d {"settings":{"number_of_shards":3}}
默認(rèn)每個索引庫都是5個分片
需要注意的是弄企,索引庫一旦被創(chuàng)建拘领,分片的個數(shù)是不能修改的约素。
1.5.3 副本
代表索引庫的副本。副本的作用是提供系統(tǒng)的容錯性送悔,當(dāng)某節(jié)點掛點可以從副本中恢復(fù)數(shù)據(jù)欠啤。
如:
curl -XPUT 'ip:port/index' -d {"settings":{"number_of_replicas":3}}
1.5.4 數(shù)據(jù)重分布
代表數(shù)據(jù)恢復(fù)或者叫做數(shù)據(jù)重新分布洁段。es在有節(jié)點加入或者退出的時候會根據(jù)機器的負(fù)載對索引分片進行重新分配祠丝,掛掉的節(jié)點重新啟動的時候也會進行數(shù)據(jù)恢復(fù)罐韩。
1.5.5 數(shù)據(jù)持久化
代表的是es的持久化存儲方式污朽,es默認(rèn)是先把索引存放到內(nèi)存中矾睦,當(dāng)內(nèi)存滿了的時候再存儲到硬盤枚冗。當(dāng)這個es集群再關(guān)閉的時候赁温、重啟的時候就會從gateway中讀取索引數(shù)據(jù)
es本身支持多種類型的gateway股囊,由本地的文件系統(tǒng)(默認(rèn))稚疹,分布式文件系統(tǒng):HDFS内狗、amazon。赂鲤。。
1.5.6 自動發(fā)現(xiàn)機制
代表es的自動發(fā)現(xiàn)節(jié)點的機制熄云。es是一個基于p2p的系統(tǒng),他先通過廣播尋找存在的節(jié)點珍德,再通過多廣播協(xié)議來進行節(jié)點與節(jié)點之間的通信锈候,同時支持點對點的交互泵琳。
禁用自動發(fā)現(xiàn)機制:
discovery.zen.ping.multicast.enabled : true/false
設(shè)置新節(jié)點被啟動時能夠發(fā)現(xiàn)的列表
discovery.zen.ping.unicast.hosts: ["hbase1", "hbase2", "hbase3"]
三 Java API
1 導(dǎo)入依賴
<dependencies>
<!-- es -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.5.3</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.8</version>
</dependency>
<!-- json -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.71</version>
</dependency>
</dependencies>
2 入門
2.1 elasticSearchUtils
package cn.qphone.es.api;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
import java.net.InetAddress;
public class ElasticSearchUtils {
private static TransportClient client;
static {
try {
//1. 配置對象
Settings settings = Settings.builder()
.put("cluster.name", "es-hzbigdata2002")
.build();
//2. Transport對象
client = new PreBuiltTransportClient(settings);
//3. 創(chuàng)建es的集群地址
TransportAddress[] trans = {
new TransportAddress(InetAddress.getByName("qphone01"), 9300),
new TransportAddress(InetAddress.getByName("qphone02"), 9300),
new TransportAddress(InetAddress.getByName("qphone03"), 9300)
};
//4. 連接es的服務(wù)器
client.addTransportAddresses(trans);
}catch (Exception e) {
e.printStackTrace();
}
}
/**
* 獲取連接的es的客戶端對象
*/
public static TransportClient getClient() {
return client;
}
}
2.2 quickstart
package cn.qphone.es.api;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.client.transport.TransportClient;
import java.net.UnknownHostException;
import java.util.Map;
public class Demo1_quickstart {
public static void main(String[] args) throws UnknownHostException {
//1. 獲取到操作es的核心類
TransportClient client = ElasticSearchUtils.getClient();
//2. 操作es
//2.1 創(chuàng)建索引庫
// curl -XPUT -H 'json/application' 'xxxxxx/index/type' -d '{"name":"lixi"}'
// String json = "{\"name\":\"wyl\", \"age\":18}";
// IndexResponse response = client.prepareIndex("hadoop", "hdfs")
// .setSource(json, XContentType.JSON)
// .get();
// System.out.println("create json version:" + response.getVersion());
// System.out.println(response.getIndex());
// System.out.println(response.getType());
//2.2 刪除索引
// DeleteResponse deleteResponse = client.prepareDelete("hadoop", "hdfs", "2")
// .get();
// System.out.println(deleteResponse.getIndex() + "/" + deleteResponse.getType() + "/" + deleteResponse.getId());
//2.3 get
GetResponse getResponse = client.prepareGet("hadoop", "hdfs", "ZZFVZnQBTuYsqQgZhqPE")
.get();
Map<String, Object> parm = getResponse.getSourceAsMap();
System.out.println(parm.get("name"));
System.out.println(parm.get("age"));
}
}
四 中文分詞
1 測試es的默認(rèn)分詞器
curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d '{
"text":"i am a big big boy"
}'
curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d '{
"text":"這里是好記性不如爛筆頭感嘆號的博客們"
}'
2 中文分詞器:ik分詞器
2.1 安裝
##1. 安裝解壓工具
yum -y install unzip
##2. 上傳ik分詞器
##3. 將ik分詞器拷貝到es的plugins目錄
mkdir -p /opt/apps/elasticsearch-6.5.3/plugins/ik && mv /opt/software/elasticsearch-analysis-ik-6.5.3.zip /opt/apps/elasticsearch-6.5.3/plugins/ik && cd /opt/apps/elasticsearch-6.5.3/plugins/ik
##4. 解壓
unzip elasticsearch-analysis-ik-6.5.3.zip && rm -f elasticsearch-analysis-ik-6.5.3.zip
##5. 分發(fā)
scp -r ik qphone02:/opt/apps/elasticsearch-6.5.3/plugins/ && scp -r ik qphone03:/opt/apps/elasticsearch-6.5.3/plugins/
##6. 重啟es集群
2.2 測試
curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d \
'{
"analyzer":"ik_max_word",
"text":"這里是好記性不如爛筆頭感嘆號的博客們"
}'
curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d \
'{
"analyzer":"ik_max_word",
"text":"i am a big big girl"
}'
##2. 創(chuàng)建chinese的索引庫鹏漆,并指定其分詞器的策略
curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese?pretty' -d \
'
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"ik": {
"tokenizer": "ik_max_word"
}
}
}
},
"mappings": {
"test1":{
"properties": {
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}
}
}'
##3. 向chinese導(dǎo)入數(shù)據(jù)
curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese/test1/1?pretty' -d \
'
{
"content": "里皮是一位牌面足夠大艺玲、支持率足夠高的教練"
}
'
curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese/test1/2?pretty' -d \
'
{
"content": "他不僅在意大利國家隊取得過成功"
}
'
curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese/test1/3?pretty' -d \
'
{
"content": "教練還帶領(lǐng)廣州恒大稱霸中超并首次奪得亞冠聯(lián)賽冠軍"
}
'
##4. 向chinese檢索教練關(guān)鍵詞
curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/chinese/_search?pretty' -d \
'
{
"query": {
"match": {
"content": "教練"
}
}
}
'
3 全文檢索的java api
package cn.qphone.es.api;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
public class Demo2_Search {
private static final String INDEX = "chinese";
public static void main(String[] args) {
//1. 獲取核心對象
TransportClient client = ElasticSearchUtils.getClient();
//2. 查詢_search
/*
* matchAll --> select * from t
* matchQuery --> select * from t where name like "%baby%"
* termQuery --> select * from t where name = baby
*/
SearchResponse response = client.prepareSearch(INDEX)
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setQuery(QueryBuilders.matchQuery("content", "意大利"))
.get();
//3. 獲取到搜索的記錄
SearchHits hits = response.getHits();
long totalHits = hits.totalHits; // 總的記錄
float maxScore = hits.getMaxScore(); // 最大分?jǐn)?shù)
System.out.println("total hits: " + totalHits);
System.out.println("max socres : " + maxScore);
SearchHit[] searchHits = hits.getHits(); // 包含了具體的記錄
for (SearchHit hit : searchHits) {
System.out.println("index : " + hit.getIndex());
System.out.println("當(dāng)前分?jǐn)?shù):" + hit.getScore());
System.out.println("content : " + hit.getSourceAsString());
}
}
}