DSL搜索
數(shù)據(jù)準備
-
自定義詞庫
- 馬可波羅
- 馬可
- 波羅
- 馬
- 可
- 波
- 羅
建立索引 demeter_index
-
手動建立mappings
POST /demeter_index/_mapping { "properties": { "id": { "type": "long" }, "age": { "type": "integer" }, "username": { "type": "keyword" }, "nickname": { "type": "text", "analyzer": "ik_max_word", "fields": { "keyword": { "type": "keyword" } } }, "money": { "type": "float" }, "desc": { "type": "text", "analyzer": "ik_max_word" }, "sex": { "type": "byte" }, "birthday": { "type": "date" }, "face": { "type": "text", "index": false } } }
添加數(shù)據(jù)
POST /demeter_index/_doc/1001 { "id": 1001, "age": 18, "username": "demeter", "nickname": "馬可", "money": 88.8, "desc": "我叫馬可波羅,很馬興認識大家", "sex": 0, "birthday": "1992-12-24", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1002 { "id": 1002, "age": 19, "username": "sulliven", "nickname": "波羅", "money": 77.8, "desc": "今天太陽很大杭朱,馬路上沒有行人", "sex": 1, "birthday": "1993-01-24", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1003 { "id": 1003, "age": 20, "username": "paul", "nickname": "馬可波羅", "money": 66.8, "desc": "馬可波羅來中國歷險", "sex": 1, "birthday": "1996-01-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1004 { "id": 1004, "age": 22, "username": "sky", "nickname": "云中君", "money": 55.8, "desc": "羊吃草嚣镜,馬兒跑", "sex": 0, "birthday": "1988-02-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1005 { "id": 1005, "age": 25, "username": "tiger", "nickname": "裴擒虎", "money": 155.8, "desc": "我今天玩了一局王者榮耀", "sex": 1, "birthday": "1989-03-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1006 { "id": 1006, "age": 19, "username": "misscodedemeter", "nickname": "小羅", "money": 156.8, "desc": "我叫羅某某轻局,今年20歲,是一名學(xué)生", "sex": 1, "birthday": "1993-04-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1007 { "id": 1007, "age": 19, "username": "cat", "nickname": "小小", "money": 1056.8, "desc": "這是我第一天學(xué)習(xí)elasticsearch", "sex": 1, "birthday": "1985-05-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1008 { "id": 1008, "age": 19, "username": "mark", "nickname": "小天", "money": 1056.8, "desc": "大學(xué)畢業(yè)后京办,來到一家開發(fā)公司工作", "sex": 1, "birthday": "1995-06-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1009 { "id": 1009, "age": 22, "username": "tim", "nickname": "大菠蘿", "money": 96.8, "desc": "阿羅在大學(xué)畢業(yè)后,考研究生去了", "sex": 1, "birthday": "1998-07-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1010 { "id": 1010, "age": 30, "username": "gaga", "nickname": "可心", "money": 100.8, "desc": "我在學(xué)習(xí)kibana", "sex": 1, "birthday": "1988-07-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1011 { "id": 1011, "age": 31, "username": "sprder", "nickname": "知事", "money": 180.8, "desc": "能讓我尊重的新聞媒體不多了", "sex": 1, "birthday": "1989-08-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
POST /demeter_index/_doc/1012 { "id": 1012, "age": 31, "username": "super hero", "nickname": "super hero", "money": 188.8, "desc": "BatMan, GreenArrow, SpiderMan, IronMan... are all Super Hero", "sex": 1, "birthday": "1980-08-14", "face": "https://www.codedemeter.com/static/img/index/logo.png" }
入門語法
請求參數(shù)的查詢(QueryString)
查詢[字段]包含[內(nèi)容]的文檔
text與keyword搜索對比測試(keyword不會被倒排索引,不會被分詞)username對應(yīng)的是keyword们何,nickname對應(yīng)的是text.
GET /demeter_index/_doc/_search?q=nickname:馬克
GET /demeter_index/_doc/_search?q=username:meter
GET /demeter_index/_doc/_search?q=username:demeter
DSL基本語法
QueryString用的很少,一旦參數(shù)復(fù)雜就難以構(gòu)建控轿,所以大多數(shù)查詢都會使用dsl來查詢冤竹。
- Domain Specific Language (領(lǐng)域?qū)S谜Z言)
- 基于JSON格式的數(shù)據(jù)查詢
- 查詢更靈活拂封,有利于復(fù)雜查詢
DSL格式語法
#查詢
POST /demeter_index/_doc/_search
{
"query":{
"match":{
"desc":"學(xué)習(xí)"
}
}
}
#判斷某字段是否存在
POST /demeter_index/_doc/_search
{
"query": {
"exists": {
"field": "desc"
}
}
}
- 語法格式為一個json object,內(nèi)容都是key-value鍵值對鹦蠕,可以嵌套
- key可以是es的關(guān)鍵字冒签,也可以是某個field字段
查詢與分頁
查詢所有
match_all
POST /demeter_index/_doc/_search
{
"query": {
"match_all": {}
}
}
只想顯示一些field可以設(shè)置_source
POST /demeter_index/_doc/_search
{
"query": {
"match_all": {}
},
"_source": [
"id",
"nickname",
"age",
"desc"
]
}
分頁查詢,默認查詢是只有10條記錄钟病,可以通過分頁來展示镣衡,設(shè)置from(從第幾條開始) size(查詢幾條)
POST /demeter_index/_doc/_search
{
"query": {
"match_all": {}
},
"_source": [
"id",
"nickname",
"age",
"desc"
],
"from": 0,
"size": 5
}
term與match區(qū)別
term精確搜索與match分詞搜索
term是代表完全匹配,也就是精確查詢档悠,搜索前不會再對搜索詞進行分詞廊鸥,所以搜索必須是文檔分詞集合中的一個
POST /demeter_index/_doc/_search
{
"query": {
"term": {
"nickname":"馬可"
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
查詢到兩條
match查詢會對搜索詞進行分詞,只要搜索詞的分詞集合中的一個或多個存在與文檔中就會被查詢到
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"nickname":"馬可"
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
查詢到3條
terms 多個詞語匹配檢索
查詢某個字段里含有多個關(guān)鍵詞的文檔
POST /demeter_index/_doc/_search
{
"query": {
"terms": {
"nickname":["馬可","波羅"]
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
match_phrase
match_phrase 短語匹配辖所,match分詞后只要有匹配就返回惰说,match_phrase分詞結(jié)果必須在text字段分詞中都包含,而且順序必須相同缘回,而且必須都是連續(xù)的吆视。
POST /demeter_index/_doc/_search
{
"query": {
"match_phrase": {
"desc":{
"query":"第一天 學(xué)習(xí)"
}
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
slop:允許詞語間跳過的數(shù)量
{
"query": {
"match_phrase": {
"desc":{
"query":"我 學(xué)習(xí)",
"slop": 1
}
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
match(operator)/ids
match擴展 operator
- or:搜索內(nèi)容分詞后,只要存在一個詞語匹配就展示結(jié)果
- and:搜索內(nèi)容分詞后酥宴,都要滿足詞語匹配
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc":"我 學(xué)習(xí)"
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
#等同于
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc":{
"query":"我 學(xué)習(xí)",
"operator":"or"
}
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc":{
"query":"我 學(xué)習(xí)",
"operator":"and"
}
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
- minimum_should_macth:最低匹配精度啦吧,至少有[分詞后的詞語個數(shù)]x百分百,得出一個數(shù)據(jù)值取整拙寡,舉個例子:當(dāng)前屬性設(shè)置為70授滓,若一個用戶查詢檢索內(nèi)容分詞后有10個詞語,那么匹配度按照10x70%=7肆糕,則desc中至少有7個詞語匹配般堆,就展示,若分詞后有8個詞語诚啃,8x70%=5.6淮摔,則desc中至少需要5個詞語匹配就展示。
- minimum_should_macth也可以設(shè)置具體的數(shù)字始赎,表示個數(shù)
# 查詢檢索內(nèi)容的分詞結(jié)果
POST /_analyze
{
"analyzer": "ik_max_word",
"text": "我學(xué)習(xí)了redis和docker"
}
#分詞后共有6個詞語
#我 學(xué)習(xí) 了 redis 和 docker
#6*40%=2.4 需要滿足兩個詞語匹配
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc":{
"query":"我學(xué)習(xí)了redis和docker",
"minimum_should_match":"40%"
}
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
# 需要滿足兩個詞語匹配 結(jié)果如上圖
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc":{
"query":"我學(xué)習(xí)了redis和docker",
"minimum_should_match":2
}
}
},
"_source": [
"id",
"nickname",
"desc"
]
}
ids 根據(jù)文檔主鍵ids搜索
GET /demeter_index/_doc/1001
查詢多個
POST /demeter_index/_doc/_search
{
"query": {
"ids": {
"type": "_doc",
"values": ["1001", "1005", "1011"]
}
}
}
multi_match/boost
multi_match:在多個字段中進行查詢
POST /demeter_index/_doc/_search
{
"query": {
"multi_match": {
"query": "小小明愛學(xué)習(xí)",
"fields": ["desc", "nickname"]
}
}
}
boost:權(quán)重和橙,為某個字段設(shè)置權(quán)重,權(quán)重越高造垛,文檔相關(guān)性得分越高魔招。
#nickname^10代表nickname搜索提高了10倍相關(guān)性
POST /demeter_index/_doc/_search
{
"query": {
"multi_match": {
"query": "小小明愛學(xué)習(xí)",
"fields": ["desc", "nickname^10"]
}
}
}
布爾查詢
must:返回的文檔必須滿足must子句的條件,并且參與計算分值
should:返回的文檔可能滿足should子句的條件筋搏。在一個Bool查詢中仆百,如果沒有must或者filter,有一個或者多個should子句奔脐,那么只要滿足一個就可以返回俄周。
minimum_should_match
參數(shù)定義了至少滿足幾個子句。must_not:返回的文檔必須不滿足must_not定義的條件
POST /demeter_index/_doc/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "馬可波羅",
"fields": [
"desc",
"nickname"
]
}
},
{
"term": {
"sex": 1
}
},
{
"term": {
"age": 19
}
}
]
}
}
}
改成should
改成****must_not****
組合使用
POST /demeter_index/_doc/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "馬",
"fields": [
"desc",
"nickname"
]
}
}
],
"should": [
{
"match": {
"sex": 1
}
}
],
"must_not": [
{
"term": {
"age": 18
}
}
]
}
}
}
過濾器
對搜索出來的結(jié)果進行數(shù)據(jù)過濾髓迎,不會到es庫里去搜峦朗,不會去計算文檔的相關(guān)度分數(shù),所以過濾的性能會比較高排龄,過濾器可以和全文搜索結(jié)合在一起使用波势。
post_filter元素是以一個頂級元素,只會對搜索結(jié)果進行過濾,不會計算數(shù)據(jù)的匹配度相關(guān)性分數(shù),不會根據(jù)分數(shù)去排序概疆,query則相反饰抒,會計算分數(shù),也會按照分數(shù)去排序膘流。
query:根據(jù)用戶搜索條件檢索匹配記錄
post_filter:用于查詢后,對結(jié)果數(shù)據(jù)的篩選
- gte:大于等于
- lte:小于等于
- gt:大于
- lt:小于
POST /demeter_index/_doc/_search
{
"query": {
"multi_match": {
"query": "馬",
"fields": [
"desc"
]
}
},
"post_filter":{
"range":{
"money":{
"gt":60,
"lt":80
}
}
}
}
排序
降序desc 升序asc
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc": "馬克"
}
},
"post_filter":{
"range":{
"money":{
"gt":60,
"lt":80
}
}
},
"sort": [
{
"age": "desc"
},
{
"money": "desc"
}
]
}
對文本的排序
由于文本會被分詞,所以往往要去做排序會報錯店溢,可以為這個字段增加額外的一個附屬屬性,類型為keyword委乌,用于做排序床牧。
# 在創(chuàng)建mappings時 設(shè)置
"nickname": {
"type": "text",
"analyzer": "ik_max_word",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc": "馬克"
}
},
"post_filter":{
"range":{
"money":{
"gt":60,
"lt":80
}
}
},
"sort": [
{
"nickname.keyword": "desc"
}
]
}
高亮 highlight
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc": "馬可"
}
},
"highlight": {
"fields": {
"desc": {}
}
}
}
自定義高亮標簽
POST /demeter_index/_doc/_search
{
"query": {
"match": {
"desc": "馬可"
}
},
"highlight": {
"pre_tags": [
"<tag>"
],
"post_tags": [
"</tag>"
],
"fields": {
"desc": {}
}
}
}
prefix/fuzzy/wildcard
prefix:前綴查詢,
prefix
查詢不做相關(guān)度評分計算遭贸,它只是將所有匹配的文檔返回戈咳,并為每條結(jié)果賦予評分值 1 。它的行為更像是過濾器而不是查詢壕吹。prefix
查詢和prefix
過濾器這兩者實際的區(qū)別就是過濾器是可以被緩存的除秀,而查詢不行。
POST /demeter_index/_doc/_search
{
"query": {
"prefix": {
"desc": "elas"
}
}
}
fuzzy:模糊搜索算利,并不是指的sql的模糊搜索册踩,而是用戶在進行搜索的時候的打字錯誤現(xiàn)象,搜索引擎會自動糾正效拭,然后嘗試匹配索引庫中的數(shù)據(jù)暂吉。
POST /demeter_index/_doc/_search
{
"query": {
"fuzzy": {
"desc": "elasticsearhc"
}
}
}
fuzziness,你的搜索文本最多可以糾正幾個字母去跟你的數(shù)據(jù)進行匹配缎患,默認如果不設(shè)置慕的,就是2
POST /demeter_index/_doc/_search
{
"query": {
"multi_match": {
"fields": [ "desc", "nickname"],
"query": "elasticsearchs",
"fuzziness": "auto"
}
}
}
wildcard:通配符查詢
- ?:1個字符
- *:1個或多個字符
POST /demeter_index/_doc/_search
{
"query": {
"wildcard": {
"desc": "elastic*"
}
}
}
POST /demeter_index/_doc/_search
{
"query": {
"wildcard": {
"desc": "馬?"
}
}
}