前言
本文主要介紹 Elasticsearch 23種最有用的檢索技巧,提供了詳盡的源碼舉例胸哥,并配有相應(yīng)的Java API實(shí)現(xiàn)劳吠,是不可多得的 Elasticsearch 學(xué)習(xí)&實(shí)戰(zhàn)資料
數(shù)據(jù)準(zhǔn)備
為了講解不同類型 ES 檢索,我們將要對(duì)包含以下類型的文檔集合進(jìn)行檢索:
title 標(biāo)題
authors 作者
summary 摘要
publish_date 發(fā)布日期
num_reviews 評(píng)論數(shù)
publisher 出版社
首先爪幻,我們借助 bulk API 批量創(chuàng)建新的索引并提交數(shù)據(jù)
# 設(shè)置索引 settings
PUT /bookdb_index
{ "settings": { "number_of_shards": 1 }}
# bulk 提交數(shù)據(jù)
POST /bookdb_index/book/_bulk
{"index":{"_id":1}}
{"title":"Elasticsearch: The Definitive Guide","authors":["clinton gormley","zachary tong"],"summary":"A distibuted real-time search and analytics engine","publish_date":"2015-02-07","num_reviews":20,"publisher":"oreilly"}
{"index":{"_id":2}}
{"title":"Taming Text: How to Find, Organize, and Manipulate It","authors":["grant ingersoll","thomas morton","drew farris"],"summary":"organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization","publish_date":"2013-01-24","num_reviews":12,"publisher":"manning"}
{"index":{"_id":3}}
{"title":"Elasticsearch in Action","authors":["radu gheorge","matthew lee hinman","roy russo"],"summary":"build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms","publish_date":"2015-12-03","num_reviews":18,"publisher":"manning"}
{"index":{"_id":4}}
{"title":"Solr in Action","authors":["trey grainger","timothy potter"],"summary":"Comprehensive guide to implementing a scalable search engine using Apache Solr","publish_date":"2014-04-05","num_reviews":23,"publisher":"manning"}
注意:本文實(shí)驗(yàn)使用的ES版本是 ES 6.3.0
1痛单、基本匹配檢索( Basic Match Query)
1.1 全文檢索
有兩種方式可以執(zhí)行全文檢索:
1)使用包含參數(shù)的檢索API嘿棘,參數(shù)作為URL的一部分
舉例:以下對(duì) "guide" 執(zhí)行全文檢索
GET bookdb_index/book/_search?q=guide
[Results]
"hits": {
"total": 2,
"max_score": 1.3278645,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1.3278645,
"_source": {
"title": "Solr in Action",
"authors": [
"trey grainger",
"timothy potter"
],
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"publish_date": "2014-04-05",
"num_reviews": 23,
"publisher": "manning"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 1.2871116,
"_source": {
"title": "Elasticsearch: The Definitive Guide",
"authors": [
"clinton gormley",
"zachary tong"
],
"summary": "A distibuted real-time search and analytics engine",
"publish_date": "2015-02-07",
"num_reviews": 20,
"publisher": "oreilly"
}
}
]
}
2)使用完整的ES DSL,其中Json body作為請(qǐng)求體
其執(zhí)行結(jié)果如方式 1)結(jié)果一致.
GET bookdb_index/book/_search
{
"query": {
"multi_match": {
"query": "guide",
"fields" : ["_all"]
}
}
}
解讀: 使用multi_match關(guān)鍵字代替match關(guān)鍵字旭绒,作為對(duì)多個(gè)字段運(yùn)行相同查詢的方便的簡(jiǎn)寫(xiě)方式鸟妙。 fields屬性指定要查詢的字段焦人,在這種情況下,我們要對(duì)文檔中的所有字段進(jìn)行查詢
注意:ES 6.x 默認(rèn)不啟用
_all
字段, 不指定 fields 默認(rèn)搜索為所有字段
1.2 指定特定字段檢索
這兩個(gè)API也允許您指定要搜索的字段重父。
例如花椭,要在標(biāo)題字段(title)中搜索帶有 "in action" 字樣的圖書(shū)
1)URL檢索方式
GET bookdb_index/book/_search?q=title:in action
[Results]
"hits": {
"total": 2,
"max_score": 1.6323128,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 1.6323128,
"_source": {
"title": "Elasticsearch in Action",
"authors": [
"radu gheorge",
"matthew lee hinman",
"roy russo"
],
"summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
"publish_date": "2015-12-03",
"num_reviews": 18,
"publisher": "manning"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1.6323128,
"_source": {
"title": "Solr in Action",
"authors": [
"trey grainger",
"timothy potter"
],
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"publish_date": "2014-04-05",
"num_reviews": 23,
"publisher": "manning"
}
}
]
}
2)DSL檢索方式
然而,full body的DSL為您提供了創(chuàng)建更復(fù)雜查詢的更多靈活性(我們將在后面看到)以及指定您希望的返回結(jié)果房午。在下面的示例中个从,我們指定要返回的結(jié)果數(shù)、偏移量(對(duì)分頁(yè)有用)歪沃、我們要返回的文檔字段以及屬性的高亮顯示。
結(jié)果數(shù)的表示方式:size
偏移值的表示方式:from
指定返回字段 的表示方式 :_source
高亮顯示 的表示方式 :highliaght
GET bookdb_index/book/_search
{
"query": {
"match": {
"title": "in action"
}
},
"size": 2,
"from": 0,
"_source": ["title", "summary", "publish_date"],
"highlight": {
"fields": {
"title": {}
}
}
}
[Results]
"hits": {
"total": 2,
"max_score": 1.6323128,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 1.6323128,
"_source": {
"summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
},
"highlight": {
"title": [
"Elasticsearch <em>in</em> <em>Action</em>"
]
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1.6323128,
"_source": {
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"title": "Solr in Action",
"publish_date": "2014-04-05"
},
"highlight": {
"title": [
"Solr <em>in</em> <em>Action</em>"
]
}
}
]
}
注意:
- 對(duì)于 multi-word 檢索嫌松,匹配查詢?cè)试S您指定是否使用 and 運(yùn)算符沪曙,
而不是使用默認(rèn) or 運(yùn)算符 ---> "operator" : "and"- 您還可以指定 minimum_should_match 選項(xiàng)來(lái)調(diào)整返回結(jié)果的相關(guān)性,詳細(xì)信息可以在Elasticsearch指南中查詢Elasticsearch guide獲取萎羔。
2液走、多字段檢索 (Multi-field Search)
如我們已經(jīng)看到的,要在搜索中查詢多個(gè)文檔字段(例如在標(biāo)題和摘要中搜索相同的查詢字符串)贾陷,請(qǐng)使用multi_match查詢
GET bookdb_index/book/_search
{
"query": {
"multi_match": {
"query": "guide",
"fields": ["title", "summary"]
}
}
}
[Results]
"hits": {
"total": 3,
"max_score": 2.0281231,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 2.0281231,
"_source": {
"title": "Elasticsearch: The Definitive Guide",
"authors": [
"clinton gormley",
"zachary tong"
],
"summary": "A distibuted real-time search and analytics engine",
"publish_date": "2015-02-07",
"num_reviews": 20,
"publisher": "oreilly"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1.3278645,
"_source": {
"title": "Solr in Action",
"authors": [
"trey grainger",
"timothy potter"
],
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"publish_date": "2014-04-05",
"num_reviews": 23,
"publisher": "manning"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 1.0333893,
"_source": {
"title": "Elasticsearch in Action",
"authors": [
"radu gheorge",
"matthew lee hinman",
"roy russo"
],
"summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
"publish_date": "2015-12-03",
"num_reviews": 18,
"publisher": "manning"
}
}
]
}
注意:以上結(jié)果中文檔4(_id=4)匹配的原因是guide在summary存在缘眶。
3、 Boosting提升某字段得分的檢索( Boosting)
由于我們正在多個(gè)字段進(jìn)行搜索髓废,我們可能希望提高某一字段的得分巷懈。 在下面的例子中,我們將“摘要”字段的得分提高了3倍慌洪,以增加“摘要”字段的重要性顶燕,從而提高文檔 4 的相關(guān)性。
GET bookdb_index/book/_search
{
"query": {
"multi_match": {
"query": "elasticsearch guide",
"fields": ["title", "summary^3"]
}
},
"_source": ["title", "summary", "publish_date"]
}
[Results]
"hits": {
"total": 3,
"max_score": 3.9835935,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 3.9835935,
"_source": {
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"title": "Solr in Action",
"publish_date": "2014-04-05"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 3.1001682,
"_source": {
"summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 2.0281231,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
}
]
}
注意:Boosting不僅意味著計(jì)算得分乘法以增加因子冈爹。 實(shí)際的提升得分值是通過(guò)歸一化和一些內(nèi)部?jī)?yōu)化涌攻。參考 Elasticsearch guide查看更多
4、Bool檢索( Bool Query)
可以使用 AND / OR / NOT 運(yùn)算符來(lái)微調(diào)我們的搜索查詢频伤,以提供更相關(guān)或指定的搜索結(jié)果恳谎。
在搜索API中是通過(guò)bool查詢來(lái)實(shí)現(xiàn)的。
bool查詢接受 must 參數(shù)(等效于AND)憋肖,一個(gè) must_not 參數(shù)(相當(dāng)于NOT)或者一個(gè) should 參數(shù)(等同于OR)因痛。
例如,如果我想在標(biāo)題中搜索一本名為 "Elasticsearch" 或 "Solr" 的書(shū)瞬哼,AND由 "clinton gormley" 創(chuàng)作婚肆,但NOT由 "radu gheorge" 創(chuàng)作
GET bookdb_index/book/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{"match": {"title": "Elasticsearch"}},
{"match": {"title": "Solr"}}
]
}
},
{
"match": {"authors": "clinton gormely"}
}
],
"must_not": [
{
"match": {"authors": "radu gheorge"}
}
]
}
}
}
[Results]
"hits": {
"total": 1,
"max_score": 2.0749094,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 2.0749094,
"_source": {
"title": "Elasticsearch: The Definitive Guide",
"authors": [
"clinton gormley",
"zachary tong"
],
"summary": "A distibuted real-time search and analytics engine",
"publish_date": "2015-02-07",
"num_reviews": 20,
"publisher": "oreilly"
}
}
]
}
關(guān)于bool查詢中的should, 有兩種情況:
- 當(dāng)should的同級(jí)存在must的時(shí)候坐慰,should中的條件可以滿足也可以不滿足较性,滿足的越多得分越高
- 當(dāng)沒(méi)有must的時(shí)候用僧,默認(rèn)should中的條件至少要滿足一個(gè)
注意:您可以看到,bool查詢可以包含任何其他查詢類型赞咙,包括其他布爾查詢责循,以創(chuàng)建任意復(fù)雜或深度嵌套的查詢
5、 Fuzzy 模糊檢索( Fuzzy Queries)
在 Match檢索 和多匹配檢索中可以啟用模糊匹配來(lái)捕捉拼寫(xiě)錯(cuò)誤攀操。 基于與原始詞的 Levenshtein 距離來(lái)指定模糊度
GET bookdb_index/book/_search
{
"query": {
"multi_match": {
"query": "comprihensiv guide",
"fields": ["title","summary"],
"fuzziness": "AUTO"
}
},
"_source": ["title","summary","publish_date"],
"size": 2
}
[Results]
"hits": {
"total": 2,
"max_score": 2.4344182,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 2.4344182,
"_source": {
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"title": "Solr in Action",
"publish_date": "2014-04-05"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 1.2871116,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
}
]
}
"AUTO" 的模糊值相當(dāng)于當(dāng)字段長(zhǎng)度大于5時(shí)指定值2院仿。但是,設(shè)置80%的拼寫(xiě)錯(cuò)誤的編輯距離為1速和,將模糊度設(shè)置為1可能會(huì)提高整體搜索性能歹垫。 有關(guān)更多信息, Typos and Misspellingsch
6颠放、 Wildcard Query 通配符檢索
通配符查詢?cè)试S您指定匹配的模式排惨,而不是整個(gè)詞組(term)檢索
- ? 匹配任何字符
- 匹配零個(gè)或多個(gè)字符
舉例碰凶,要查找具有以 "t" 字母開(kāi)頭的作者的所有記錄暮芭,如下所示
GET bookdb_index/book/_search
{
"query": {
"wildcard": {
"authors": {
"value": "t*"
}
}
},
"_source": ["title", "authors"],
"highlight": {
"fields": {
"authors": {}
}
}
}
[Results]
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 1,
"_source": {
"title": "Elasticsearch: The Definitive Guide",
"authors": [
"clinton gormley",
"zachary tong"
]
},
"highlight": {
"authors": [
"zachary <em>tong</em>"
]
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "2",
"_score": 1,
"_source": {
"title": "Taming Text: How to Find, Organize, and Manipulate It",
"authors": [
"grant ingersoll",
"thomas morton",
"drew farris"
]
},
"highlight": {
"authors": [
"<em>thomas</em> morton"
]
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1,
"_source": {
"title": "Solr in Action",
"authors": [
"trey grainger",
"timothy potter"
]
},
"highlight": {
"authors": [
"<em>trey</em> grainger",
"<em>timothy</em> potter"
]
}
}
]
}
7、正則表達(dá)式檢索( Regexp Query)
正則表達(dá)式能指定比通配符檢索更復(fù)雜的檢索模式欲低,舉例如下:
POST bookdb_index/book/_search
{
"query": {
"regexp": {
"authors": "t[a-z]*y"
}
},
"_source": ["title", "authors"],
"highlight": {
"fields": {
"authors": {}
}
}
}
[Results]
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1,
"_source": {
"title": "Solr in Action",
"authors": [
"trey grainger",
"timothy potter"
]
},
"highlight": {
"authors": [
"<em>trey</em> grainger",
"<em>timothy</em> potter"
]
}
}
]
}
8辕宏、匹配短語(yǔ)檢索( Match Phrase Query)
匹配短語(yǔ)查詢要求查詢字符串中的所有詞都存在于文檔中,按照查詢字符串中指定的順序并且彼此靠近砾莱。
默認(rèn)情況下瑞筐,這些詞必須完全相鄰,但您可以指定偏離值(slop value)恤磷,該值指示在仍然考慮文檔匹配的情況下詞與詞之間的偏離值面哼。
GET bookdb_index/book/_search
{
"query": {
"multi_match": {
"query": "search engine",
"fields": ["title", "summary"],
"type": "phrase",
"slop": 3
}
},
"_source": [ "title", "summary", "publish_date" ]
}
[Results]
"hits": {
"total": 2,
"max_score": 0.88067603,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 0.88067603,
"_source": {
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"title": "Solr in Action",
"publish_date": "2014-04-05"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 0.51429313,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
}
]
}
注意:在上面的示例中,對(duì)于非短語(yǔ)類型查詢扫步,文檔_id 1通常具有較高的分?jǐn)?shù)魔策,并且顯示在文檔_id 4之前,因?yàn)槠渥侄伍L(zhǎng)度較短河胎。
然而闯袒,作為一個(gè)短語(yǔ)查詢,詞與詞之間的接近度被考慮在內(nèi)游岳,所以文檔_id 4分?jǐn)?shù)更好
9政敢、匹配詞組前綴檢索
匹配詞組前綴查詢?cè)诓樵儠r(shí)提供搜索即時(shí)類型或 "相對(duì)簡(jiǎn)單" "的自動(dòng)完成版本,而無(wú)需以任何方式準(zhǔn)備數(shù)據(jù)胚迫。
像match_phrase查詢一樣喷户,它接受一個(gè)斜率參數(shù),使得單詞的順序和相對(duì)位置沒(méi)有那么 "嚴(yán)格"访锻。 它還接受max_expansions參數(shù)來(lái)限制匹配的條件數(shù)以減少資源強(qiáng)度
GET bookdb_index/book/_search
{
"query": {
"match_phrase_prefix": {
"summary": {
"query": "search en",
"slop": 3,
"max_expansions": 10
}
}
},
"_source": ["title","summary","publish_date"]
}
注意:查詢時(shí)間搜索類型具有性能成本褪尝。 一個(gè)更好的解決方案是將時(shí)間作為索引類型闹获。 更多相關(guān)API查詢 Completion Suggester API 或者 Edge-Ngram filters 。
10河哑、字符串檢索( Query String)
query_string查詢提供了以簡(jiǎn)明的簡(jiǎn)寫(xiě)語(yǔ)法執(zhí)行多匹配查詢 multi_match queries 避诽,布爾查詢 bool queries ,提升得分 boosting 璃谨,模糊匹配 fuzzy matching 沙庐,通配符 wildcards ,正則表達(dá)式 regexp 和范圍查詢 range queries 的方式佳吞。
在下面的例子中拱雏,我們對(duì) "search algorithm" 一詞執(zhí)行模糊搜索,其中一本作者是 "grant ingersoll" 或 "tom morton"底扳。 我們搜索所有字段古涧,但將提升應(yīng)用于文檔2的摘要字段
GET bookdb_index/book/_search
{
"query": {
"query_string": {
"query": "(saerch~1 algorithm~1) AND (grant ingersoll) OR (tom morton)",
"fields": ["summary^2","title","authors","publisher"]
}
},
"_source": ["title","summary","authors"],
"highlight": {
"fields": {
"summary": {}
}
}
}
[Results]
"hits": {
"total": 1,
"max_score": 3.571021,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "2",
"_score": 3.571021,
"_source": {
"summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
"title": "Taming Text: How to Find, Organize, and Manipulate It",
"authors": [
"grant ingersoll",
"thomas morton",
"drew farris"
]
},
"highlight": {
"summary": [
"organize text using approaches such as full-text <em>search</em>, proper name recognition, clustering, tagging"
]
}
}
]
}
11、簡(jiǎn)化的字符串檢索 (Simple Query String)
simple_query_string 查詢是 query_string 查詢的一個(gè)版本花盐,更適合用于暴露給用戶的單個(gè)搜索框,
因?yàn)樗謩e用 +
/ |
/ -
替換了 AND
/ OR
/ NOT
的使用菇爪,并放棄查詢的無(wú)效部分算芯,而不是在用戶出錯(cuò)時(shí)拋出異常。
GET bookdb_index/book/_search
{
"query": {
"simple_query_string": {
"query": "(saerch~1 algorithm~1) + (grant ingersoll) | (tom morton)",
"fields": ["summary^2","title","authors","publisher"]
}
},
"_source": ["title","summary","authors"],
"highlight": {
"fields": {
"summary": {}
}
}
}
[Results]
# 結(jié)果同上
12凳宙、Term/Terms檢索(指定字段檢索)
上面1-11小節(jié)的例子是全文搜索的例子熙揍。 有時(shí)我們對(duì)結(jié)構(gòu)化搜索更感興趣,我們希望在其中找到完全匹配并返回結(jié)果
在下面的例子中氏涩,我們搜索 Manning Publications 發(fā)布的索引中的所有圖書(shū)(借助 term和terms查詢 )
GET bookdb_index/book/_search
{
"query": {
"term": {
"publisher": {
"value": "manning"
}
}
},
"_source" : ["title","publish_date","publisher"]
}
[Results]
"hits": {
"total": 3,
"max_score": 0.35667494,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "2",
"_score": 0.35667494,
"_source": {
"publisher": "manning",
"title": "Taming Text: How to Find, Organize, and Manipulate It",
"publish_date": "2013-01-24"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 0.35667494,
"_source": {
"publisher": "manning",
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 0.35667494,
"_source": {
"publisher": "manning",
"title": "Solr in Action",
"publish_date": "2014-04-05"
}
}
]
}
Multiple terms可指定多個(gè)關(guān)鍵詞進(jìn)行檢索
GET bookdb_index/book/_search
{
"query": {
"terms": {
"publisher": ["oreilly", "manning"]
}
}
}
13届囚、Term排序檢索-(Term Query - Sorted)
Term查詢和其他查詢一樣,輕松的實(shí)現(xiàn)排序是尖。多級(jí)排序也是允許的
GET bookdb_index/book/_search
{
"query": {
"term": {
"publisher": {
"value": "manning"
}
}
},
"_source" : ["title","publish_date","publisher"],
"sort": [{"publisher.keyword": { "order": "desc"}},
{"title.keyword": {"order": "asc"}}]
}
[Results]
"hits": {
"total": 3,
"max_score": null,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": null,
"_source": {
"publisher": "manning",
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
},
"sort": [
"manning",
"Elasticsearch in Action"
]
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": null,
"_source": {
"publisher": "manning",
"title": "Solr in Action",
"publish_date": "2014-04-05"
},
"sort": [
"manning",
"Solr in Action"
]
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "2",
"_score": null,
"_source": {
"publisher": "manning",
"title": "Taming Text: How to Find, Organize, and Manipulate It",
"publish_date": "2013-01-24"
},
"sort": [
"manning",
"Taming Text: How to Find, Organize, and Manipulate It"
]
}
]
}
注意:Elasticsearch 6.x 全文搜索用text類型的字段意系,排序用不用 text 類型的字段
14、范圍檢索(Range query)
另一個(gè)結(jié)構(gòu)化檢索的例子是范圍檢索饺汹。下面的舉例中蛔添,我們檢索了2015年發(fā)布的書(shū)籍。
GET bookdb_index/book/_search
{
"query": {
"range": {
"publish_date": {
"gte": "2015-01-01",
"lte": "2015-12-31"
}
}
},
"_source" : ["title","publish_date","publisher"]
}
[Results]
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 1,
"_source": {
"publisher": "oreilly",
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 1,
"_source": {
"publisher": "manning",
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
}
}
]
}
注意:范圍查詢適用于日期兜辞,數(shù)字和字符串類型字段
15迎瞧、過(guò)濾檢索(Filtered query)
(5.0版本起已不再存在,不必關(guān)注)
過(guò)濾的查詢?cè)试S您過(guò)濾查詢的結(jié)果逸吵。 如下的例子凶硅,我們?cè)跇?biāo)題或摘要中查詢名為“Elasticsearch”的圖書(shū),但是我們希望將結(jié)果過(guò)濾到只有20個(gè)或更多評(píng)論的結(jié)果扫皱。
POST /bookdb_index/book/_search
{
"query": {
"filtered": {
"query" : {
"multi_match": {
"query": "elasticsearch",
"fields": ["title","summary"]
}
},
"filter": {
"range" : {
"num_reviews": {
"gte": 20
}
}
}
}
},
"_source" : ["title","summary","publisher", "num_reviews"]
}
[Results]
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 0.5955761,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"publisher": "oreilly",
"num_reviews": 20,
"title": "Elasticsearch: The Definitive Guide"
}
}
]
注意:已過(guò)濾的查詢不要求存在要過(guò)濾的查詢足绅。 如果沒(méi)有指定查詢捷绑,則運(yùn)行 match_all 查詢,基本上返回索引中的所有文檔编检,然后對(duì)其進(jìn)行過(guò)濾胎食。
實(shí)際上,首先運(yùn)行過(guò)濾器允懂,減少需要查詢的表面積厕怜。 此外,過(guò)濾器在第一次使用后被緩存蕾总,這使得它非常有效
更新: 已篩選的查詢已推出的Elasticsearch 5.X版本中移除粥航,有利于布爾查詢。 這是與上面重寫(xiě)的使用bool查詢相同的示例生百。 返回的結(jié)果是完全一樣的递雀。
GET bookdb_index/book/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "elasticsearch",
"fields": ["title","summary"]
}
}
],
"filter": {
"range": {
"num_reviews": {
"gte": 20
}
}
}
}
},
"_source" : ["title","summary","publisher", "num_reviews"]
}
16、多個(gè)過(guò)濾器檢索(Multiple Filters)
(5.x不再支持蚀浆,無(wú)需關(guān)注)
多個(gè)過(guò)濾器可以通過(guò)使用布爾過(guò)濾器進(jìn)行組合缀程。
在下一個(gè)示例中,過(guò)濾器確定返回的結(jié)果必須至少包含20個(gè)評(píng)論市俊,不得在2015年之前發(fā)布杨凑,并且應(yīng)該由oreilly發(fā)布
POST /bookdb_index/book/_search
{
"query": {
"filtered": {
"query" : {
"multi_match": {
"query": "elasticsearch",
"fields": ["title","summary"]
}
},
"filter": {
"bool": {
"must": {
"range" : { "num_reviews": { "gte": 20 } }
},
"must_not": {
"range" : { "publish_date": { "lte": "2014-12-31" } }
},
"should": {
"term": { "publisher": "oreilly" }
}
}
}
}
},
"_source" : ["title","summary","publisher", "num_reviews", "publish_date"]
}
[Results]
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 0.5955761,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"publisher": "oreilly",
"num_reviews": 20,
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
}
]
17、 Function 得分:Field值因子( Function Score: Field Value Factor)
可能有一種情況摆昧,您想要將文檔中特定字段的值納入相關(guān)性分?jǐn)?shù)的計(jì)算撩满。 這在您希望基于其受歡迎程度提升文檔的相關(guān)性的情況下是有代表性的場(chǎng)景
在我們的例子中,我們希望增加更受歡迎的書(shū)籍(按評(píng)論數(shù)量判斷)绅你。 這可以使用field_value_factor函數(shù)得分
GET bookdb_index/book/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "search engine",
"fields": ["title","summary"]
}
},
"field_value_factor": {
"field": "num_reviews",
"modifier": "log1p",
"factor": 2
}
}
},
"_source": ["title", "summary", "publish_date", "num_reviews"]
}
[Results]
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 1.5694137,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"num_reviews": 20,
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 1.4725765,
"_source": {
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"num_reviews": 23,
"title": "Solr in Action",
"publish_date": "2014-04-05"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 0.14181662,
"_source": {
"summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
"num_reviews": 18,
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "2",
"_score": 0.13297246,
"_source": {
"summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
"num_reviews": 12,
"title": "Taming Text: How to Find, Organize, and Manipulate It",
"publish_date": "2013-01-24"
}
}
]
}
注1:我們可以運(yùn)行一個(gè)常規(guī)的multi_match查詢伺帘,并按num_reviews字段排序,但是我們失去了相關(guān)性得分的好處忌锯。
注2:有許多附加參數(shù)可以調(diào)整對(duì)原始相關(guān)性分?jǐn)?shù)
(如“ modifier ”伪嫁,“ factor ”,“boost_mode”等)的增強(qiáng)效果的程度偶垮。
詳見(jiàn) Elasticsearch guide.
18礼殊、 Function 得分:衰減函數(shù)( Function Score: Decay Functions )
假設(shè),我們不是想通過(guò)一個(gè)字段的值逐漸增加得分针史,以獲取理想的結(jié)果晶伦。 舉例:價(jià)格范圍、數(shù)字字段范圍啄枕、日期范圍婚陪。 在我們的例子中,我們正在搜索2014年6月左右出版的“ search engines ”的書(shū)籍频祝。
GET bookdb_index/book/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "search engine",
"fields": ["title", "summary"]
}
},
"functions": [
{
"exp": {
"publish_date": {
"origin": "2014-06-15",
"scale": "30d",
"offset": "7d"
}
}
}
],
"boost_mode": "replace"
}
},
"_source": ["title", "summary", "publish_date", "num_reviews"]
}
[Results]
"hits": {
"total": 4,
"max_score": 0.22793062,
"hits": [
{
"_index": "bookdb_index",
"_type": "book",
"_id": "4",
"_score": 0.22793062,
"_source": {
"summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
"num_reviews": 23,
"title": "Solr in Action",
"publish_date": "2014-04-05"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "1",
"_score": 0.0049215667,
"_source": {
"summary": "A distibuted real-time search and analytics engine",
"num_reviews": 20,
"title": "Elasticsearch: The Definitive Guide",
"publish_date": "2015-02-07"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "2",
"_score": 0.000009612435,
"_source": {
"summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
"num_reviews": 12,
"title": "Taming Text: How to Find, Organize, and Manipulate It",
"publish_date": "2013-01-24"
}
},
{
"_index": "bookdb_index",
"_type": "book",
"_id": "3",
"_score": 0.0000049185574,
"_source": {
"summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
"num_reviews": 18,
"title": "Elasticsearch in Action",
"publish_date": "2015-12-03"
}
}
]
}
19泌参、Function得分:腳本得分( Function Score: Script Scoring )
在內(nèi)置計(jì)分功能不符合您需求的情況下脆淹,可以選擇指定用于評(píng)分的Groovy腳本
在我們的示例中,我們要指定一個(gè)考慮到publish_date的腳本沽一,然后再?zèng)Q定考慮多少評(píng)論盖溺。 較新的書(shū)籍可能沒(méi)有這么多的評(píng)論,所以他們不應(yīng)該為此付出“代價(jià)”
得分腳本如下所示:
publish_date = doc['publish_date'].value
num_reviews = doc['num_reviews'].value
if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) {
my_score = Math.log(2.5 + num_reviews)
} else {
my_score = Math.log(1 + num_reviews)
}
return my_score
要?jiǎng)討B(tài)使用評(píng)分腳本铣缠,我們使用script_score參數(shù)
GET /bookdb_index/book/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "search engine",
"fields": ["title","summary"]
}
},
"functions": [
{
"script_score": {
"script": {
"params": {
"threshold": "2015-07-30"
},
"lang": "groovy",
"source": "publish_date = doc['publish_date'].value; num_reviews = doc['num_reviews'].value; if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) { return log(2.5 + num_reviews) }; return log(1 + num_reviews);"
}
}
}
]
}
},
"_source": ["title","summary","publish_date", "num_reviews"]
}
注1:要使用動(dòng)態(tài)腳本烘嘱,必須為config / elasticsearch.yml文件中的Elasticsearch實(shí)例啟用它。 也可以使用已經(jīng)存儲(chǔ)在Elasticsearch服務(wù)器上的腳本蝗蛙。 查看 Elasticsearch reference docs 以獲取更多信息蝇庭。
注2: JSON不能包含嵌入的換行符,因此分號(hào)用于分隔語(yǔ)句捡硅。
原文作者: by Tim Ojo Aug. 05, 16 · Big Data Zone
原文地址:https://dzone.com/articles/23-useful-elasticsearch-example-queries
注意:ES6.3 怎樣啟用 groovy 腳本哮内?配置未成功
script.allowed_types: inline & script.allowed_contexts: search, update
Java API 實(shí)現(xiàn)
Java API 實(shí)現(xiàn)上面的查詢,代碼見(jiàn) https://github.com/whirlys/elastic-example/tree/master/UsefullESSearchSkill
參考文章:
銘毅天下:[譯]你必須知道的23個(gè)最有用的Elasticseaerch檢索技巧
英文原文:23 Useful Elasticsearch Example Queries
更多內(nèi)容請(qǐng)?jiān)L問(wèn)我的個(gè)人博客:http://laijianfeng.org/
打開(kāi)微信掃一掃壮韭,關(guān)注【小旋鋒】微信公眾號(hào)北发,及時(shí)接收博文推送
[圖片上傳失敗...(image-f6b04a-1535108551707)]