URI Search
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty&q=last_name:Smith'
$ curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty&q=last_name:Smith'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
}
]
}
}
Request Body Search
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"query": {
"bool": {
"must": [
{ "match" : { "last_name" : "Smith" } }
],
"filter": [
{ "range" : { "age" : { "gt" : 10} } }
]
}
}
}
'
######Response
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
}
]
}
}
>* 分頁 (Pagination)
The **from** parameter defines the **offset** from the first result you want to fetch. The **size** parameter allows you to configure the **maximum amount** of hits to be returned.
Though **from** and **size** can be set as request parameters, they can also be set within the search body. **from** defaults to 0, and **size** defaults to 10.
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"from" : 0, "size" : 1,
"query": {
"bool": {
"must": [
{ "match" : { "last_name" : "Smith" } }
],
"filter": [
{ "range" : { "age" : { "gt" : 10} } }
]
}
}
}
'
######Response
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
}
]
}
}
>* 排序 (Sort)
The **from** parameter defines the **offset** from the first result you want to fetch. The **size** parameter allows you to configure the **maximum amount** of hits to be returned.
Though **from** and **size** can be set as request parameters, they can also be set within the search body. **from** defaults to 0, and **size** defaults to 10.
[查看更多用法](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html)
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"sort" : [
{ "age" : {"order" : "asc"}},
"_score"
],
"query": {
"bool": {
"must": [
{ "match" : {
"last_name" : "Smith"
}}
],
"filter": [
{ "range" : {
"age" : { "gt" : 10 }
}
}
]
}
}
}
'
##全文搜索
搜索所有喜歡 rock climbing 的員工:
>
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"query" : {
"match" : {
"about" : "rock climbing"
}
}
}
'
你會(huì)發(fā)現(xiàn)我們同樣使用了 match 查詢來搜索 about 字段中的 rock climbing英染。我們會(huì)得到兩個(gè)匹配的文檔:
######Response
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.53484553,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.53484553,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.26742277,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
}
]
}
}
通常情況下杠览,Elasticsearch 會(huì)通過相關(guān)性來排列順序,第一個(gè)結(jié)果中轰传,John Smith 的 about 字段中明確地寫到 rock climbing幔嫂。而在 Jane Smith 的 about 字段中辆它,提及到了 rock,但是并沒有提及到 climbing履恩,所以后者的 _score 就要比前者的低锰茉。
這個(gè)例子很好地解釋了 Elasticsearch 是如何執(zhí)行全文搜索的。對于 Elasticsearch 來說似袁,相關(guān)性的概念是很重要的洞辣,而這也是它與傳統(tǒng)數(shù)據(jù)庫在返回匹配數(shù)據(jù)時(shí)最大的不同之處。
* 段落搜索
能夠找出每個(gè)字段中的獨(dú)立單詞固然很好昙衅,但是有的時(shí)候你可能還需要去匹配精確的短語或者 段落扬霜。例如,我們只需要查詢到 about 字段只包含 rock climbing 的短語的員工而涉。
為了實(shí)現(xiàn)這個(gè)效果著瓶,我們將對 match 查詢變?yōu)?match_phrase 查詢:
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
}
}
'
######Response
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.53484553,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.53484553,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
}
]
}
}
* 高亮我們的搜索
很多程序希望能在搜索結(jié)果中 高亮 匹配到的關(guān)鍵字來告訴用戶這個(gè)文檔是 如何 匹配他們的搜索的。在 Elasticsearch 中找到高亮片段是非常容易的啼县。
讓我們回到之前的查詢材原,但是添加一個(gè) highlight 參數(shù):
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
},
"highlight": {
"fields" : {
"about" : {}
}
}
}
'
當(dāng)我們運(yùn)行這個(gè)查詢后沸久,相同的命中結(jié)果會(huì)被返回,但是我們會(huì)得到一個(gè)新的名叫 highlight 的部分余蟹。在這里包含了 about 字段中的匹配單詞卷胯,并且會(huì)被 <em></em> HTML字符包裹住:
######Response
{
"took" : 30,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.53484553,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.53484553,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
},
"highlight" : {
"about" : [
"I love to go <em>rock</em> <em>climbing</em>"
]
}
}
]
}
}
##統(tǒng)計(jì)
Elasticsearch 把這項(xiàng)功能稱作 匯總 (aggregations)威酒,通過這個(gè)功能窑睁,我們可以針對你的數(shù)據(jù)進(jìn)行復(fù)雜的統(tǒng)計(jì)。這個(gè)功能有些類似于 SQL 中的 GROUP BY葵孤,但是要比它更加強(qiáng)大担钮。
例如,讓我們找一下員工中最受歡迎的興趣是什么:
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"aggs": {
"all_interests": {
"terms": { "field": "interests" }
}
}
}
'
可能會(huì)出現(xiàn)如下錯(cuò)誤:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "megacorp",
"node" : "qm6aUUoUScO_S16Sod_7Bw",
"reason" : {
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
}
}
],
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
}
},
"status" : 400
}
解決這個(gè)問題需要在該字段上設(shè)置Fileddata=true尤仍,默認(rèn)是禁用的箫津;
[解決方案](https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html#_fielddata_is_disabled_on_literal_text_literal_fields_by_default)
curl -XPUT 'http://10.213.10.30:10920/megacorp/_mapping/employee?pretty' -d'
{
"properties": {
"interests": {
"type": "text",
"fielddata": true
}
}
}
'
#####Response:
{
"acknowledged" : true
}
然后,讓我們重新找一下員工中最受歡迎的興趣是什么:
curl -XGET 'http://10.213.10.30:10920/megacorp/employee/_search?pretty' -d '
{
"aggs": {
"all_interests": {
"terms": { "field": "interests" }
}
}
}
{
"took" : 19,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"first_name" : "Douglas",
"last_name" : "Fir",
"age" : 35,
"about" : "I like to build cabinets",
"interests" : [
"forestry"
]
}
}
]
},
"aggregations" : {
"all_interests" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "music",
"doc_count" : 2
},
{
"key" : "forestry",
"doc_count" : 1
},
{
"key" : "sports",
"doc_count" : 1
}
]
}
}
}
'
文檔:
http://www.cnblogs.com/muniaofeiyu/p/5616316.html
http://blog.csdn.net/ty_0930/article/details/52266611