樂優(yōu)商城學(xué)習(xí)筆記十一-Elasticsearch學(xué)習(xí)（三）

title:樂優(yōu)商城學(xué)習(xí)筆記十一-Elasticsearch學(xué)習(xí)（三）
date: 2019-04-18 09:49:18
tags:
- 樂優(yōu)商城
- java
- springboot
- Elasticsearch
categories:
- 樂優(yōu)商城

3.1.基本查詢：

基本語法

GET /索引庫名/_search
{
    "query":{
        "查詢類型":{
            "查詢條件":"查詢條件值"
        }
    }
}

這里的query代表一個查詢對象掖棉，里面可以有不同的查詢屬性

查詢類型：
- 例如：match_all类浪， match熬甚，term 驰吓， range 等等
查詢條件：查詢條件會根據(jù)類型的不同粘捎，寫法也有差異辆布，后面詳細(xì)講解

3.1.1 查詢所有（match_all)

示例：

GET /smallmartial/_search
{
    "query":{
        "match_all": {}
    }
}

query：代表查詢對象
match_all：代表查詢所有

結(jié)果：

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "2",
        "_score": 1,
        "_source": {
          "title": "大米手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2899
        }
      },
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "vV5xK2oBwnpoSx5Aac1y",
        "_score": 1,
        "_source": {
          "title": "小米手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2699
        }
      }
    ]
  }
}

took：查詢花費時間，單位是毫秒
time_out：是否超時
_shards：分片信息
hits：搜索結(jié)果總覽對象
- total：搜索到的總條數(shù)
- max_score：所有結(jié)果中文檔得分的最高分
- hits：搜索結(jié)果的文檔對象數(shù)組评也，每個元素是一條搜索到的文檔信息
  - _index：索引庫
  - _type：文檔類型
  - _id：文檔id
  - _score：文檔得分
  - _source：文檔的源數(shù)據(jù)

3.1.2 匹配查詢（match）

我們先加入一條數(shù)據(jù)炼杖，便于測試：

PUT /heima/goods/3
{
    "title":"小米電視4A",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":3899.00
}

現(xiàn)在，索引庫中有2部手機(jī)盗迟，1臺電視：

1526528746961

or關(guān)系

match類型查詢坤邪，會把查詢條件進(jìn)行分詞，然后進(jìn)行查詢,多個詞條之間是or的關(guān)系

PUT /smallmartial/goods/3
{
    "title":"小米電視4A",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":3899.00
}

結(jié)果：

{
  "_index": "smallmartial",
  "_type": "goods",
  "_id": "3",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 3,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 4,
  "_primary_term": 1
}

在上面的案例中罚缕，不僅會查詢到電視艇纺，而且與小米相關(guān)的都會查詢到，多個詞之間是or的關(guān)系怕磨。

and關(guān)系

某些情況下喂饥，我們需要更精確查找，我們希望這個關(guān)系變成and肠鲫，可以這樣做：

GET /goods/_search
{
    "query":{
        "match":{
            "title":{"query":"小米電視","operator":"and"}
        }
    }
}

結(jié)果：

[圖片上傳失敗...(image-d84a85-1555557672183)]

本例中，只有同時包含小米和電視的詞條才會被搜索到或粮。

or和and之間导饲？

在 or 與 and 間二選一有點過于非黑即白。如果用戶給定的條件分詞后有 5 個查詢詞項，想查找只包含其中 4 個詞的文檔渣锦，該如何處理硝岗？將 operator 操作符參數(shù)設(shè)置成 and 只會將此文檔排除。

有時候這正是我們期望的袋毙，但在全文搜索的大多數(shù)應(yīng)用場景下型檀，我們既想包含那些可能相關(guān)的文檔，同時又排除那些不太相關(guān)的听盖。換句話說胀溺，我們想要處于中間某種結(jié)果。

match 查詢支持 minimum_should_match 最小匹配參數(shù)皆看，這讓我們可以指定必須匹配的詞項數(shù)用來表示一個文檔是否相關(guān)仓坞。我們可以將其設(shè)置為某個具體數(shù)字，更常用的做法是將其設(shè)置為一個百分?jǐn)?shù)腰吟，因為我們無法控制用戶搜索時輸入的單詞數(shù)量：

GET /smallmartial/_search
{
    "query":{
        "match":{
            "title":{
                "query":"小米曲面電視",
                "minimum_should_match": "75%"
            }
        }
    }
}

本例中无埃，搜索語句可以分為3個詞，如果使用and關(guān)系毛雇，需要同時滿足3個詞才會被搜索到嫉称。這里我們采用最小品牌數(shù)：75%，那么也就是說只要匹配到總詞條數(shù)量的75%即可灵疮，這里3*75% 約等于2织阅。所以只要包含2個詞條就算滿足條件了。

結(jié)果：

{
  "took": 32,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.77041245,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "3",
        "_score": 0.77041245,
        "_source": {
          "title": "小米電視4A",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 3899
        }
      }
    ]
  }
}

3.1.3 多字段查詢（multi_match）

multi_match與match類似始藕，不同的是它可以在多個字段中查詢

GET /heima/_search
{
    "query":{
        "multi_match": {
            "query":    "小米",
            "fields":   [ "title", "subTitle" ]
        }
    }
}

本例中蒲稳，我們會在title字段和subtitle字段中查詢小米這個詞

3.1.4 詞條匹配(term)

term 查詢被用于精確值匹配，這些精確值可能是數(shù)字伍派、時間江耀、布爾或者那些未分詞的字符串

GET /smallmartial/_search
{
    "query":{
        "term":{
            "price":2699.00
        }
    }
}

結(jié)果：

{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "vV5xK2oBwnpoSx5Aac1y",
        "_score": 1,
        "_source": {
          "title": "小米手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2699
        }
      }
    ]
  }
}

3.1.5 多詞條精確匹配(terms)

terms 查詢和 term 查詢一樣，但它允許你指定多值進(jìn)行匹配诉植。如果這個字段包含了指定值中的任何一個值祥国，那么這個文檔滿足條件：

GET /smallmartial/_search
{
    "query":{
        "terms":{
            "price":[2699.00,2899.00,3899.00]
        }
    }
}

結(jié)果：

{
  "took": 14,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "2",
        "_score": 1,
        "_source": {
          "title": "大米手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2899
        }
      },
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "vV5xK2oBwnpoSx5Aac1y",
        "_score": 1,
        "_source": {
          "title": "小米手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2699
        }
      },
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "3",
        "_score": 1,
        "_source": {
          "title": "小米電視4A",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 3899
        }
      }
    ]
  }
}

3.2.結(jié)果過濾

默認(rèn)情況下，elasticsearch在搜索的結(jié)果中晾腔，會把文檔中保存在_source的所有字段都返回舌稀。

如果我們只想獲取其中的部分字段，我們可以添加_source的過濾

3.2.1.直接指定字段

示例：

GET /smallmartial/_search
{
  "_source": ["title","price"],
  "query": {
    "term": {
      "price": 2699
    }
  }
}

返回的結(jié)果：

{
  "took": 13,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "vV5xK2oBwnpoSx5Aac1y",
        "_score": 1,
        "_source": {
          "price": 2699,
          "title": "小米手機(jī)"
        }
      }
    ]
  }
}

3.2.2.指定includes和excludes

我們也可以通過：

includes：來指定想要顯示的字段
excludes：來指定不想要顯示的字段

二者都是可選的灼擂。

示例：

GET /smallmartial/_search
{
  "_source": {
    "includes":["title","price"]
  },
  "query": {
    "term": {
      "price": 2699
    }
  }
}

與下面的結(jié)果將是一樣的：

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "vV5xK2oBwnpoSx5Aac1y",
        "_score": 1,
        "_source": {
          "price": 2699,
          "title": "小米手機(jī)"
        }
      }
    ]
  }
}

3.3 高級查詢

3.3.1 布爾組合（bool)

bool把各種其它查詢通過must（與）壁查、must_not（非）、should（或）的方式進(jìn)行組合

GET /smallmartial/_search
{
    "query":{
        "bool":{
            "must":     { "match": { "title": "大米" }},
            "must_not": { "match": { "title":  "電視" }},
            "should":   { "match": { "title": "手機(jī)" }}
        }
    }
}

結(jié)果：

{
  "took": 22,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "2",
        "_score": 0.5753642,
        "_source": {
          "title": "大米手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2899
        }
      }
    ]
  }
}

3.3.2 范圍查詢(range)

range 查詢找出那些落在指定區(qū)間內(nèi)的數(shù)字或者時間

GET /smallmartial/_search
{
    "query":{
        "range": {
            "price": {
                "gte":  1000.0,
                "lt":   2800.00
            }
        }
    }
}

range查詢允許以下字符：

操作符	說明
gt	大于
gte	大于等于
lt	小于
lte	小于等于

3.3.3 模糊查詢(fuzzy)

我們新增一個商品：

POST /smallmartial/goods/4
{
    "title":"apple手機(jī)",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":6899.00
}

fuzzy 查詢是 term 查詢的模糊等價剔应。它允許用戶搜索詞條與實際詞條的拼寫出現(xiàn)偏差睡腿，但是偏差的編輯距離不得超過2：

GET /smallmartial/_search
{
  "query": {
    "fuzzy": {
      "title": "appla"
    }
  }
}

上面的查詢语御，也能查詢到apple手機(jī)

我們可以通過fuzziness來指定允許的編輯距離：

GET /smallmartial/_search
{
  "query": {
    "fuzzy": {
        "title": {
            "value":"appla",
            "fuzziness":1
        }
    }
  }
}

reslut

{
  "took": 37,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.55451775,
    "hits": [
      {
        "_index": "smallmartial",
        "_type": "goods",
        "_id": "4",
        "_score": 0.55451775,
        "_source": {
          "title": "apple手機(jī)",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 6899
        }
      }
    ]
  }
}

3.4 過濾(filter)

條件查詢中進(jìn)行過濾

所有的查詢都會影響到文檔的評分及排名。如果我們需要在查詢結(jié)果中進(jìn)行過濾席怪，并且不希望過濾條件影響評分应闯，那么就不要把過濾條件作為查詢條件來用。而是使用filter方式：

GET /smallamrtial/_search
{
    "query":{
        "bool":{
            "must":{ "match": { "title": "小米手機(jī)" }},
            "filter":{
                "range":{"price":{"gt":2000.00,"lt":3800.00}}
            }
        }
    }
}

注意：filter中還可以再次進(jìn)行bool組合條件過濾挂捻。

無查詢條件碉纺，直接過濾

如果一次查詢只有過濾，沒有查詢條件刻撒，不希望進(jìn)行評分骨田，我們可以使用constant_score取代只有 filter 語句的 bool 查詢。在性能上是完全相同的疫赎，但對于提高查詢簡潔性和清晰度有很大幫助盛撑。

GET /smallamrtial/_search
{
    "query":{
        "constant_score":   {
            "filter": {
                 "range":{"price":{"gt":2000.00,"lt":3000.00}}
            }
        }
}

3.5 排序

3.4.1 單字段排序

sort 可以讓我們按照不同的字段進(jìn)行排序，并且通過order指定排序的方式

GET /smallmartial/_search
{
  "query": {
    "match": {
      "title": "小米手機(jī)"
    }
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

3.4.2 多字段排序

假定我們想要結(jié)合使用 price和 _score（得分）進(jìn)行查詢捧搞，并且匹配的結(jié)果首先按照價格排序抵卫，然后按照相關(guān)性得分排序：

GET /goods/_search
{
    "query":{
        "bool":{
            "must":{ "match": { "title": "小米手機(jī)" }},
            "filter":{
                "range":{"price":{"gt":200000,"lt":300000}}
            }
        }
    },
    "sort": [
      { "price": { "order": "desc" }},
      { "_score": { "order": "desc" }}
    ]
}

4. 聚合aggregations

聚合可以讓我們極其方便的實現(xiàn)對數(shù)據(jù)的統(tǒng)計、分析胎撇。例如：

什么品牌的手機(jī)最受歡迎介粘？
這些手機(jī)的平均價格、最高價格晚树、最低價格姻采？
這些手機(jī)每月的銷售情況如何？

實現(xiàn)這些統(tǒng)計功能的比數(shù)據(jù)庫的sql要方便的多爵憎，而且查詢速度非晨祝快，可以實現(xiàn)近實時搜索效果宝鼓。

4.1 基本概念

Elasticsearch中的聚合刑棵，包含多種類型，最常用的兩種愚铡，一個叫桶蛉签，一個叫度量：

桶（bucket）

桶的作用，是按照某種方式對數(shù)據(jù)進(jìn)行分組沥寥，每一組數(shù)據(jù)在ES中稱為一個桶碍舍，例如我們根據(jù)國籍對人劃分，可以得到中國桶邑雅、英國桶片橡，日本桶……或者我們按照年齡段對人進(jìn)行劃分：0_10,1020,20_30,3040等。

Elasticsearch中提供的劃分桶的方式有很多：

Date Histogram Aggregation：根據(jù)日期階梯分組淮野，例如給定階梯為周锻全，會自動每周分為一組
Histogram Aggregation：根據(jù)數(shù)值階梯分組狂塘，與日期類似
Terms Aggregation：根據(jù)詞條內(nèi)容分組录煤，詞條內(nèi)容完全匹配的為一組
Range Aggregation：數(shù)值和日期的范圍分組鳄厌，指定開始和結(jié)束，然后按段分組
……

綜上所述妈踊，我們發(fā)現(xiàn)bucket aggregations 只負(fù)責(zé)對數(shù)據(jù)進(jìn)行分組了嚎，并不進(jìn)行計算，因此往往bucket中往往會嵌套另一種聚合：metrics aggregations即度量

度量（metrics）

分組完成以后廊营，我們一般會對組中的數(shù)據(jù)進(jìn)行聚合運算歪泳，例如求平均值、最大露筒、最小呐伞、求和等，這些在ES中稱為度量

比較常用的一些度量聚合方式：

Avg Aggregation：求平均值
Max Aggregation：求最大值
Min Aggregation：求最小值
Percentiles Aggregation：求百分比
Stats Aggregation：同時返回avg慎式、max伶氢、min、sum瘪吏、count等
Sum Aggregation：求和
Top hits Aggregation：求前幾
Value Count Aggregation：求總數(shù)
……

為了測試聚合癣防，我們先批量導(dǎo)入一些數(shù)據(jù)

創(chuàng)建索引：

PUT /cars
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "transactions": {
      "properties": {
        "color": {
          "type": "keyword"
        },
        "make": {
          "type": "keyword"
        }
      }
    }
  }
}

注意：在ES中，需要進(jìn)行聚合掌眠、排序蕾盯、過濾的字段其處理方式比較特殊，因此不能被分詞蓝丙。這里我們將color和make這兩個文字類型的字段設(shè)置為keyword類型级遭，這個類型不會被分詞，將來就可以參與聚合

導(dǎo)入數(shù)據(jù)

POST /cars/transactions/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }

4.2 聚合為桶

首先渺尘，我們按照汽車的顏色color來劃分桶

GET /cars/_search
{
    "size" : 0,
    "aggs" : { 
        "popular_colors" : { 
            "terms" : { 
              "field" : "color"
            }
        }
    }
}

size：查詢條數(shù)挫鸽，這里設(shè)置為0，因為我們不關(guān)心搜索到的數(shù)據(jù)沧烈，只關(guān)心聚合結(jié)果掠兄，提高效率
aggs：聲明這是一個聚合查詢，是aggregations的縮寫
- popular_colors：給這次聚合起一個名字锌雀，任意蚂夕。
  - terms：劃分桶的方式，這里是根據(jù)詞條劃分
    - field：劃分桶的字段

結(jié)果：

{
  "took": 40,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 4
        },
        {
          "key": "blue",
          "doc_count": 2
        },
        {
          "key": "green",
          "doc_count": 2
        }
      ]
    }
  }
}

hits：查詢結(jié)果為空腋逆，因為我們設(shè)置了size為0
aggregations：聚合的結(jié)果
popular_colors：我們定義的聚合名稱
buckets：查找到的桶婿牍，每個不同的color字段值都會形成一個桶
- key：這個桶對應(yīng)的color字段的值
- doc_count：這個桶中的文檔數(shù)量

通過聚合的結(jié)果我們發(fā)現(xiàn)，目前紅色的小車比較暢銷惩歉！

4.3 桶內(nèi)度量

前面的例子告訴我們每個桶里面的文檔數(shù)量等脂，這很有用俏蛮。但通常，我們的應(yīng)用需要提供更復(fù)雜的文檔度量上遥。例如搏屑，每種顏色汽車的平均價格是多少？

因此粉楚，我們需要告訴Elasticsearch使用哪個字段辣恋，使用何種度量方式進(jìn)行運算，這些信息要嵌套在桶內(nèi)模软，度量的運算會基于桶內(nèi)的文檔進(jìn)行

現(xiàn)在伟骨，我們?yōu)閯倓偟木酆辖Y(jié)果添加求價格平均值的度量：

GET /cars/_search
{
    "size" : 0,
    "aggs" : { 
        "popular_colors" : { 
            "terms" : { 
              "field" : "color"
            },
            "aggs":{
                "avg_price": { 
                   "avg": {
                      "field": "price" 
                   }
                }
            }
        }
    }
}

aggs：我們在上一個aggs(popular_colors)中添加新的aggs∪家欤可見度量也是一個聚合
avg_price：聚合的名稱
avg：度量的類型携狭，這里是求平均值
field：度量運算的字段

結(jié)果：


{
  "took": 35,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 4,
          "avg_price": {
            "value": 32500
          }
        },
        {
          "key": "blue",
          "doc_count": 2,
          "avg_price": {
            "value": 20000
          }
        },
        {
          "key": "green",
          "doc_count": 2,
          "avg_price": {
            "value": 21000
          }
        }
      ]
    }
  }
}

可以看到每個桶中都有自己的avg_price字段，這是度量聚合的結(jié)果

4.4 桶內(nèi)嵌套桶

剛剛的案例中回俐，我們在桶內(nèi)嵌套度量運算逛腿。事實上桶不僅可以嵌套運算，還可以再嵌套其它桶鲫剿。也就是說在每個分組中鳄逾，再分更多組。

比如：我們想統(tǒng)計每種顏色的汽車中灵莲，分別屬于哪個制造商雕凹，按照make字段再進(jìn)行分桶

GET /cars/_search
{
    "size" : 0,
    "aggs" : { 
        "popular_colors" : { 
            "terms" : { 
              "field" : "color"
            },
            "aggs":{
                "avg_price": { 
                   "avg": {
                      "field": "price" 
                   }
                },
                "maker":{
                    "terms":{
                        "field":"make"
                    }
                }
            }
        }
    }
}

原來的color桶和avg計算我們不變
maker：在嵌套的aggs下新添一個桶，叫做maker
terms：桶的劃分類型依然是詞條
filed：這里根據(jù)make字段進(jìn)行劃分

部分結(jié)果：

...
{"aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "red",
          "doc_count": 4,
          "maker": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "honda",
                "doc_count": 3
              },
              {
                "key": "bmw",
                "doc_count": 1
              }
            ]
          },
          "avg_price": {
            "value": 32500
          }
        },
        {
          "key": "blue",
          "doc_count": 2,
          "maker": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ford",
                "doc_count": 1
              },
              {
                "key": "toyota",
                "doc_count": 1
              }
            ]
          },
          "avg_price": {
            "value": 20000
          }
        },
      {
          "key": "green",
          "doc_count": 2,
          "maker": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ford",
                "doc_count": 1
              },
              {
                "key": "toyota",
                "doc_count": 1
              }
            ]
          },
          "avg_price": {
            "value": 21000
          }
        }
      ]
    }
  }
}
...

我們可以看到政冻，新的聚合maker被嵌套在原來每一個color的桶中枚抵。
每個顏色下面都根據(jù) make字段進(jìn)行了分組
我們能讀取到的信息：
- 紅色車共有4輛
- 紅色車的平均售價是 $32，500 美元明场。
- 其中3輛是 Honda 本田制造汽摹，1輛是 BMW 寶馬制造。

4.5.劃分桶的其它方式

前面講了苦锨，劃分桶的方式有很多逼泣，例如：

Date Histogram Aggregation：根據(jù)日期階梯分組，例如給定階梯為周舟舒，會自動每周分為一組
Histogram Aggregation：根據(jù)數(shù)值階梯分組拉庶，與日期類似
Terms Aggregation：根據(jù)詞條內(nèi)容分組，詞條內(nèi)容完全匹配的為一組
Range Aggregation：數(shù)值和日期的范圍分組秃励，指定開始和結(jié)束氏仗，然后按段分組

剛剛的案例中，我們采用的是Terms Aggregation夺鲜，即根據(jù)詞條劃分桶皆尔。

接下來呐舔，我們再學(xué)習(xí)幾個比較實用的：

4.5.1.階梯分桶Histogram

原理：

histogram是把數(shù)值類型的字段，按照一定的階梯大小進(jìn)行分組慷蠕。你需要指定一個階梯值（interval）來劃分階梯大小珊拼。

舉例：

比如你有價格字段，如果你設(shè)定interval的值為200砌们，那么階梯就會是這樣的：

0杆麸，200，400浪感，600，...

上面列出的是每個階梯的key饼问，也是區(qū)間的啟點影兽。

如果一件商品的價格是450，會落入哪個階梯區(qū)間呢莱革？計算公式如下：

bucket_key = Math.floor((value - offset) / interval) * interval + offset

value：就是當(dāng)前數(shù)據(jù)的值峻堰，本例中是450

offset：起始偏移量，默認(rèn)為0

interval：階梯間隔盅视，比如200

因此你得到的key = Math.floor((450 - 0) / 200) * 200 + 0 = 400

操作一下：

比如捐名，我們對汽車的價格進(jìn)行分組，指定間隔interval為5000：

GET /cars/_search
{
  "size":0,
  "aggs":{
    "price":{
      "histogram": {
        "field": "price",
        "interval": 5000
      }
    }
  }
}

結(jié)果：

{
  "took": 21,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "price": {
      "buckets": [
        {
          "key": 10000,
          "doc_count": 2
        },
        {
          "key": 15000,
          "doc_count": 1
        },
        {
          "key": 20000,
          "doc_count": 2
        },
        {
          "key": 25000,
          "doc_count": 1
        },
        {
          "key": 30000,
          "doc_count": 1
        },
        {
          "key": 35000,
          "doc_count": 0
        },
        {
          "key": 40000,
          "doc_count": 0
        },
        {
          "key": 45000,
          "doc_count": 0
        },
        {
          "key": 50000,
          "doc_count": 0
        },
        {
          "key": 55000,
          "doc_count": 0
        },
        {
          "key": 60000,
          "doc_count": 0
        },
        {
          "key": 65000,
          "doc_count": 0
        },
        {
          "key": 70000,
          "doc_count": 0
        },
        {
          "key": 75000,
          "doc_count": 0
        },
        {
          "key": 80000,
          "doc_count": 1
        }
      ]
    }
  }
}

你會發(fā)現(xiàn)闹击，中間有大量的文檔數(shù)量為0 的桶镶蹋，看起來很丑。

我們可以增加一個參數(shù)min_doc_count為1赏半，來約束最少文檔數(shù)量為1贺归，這樣文檔數(shù)量為0的桶會被過濾

示例：

GET /cars/_search
{
  "size":0,
  "aggs":{
    "price":{
      "histogram": {
        "field": "price",
        "interval": 5000,
        "min_doc_count": 1
      }
    }
  }
}

結(jié)果：

{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "price": {
      "buckets": [
        {
          "key": 10000,
          "doc_count": 2
        },
        {
          "key": 15000,
          "doc_count": 1
        },
        {
          "key": 20000,
          "doc_count": 2
        },
        {
          "key": 25000,
          "doc_count": 1
        },
        {
          "key": 30000,
          "doc_count": 1
        },
        {
          "key": 80000,
          "doc_count": 1
        }
      ]
    }
  }
}

完美，断箫！

如果你用kibana將結(jié)果變?yōu)橹螆D拂酣，會更好看：

[圖片上傳失敗...(image-f39260-1555557672184)]

餅圖

image

4.5.2.范圍分桶range

范圍分桶與階梯分桶類似，也是把數(shù)字按照階段進(jìn)行分組仲义，只不過range方式需要你自己指定每一組的起始和結(jié)束大小婶熬。

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個濱河市埃撵，隨后出現(xiàn)的幾起案子赵颅，更是在濱河造成了極大的恐慌，老刑警劉巖盯另，帶你破解...
沈念sama閱讀 212,816評論 6贊 492
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件性含，死亡現(xiàn)場離奇詭異，居然都是意外死亡鸳惯，警方通過查閱死者的電腦和手機(jī)商蕴，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,729評論 3贊 385
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門叠萍，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人绪商，你說我怎么就攤上這事苛谷。” “怎么了格郁？”我有些...
開封第一講書人閱讀 158,300評論 0贊 348
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵腹殿，是天一觀的道長。經(jīng)常有香客問我例书，道長锣尉，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 56,780評論 1贊 285
?港島之戀（遺憾婚禮）
正文為了忘掉前任决采，我火速辦了婚禮自沧，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘树瞭。我一直安慰自己拇厢，他們只是感情好，可當(dāng)我...
茶點故事閱讀 65,890評論 6贊 385
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布晒喷。她就那樣靜靜地躺著孝偎，像睡著了一般。火紅的嫁衣襯著肌膚如雪凉敲。梳的紋絲不亂的頭發(fā)上衣盾，一...
開封第一講書人閱讀 50,084評論 1贊 291
城市分裂傳說
那天，我揣著相機(jī)與錄音荡陷，去河邊找鬼雨效。笑死，一個胖子當(dāng)著我的面吹牛废赞，可吹牛的內(nèi)容都是我干的徽龟。我是一名探鬼主播，決...
沈念sama閱讀 39,151評論 3贊 410
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼唉地，長吁一口氣：“原來是場噩夢啊……” “哼据悔！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起耘沼，我...
開封第一講書人閱讀 37,912評論 0贊 268
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤极颓，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后群嗤，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體菠隆，經(jīng)...
沈念sama閱讀 44,355評論 1贊 303
?護(hù)林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 36,666評論 2贊 327
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了骇径。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片躯肌。...
茶點故事閱讀 38,809評論 1贊 341
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖破衔，靈堂內(nèi)的尸體忽然破棺而出清女，到底是詐尸還是另有隱情，我是刑警寧澤晰筛，帶...
沈念sama閱讀 34,504評論 4贊 334
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布嫡丙，位于F島的核電站，受9級特大地震影響读第，放射性物質(zhì)發(fā)生泄漏曙博。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點故事閱讀 40,150評論 3贊 317
男人毒藥：我在死后第九天來索命
文/蒙蒙一卦方、第九天我趴在偏房一處隱蔽的房頂上張望羊瘩。院中可真熱鬧，春花似錦盼砍、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,882評論 0贊 21
一樁弒父案浇坐，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至黔宛，卻和暖如春近刘，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背臀晃。一陣腳步聲響...
開封第一講書人閱讀 32,121評論 1贊 267
情欲美人皮
我被黑心中介騙來泰國打工觉渴，沒想到剛下飛機(jī)就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人徽惋。一個月前我還...
沈念sama閱讀 46,628評論 2贊 362
代替公主和親
正文我出身青樓案淋，卻偏偏與公主長得像，于是被迫代替她去往敵國和親险绘。傳聞我的和親對象是個殘疾皇子踢京，可洞房花燭夜當(dāng)晚...
茶點故事閱讀 43,724評論 2贊 351

樂優(yōu)商城學(xué)習(xí)筆記十一-Elasticsearch學(xué)習(xí)（三）

3.1.基本查詢：

3.1.1 查詢所有（match_all)

3.1.2 匹配查詢（match）

3.1.3 多字段查詢（multi_match）

3.1.4 詞條匹配(term)

3.1.5 多詞條精確匹配(terms)

3.2.結(jié)果過濾

3.2.1.直接指定字段

3.2.2.指定includes和excludes

3.3 高級查詢

3.3.1 布爾組合（bool)

3.3.2 范圍查詢(range)

3.3.3 模糊查詢(fuzzy)

3.4 過濾(filter)

3.5 排序

3.4.1 單字段排序

3.4.2 多字段排序

4. 聚合aggregations

4.1 基本概念

4.2 聚合為桶

4.3 桶內(nèi)度量

4.4 桶內(nèi)嵌套桶

4.5.劃分桶的其它方式

4.5.1.階梯分桶Histogram

4.5.2.范圍分桶range

推薦閱讀更多精彩內(nèi)容