23個(gè)最有用的Elasticseaerch檢索技巧

前言

本文主要介紹 Elasticsearch 23種最有用的檢索技巧,提供了詳盡的源碼舉例胸哥,并配有相應(yīng)的Java API實(shí)現(xiàn)劳吠,是不可多得的 Elasticsearch 學(xué)習(xí)&實(shí)戰(zhàn)資料

數(shù)據(jù)準(zhǔn)備

為了講解不同類型 ES 檢索,我們將要對(duì)包含以下類型的文檔集合進(jìn)行檢索:

title               標(biāo)題
authors             作者
summary             摘要
publish_date        發(fā)布日期
num_reviews         評(píng)論數(shù)
publisher           出版社

首先爪幻,我們借助 bulk API 批量創(chuàng)建新的索引并提交數(shù)據(jù)

# 設(shè)置索引 settings
PUT /bookdb_index
{ "settings": { "number_of_shards": 1 }}

# bulk 提交數(shù)據(jù)
POST /bookdb_index/book/_bulk
{"index":{"_id":1}}
{"title":"Elasticsearch: The Definitive Guide","authors":["clinton gormley","zachary tong"],"summary":"A distibuted real-time search and analytics engine","publish_date":"2015-02-07","num_reviews":20,"publisher":"oreilly"}
{"index":{"_id":2}}
{"title":"Taming Text: How to Find, Organize, and Manipulate It","authors":["grant ingersoll","thomas morton","drew farris"],"summary":"organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization","publish_date":"2013-01-24","num_reviews":12,"publisher":"manning"}
{"index":{"_id":3}}
{"title":"Elasticsearch in Action","authors":["radu gheorge","matthew lee hinman","roy russo"],"summary":"build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms","publish_date":"2015-12-03","num_reviews":18,"publisher":"manning"}
{"index":{"_id":4}}
{"title":"Solr in Action","authors":["trey grainger","timothy potter"],"summary":"Comprehensive guide to implementing a scalable search engine using Apache Solr","publish_date":"2014-04-05","num_reviews":23,"publisher":"manning"}

注意:本文實(shí)驗(yàn)使用的ES版本是 ES 6.3.0

1痛单、基本匹配檢索( Basic Match Query)

1.1 全文檢索

有兩種方式可以執(zhí)行全文檢索:

1)使用包含參數(shù)的檢索API嘿棘,參數(shù)作為URL的一部分

舉例:以下對(duì) "guide" 執(zhí)行全文檢索

GET bookdb_index/book/_search?q=guide

[Results]
  "hits": {
    "total": 2,
    "max_score": 1.3278645,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1.3278645,
        "_source": {
          "title": "Solr in Action",
          "authors": [
            "trey grainger",
            "timothy potter"
          ],
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "publish_date": "2014-04-05",
          "num_reviews": 23,
          "publisher": "manning"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 1.2871116,
        "_source": {
          "title": "Elasticsearch: The Definitive Guide",
          "authors": [
            "clinton gormley",
            "zachary tong"
          ],
          "summary": "A distibuted real-time search and analytics engine",
          "publish_date": "2015-02-07",
          "num_reviews": 20,
          "publisher": "oreilly"
        }
      }
    ]
  }

2)使用完整的ES DSL,其中Json body作為請(qǐng)求體
其執(zhí)行結(jié)果如方式 1)結(jié)果一致.

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "guide",
      "fields" : ["_all"]
    }
  }
}

解讀: 使用multi_match關(guān)鍵字代替match關(guān)鍵字旭绒,作為對(duì)多個(gè)字段運(yùn)行相同查詢的方便的簡(jiǎn)寫(xiě)方式鸟妙。 fields屬性指定要查詢的字段焦人,在這種情況下,我們要對(duì)文檔中的所有字段進(jìn)行查詢

注意:ES 6.x 默認(rèn)不啟用 _all 字段, 不指定 fields 默認(rèn)搜索為所有字段

1.2 指定特定字段檢索

這兩個(gè)API也允許您指定要搜索的字段重父。
例如花椭,要在標(biāo)題字段(title)中搜索帶有 "in action" 字樣的圖書(shū)

1)URL檢索方式

GET bookdb_index/book/_search?q=title:in action

[Results]
  "hits": {
    "total": 2,
    "max_score": 1.6323128,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 1.6323128,
        "_source": {
          "title": "Elasticsearch in Action",
          "authors": [
            "radu gheorge",
            "matthew lee hinman",
            "roy russo"
          ],
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
          "publish_date": "2015-12-03",
          "num_reviews": 18,
          "publisher": "manning"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1.6323128,
        "_source": {
          "title": "Solr in Action",
          "authors": [
            "trey grainger",
            "timothy potter"
          ],
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "publish_date": "2014-04-05",
          "num_reviews": 23,
          "publisher": "manning"
        }
      }
    ]
  }

2)DSL檢索方式
然而,full body的DSL為您提供了創(chuàng)建更復(fù)雜查詢的更多靈活性(我們將在后面看到)以及指定您希望的返回結(jié)果房午。在下面的示例中个从,我們指定要返回的結(jié)果數(shù)、偏移量(對(duì)分頁(yè)有用)歪沃、我們要返回的文檔字段以及屬性的高亮顯示。

結(jié)果數(shù)的表示方式:size
偏移值的表示方式:from
指定返回字段 的表示方式 :_source
高亮顯示 的表示方式 :highliaght

GET bookdb_index/book/_search
{
  "query": {
    "match": {
      "title": "in action"
    }
  },
  "size": 2,
  "from": 0,
  "_source": ["title", "summary", "publish_date"],
  "highlight": {
    "fields": {
      "title": {}
    }
  }
}

[Results]
  "hits": {
    "total": 2,
    "max_score": 1.6323128,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 1.6323128,
        "_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        },
        "highlight": {
          "title": [
            "Elasticsearch <em>in</em> <em>Action</em>"
          ]
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1.6323128,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        },
        "highlight": {
          "title": [
            "Solr <em>in</em> <em>Action</em>"
          ]
        }
      }
    ]
  }

注意:

  1. 對(duì)于 multi-word 檢索嫌松,匹配查詢?cè)试S您指定是否使用 and 運(yùn)算符沪曙,
    而不是使用默認(rèn) or 運(yùn)算符 ---> "operator" : "and"
  2. 您還可以指定 minimum_should_match 選項(xiàng)來(lái)調(diào)整返回結(jié)果的相關(guān)性,詳細(xì)信息可以在Elasticsearch指南中查詢Elasticsearch guide獲取萎羔。

2液走、多字段檢索 (Multi-field Search)

如我們已經(jīng)看到的,要在搜索中查詢多個(gè)文檔字段(例如在標(biāo)題和摘要中搜索相同的查詢字符串)贾陷,請(qǐng)使用multi_match查詢

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "guide", 
      "fields": ["title", "summary"]
    }
  }
}

[Results]
  "hits": {
    "total": 3,
    "max_score": 2.0281231,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 2.0281231,
        "_source": {
          "title": "Elasticsearch: The Definitive Guide",
          "authors": [
            "clinton gormley",
            "zachary tong"
          ],
          "summary": "A distibuted real-time search and analytics engine",
          "publish_date": "2015-02-07",
          "num_reviews": 20,
          "publisher": "oreilly"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1.3278645,
        "_source": {
          "title": "Solr in Action",
          "authors": [
            "trey grainger",
            "timothy potter"
          ],
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "publish_date": "2014-04-05",
          "num_reviews": 23,
          "publisher": "manning"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 1.0333893,
        "_source": {
          "title": "Elasticsearch in Action",
          "authors": [
            "radu gheorge",
            "matthew lee hinman",
            "roy russo"
          ],
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
          "publish_date": "2015-12-03",
          "num_reviews": 18,
          "publisher": "manning"
        }
      }
    ]
  }

注意:以上結(jié)果中文檔4(_id=4)匹配的原因是guide在summary存在缘眶。

3、 Boosting提升某字段得分的檢索( Boosting)

由于我們正在多個(gè)字段進(jìn)行搜索髓废,我們可能希望提高某一字段的得分巷懈。 在下面的例子中,我們將“摘要”字段的得分提高了3倍慌洪,以增加“摘要”字段的重要性顶燕,從而提高文檔 4 的相關(guān)性。

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "elasticsearch guide", 
      "fields": ["title", "summary^3"]
    }
  },
  "_source": ["title", "summary", "publish_date"]
}

[Results]
  "hits": {
    "total": 3,
    "max_score": 3.9835935,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 3.9835935,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 3.1001682,
        "_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 2.0281231,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      }
    ]
  }

注意:Boosting不僅意味著計(jì)算得分乘法以增加因子冈爹。 實(shí)際的提升得分值是通過(guò)歸一化和一些內(nèi)部?jī)?yōu)化涌攻。參考 Elasticsearch guide查看更多

4、Bool檢索( Bool Query)

可以使用 AND / OR / NOT 運(yùn)算符來(lái)微調(diào)我們的搜索查詢频伤,以提供更相關(guān)或指定的搜索結(jié)果恳谎。

在搜索API中是通過(guò)bool查詢來(lái)實(shí)現(xiàn)的。
bool查詢接受 must 參數(shù)(等效于AND)憋肖,一個(gè) must_not 參數(shù)(相當(dāng)于NOT)或者一個(gè) should 參數(shù)(等同于OR)因痛。

例如,如果我想在標(biāo)題中搜索一本名為 "Elasticsearch" 或 "Solr" 的書(shū)瞬哼,AND由 "clinton gormley" 創(chuàng)作婚肆,但NOT由 "radu gheorge" 創(chuàng)作

GET bookdb_index/book/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {"match": {"title": "Elasticsearch"}},
              {"match": {"title": "Solr"}}
            ]
          }
        },
        {
          "match": {"authors": "clinton gormely"}
        }
      ],
      "must_not": [
        {
          "match": {"authors": "radu gheorge"}
        }
      ]
    }
  }
}

[Results]
  "hits": {
    "total": 1,
    "max_score": 2.0749094,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 2.0749094,
        "_source": {
          "title": "Elasticsearch: The Definitive Guide",
          "authors": [
            "clinton gormley",
            "zachary tong"
          ],
          "summary": "A distibuted real-time search and analytics engine",
          "publish_date": "2015-02-07",
          "num_reviews": 20,
          "publisher": "oreilly"
        }
      }
    ]
  }

關(guān)于bool查詢中的should, 有兩種情況:

  • 當(dāng)should的同級(jí)存在must的時(shí)候坐慰,should中的條件可以滿足也可以不滿足较性,滿足的越多得分越高
  • 當(dāng)沒(méi)有must的時(shí)候用僧,默認(rèn)should中的條件至少要滿足一個(gè)

注意:您可以看到,bool查詢可以包含任何其他查詢類型赞咙,包括其他布爾查詢责循,以創(chuàng)建任意復(fù)雜或深度嵌套的查詢

5、 Fuzzy 模糊檢索( Fuzzy Queries)

在 Match檢索 和多匹配檢索中可以啟用模糊匹配來(lái)捕捉拼寫(xiě)錯(cuò)誤攀操。 基于與原始詞的 Levenshtein 距離來(lái)指定模糊度

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "comprihensiv guide",
      "fields": ["title","summary"],
      "fuzziness": "AUTO"
    }
  },
  "_source": ["title","summary","publish_date"],
  "size": 2
}

[Results]
  "hits": {
    "total": 2,
    "max_score": 2.4344182,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 2.4344182,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 1.2871116,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      }
    ]
  }

"AUTO" 的模糊值相當(dāng)于當(dāng)字段長(zhǎng)度大于5時(shí)指定值2院仿。但是,設(shè)置80%的拼寫(xiě)錯(cuò)誤的編輯距離為1速和,將模糊度設(shè)置為1可能會(huì)提高整體搜索性能歹垫。 有關(guān)更多信息, Typos and Misspellingsch

6颠放、 Wildcard Query 通配符檢索

通配符查詢?cè)试S您指定匹配的模式排惨,而不是整個(gè)詞組(term)檢索

  • ? 匹配任何字符
    • 匹配零個(gè)或多個(gè)字符

舉例碰凶,要查找具有以 "t" 字母開(kāi)頭的作者的所有記錄暮芭,如下所示

GET bookdb_index/book/_search
{
  "query": {
    "wildcard": {
      "authors": {
        "value": "t*"
      }
    }
  },
  "_source": ["title", "authors"],
  "highlight": {
    "fields": {
      "authors": {}
    }
  }
}

[Results]
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 1,
        "_source": {
          "title": "Elasticsearch: The Definitive Guide",
          "authors": [
            "clinton gormley",
            "zachary tong"
          ]
        },
        "highlight": {
          "authors": [
            "zachary <em>tong</em>"
          ]
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": 1,
        "_source": {
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "authors": [
            "grant ingersoll",
            "thomas morton",
            "drew farris"
          ]
        },
        "highlight": {
          "authors": [
            "<em>thomas</em> morton"
          ]
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1,
        "_source": {
          "title": "Solr in Action",
          "authors": [
            "trey grainger",
            "timothy potter"
          ]
        },
        "highlight": {
          "authors": [
            "<em>trey</em> grainger",
            "<em>timothy</em> potter"
          ]
        }
      }
    ]
  }

7、正則表達(dá)式檢索( Regexp Query)

正則表達(dá)式能指定比通配符檢索更復(fù)雜的檢索模式欲低,舉例如下:

POST bookdb_index/book/_search
{
  "query": {
    "regexp": {
      "authors": "t[a-z]*y"
    }
  },
  "_source": ["title", "authors"],
  "highlight": {
    "fields": {
      "authors": {}
    }
  }
}

[Results]
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1,
        "_source": {
          "title": "Solr in Action",
          "authors": [
            "trey grainger",
            "timothy potter"
          ]
        },
        "highlight": {
          "authors": [
            "<em>trey</em> grainger",
            "<em>timothy</em> potter"
          ]
        }
      }
    ]
  }

8辕宏、匹配短語(yǔ)檢索( Match Phrase Query)

匹配短語(yǔ)查詢要求查詢字符串中的所有詞都存在于文檔中,按照查詢字符串中指定的順序并且彼此靠近砾莱。

默認(rèn)情況下瑞筐,這些詞必須完全相鄰,但您可以指定偏離值(slop value)恤磷,該值指示在仍然考慮文檔匹配的情況下詞與詞之間的偏離值面哼。

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "search engine",
      "fields": ["title", "summary"],
      "type": "phrase",
      "slop": 3
    }
  },
  "_source": [ "title", "summary", "publish_date" ]
}

[Results]
  "hits": {
    "total": 2,
    "max_score": 0.88067603,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 0.88067603,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.51429313,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      }
    ]
  }

注意:在上面的示例中,對(duì)于非短語(yǔ)類型查詢扫步,文檔_id 1通常具有較高的分?jǐn)?shù)魔策,并且顯示在文檔_id 4之前,因?yàn)槠渥侄伍L(zhǎng)度較短河胎。

然而闯袒,作為一個(gè)短語(yǔ)查詢,詞與詞之間的接近度被考慮在內(nèi)游岳,所以文檔_id 4分?jǐn)?shù)更好

9政敢、匹配詞組前綴檢索

匹配詞組前綴查詢?cè)诓樵儠r(shí)提供搜索即時(shí)類型或 "相對(duì)簡(jiǎn)單" "的自動(dòng)完成版本,而無(wú)需以任何方式準(zhǔn)備數(shù)據(jù)胚迫。

像match_phrase查詢一樣喷户,它接受一個(gè)斜率參數(shù),使得單詞的順序和相對(duì)位置沒(méi)有那么 "嚴(yán)格"访锻。 它還接受max_expansions參數(shù)來(lái)限制匹配的條件數(shù)以減少資源強(qiáng)度

GET bookdb_index/book/_search
{
  "query": {
    "match_phrase_prefix": {
      "summary": {
        "query": "search en",
        "slop": 3,
        "max_expansions": 10
      }
    }
  },
  "_source": ["title","summary","publish_date"]
}

注意:查詢時(shí)間搜索類型具有性能成本褪尝。 一個(gè)更好的解決方案是將時(shí)間作為索引類型闹获。 更多相關(guān)API查詢 Completion Suggester API 或者 Edge-Ngram filters 。

10河哑、字符串檢索( Query String)

query_string查詢提供了以簡(jiǎn)明的簡(jiǎn)寫(xiě)語(yǔ)法執(zhí)行多匹配查詢 multi_match queries 避诽,布爾查詢 bool queries ,提升得分 boosting 璃谨,模糊匹配 fuzzy matching 沙庐,通配符 wildcards ,正則表達(dá)式 regexp 和范圍查詢 range queries 的方式佳吞。

在下面的例子中拱雏,我們對(duì) "search algorithm" 一詞執(zhí)行模糊搜索,其中一本作者是 "grant ingersoll" 或 "tom morton"底扳。 我們搜索所有字段古涧,但將提升應(yīng)用于文檔2的摘要字段

GET bookdb_index/book/_search
{
  "query": {
    "query_string": {
      "query": "(saerch~1 algorithm~1) AND (grant ingersoll)  OR (tom morton)",
      "fields": ["summary^2","title","authors","publisher"]
    }
  },
  "_source": ["title","summary","authors"],
  "highlight": {
    "fields": {
      "summary": {}
    }
  }
}

[Results]
  "hits": {
    "total": 1,
    "max_score": 3.571021,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": 3.571021,
        "_source": {
          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "authors": [
            "grant ingersoll",
            "thomas morton",
            "drew farris"
          ]
        },
        "highlight": {
          "summary": [
            "organize text using approaches such as full-text <em>search</em>, proper name recognition, clustering, tagging"
          ]
        }
      }
    ]
  }

11、簡(jiǎn)化的字符串檢索 (Simple Query String)

simple_query_string 查詢是 query_string 查詢的一個(gè)版本花盐,更適合用于暴露給用戶的單個(gè)搜索框,
因?yàn)樗謩e用 + / | / - 替換了 AND / OR / NOT 的使用菇爪,并放棄查詢的無(wú)效部分算芯,而不是在用戶出錯(cuò)時(shí)拋出異常。

GET bookdb_index/book/_search
{
  "query": {
    "simple_query_string": {
      "query": "(saerch~1 algorithm~1) + (grant ingersoll)  | (tom morton)",
      "fields": ["summary^2","title","authors","publisher"]
    }
  },
  "_source": ["title","summary","authors"],
  "highlight": {
    "fields": {
      "summary": {}
    }
  }
}

[Results]
# 結(jié)果同上

12凳宙、Term/Terms檢索(指定字段檢索)

上面1-11小節(jié)的例子是全文搜索的例子熙揍。 有時(shí)我們對(duì)結(jié)構(gòu)化搜索更感興趣,我們希望在其中找到完全匹配并返回結(jié)果

在下面的例子中氏涩,我們搜索 Manning Publications 發(fā)布的索引中的所有圖書(shū)(借助 term和terms查詢 )

GET bookdb_index/book/_search
{
  "query": {
    "term": {
      "publisher": {
        "value": "manning"
      }
    }
  },
  "_source" : ["title","publish_date","publisher"]
}

[Results]
  "hits": {
    "total": 3,
    "max_score": 0.35667494,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": 0.35667494,
        "_source": {
          "publisher": "manning",
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "publish_date": "2013-01-24"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 0.35667494,
        "_source": {
          "publisher": "manning",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 0.35667494,
        "_source": {
          "publisher": "manning",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      }
    ]
  }

Multiple terms可指定多個(gè)關(guān)鍵詞進(jìn)行檢索

GET bookdb_index/book/_search
{
  "query": {
    "terms": {
      "publisher": ["oreilly", "manning"]
    }
  }
}

13届囚、Term排序檢索-(Term Query - Sorted)

Term查詢和其他查詢一樣,輕松的實(shí)現(xiàn)排序是尖。多級(jí)排序也是允許的

GET bookdb_index/book/_search
{
  "query": {
    "term": {
      "publisher": {
        "value": "manning"
      }
    }
  },
  "_source" : ["title","publish_date","publisher"],
  "sort": [{"publisher.keyword": { "order": "desc"}},
    {"title.keyword": {"order": "asc"}}]
}

[Results]
  "hits": {
    "total": 3,
    "max_score": null,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": null,
        "_source": {
          "publisher": "manning",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        },
        "sort": [
          "manning",
          "Elasticsearch in Action"
        ]
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": null,
        "_source": {
          "publisher": "manning",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        },
        "sort": [
          "manning",
          "Solr in Action"
        ]
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": null,
        "_source": {
          "publisher": "manning",
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "publish_date": "2013-01-24"
        },
        "sort": [
          "manning",
          "Taming Text: How to Find, Organize, and Manipulate It"
        ]
      }
    ]
  }

注意:Elasticsearch 6.x 全文搜索用text類型的字段意系,排序用不用 text 類型的字段

14、范圍檢索(Range query)

另一個(gè)結(jié)構(gòu)化檢索的例子是范圍檢索饺汹。下面的舉例中蛔添,我們檢索了2015年發(fā)布的書(shū)籍。

GET bookdb_index/book/_search
{
  "query": {
    "range": {
      "publish_date": {
        "gte": "2015-01-01",
        "lte": "2015-12-31"
      }
    }
  },
  "_source" : ["title","publish_date","publisher"]
}

[Results]
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 1,
        "_source": {
          "publisher": "oreilly",
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 1,
        "_source": {
          "publisher": "manning",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      }
    ]
  }

注意:范圍查詢適用于日期兜辞,數(shù)字和字符串類型字段

15迎瞧、過(guò)濾檢索(Filtered query)

(5.0版本起已不再存在,不必關(guān)注)

過(guò)濾的查詢?cè)试S您過(guò)濾查詢的結(jié)果逸吵。 如下的例子凶硅,我們?cè)跇?biāo)題或摘要中查詢名為“Elasticsearch”的圖書(shū),但是我們希望將結(jié)果過(guò)濾到只有20個(gè)或更多評(píng)論的結(jié)果扫皱。

POST /bookdb_index/book/_search
{
    "query": {
        "filtered": {
            "query" : {
                "multi_match": {
                    "query": "elasticsearch",
                    "fields": ["title","summary"]
                }
            },
            "filter": {
                "range" : {
                    "num_reviews": {
                        "gte": 20
                    }
                }
            }
        }
    },
    "_source" : ["title","summary","publisher", "num_reviews"]
}


[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.5955761,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "publisher": "oreilly",
          "num_reviews": 20,
          "title": "Elasticsearch: The Definitive Guide"
        }
      }
    ]

注意:已過(guò)濾的查詢不要求存在要過(guò)濾的查詢足绅。 如果沒(méi)有指定查詢捷绑,則運(yùn)行 match_all 查詢,基本上返回索引中的所有文檔编检,然后對(duì)其進(jìn)行過(guò)濾胎食。
實(shí)際上,首先運(yùn)行過(guò)濾器允懂,減少需要查詢的表面積厕怜。 此外,過(guò)濾器在第一次使用后被緩存蕾总,這使得它非常有效

更新: 已篩選的查詢已推出的Elasticsearch 5.X版本中移除粥航,有利于布爾查詢。 這是與上面重寫(xiě)的使用bool查詢相同的示例生百。 返回的結(jié)果是完全一樣的递雀。

GET bookdb_index/book/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "elasticsearch",
            "fields": ["title","summary"]
          }
        }
      ],
      "filter": {
        "range": {
          "num_reviews": {
            "gte": 20
          }
        }
      }
    }
  },
  "_source" : ["title","summary","publisher", "num_reviews"]
}

16、多個(gè)過(guò)濾器檢索(Multiple Filters)

(5.x不再支持蚀浆,無(wú)需關(guān)注)
多個(gè)過(guò)濾器可以通過(guò)使用布爾過(guò)濾器進(jìn)行組合缀程。

在下一個(gè)示例中,過(guò)濾器確定返回的結(jié)果必須至少包含20個(gè)評(píng)論市俊,不得在2015年之前發(fā)布杨凑,并且應(yīng)該由oreilly發(fā)布

POST /bookdb_index/book/_search
{
    "query": {
        "filtered": {
            "query" : {
                "multi_match": {
                    "query": "elasticsearch",
                    "fields": ["title","summary"]
                }
            },
            "filter": {
                "bool": {
                    "must": {
                        "range" : { "num_reviews": { "gte": 20 } }
                    },
                    "must_not": {
                        "range" : { "publish_date": { "lte": "2014-12-31" } }
                    },
                    "should": {
                        "term": { "publisher": "oreilly" }
                    }
                }
            }
        }
    },
    "_source" : ["title","summary","publisher", "num_reviews", "publish_date"]
}


[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.5955761,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "publisher": "oreilly",
          "num_reviews": 20,
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      }
    ]

17、 Function 得分:Field值因子( Function Score: Field Value Factor)

可能有一種情況摆昧,您想要將文檔中特定字段的值納入相關(guān)性分?jǐn)?shù)的計(jì)算撩满。 這在您希望基于其受歡迎程度提升文檔的相關(guān)性的情況下是有代表性的場(chǎng)景

在我們的例子中,我們希望增加更受歡迎的書(shū)籍(按評(píng)論數(shù)量判斷)绅你。 這可以使用field_value_factor函數(shù)得分

GET bookdb_index/book/_search
{
  "query": {
    "function_score": {
      "query": {
        "multi_match": {
          "query": "search engine",
          "fields": ["title","summary"]
        }
      },
      "field_value_factor": {
        "field": "num_reviews",
        "modifier": "log1p",
        "factor": 2
      }
    }
  },
  "_source": ["title", "summary", "publish_date", "num_reviews"]
}

[Results]
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 1.5694137,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "num_reviews": 20,
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1.4725765,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "num_reviews": 23,
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 0.14181662,
        "_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
          "num_reviews": 18,
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": 0.13297246,
        "_source": {
          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
          "num_reviews": 12,
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "publish_date": "2013-01-24"
        }
      }
    ]
  }

注1:我們可以運(yùn)行一個(gè)常規(guī)的multi_match查詢伺帘,并按num_reviews字段排序,但是我們失去了相關(guān)性得分的好處忌锯。
注2:有許多附加參數(shù)可以調(diào)整對(duì)原始相關(guān)性分?jǐn)?shù)
(如“ modifier ”伪嫁,“ factor ”,“boost_mode”等)的增強(qiáng)效果的程度偶垮。
詳見(jiàn) Elasticsearch guide.

18礼殊、 Function 得分:衰減函數(shù)( Function Score: Decay Functions )

假設(shè),我們不是想通過(guò)一個(gè)字段的值逐漸增加得分针史,以獲取理想的結(jié)果晶伦。 舉例:價(jià)格范圍、數(shù)字字段范圍啄枕、日期范圍婚陪。 在我們的例子中,我們正在搜索2014年6月左右出版的“ search engines ”的書(shū)籍频祝。

GET bookdb_index/book/_search
{
  "query": {
    "function_score": {
      "query": {
        "multi_match": {
          "query": "search engine",
          "fields": ["title", "summary"]
        }
      },
      "functions": [
        {
          "exp": {
            "publish_date": {
              "origin": "2014-06-15",
              "scale": "30d",
              "offset": "7d"
            }
          }
        }
      ],
      "boost_mode": "replace"
    }
  },
  "_source": ["title", "summary", "publish_date", "num_reviews"]
}

[Results]
  "hits": {
    "total": 4,
    "max_score": 0.22793062,
    "hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 0.22793062,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "num_reviews": 23,
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.0049215667,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "num_reviews": 20,
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": 0.000009612435,
        "_source": {
          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
          "num_reviews": 12,
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "publish_date": "2013-01-24"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 0.0000049185574,
        "_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
          "num_reviews": 18,
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      }
    ]
  }

19泌参、Function得分:腳本得分( Function Score: Script Scoring )

在內(nèi)置計(jì)分功能不符合您需求的情況下脆淹,可以選擇指定用于評(píng)分的Groovy腳本

在我們的示例中,我們要指定一個(gè)考慮到publish_date的腳本沽一,然后再?zèng)Q定考慮多少評(píng)論盖溺。 較新的書(shū)籍可能沒(méi)有這么多的評(píng)論,所以他們不應(yīng)該為此付出“代價(jià)”

得分腳本如下所示:

publish_date = doc['publish_date'].value
num_reviews = doc['num_reviews'].value

if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) {
  my_score = Math.log(2.5 + num_reviews)
} else {
  my_score = Math.log(1 + num_reviews)
}
return my_score

要?jiǎng)討B(tài)使用評(píng)分腳本铣缠,我們使用script_score參數(shù)

GET /bookdb_index/book/_search
{
  "query": {
    "function_score": {
      "query": {
        "multi_match": {
          "query": "search engine",
          "fields": ["title","summary"]
        }
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "params": {
                "threshold": "2015-07-30"
              },  
              "lang": "groovy", 
              "source": "publish_date = doc['publish_date'].value; num_reviews = doc['num_reviews'].value; if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) { return log(2.5 + num_reviews) }; return log(1 + num_reviews);"
            }
          }
        }
      ]
    }
  },
  "_source": ["title","summary","publish_date", "num_reviews"]
}

注1:要使用動(dòng)態(tài)腳本烘嘱,必須為config / elasticsearch.yml文件中的Elasticsearch實(shí)例啟用它。 也可以使用已經(jīng)存儲(chǔ)在Elasticsearch服務(wù)器上的腳本蝗蛙。 查看 Elasticsearch reference docs 以獲取更多信息蝇庭。
注2: JSON不能包含嵌入的換行符,因此分號(hào)用于分隔語(yǔ)句捡硅。
原文作者: by Tim Ojo Aug. 05, 16 · Big Data Zone
原文地址:https://dzone.com/articles/23-useful-elasticsearch-example-queries

注意:ES6.3 怎樣啟用 groovy 腳本哮内?配置未成功
script.allowed_types: inline & script.allowed_contexts: search, update

Java API 實(shí)現(xiàn)

Java API 實(shí)現(xiàn)上面的查詢,代碼見(jiàn) https://github.com/whirlys/elastic-example/tree/master/UsefullESSearchSkill

參考文章:
銘毅天下:[譯]你必須知道的23個(gè)最有用的Elasticseaerch檢索技巧
英文原文:23 Useful Elasticsearch Example Queries


更多內(nèi)容請(qǐng)?jiān)L問(wèn)我的個(gè)人博客:http://laijianfeng.org/

打開(kāi)微信掃一掃壮韭,關(guān)注【小旋鋒】微信公眾號(hào)北发,及時(shí)接收博文推送

[圖片上傳失敗...(image-f6b04a-1535108551707)]

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市喷屋,隨后出現(xiàn)的幾起案子鲫竞,更是在濱河造成了極大的恐慌,老刑警劉巖逼蒙,帶你破解...
    沈念sama閱讀 212,383評(píng)論 6 493
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異寄疏,居然都是意外死亡是牢,警方通過(guò)查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,522評(píng)論 3 385
  • 文/潘曉璐 我一進(jìn)店門(mén)陕截,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)驳棱,“玉大人,你說(shuō)我怎么就攤上這事农曲∩缃粒” “怎么了?”我有些...
    開(kāi)封第一講書(shū)人閱讀 157,852評(píng)論 0 348
  • 文/不壞的土叔 我叫張陵乳规,是天一觀的道長(zhǎng)形葬。 經(jīng)常有香客問(wèn)我,道長(zhǎng)暮的,這世上最難降的妖魔是什么笙以? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 56,621評(píng)論 1 284
  • 正文 為了忘掉前任,我火速辦了婚禮冻辩,結(jié)果婚禮上猖腕,老公的妹妹穿的比我還像新娘拆祈。我一直安慰自己,他們只是感情好倘感,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,741評(píng)論 6 386
  • 文/花漫 我一把揭開(kāi)白布放坏。 她就那樣靜靜地躺著,像睡著了一般老玛。 火紅的嫁衣襯著肌膚如雪淤年。 梳的紋絲不亂的頭發(fā)上,一...
    開(kāi)封第一講書(shū)人閱讀 49,929評(píng)論 1 290
  • 那天逻炊,我揣著相機(jī)與錄音互亮,去河邊找鬼。 笑死余素,一個(gè)胖子當(dāng)著我的面吹牛豹休,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播桨吊,決...
    沈念sama閱讀 39,076評(píng)論 3 410
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼威根,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了视乐?” 一聲冷哼從身側(cè)響起洛搀,我...
    開(kāi)封第一講書(shū)人閱讀 37,803評(píng)論 0 268
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎佑淀,沒(méi)想到半個(gè)月后留美,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 44,265評(píng)論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡伸刃,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,582評(píng)論 2 327
  • 正文 我和宋清朗相戀三年谎砾,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片捧颅。...
    茶點(diǎn)故事閱讀 38,716評(píng)論 1 341
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡景图,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出碉哑,到底是詐尸還是另有隱情挚币,我是刑警寧澤,帶...
    沈念sama閱讀 34,395評(píng)論 4 333
  • 正文 年R本政府宣布扣典,位于F島的核電站妆毕,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏贮尖。R本人自食惡果不足惜设塔,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 40,039評(píng)論 3 316
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧闰蛔,春花似錦痕钢、人聲如沸。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 30,798評(píng)論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至例诀,卻和暖如春随抠,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背繁涂。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 32,027評(píng)論 1 266
  • 我被黑心中介騙來(lái)泰國(guó)打工拱她, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留错邦,地道東北人仅偎。 一個(gè)月前我還...
    沈念sama閱讀 46,488評(píng)論 2 361
  • 正文 我出身青樓秃臣,卻偏偏與公主長(zhǎng)得像欠动,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子只嚣,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,612評(píng)論 2 350

推薦閱讀更多精彩內(nèi)容