ES近義詞匹配

ES近義詞匹配搜索需要用戶提供一張滿足相應(yīng)格式的近義詞表，并在創(chuàng)建索引時設(shè)計將該表放入settings中致稀。

近義詞表的可以直接以字符串的形式寫入settings中也可以放入文本文件中冈闭，由es讀取。

近義詞表格式

近義詞表需要滿足以下格式要求：

A => B,C格式
- 這種格式在搜索時會將搜索詞A替換成B抖单、C萎攒，且B，C互不為同義詞
A,B,C,D 格式

這種格式得分情況討論：

當(dāng)expand == true時臭猜，這種格式等價于A,B,C,D => A,B,C,D即ABCD互為同義詞
當(dāng)expand == false時躺酒，這種格式等價于A,B,C,D => A，即ABCD四個詞在搜索時會被替換成A

如何使用近義詞表進(jìn)行查詢

建立索引

PUT /fond_goods
{
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1,
    "analysis": {
      "analyzer": {
        "my_whitespace":{
          "tokenizer":"whitespace",
          "filter": ["synonymous_filter"]
        }
      },
      "filter": {
        "synonymous_filter":{
          "type": "synonym",
          "expand": true
          "synonyms": [
            "A, B, C, D"
            ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "code":{
        "type": "keyword"
      },
      "context":{
        "type": "text",
        "analyzer": "my_whitespace"
      },
      "color":{
        "type": "text",
        "analyzer": "my_whitespace"
      }
    }
  }
}

參數(shù)解釋

expand 默認(rèn)值為 true 蔑歌。
lenient 默認(rèn)值為false 若lenient值為true羹应， es會忽略轉(zhuǎn)換近義詞文件時的報錯。值得注意的是次屠，只有當(dāng)遇到近義詞無法轉(zhuǎn)換時出現(xiàn)的異常才會被忽略掉园匹，具體例子可以參考官網(wǎng) [ https://www.elastic.co/guide/en/elasticsearch/reference/7.16/analysis-synonym-tokenfilter.html ]。
synonyms近義詞表劫灶，即開始所說要按格式填寫的近義詞表裸违。
synonyms也可替換成synonyms_path，此時需要填寫一個外部文件的路徑本昏。該文件可以是某個外部的網(wǎng)頁供汛，也可以是存放在本地的文件。
format 當(dāng)該參數(shù)值為wordnet時涌穆，可以使用wordnet英文詞匯數(shù)據(jù)庫中的近義詞怔昨。

使用案例

構(gòu)建索引

PUT /fond_goods
{
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1,
    "analysis": {
      "analyzer": {
        "my_whitespace":{................................................................ I
          "tokenizer":"whitespace",
          "filter": ["synonymous_filter"]
        }
      },
      "filter": {
        "synonymous_filter":{
          "type": "synonym",
          "synonyms_path": "synonym.txt"................................................. II
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "code":{
        "type": "keyword"
      },
      "context":{
        "type": "text",
        "analyzer": "my_whitespace"
      },
      "color":{
        "type": "text",
        "analyzer": "my_whitespace"
      }
    }
  }
}

注：

  I：`my_whitespace`為自定義分詞器

  II：此處的synonyms_path為es文件夾中以config文件夾為基準(zhǔn)的相對路徑

在相應(yīng)路徑中存入近義詞文件

Women,women,girl,girls
yellow,orange,wheat
blue,skyblue
white,snow,silver
dress,dresses,skirt,skirts
autumn,fall
shirt,shirts
A,B,C

存入測試數(shù)據(jù)

POST _bulk
{"index" : {"_index" : "fond_goods", "_id":1}}
{"code" : 1,"context" : "ruffled shirt for women 2021 fall slim fit pure color all matching off-neck lantern long sleeve slim women short shirt", "color": "red"}
{"index" : {"_index" : "fond_goods", "_id":2}}
{"code" : 2,"context" : "2021 warmth pullover sweater fall", "color": "blue"}
{"index" : {"_index" : "fond_goods", "_id":3}}
{"code" : 3,"context" : "early autumn elegant dress women dress 2021 autumn new long sleeve", "color": "yellow"}
{"index" : {"_index" : "fond_goods", "_id":4}}
{"code" : 4,"context" : "2021 autumn new  sweater yama autumn and winter female  autumn and winter dot cardigan knitted coat", "color": "snow"}
{"index" : {"_index" : "fond_goods", "_id":5}}
{"code" : 5,"context" : "za satin party dinner skirts suits woemn sexy bandage shirts and high split skirt elegant luxurious female dinner sets", "color": "white"}
{"index" : {"_index" : "fond_goods", "_id":6}}
{"code" : 6,"context" : "big bow tie sweet puff sleeve shirt dress long sleeve shirt skirt solid color shirt dress short skirt ", "color": "moss green"}
{"index" : {"_index" : "fond_goods", "_id":7}}
{"code" : 7,"context" : "casual button plaid short skirts women streetwear a-line summer skirts female high waist yellow autumn short skirts", "color": "skyblue "}
{"index" : {"_index" : "fond_goods", "_id":8}}
{"code" : 8,"context" : "muslim middle east women fashion dress abaya long dress muslim dress arab dress dres", "color": "orange"}
{"index" : {"_index" : "fond_goods", "_id":9}}
{"code" : 9,"context" : "sexy white party dresses autumn winter sexy mini dresses women fashion solid color off shoulder short", "color": "wheat"}
{"index" : {"_index" : "fond_goods", "_id":10}}
{"code" : 10,"context" : "women green patchwork buttons bodycon mini dresses all-match office ladies long shirt dresses autumn party vestidos new", "color": "silver"}
{"index" : {"_index" : "fond_goods_demo", "_id":11}}
{"code" : 11,"context" : "A", "color": "silver"}
{"index" : {"_index" : "fond_goods_demo", "_id":12}}
{"code" : 12,"context" : "B", "color": "silver"}
{"index" : {"_index" : "fond_goods_demo", "_id":13}}
{"code" : 13,"context" : "C", "color": "silver"}

簡單應(yīng)用

簡單嘗試一下近義詞庫查詢

查詢條件

GET fond_goods/_search
{
  "query": {
    "match": {
      "context": "A"
    }
  }
}

查詢結(jié)果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 2.7354302,
    "hits" : [
      {
        "_index" : "fond_goods",
        "_type" : "_doc",
        "_id" : "11",
        "_score" : 2.7354302,
        "_source" : {
          "code" : 11,
          "context" : "A",
          "color" : "silver"
        }
      },
      {
        "_index" : "fond_goods",
        "_type" : "_doc",
        "_id" : "12",
        "_score" : 2.7354302,
        "_source" : {
          "code" : 12,
          "context" : "B",
          "color" : "silver"
        }
      },
      {
        "_index" : "fond_goods",
        "_type" : "_doc",
        "_id" : "13",
        "_score" : 2.7354302,
        "_source" : {
          "code" : 13,
          "context" : "C",
          "color" : "silver"
        }
      }
    ]
  }
}

刪除數(shù)據(jù)

刪除語句

POST fond_goods/_delete_by_query
{
  "query": {
    "match": {
      "context": "A"
    }
  }
}

刪除結(jié)果

{
  "took" : 5,
  "timed_out" : false,
  "total" : 3,
  "deleted" : 3,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

我們一共插入了三條A、B宿稀、C這組同義詞的數(shù)據(jù)趁舀，一共刪除了三條數(shù)據(jù)；可以看出祝沸，在刪除時矮烹，我們也將A的近義詞B越庇、C給刪除了

結(jié)論

我們使用A為查詢條件，但結(jié)果中出現(xiàn)了B奉狈、C的數(shù)據(jù)卤唉，即近義詞查詢成功
我們以A為查詢條件，而結(jié)果的相關(guān)性打分中嘹吨，B搬味、C的得分與A一致境氢，即表明在查詢時蟀拷，A、B萍聊、C是完全等價的问芬，es的相關(guān)性打分無法做出區(qū)分
在根據(jù)條件刪除數(shù)據(jù)時，近義詞的數(shù)據(jù)也會一同刪除

動態(tài)更新近義詞文件

es本身提供的近義詞功能是在項(xiàng)目啟動時讀取近義詞表文件寿桨，并且每一次近義詞表文件有更新時都得重啟才能再次讀取此衅，這就給我們項(xiàng)目使用帶來了很大的不便性。

可以使用一款叫做 elasticsearch-analysis-dynamic-synonym的es插件來動態(tài)讀取近義詞文件

插件地址

https://github.com/bells/elasticsearch-analysis-dynamic-synonym

插件使用方法

插件使用方法在項(xiàng)目中有詳細(xì)介紹亭螟，這里簡單介紹一下

拷貝項(xiàng)目到本地
將項(xiàng)目打包
在es的 plugins/ 文件夾中新建dynamic-synonym文件夾
將target/releases/elasticsearch-analysis-dynamic-synonym-{version}.zip文件解壓到dynamic-synonym中
創(chuàng)建es索引時將同義詞配置中的"type": "synonym"

      "filter": {
        "synonymous_filter":{
          "type": "synonym",
          "synonyms_path": "synonym.txt"
        }
      }

修改成"type": "dynamic_synonym"

      "filter": {
        "synonymous_filter":{
          "type": "dynamic_synonym",
          "synonyms_path": "synonym.txt"
        }
      }

注：該插件還提供了一個可選參數(shù)interval挡鞍，即刷新同義詞文件時間間隔，默認(rèn)值為60s

他與原有操作一致预烙，至此墨微，每隔60s，es會自動獲取一次同義詞文件修改時間扁掸，如有變化翘县，es會重新載入同義詞文件

同義詞查詢原理

分詞

想了解同義詞查詢的原理就必須先了解es的分詞（Trem）。ES中的分詞（Analysis）就是把一段文本拆分成一系列的單詞谴分，也叫做文本分析锈麸。在es中，分析器（Analyzer）負(fù)責(zé)處理這一系列操作牺蹄。

分詞演示

ES的分詞器主要由字符過濾器（Character Filter）忘伞、分詞器（Tokenizer）、分詞過濾器（Token Filter）組成沙兰。

字符過濾器（Character Filter）
1. 以字符流的形式接受文本氓奈，并可以通過添加、刪除或更改字符來轉(zhuǎn)化文本僧凰。
2. 一個Analyzer可以由0個或多個字符過濾器
分詞器（Tokenizer）
1. 對經(jīng)過字符過濾器過濾后的文本按照一定規(guī)則分詞探颈。一個Analyzer只允許有一個分詞器
分詞過濾器（Token Filter）
1. 針對分詞后的token再次進(jìn)行過濾，可以增刪和修改token训措，一個分詞器中可以有多個token過濾器

同義詞過濾器

同義詞查詢的關(guān)鍵其實(shí)就是自定義Token過濾器伪节。該過濾器在收到分詞器發(fā)過來的數(shù)據(jù)（我暫時將其稱之為分詞數(shù)據(jù)）時光羞，會先讀取用戶存放的近義詞文件，比對分詞數(shù)據(jù)怀大。當(dāng)出現(xiàn)同義詞時纱兑，Token過濾器就按照近義詞文件配置的規(guī)則選定帶搜索詞組，進(jìn)行同義詞搜索化借。

我們可以拿之前的索引做個試驗(yàn)：我們的索引使用的是自定義的分析器my_whitespace潜慎，其中分詞器是whitespace空格分詞器，而token Filter 使用的是自定義的近義詞過濾器蓖康。由上述可知铐炫，我們自定義的分析器與官方自帶的whitespace分析器唯一的差別就在token Filter上。

我們使用官方的`whitespace`分析器來看一下分詞情況：

GET fond_goods/_analyze
{
  "analyzer": "whitespace",
  "field":"context", 
  "text": "A"
}

結(jié)果

{
  "tokens" : [
    {
      "token" : "A",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "word",
      "position" : 0
    }
  ]
}

在經(jīng)過分析器后蒜焊，字符A被分成了 "A"這一個分詞

再來嘗試一個長度更長的字符串

GET fond_goods/_analyze
{
  "analyzer": "whitespace",
  "field":"context", 
  "text": "ruffled shirt for women 2021 fall slim fit pure color all matching off-neck lantern long sleeve slim women short shirt"
}

結(jié)果

{
  "tokens" : [
    {
      "token" : "ruffled",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "shirt",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "for",
      "start_offset" : 14,
      "end_offset" : 17,
      "type" : "word",
      "position" : 2
    },
    {
      "token" : "women",
      "start_offset" : 18,
      "end_offset" : 23,
      "type" : "word",
      "position" : 3
    },
    {
      "token" : "2021",
      "start_offset" : 24,
      "end_offset" : 28,
      "type" : "word",
      "position" : 4
    },
    {
      "token" : "fall",
      "start_offset" : 29,
      "end_offset" : 33,
      "type" : "word",
      "position" : 5
    },
    {
      "token" : "slim",
      "start_offset" : 34,
      "end_offset" : 38,
      "type" : "word",
      "position" : 6
    },
    {
      "token" : "fit",
      "start_offset" : 39,
      "end_offset" : 42,
      "type" : "word",
      "position" : 7
    },
    {
      "token" : "pure",
      "start_offset" : 43,
      "end_offset" : 47,
      "type" : "word",
      "position" : 8
    },
    {
      "token" : "color",
      "start_offset" : 48,
      "end_offset" : 53,
      "type" : "word",
      "position" : 9
    },
    {
      "token" : "all",
      "start_offset" : 54,
      "end_offset" : 57,
      "type" : "word",
      "position" : 10
    },
    {
      "token" : "matching",
      "start_offset" : 58,
      "end_offset" : 66,
      "type" : "word",
      "position" : 11
    },
    {
      "token" : "off-neck",
      "start_offset" : 67,
      "end_offset" : 75,
      "type" : "word",
      "position" : 12
    },
    {
      "token" : "lantern",
      "start_offset" : 76,
      "end_offset" : 83,
      "type" : "word",
      "position" : 13
    },
    {
      "token" : "long",
      "start_offset" : 84,
      "end_offset" : 88,
      "type" : "word",
      "position" : 14
    },
    {
      "token" : "sleeve",
      "start_offset" : 89,
      "end_offset" : 95,
      "type" : "word",
      "position" : 15
    },
    {
      "token" : "slim",
      "start_offset" : 96,
      "end_offset" : 100,
      "type" : "word",
      "position" : 16
    },
    {
      "token" : "women",
      "start_offset" : 101,
      "end_offset" : 106,
      "type" : "word",
      "position" : 17
    },
    {
      "token" : "short",
      "start_offset" : 107,
      "end_offset" : 112,
      "type" : "word",
      "position" : 18
    },
    {
      "token" : "shirt",
      "start_offset" : 113,
      "end_offset" : 118,
      "type" : "word",
      "position" : 19
    }
  ]
}

結(jié)果

可以看到倒信，whitespace分析器將輸入字符串按照空格拆分成了如上結(jié)果

我們再來試試自定義的分析器

GET fond_goods/_analyze
{
  "analyzer": "my_whitespace",
  "field":"context", 
  "text": "A"
}

結(jié)果

{
  "tokens" : [
    {
      "token" : "A",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "B",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "SYNONYM",
      "position" : 0
    },
    {
      "token" : "C",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "SYNONYM",
      "position" : 0
    }
  ]
}

經(jīng)過分析器后，A這個字符被分成了 A泳梆、B鳖悠、C三個分詞，且在type字段上有作區(qū)分优妙，A被標(biāo)記為word乘综，B、C被標(biāo)記為SYNONYM

我們再嘗試一下長字符串（注：在近義詞文件中套硼，我們定義了shirt,shirts為一組近義詞卡辰；Women,women,girl,girls為一組近義詞）

GET fond_goods/_analyze
{
  "analyzer": "my_whitespace",
  "field":"context", 
  "text": "ruffled shirt for women 2021 fall slim fit pure color all matching off-neck lantern long sleeve slim women short shirt"
}

結(jié)果

{
  "tokens" : [
    {
      "token" : "ruffled",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "shirt",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "shirts",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "SYNONYM",
      "position" : 1
    },
    {
      "token" : "for",
      "start_offset" : 14,
      "end_offset" : 17,
      "type" : "word",
      "position" : 2
    },
    {
      "token" : "women",
      "start_offset" : 18,
      "end_offset" : 23,
      "type" : "word",
      "position" : 3
    },
    {
      "token" : "Women",
      "start_offset" : 18,
      "end_offset" : 23,
      "type" : "SYNONYM",
      "position" : 3
    },
    {
      "token" : "girl",
      "start_offset" : 18,
      "end_offset" : 23,
      "type" : "SYNONYM",
      "position" : 3
    },
    {
      "token" : "girls",
      "start_offset" : 18,
      "end_offset" : 23,
      "type" : "SYNONYM",
      "position" : 3
    },
    {
      "token" : "2021",
      "start_offset" : 24,
      "end_offset" : 28,
      "type" : "word",
      "position" : 4
    },
    {
      "token" : "fall",
      "start_offset" : 29,
      "end_offset" : 33,
      "type" : "word",
      "position" : 5
    },
    {
      "token" : "autumn",
      "start_offset" : 29,
      "end_offset" : 33,
      "type" : "SYNONYM",
      "position" : 5
    },
    {
      "token" : "slim",
      "start_offset" : 34,
      "end_offset" : 38,
      "type" : "word",
      "position" : 6
    },
    {
      "token" : "fit",
      "start_offset" : 39,
      "end_offset" : 42,
      "type" : "word",
      "position" : 7
    },
    {
      "token" : "pure",
      "start_offset" : 43,
      "end_offset" : 47,
      "type" : "word",
      "position" : 8
    },
    {
      "token" : "color",
      "start_offset" : 48,
      "end_offset" : 53,
      "type" : "word",
      "position" : 9
    },
    {
      "token" : "all",
      "start_offset" : 54,
      "end_offset" : 57,
      "type" : "word",
      "position" : 10
    },
    {
      "token" : "matching",
      "start_offset" : 58,
      "end_offset" : 66,
      "type" : "word",
      "position" : 11
    },
    {
      "token" : "off-neck",
      "start_offset" : 67,
      "end_offset" : 75,
      "type" : "word",
      "position" : 12
    },
    {
      "token" : "lantern",
      "start_offset" : 76,
      "end_offset" : 83,
      "type" : "word",
      "position" : 13
    },
    {
      "token" : "long",
      "start_offset" : 84,
      "end_offset" : 88,
      "type" : "word",
      "position" : 14
    },
    {
      "token" : "sleeve",
      "start_offset" : 89,
      "end_offset" : 95,
      "type" : "word",
      "position" : 15
    },
    {
      "token" : "slim",
      "start_offset" : 96,
      "end_offset" : 100,
      "type" : "word",
      "position" : 16
    },
    {
      "token" : "women",
      "start_offset" : 101,
      "end_offset" : 106,
      "type" : "word",
      "position" : 17
    },
    {
      "token" : "Women",
      "start_offset" : 101,
      "end_offset" : 106,
      "type" : "SYNONYM",
      "position" : 17
    },
    {
      "token" : "girl",
      "start_offset" : 101,
      "end_offset" : 106,
      "type" : "SYNONYM",
      "position" : 17
    },
    {
      "token" : "girls",
      "start_offset" : 101,
      "end_offset" : 106,
      "type" : "SYNONYM",
      "position" : 17
    },
    {
      "token" : "short",
      "start_offset" : 107,
      "end_offset" : 112,
      "type" : "word",
      "position" : 18
    },
    {
      "token" : "shirt",
      "start_offset" : 113,
      "end_offset" : 118,
      "type" : "word",
      "position" : 19
    },
    {
      "token" : "shirts",
      "start_offset" : 113,
      "end_offset" : 118,
      "type" : "SYNONYM",
      "position" : 19
    }
  ]
}

可以看到，shirt熟菲、women兩個字符串經(jīng)過分析器后被分詞為了shirt, shirts以及 women, Women, girl, girls兩組分詞看政，且都做了相應(yīng)標(biāo)識。

參考文章

同義詞搜索原理部分參考

https://blog.csdn.net/woshixubo123/article/details/121774972

以及

https://blog.csdn.net/woshixubo123/article/details/121898514

兩篇文章

其他均來自于官網(wǎng)或者自己舉的例子

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末抄罕，一起剝皮案震驚了整個濱河市允蚣，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌呆贿，老刑警劉巖嚷兔，帶你破解...
沈念sama閱讀 218,640評論 6贊 507
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異做入，居然都是意外死亡冒晰，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,254評論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門竟块，熙熙樓的掌柜王于貴愁眉苦臉地迎上來壶运，“玉大人，你說我怎么就攤上這事浪秘〗椋” “怎么了埠况？”我有些...
開封第一講書人閱讀 165,011評論 0贊 355
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長棵癣。經(jīng)常有香客問我辕翰，道長，這世上最難降的妖魔是什么狈谊？我笑而不...
開封第一講書人閱讀 58,755評論 1贊 294
?港島之戀（遺憾婚禮）
正文為了忘掉前任喜命，我火速辦了婚禮，結(jié)果婚禮上河劝，老公的妹妹穿的比我還像新娘壁榕。我一直安慰自己，他們只是感情好丧裁，可當(dāng)我...
茶點(diǎn)故事閱讀 67,774評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布护桦。她就那樣靜靜地躺著含衔，像睡著了一般煎娇。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上贪染，一...
開封第一講書人閱讀 51,610評論 1贊 305
城市分裂傳說
那天缓呛，我揣著相機(jī)與錄音，去河邊找鬼杭隙。笑死哟绊，一個胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的痰憎。我是一名探鬼主播票髓，決...
沈念sama閱讀 40,352評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼铣耘！你這毒婦竟也來了洽沟？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 39,257評論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤蜗细，失蹤者是張志新（化名）和其女友劉穎裆操，沒想到半個月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體炉媒，經(jīng)...
沈念sama閱讀 45,717評論 1贊 315
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡踪区，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,894評論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了吊骤。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片缎岗。...
茶點(diǎn)故事閱讀 40,021評論 1贊 350
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖白粉，靈堂內(nèi)的尸體忽然破棺而出传泊，到底是詐尸還是另有隱情茅郎，我是刑警寧澤，帶...
沈念sama閱讀 35,735評論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布或渤，位于F島的核電站系冗，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏薪鹦。R本人自食惡果不足惜掌敬，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,354評論 3贊 330
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望池磁。院中可真熱鬧奔害，春花似錦、人聲如沸地熄。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,936評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽端考。三九已至雅潭，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間却特，已是汗流浹背扶供。一陣腳步聲響...
開封第一講書人閱讀 33,054評論 1贊 270
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留裂明，地道東北人椿浓。一個月前我還...
沈念sama閱讀 48,224評論 3贊 371
代替公主和親
正文我出身青樓，卻偏偏與公主長得像闽晦，于是被迫代替她去往敵國和親扳碍。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,974評論 2贊 355

ES近義詞匹配

ES近義詞匹配

近義詞表格式

如何使用近義詞表進(jìn)行查詢

建立索引

參數(shù)解釋

使用案例

構(gòu)建索引

在相應(yīng)路徑中存入近義詞文件

存入測試數(shù)據(jù)

簡單應(yīng)用

簡單嘗試一下近義詞庫查詢

刪除數(shù)據(jù)

結(jié)論

動態(tài)更新近義詞文件

插件地址

插件使用方法

同義詞查詢原理

分詞

同義詞過濾器

我們使用官方的whitespace分析器來看一下分詞情況：

我們再來試試自定義的分析器

參考文章

推薦閱讀更多精彩內(nèi)容

我們使用官方的`whitespace`分析器來看一下分詞情況：