大致學(xué)習(xí)了elasticsearch后,我在想发乔,能不能用elasticsearch來做熱詞分析呢熟妓?這樣,就可以看到特定業(yè)務(wù)栏尚,比如玩家反饋起愈、玩家世界聊天、錯(cuò)誤日志中译仗,出現(xiàn)集中的詞語(yǔ)抬虽,可以了解到非常多的信息。elasticsearch如何做熱詞分析呢纵菌?對(duì)于英文分析來說斥赋,非常簡(jiǎn)單,使用默認(rèn)的分詞器,外加聚合操作即可产艾。
<pre>
curl localhost:9200/top-terms/_search?pretty -d '{
"aggs": {
"top-terms-aggregation": {
"terms": { "field" : "text","size":5 }
}
}
}'
</pre>
測(cè)試腳本如下:
<pre>
!/bin/sh
test_document="{
"text": "a this is pen dst is a apple"
}"
test_document1="{
"text": "a blue is always pen dst apple"
}"
test_document2="{
"text": "a hello world "
}"
if curl -fs -X HEAD localhost:9200/top-terms; then
echo "Clear the old test index"
curl -X DELETE localhost:9200/top-terms; echo "\n"
fi
echo "Create our first test index"
curl -X POST localhost:9200/top-terms; echo "\n"
echo "Index our test document"
curl -X POST localhost:9200/top-terms/test/1?refresh=true -d "${test_document}"; echo "\n"
curl -X POST localhost:9200/top-terms/test/2?refresh=true -d "${test_document1}"; echo "\n"
curl -X POST localhost:9200/top-terms/test/3?refresh=true -d "${test_document2}"; echo "\n"
echo "Our first test, aggregations, only counts the number of documents that a term matches."
curl localhost:9200/top-terms/_search?pretty -d '{
"aggs": {
"top-terms-aggregation": {
"terms": { "field" : "text","size":5 }
}
}
}'
echo
</pre>
注意疤剑,
<pre>
"terms": { "field" : "text","size":5 }這里選出top 5的詞
</pre>
執(zhí)行結(jié)果如下
<pre>
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "top-terms",
"_type" : "test",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"text" : "a blue is always pen dst apple"
}
}, {
"_index" : "top-terms",
"_type" : "test",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"text" : "a this is pen dst is a apple"
}
}, {
"_index" : "top-terms",
"_type" : "test",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"text" : "a hello world "
}
} ]
},
"aggregations" : {
"top-terms-aggregation" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 5,
"buckets" : [ {
"key" : "a",
"doc_count" : 3
}, {
"key" : "apple",
"doc_count" : 2
}, {
"key" : "dst",
"doc_count" : 2
}, {
"key" : "is",
"doc_count" : 2
}, {
"key" : "pen",
"doc_count" : 2
} ]
}
}
}
</pre>
注意buckets的結(jié)果就是我們需要的結(jié)果