數(shù)據(jù)的關(guān)聯(lián)關(guān)系
-
真實(shí)世界中有很多重要的關(guān)聯(lián)關(guān)系
博客 / 作者 / 評(píng)論
銀?賬戶有多次交易記錄
客戶有多個(gè)銀?賬戶
?錄?件有多個(gè)?件和??錄
關(guān)系型數(shù)據(jù)庫(kù)的范式化設(shè)計(jì)
關(guān)系型數(shù)據(jù)庫(kù)的范式化設(shè)計(jì) |
---|
1NF – 消除?主屬性對(duì)鍵的部分函數(shù)依賴 |
2NF – 消除?主要屬性對(duì)鍵的傳遞函數(shù)依賴 |
3NF – 消除主屬性對(duì)鍵的傳遞函數(shù)依賴 |
BCNF –主屬性不依賴于主屬性 |
范式化設(shè)計(jì)(Normalization)的主要?標(biāo)是“減少不必要 的更新”
副作?:?個(gè)完全范式化設(shè)計(jì)的數(shù)據(jù)庫(kù)會(huì)經(jīng)常?臨 “查詢緩慢”的問題
數(shù)據(jù)庫(kù)越范式化,就需要 Join 越多的表
范式化節(jié)省了存儲(chǔ)空間,但是存儲(chǔ)空間卻越來(lái)越便宜
范式化簡(jiǎn)化了更新,但是數(shù)據(jù)“讀”取操作可能更多
Denormalization
-
反范式化設(shè)計(jì)
- 數(shù)據(jù) “Flattening”,不使?關(guān)聯(lián)關(guān)系胸哥,?是在?檔中保存冗余的數(shù)據(jù)拷?
-
優(yōu)點(diǎn):?需處理 Joins 操作,數(shù)據(jù)讀取性能好
- Elasticsearch 通過壓縮 _source 字段,減少磁盤空間的開銷
-
缺點(diǎn):不適合在數(shù)據(jù)頻繁修改的場(chǎng)景
- ?條數(shù)據(jù)(?戶名)的改動(dòng)火邓,可能會(huì)引起很多數(shù)據(jù)的更新
在 Elasticsearch 中處理關(guān)聯(lián)關(guān)系
-
關(guān)系型數(shù)據(jù)庫(kù),?般會(huì)考慮 Normalize 數(shù)據(jù)德撬;在 Elasticsearch铲咨,往往考慮 Denormalize 數(shù)據(jù)
- Denormalize 的好處:讀的速度變快 / ?需表連接 / ?需?鎖
-
Elasticsearch 并不擅?處理關(guān)聯(lián)關(guān)系。我們?般采?以下四種?法處理關(guān)聯(lián)
對(duì)象類型
嵌套對(duì)象(Nested Object)
??關(guān)聯(lián)關(guān)系(Parent / Child )
應(yīng)?端關(guān)聯(lián)
案例 1:博客和其作者信息
-
對(duì)象類型
在每?博客的?檔中都保留作者的信息
如果作者信息發(fā)?變化蜓洪,需要修改相關(guān)的 博客?檔
# 插入一條 Blog 信息
PUT blog/_doc/1
{
"content":"I like Elasticsearch",
"time":"2019-01-01T00:00:00",
"user":{
"userid":1,
"username":"Jack",
"city":"Shanghai"
}
}
- 通過?條查詢即可獲取到博客和作者信息
# 查詢 Blog 信息
POST blog/_search
{
"query": {
"bool": {
"must": [
{"match": {"content": "Elasticsearch"}},
{"match": {"user.username": "Jack"}}
]
}
}
}
res:
"max_score" : 0.5753642,
"hits" : [
{
"_index" : "blog",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"content" : "I like Elasticsearch",
"time" : "2019-01-01T00:00:00",
"user" : {
"userid" : 1,
"username" : "Jack",
"city" : "Shanghai"
}
}
}
]
案例 2:包含對(duì)象數(shù)組的?檔
DELETE my_movies
# 電影的Mapping信息
PUT my_movies
{
"mappings" : {
"properties" : {
"actors" : {
"properties" : {
"first_name" : {
"type" : "keyword"
},
"last_name" : {
"type" : "keyword"
}
}
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
# 寫入一條電影信息
POST my_movies/_doc/1
{
"title":"Speed",
"actors":[
{
"first_name":"Keanu",
"last_name":"Reeves"
},
{
"first_name":"Dennis",
"last_name":"Hopper"
}
]
}
# 查詢電影信息
POST my_movies/_search
{
"query": {
"bool": {
"must": [
{"match": {"actors.first_name": "Keanu"}},
{"match": {"actors.last_name": "Hopper"}}
]
}
}
}
res:
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.723315,
"hits" : [
{
"_index" : "my_movies",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.723315,
"_source" : {
"title" : "Speed",
"actors" : [
{
"first_name" : "Keanu",
"last_name" : "Reeves"
},
{
"first_name" : "Dennis",
"last_name" : "Hopper"
}
]
}
}
]
}
為什么會(huì)搜到不需要的結(jié)果纤勒?
存儲(chǔ)時(shí),內(nèi)部對(duì)象的邊界并沒有考慮在內(nèi)蝠咆,JSON 格式被處理成扁平式鍵值對(duì)的結(jié)構(gòu)
當(dāng)對(duì)多個(gè)字段進(jìn)?查詢時(shí)踊东,導(dǎo)致了意外的搜索結(jié)果
可以? Nested Data Type 解決這個(gè)問題
什么是 Nested Data Type
Nested 數(shù)據(jù)類型:允許對(duì)象數(shù)組中的 對(duì)象被獨(dú)?索引
使? nested 和 properties 關(guān)鍵字,將所有 actors 索引到多個(gè)分隔的?檔
在內(nèi)部刚操, Nested ?檔會(huì)被保存在兩個(gè) Lucene ?檔中闸翅,在查詢時(shí)做 Join 處理
DELETE my_movies
# 創(chuàng)建 Nested 對(duì)象 Mapping
PUT my_movies
{
"mappings" : {
"properties" : {
"actors" : {
"type": "nested",
"properties" : {
"first_name" : {"type" : "keyword"},
"last_name" : {"type" : "keyword"}
}},
"title" : {
"type" : "text",
"fields" : {"keyword":{"type":"keyword","ignore_above":256}}
}
}
}
}
嵌套查詢
- 在內(nèi)部, Nested ?檔會(huì)被保存在兩個(gè) Lucene ?檔中菊霜,會(huì)在查詢時(shí)做 Join 處理
# Nested 查詢
POST my_movies/_search
{
"query": {
"bool": {
"must": [
{"match": {"title": "Speed"}},
{
"nested": {
"path": "actors",
"query": {
"bool": {
"must": [
{"match": {
"actors.first_name": "Keanu"
}},
{"match": {
"actors.last_name": "Hopper"
}}
]
}
}
}
}
]
}
}
}
嵌套聚合
# 普通 aggregation不工作
POST my_movies/_search
{
"size": 0,
"aggs": {
"NAME": {
"terms": {
"field": "actors.first_name",
"size": 10
}
}
}
}
# Nested Aggregation
POST my_movies/_search
{
"size": 0,
"aggs": {
"actors": {
"nested": {
"path": "actors"
},
"aggs": {
"actor_name": {
"terms": {
"field": "actors.first_name",
"size": 10
}
}
}
}
}
}
res:
"aggregations" : {
"actors" : {
"doc_count" : 2,
"actor_name" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Dennis",
"doc_count" : 1
},
{
"key" : "Keanu",
"doc_count" : 1
}
]
}
}
}
本節(jié)知識(shí)點(diǎn)
-
在 Elasticsearch 中坚冀,往往會(huì) Denormalize 數(shù)據(jù)的?式建模(使?對(duì)象的?式)
- 好處是:讀寫的速度變快 / ?需表連接 / ?需?鎖
如果?檔的更新并不頻繁,可以在?檔中使?對(duì)象
-
當(dāng)對(duì)象包含了多值對(duì)象時(shí)
- 可以使?嵌套對(duì)象(Nested Object)解決查詢正確性的問題
課程demos
DELETE blog
# 設(shè)置blog的 Mapping
PUT /blog
{
"mappings": {
"properties": {
"content": {
"type": "text"
},
"time": {
"type": "date"
},
"user": {
"properties": {
"city": {
"type": "text"
},
"userid": {
"type": "long"
},
"username": {
"type": "keyword"
}
}
}
}
}
}
# 插入一條 Blog 信息
PUT blog/_doc/1
{
"content":"I like Elasticsearch",
"time":"2019-01-01T00:00:00",
"user":{
"userid":1,
"username":"Jack",
"city":"Shanghai"
}
}
# 查詢 Blog 信息
POST blog/_search
{
"query": {
"bool": {
"must": [
{"match": {"content": "Elasticsearch"}},
{"match": {"user.username": "Jack"}}
]
}
}
}
DELETE my_movies
# 電影的Mapping信息
PUT my_movies
{
"mappings" : {
"properties" : {
"actors" : {
"properties" : {
"first_name" : {
"type" : "keyword"
},
"last_name" : {
"type" : "keyword"
}
}
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
# 寫入一條電影信息
POST my_movies/_doc/1
{
"title":"Speed",
"actors":[
{
"first_name":"Keanu",
"last_name":"Reeves"
},
{
"first_name":"Dennis",
"last_name":"Hopper"
}
]
}
# 查詢電影信息
POST my_movies/_search
{
"query": {
"bool": {
"must": [
{"match": {"actors.first_name": "Keanu"}},
{"match": {"actors.last_name": "Hopper"}}
]
}
}
}
DELETE my_movies
# 創(chuàng)建 Nested 對(duì)象 Mapping
PUT my_movies
{
"mappings" : {
"properties" : {
"actors" : {
"type": "nested",
"properties" : {
"first_name" : {"type" : "keyword"},
"last_name" : {"type" : "keyword"}
}},
"title" : {
"type" : "text",
"fields" : {"keyword":{"type":"keyword","ignore_above":256}}
}
}
}
}
POST my_movies/_doc/1
{
"title":"Speed",
"actors":[
{
"first_name":"Keanu",
"last_name":"Reeves"
},
{
"first_name":"Dennis",
"last_name":"Hopper"
}
]
}
# Nested 查詢
POST my_movies/_search
{
"query": {
"bool": {
"must": [
{"match": {"title": "Speed"}},
{
"nested": {
"path": "actors",
"query": {
"bool": {
"must": [
{"match": {
"actors.first_name": "Keanu"
}},
{"match": {
"actors.last_name": "Hopper"
}}
]
}
}
}
}
]
}
}
}
# Nested Aggregation
POST my_movies/_search
{
"size": 0,
"aggs": {
"actors": {
"nested": {
"path": "actors"
},
"aggs": {
"actor_name": {
"terms": {
"field": "actors.first_name",
"size": 10
}
}
}
}
}
}
# 普通 aggregation不工作
POST my_movies/_search
{
"size": 0,
"aggs": {
"NAME": {
"terms": {
"field": "actors.first_name",
"size": 10
}
}
}
}
相關(guān)閱讀