Hive一行拆多行、多行拼一行

一行拆多行結(jié)合使用了lateral viewexplode焕窝。
多行拼一行結(jié)合使用了concat_wscollect_set棒呛,某些hive版本支持group_concat可以代替前者。

舉例說明:
有兩張表sen_tbl(敏感字段定義表)告唆、msg_tbl(消息表)莫秆。其中消息表的消息是一個json字符串,最多有兩層json嵌套悔详。而json消息中的某些value是含有敏感信息的,key-value是否敏感的定義存放在sen_tbl中惹挟。

現(xiàn)在要做的是將json消息中的敏感value加密茄螃,然后分成不含敏感的json,和只含敏感的json连锯。所以需要分為兩步:

  1. 拆分json归苍,并加密敏感value
  2. 分別合并敏感和非敏感鍵值對用狱,合并成2個json,放在不同的列
1. 建表拼弃,插入數(shù)據(jù):
create table sen_tbl (
    col_nam string,  --字段(key)名稱
    is_sen string  --是否敏感:1敏感夏伊,2不敏感
);

insert into sen_tbl values('CRDACCPTNMELCT','1');  --敏感
insert into sen_tbl values('TXNMERCHNO','1');  --敏感
insert into sen_tbl values('ACCTNBR','0');
insert into sen_tbl values('CRSERVICECODE','0');
insert into sen_tbl values('modelFileId','0');
insert into sen_tbl values('trs_ist','0');


drop table if exists msg_tbl;
create table msg_tbl (
    msg_id string,  --消息id
    json_msg string  --消息主體
);


insert into msg_tbl values(
'1',
'{
    "sceneParams": {
        "ACCTNBR": "0100020100001001",
        "CRDACCPTNMELCT": "如家和美酒店管理有限公司如家快捷天壇店",
        "CRSERVICECODE": "340",
        "TXNMERCHNO": "303605280000102"
    },
    "modelFileId": 2,
    "modelFileThreshold": {
        "trs_ist": 0.4
    }
}'
);

insert into msg_tbl values(
'2',
'{
    "sceneParams": {
        "ACCTNBR": "0100020100001222",
        "CRDACCPTNMELCT": "招商銀行",
        "CRSERVICECODE": "340",
        "TXNMERCHNO": "303605285555555"
    }
}'
);
2. 將json消息的第一層拆分成多行數(shù)據(jù):
create table explode_msg as
select msg_id,   --消息id
       '1' as json_lv,   --key在json中的層級
       tbl1.key, 
       tbl1.value
  from msg_tbl
  lateral view explode(default.json_to_map(json_msg)) tbl1 as key,value
;

SQL執(zhí)行結(jié)果數(shù)據(jù)在簡書中沒有對齊,在其他編輯器入notepad++中是可以對齊的

+---------------------+----------------------+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| explode_msg.msg_id  | explode_msg.json_lv  |   explode_msg.key   |                                                               explode_msg.value                                                                |
+---------------------+----------------------+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| 1                   | 1                    | modelFileId         | 2                                                                                                                                              |
| 1                   | 1                    | sceneParams         | {"CRDACCPTNMELCT":"如家和美酒店管理有限公司如家快捷天壇店","CRSERVICECODE":"340","ACCTNBR":"0100020100001001","TXNMERCHNO":"303605280000102"}  |
| 1                   | 1                    | modelFileThreshold  | {"trs_ist":0.4}                                                                                                                                |
| 2                   | 1                    | sceneParams         | {"CRDACCPTNMELCT":"招商銀行","CRSERVICECODE":"340","ACCTNBR":"0100020100001222","TXNMERCHNO":"303605285555555"}                                |
+---------------------+----------------------+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
3. 將json消息第二層拆分成多行數(shù)據(jù):
drop table if exists explode_msg_2;
create table explode_msg_2 as
select msg_id, 
       '2' as json_lv, 
       key, 
       tbl1.key2, 
       tbl1.value2
  from explode_msg
  lateral view explode(default.json_to_map(value)) tbl1 as key2,value2
;
+-----------------------+------------------------+---------------------+---------------------+-----------------------------------------+
| explode_msg_2.msg_id  | explode_msg_2.json_lv  |  explode_msg_2.key  | explode_msg_2.key2  |          explode_msg_2.value2           |
+-----------------------+------------------------+---------------------+---------------------+-----------------------------------------+
| 1                     | 2                      | sceneParams         | CRDACCPTNMELCT      | 如家和美酒店管理有限公司如家快捷天壇店  |
| 1                     | 2                      | sceneParams         | CRSERVICECODE       | 340                                     |
| 1                     | 2                      | sceneParams         | ACCTNBR             | 0100020100001001                        |
| 1                     | 2                      | sceneParams         | TXNMERCHNO          | 303605280000102                         |
| 1                     | 2                      | modelFileThreshold  | trs_ist             | 0.4                                     |
| 2                     | 2                      | sceneParams         | CRDACCPTNMELCT      | 招商銀行                                |
| 2                     | 2                      | sceneParams         | CRSERVICECODE       | 340                                     |
| 2                     | 2                      | sceneParams         | ACCTNBR             | 0100020100001222                        |
| 2                     | 2                      | sceneParams         | TXNMERCHNO          | 303605285555555                         |
+-----------------------+------------------------+---------------------+---------------------+-----------------------------------------+
4. 合并以上兩步的數(shù)據(jù)吻氧,并對敏感value加密:
drop table if exists explode_msg_fnl;
create table explode_msg_fnl as
select msg_id,
       json_lv,
       key as key1,
       key2,
       value2 as value,
       case when b.is_sen = '1' then md5(a.value2) else a.value2 end as value_sen
  from explode_msg_2 a
  join sen_tbl b
    on a.key2 = b.col_nam
union all
select msg_id,
       json_lv,
       key as key1,
       null,
       value,
       case when b.is_sen = '1' then md5(a.value) else a.value end as value_sen
  from explode_msg a
  join sen_tbl b
    on a.key = b.col_nam
 where value not like '{"%'
;
+-------------------------+--------------------------+-----------------------+-----------------------+-----------------------------------------+-----------------------------------+
| explode_msg_fnl.msg_id  | explode_msg_fnl.json_lv  | explode_msg_fnl.key1  | explode_msg_fnl.key2  |          explode_msg_fnl.value          |     explode_msg_fnl.value_sen     |
+-------------------------+--------------------------+-----------------------+-----------------------+-----------------------------------------+-----------------------------------+
| 1                       | 1                        | modelFileId           | NULL                  | 2                                       | 2                                 |
| 1                       | 2                        | sceneParams           | ACCTNBR               | 0100020100001001                        | 0100020100001001                  |
| 1                       | 2                        | modelFileThreshold    | trs_ist               | 0.4                                     | 0.4                               |
| 1                       | 2                        | sceneParams           | CRSERVICECODE         | 340                                     | 340                               |
| 1                       | 2                        | sceneParams           | TXNMERCHNO            | 303605280000102                         | ecb0166601a8264180164810a2df4ee9  |
| 1                       | 2                        | sceneParams           | CRDACCPTNMELCT        | 如家和美酒店管理有限公司如家快捷天壇店  | d531cfc939890cfbb3127f59bc30060a  |
| 2                       | 2                        | sceneParams           | ACCTNBR               | 0100020100001222                        | 0100020100001222                  |
| 2                       | 2                        | sceneParams           | CRSERVICECODE         | 340                                     | 340                               |
| 2                       | 2                        | sceneParams           | TXNMERCHNO            | 303605285555555                         | 0ab5cdd213a8313d69b3d8e1b5e1eadb  |
| 2                       | 2                        | sceneParams           | CRDACCPTNMELCT        | 招商銀行                                | e0f88f4dbec781d1ab8402e53f0e25c3  |
+-------------------------+--------------------------+-----------------------+-----------------------+-----------------------------------------+-----------------------------------+
5. 重新合并json的第二層的key溺忧、value,多行合并為一行:
drop table if exists result_msg;
create table result_msg as
select msg_id, key1, is_sen, concat('"',value,'"') as key1_val
  from explode_msg_fnl
 where json_lv = 1

union all
select msg_id, 
       key1,
       is_sen,
       concat('{', concat_ws(',', collect_set(map_val)), '}') as key1_val
  from (select msg_id, 
               key1, 
               concat('"',key2,'":"',value,'"') as map_val,
               is_sen
          from explode_msg_fnl
         where json_lv = 2
       ) t
 group by msg_id, key1, is_sen
;
+--------------------+---------------------+--------------------+--------------------------------------------------------------------------------------------------------+
| result_msg.msg_id  |   result_msg.key1   | result_msg.is_sen  |                                          result_msg.key1_val                                           |
+--------------------+---------------------+--------------------+--------------------------------------------------------------------------------------------------------+
| 1                  | modelFileThreshold  | 0                  | {"trs_ist":"0.4"}                                                                                      |
| 1                  | sceneParams         | 0                  | {"CRSERVICECODE":"340","ACCTNBR":"0100020100001001"}                                                   |
| 1                  | sceneParams         | 1                  | {"CRDACCPTNMELCT":"d531cfc939890cfbb3127f59bc30060a","TXNMERCHNO":"ecb0166601a8264180164810a2df4ee9"}  |
| 2                  | sceneParams         | 0                  | {"CRSERVICECODE":"340","ACCTNBR":"0100020100001222"}                                                   |
| 2                  | sceneParams         | 1                  | {"CRDACCPTNMELCT":"e0f88f4dbec781d1ab8402e53f0e25c3","TXNMERCHNO":"0ab5cdd213a8313d69b3d8e1b5e1eadb"}  |
| 1                  | modelFileId         | 0                  | "2"                                                                                                    |
+--------------------+---------------------+--------------------+--------------------------------------------------------------------------------------------------------+
6. 合并json第一層的key盯孙、value:
drop table if exists result_msg_2;
create table result_msg_2 as
select msg_id, 
       is_sen,
       concat('{', concat_ws(',', collect_set(json_val)), '}') as json_val
  from (select msg_id,
               is_sen,
               concat('"',key1,'": ',key1_val) as json_val
          from result_msg
       ) t
 group by msg_id, is_sen
;
+----------------------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------+
| result_msg_2.msg_id  | result_msg_2.is_sen  |                                                       result_msg_2.json_val                                                       |
+----------------------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------+
| 1                    | 0                    | {"modelFileThreshold": {"trs_ist":"0.4"},"sceneParams": {"CRSERVICECODE":"340","ACCTNBR":"0100020100001001"},"modelFileId": "2"}  |
| 1                    | 1                    | {"sceneParams": {"CRDACCPTNMELCT":"d531cfc939890cfbb3127f59bc30060a","TXNMERCHNO":"ecb0166601a8264180164810a2df4ee9"}}            |
| 2                    | 0                    | {"sceneParams": {"CRSERVICECODE":"340","ACCTNBR":"0100020100001222"}}                                                             |
| 2                    | 1                    | {"sceneParams": {"CRDACCPTNMELCT":"e0f88f4dbec781d1ab8402e53f0e25c3","TXNMERCHNO":"0ab5cdd213a8313d69b3d8e1b5e1eadb"}}            |
+----------------------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------+
7. 行轉(zhuǎn)列鲁森,一列為不包含敏感的json,另一列為包含敏感的json:
create table result_msg_fnl as
select msg_id,
       max(case when is_sen = 0 then json_val end) as json_val,
       max(case when is_sen = 1 then json_val end) as json_val_sen
  from result_msg_2
 group by msg_id
;
+------------------------+-----------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+
| result_msg_fnl.msg_id  |                                                      result_msg_fnl.json_val                                                      |                                               result_msg_fnl.json_val_sen                                               |
+------------------------+-----------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+
| 1                      | {"modelFileThreshold": {"trs_ist":"0.4"},"sceneParams": {"CRSERVICECODE":"340","ACCTNBR":"0100020100001001"},"modelFileId": "2"}  | {"sceneParams": {"CRDACCPTNMELCT":"d531cfc939890cfbb3127f59bc30060a","TXNMERCHNO":"ecb0166601a8264180164810a2df4ee9"}}  |
| 2                      | {"sceneParams": {"CRSERVICECODE":"340","ACCTNBR":"0100020100001222"}}                                                             | {"sceneParams": {"CRDACCPTNMELCT":"e0f88f4dbec781d1ab8402e53f0e25c3","TXNMERCHNO":"0ab5cdd213a8313d69b3d8e1b5e1eadb"}}  |
+------------------------+-----------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末振惰,一起剝皮案震驚了整個濱河市歌溉,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌骑晶,老刑警劉巖痛垛,帶你破解...
    沈念sama閱讀 217,406評論 6 503
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異桶蛔,居然都是意外死亡匙头,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,732評論 3 393
  • 文/潘曉璐 我一進(jìn)店門羽圃,熙熙樓的掌柜王于貴愁眉苦臉地迎上來乾胶,“玉大人,你說我怎么就攤上這事朽寞∈读” “怎么了?”我有些...
    開封第一講書人閱讀 163,711評論 0 353
  • 文/不壞的土叔 我叫張陵脑融,是天一觀的道長喻频。 經(jīng)常有香客問我,道長肘迎,這世上最難降的妖魔是什么甥温? 我笑而不...
    開封第一講書人閱讀 58,380評論 1 293
  • 正文 為了忘掉前任,我火速辦了婚禮妓布,結(jié)果婚禮上姻蚓,老公的妹妹穿的比我還像新娘。我一直安慰自己匣沼,他們只是感情好狰挡,可當(dāng)我...
    茶點故事閱讀 67,432評論 6 392
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般加叁。 火紅的嫁衣襯著肌膚如雪倦沧。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,301評論 1 301
  • 那天它匕,我揣著相機與錄音展融,去河邊找鬼。 笑死豫柬,一個胖子當(dāng)著我的面吹牛告希,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播轮傍,決...
    沈念sama閱讀 40,145評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼暂雹,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了创夜?” 一聲冷哼從身側(cè)響起杭跪,我...
    開封第一講書人閱讀 39,008評論 0 276
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎驰吓,沒想到半個月后涧尿,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,443評論 1 314
  • 正文 獨居荒郊野嶺守林人離奇死亡檬贰,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,649評論 3 334
  • 正文 我和宋清朗相戀三年姑廉,在試婚紗的時候發(fā)現(xiàn)自己被綠了。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片翁涤。...
    茶點故事閱讀 39,795評論 1 347
  • 序言:一個原本活蹦亂跳的男人離奇死亡桥言,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出葵礼,到底是詐尸還是另有隱情号阿,我是刑警寧澤,帶...
    沈念sama閱讀 35,501評論 5 345
  • 正文 年R本政府宣布鸳粉,位于F島的核電站扔涧,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏届谈。R本人自食惡果不足惜枯夜,卻給世界環(huán)境...
    茶點故事閱讀 41,119評論 3 328
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望艰山。 院中可真熱鬧湖雹,春花似錦、人聲如沸曙搬。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,731評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至舔腾,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間搂擦,已是汗流浹背稳诚。 一陣腳步聲響...
    開封第一講書人閱讀 32,865評論 1 269
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留瀑踢,地道東北人扳还。 一個月前我還...
    沈念sama閱讀 47,899評論 2 370
  • 正文 我出身青樓,卻偏偏與公主長得像橱夭,于是被迫代替她去往敵國和親氨距。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 44,724評論 2 354