hive建表語句:
1.建內(nèi)部表
hive (badou)> create table udata(userid int,item_id int,rating int,`timestamp` timestamp)row format delimited fields terminated by ' ';
OK
Time taken: 2.254 seconds
hive (badou)> show tables;
OK
udata
create table if not exists inner_test (
aisle_id string,
aisle_name string
)
row format delimited fields terminated by ',' lines terminated by '\n'
stored as textfile
location '/data/inner';
2.建外部表
create external table if not exists ext_test (
aisle_id string,
aisle_name string
)
row format delimited fields terminated by ',' lines terminated by '\n'
stored as textfile
location '/data/ext';
3.建分區(qū)表
create table partition_test(
order_id string,
user_id string,
eval_set string,
order_number string,
order_hour_of_day string,
days_since_prior_order string
)partitioned by(order_dow string)
row format delimited fields terminated by '\t';
hive導(dǎo)入數(shù)據(jù):
1.從文件導(dǎo)入數(shù)據(jù)到表
hive (badou)> load data local inpath '/home/dongdong/hive/u.data' overwrite into table udata;
Loading data to table badou.udata
OK
Time taken: 2.335 seconds
2.給分區(qū)表插入數(shù)據(jù)
insert overwrite table partition_test partition (order_dow='1')
select order_id,user_id,eval_set,order_number,order_hour_of_day,days_since_prior_order from orders where order_dow='1' limit 10;
Hive刪除表字段
CREATE TABLE test (
creatingTs BIGINT,
a STRING,
b BIGINT,
c STRING,
d STRING,
e BIGINT,
f BIGINT
);
如果需要刪除 column f 列包个,可以使用以下語句:
ALTER TABLE test REPLACE COLUMNS (
creatingTs BIGINT,
a STRING,
b BIGINT,
c STRING,
d STRING,
e BIGINT
);
Hive添加表字段
創(chuàng)建測試表:
use mart_flow_test;
create table if not exists mart_flow_test.detail_flow_test
(
union_id string comment '設(shè)備唯一標識'
) comment '測試表'
partitioned by (
partition_date string comment '日志生成日期'
) stored as orc;
(2)新增字段:use mart_flow_test;
alter table detail_flow_test add columns(original_union_id string);
(3)修改注釋:use mart_flow_test;
alter table detail_flow_conversion_base_raw change column original_union_id original_union_id string COMMENT'原始設(shè)備唯一性標識’;
Hive刪除表中數(shù)據(jù)
方式一:僅刪除表中數(shù)據(jù)比吭,保留表結(jié)構(gòu)
truncate table 表名;
(truncate用于刪除所有的行,這個行為在hive元存儲刪除數(shù)據(jù)是不可逆的)
或 delete from 表名 where 1 = 1 ;
(delete用于刪除特定條件下的行掘托,使用where 1=1 刪除所有行 SQL中where 1 = 1 的使用)
truncate 不能刪除外部表练俐!因為外部表里的數(shù)據(jù)并不是存放在Hive Meta store中
方式二:刪除整個表
drop table 表名;
如果要永久性刪除饮潦,不準備再恢復(fù):
drop table 表名 purge;
不用終端的方式執(zhí)行sql:
hive -f create_table.sql
hive實現(xiàn)wordcount的sql語句
select word,count(*)
from (
select
explode(split(sentence,''))
as word
from article
)t
group by word
hive的udf爆袍,udtf內(nèi)置函數(shù)
udf實踐:
1.java代碼并打包成jar:
2.將jar添加到hive中,并創(chuàng)建臨時函數(shù):
3.使用函數(shù)及結(jié)果:
udtf實踐:
1.編寫java代碼并打包成jar:
2.添加到hive践宴,并創(chuàng)建臨時函數(shù):
3.載入數(shù)據(jù)并創(chuàng)建hive表:
4.使用函數(shù)及結(jié)果:
transform實踐(可用shell語言,python語言開發(fā))
1.shell實踐:
1.編寫awk文件:
2.加入hive
3.使用awk腳本爷怀,及結(jié)果:
2.python實踐:
使用1:
把py文件添加到hive中:
使用及結(jié)果:
wordcount實踐:
1.創(chuàng)建hive表
2.導(dǎo)入數(shù)據(jù):
- 創(chuàng)建另一張表(存儲):
- 新建map.py阻肩,red.py并add到hive:
- 使用及結(jié)果:
把wordcount結(jié)果插入到指定表中: