1、hive的分組和組內(nèi)排序---語法
語法:
row_number() over (partition by 字段a order by 計算項b desc ) rank
- rank是排序的別名
- partition by:類似hive的建表鳖昌,分區(qū)的意思;
- order by :排序柬姚,默認是升序傅事,加desc降序;
- 這里按字段a分區(qū)蝶押,對計算項b進行降序排序
2浦楣、hive的分組和組內(nèi)排序 --- 實例
要取top10品牌袖肥,各品牌的top10渠道,各品牌的top10渠道中各渠道的top10檔期
1椒振、取top10品牌
select “品牌” , sum/count/其他() as num from "table_name" order by num desc limit 10;
2昭伸、取top10品牌下各品牌的top10渠道
select a.* from (select "品牌","渠道",sum/count() as num, row_number () over (partition by "品牌" order by num desc) rank from “table_name” where 品牌限制條件 group by “品牌”,“渠道” ) a having a.rank <= 10;
3、 取top10品牌下各品牌的top10渠道中各渠道的top10檔期
select a.* from (select "品牌","渠道","檔期",sum/count/其他() as num row_number() over (partition by "檔期" order by num desc) rank from "table_name" where 品牌限制條件 group by “品牌”,“渠道) a Having a.rank <= 10;
row_number的使用在hive和spark的實時計算中常常會用到計算分區(qū)中的排序問題澎迎,所以使用好row_number是很重要的庐杨。