HQL聚集函數(shù)可以使用GROUPING SETS, CUBE, 和ROLLUP等關(guān)鍵詞雏亚。
- GROUPING SETS
該子句等同于GROUP BY子句和UNION ALL子句一起組合使用宿崭。另外該子句是在單一階段一次性完成相關(guān)處理,效率相對更高爷辙。GROUPING SETS這個子句后是空集合的話沪斟,會計算整體聚集静檬。GROUPING SETS這個子句后()之外的部分帘营,用于確定 UNION ALL的執(zhí)行方式和個數(shù);()之內(nèi)的部分翩活,用于確定GROUP BY的執(zhí)行方式阱洪。
例1.1 一個元素:一個兩列組合
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date
GROUPING SETS((name, start_date));
--||-- equals to
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date;
+---------+------------+---------+
| name | start_date | sin_cnt |
+---------+------------+---------+
| Lucy | 2010-01-03 | 1 |
| Michael | 2014-01-29 | 1 |
| Steven | 2012-11-03 | 1 |
| Will | 2013-10-02 | 1 |
+---------+------------+---------+
4 rows selected (26.3 seconds)
例1.2 兩個元素:兩個列
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date
GROUPING SETS(name, start_date);
--||-- equals to
SELECT
name, null as start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name
UNION ALL
SELECT
null as name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY start_date;
----------+------------+---------+
| name | start_date | sin_cnt |
+---------+------------+---------+
| NULL | 2010-01-03 | 1 |
| NULL | 2012-11-03 | 1 |
| NULL | 2013-10-02 | 1 |
| NULL | 2014-01-29 | 1 |
| Lucy | NULL | 1 |
| Michael | NULL | 1 |
| Steven | NULL | 1 |
| Will | NULL | 1 |
+---------+------------+---------+
8 rows selected (22.658 seconds)
例1.3 兩個元素:一個兩列組合便贵,一個列
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date
GROUPING SETS((name, start_date), name);
--||-- equals to
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date
UNION ALL
SELECT
name, null as start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name;
+---------+------------+---------+
| name | start_date | sin_cnt |
+---------+------------+---------+
| Lucy | NULL | 1 |
| Lucy | 2010-01-03 | 1 |
| Michael | NULL | 1 |
| Michael | 2014-01-29 | 1 |
| Steven | NULL | 1 |
| Steven | 2012-11-03 | 1 |
| Will | NULL | 1 |
| Will | 2013-10-02 | 1 |
+---------+------------+---------+
8 rows selected (22.503 seconds)
例1.4 四個元素:兩列的所有排列組合
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date
GROUPING SETS((name, start_date), name, start_date, ());
--||-- equals to
SELECT
name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name, start_date
UNION ALL
SELECT
name, null as start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY name
UNION ALL
SELECT
null as name, start_date, count(sin_number) as sin_cnt
FROM employee_hr
GROUP BY start_date
UNION ALL
SELECT
null as name, null as start_date, count(sin_number) as sin_cnt
FROM employee_hr
+---------+------------+---------+
| name | start_date | sin_cnt |
+---------+------------+---------+
| NULL | NULL | 4 |
| NULL | 2010-01-03 | 1 |
| NULL | 2012-11-03 | 1 |
| NULL | 2013-10-02 | 1 |
| NULL | 2014-01-29 | 1 |
| Lucy | NULL | 1 |
| Lucy | 2010-01-03 | 1 |
| Michael | NULL | 1 |
| Michael | 2014-01-29 | 1 |
| Steven | NULL | 1 |
| Steven | 2012-11-03 | 1 |
| Will | NULL | 1 |
| Will | 2013-10-02 | 1 |
+---------+------------+---------+
13 rows selected (24.916 seconds)
- ROLLUP
提供n+1層級的聚集計算,這里n為參與分組的列的個數(shù)冗荸。例如GROUP BY a,b,c WITH ROLLUP 等效于 GROUP BY a,b,c GROUPING SETS ((a,b,c),(a,b),(a),()) - CUBE
提供2的n次方個層級的聚集計算嫉沽,這里n為參與分組的列的個數(shù),這個層級數(shù)為n個元素所有組合數(shù)俏竞。例如GROUP BY a,b,c WITH CUBE等效于 GROUP BY a,b,c GROUPING SETS ((a,b,c),(a,b),(b,c),(a,c),(a),(b),(c),()) - GROUPING__ID 和 GROUPING 函數(shù)
GROUPING__ID函數(shù)堂竟,無需輸入?yún)?shù)魂毁,返回值用來標(biāo)識用于聚集計算的層次,這個值是GROUP BY后具體列組合的位向量的數(shù)字值出嘹。具有相同GROUP BY后具體列組合的行席楚,該函數(shù)返回相同的數(shù)字ID。
GROUPING函數(shù)用于判斷某列是否包含在當(dāng)前行的聚集計算(也即是否包含在該行的GROUP BY之后)税稼。0烦秩,指不包含在GROUP BY之后的列中;1郎仆,指包含在GROUP BY之后的列中只祠。請看以下示例,
SELECT
name, start_date, count(employee_id) as emp_id_cnt,
GROUPING__ID,
grouping(name) as gp_name,
grouping(start_date) as gp_sd
FROM employee_hr
GROUP BY name, start_date
WITH CUBE ORDER BY name, start_date;
+---------+------------+------------+-----+---------+-------+
| name | start_date | emp_id_cnt | gid | gp_name | gp_sd |
+---------+------------+------------+-----+---------+-------+
| NULL | NULL | 4 | 3 | 1 | 1 |
| NULL | 2010-01-03 | 1 | 2 | 1 | 0 |
| NULL | 2012-11-03 | 1 | 2 | 1 | 0 |
| NULL | 2013-10-02 | 1 | 2 | 1 | 0 |
| NULL | 2014-01-29 | 1 | 2 | 1 | 0 |
| Lucy | NULL | 1 | 1 | 0 | 1 |
| Lucy | 2010-01-03 | 1 | 0 | 0 | 0 |
| Michael | NULL | 1 | 1 | 0 | 1 |
| Michael | 2014-01-29 | 1 | 0 | 0 | 0 |
| Steven | NULL | 1 | 1 | 0 | 1 |
| Steven | 2012-11-03 | 1 | 0 | 0 | 0 |
| Will | NULL | 1 | 1 | 0 | 1 |
| Will | 2013-10-02 | 1 | 0 | 0 | 0 |
+---------+------------+------------+-----+---------+-------+
13 rows selected (55.507 seconds)