oracle學(xué)習筆記７: 高級分組

group by 用來在原始數(shù)據(jù)上創(chuàng)建聚合來將數(shù)據(jù)轉(zhuǎn)化為有用的信息葡公。

基本的group by 列出個個部門的名稱鸟雏，員工總數(shù)

select d.dname, count(empno) empcount
  from scott.dept d
  left outer join scott.emp e
    on d.deptno = e.deptno
 group by d.dname
 order by d.dname;

select列表中的每一列必須包含在group by子句中裆针。如果沒有則會導(dǎo)致錯誤咱枉。如：


SQL> select d.dname, d.loc, count(empno) empcount
  2    from scott.emp e
  3     join scott.dept d
  4      on d.deptno = e.deptno
  5   group by d.dname;
select d.dname, d.loc, count(empno) empcount
                *
第 1 行出現(xiàn)錯誤:
ORA-00979: 不是 GROUP BY 表達式

盡管包含group by子句的select語句輸出看上去是按順序列出的，你不能期待group by每次都排好序地返回數(shù)據(jù)，如果輸出結(jié)果必須按照一定的順序排列陷嘴，則必須使用order by子句乎赴。

--沒有排序的group by 
select deptno,count(*)
from emp
group by deptno;

--復(fù)雜的sql
set serveroutput off;
--復(fù)雜的sql
select /* lst7-4 */
distinct dname,
         decode(d.deptno,
                1,
                (select count(*) from emp where deptno = 1),
                2,
                (select count(*) from emp where deptno = 2),
                3,
                (select count(*) from emp where deptno = 3),
                (select count(*) from emp where deptno not in (1, 2, 3))) dept_count
  from (select distinct deptno from emp) d
  join dept d2
    on d.deptno = d.deptno;

@E:\bjc2016\study\pln lst7-4

上面的寫法，會使SQL語句更加復(fù)雜難以理解并且難以維護真慢。group by子句極大的簡化必須寫的sql語句以外毅臊，還消除了數(shù)據(jù)庫不必要的IO。

set serveroutput off;
--復(fù)雜的sql
select /* lst7-5 */
distinct dname,
         count(empno) empcount
  from  emp e
  join dept d
    on d.deptno = d.deptno
    group by d.dname
    order by d.dname;

@E:\bjc2016\study\pln lst7-5

group by 優(yōu)點：

使sql語句更具有可讀性
書寫起來比使用很多相關(guān)子查詢更簡單
減少了重復(fù)訪問同一個數(shù)據(jù)塊的次數(shù)黑界，從而得到更好的性能管嬉。

在分組之后，還在數(shù)據(jù)集上應(yīng)用了having子句朗鸠。另一方面蚯撩，在獲取數(shù)據(jù)行之后，進行分組之前烛占，應(yīng)用了where子句胎挎。having 子句中可以使用運算，函數(shù)及子查詢忆家。


SQL> --having子句
SQL> select /* lst7-6 */
  2   d.dname, trunc(e.hiredate, 'yyyy') hiredate, count(empno) empcount
  3    from emp e
  4    join dept d
  5      on e.deptno = e.deptno
  6   group by d.dname, trunc(e.hiredate, 'yyyy')
  7  having count(empno) <= 5 and trunc(e.hiredate, 'yyyy') between (select min(hiredate)
  8                                                                    from scott.emp) and (select max(hiredate)
  9                                                                                           from scott.emp)
 10   order by d.dname;

未選定行

SQL> @E:\bjc2016\study\pln lst7-6
原值    8:        WHERE UPPER(SQL_TEXT) LIKE '%&1%'
新值    8:        WHERE UPPER(SQL_TEXT) LIKE '%lst7-6%'

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  0wcfknkztdxqt, child number 0
-------------------------------------
select /* lst7-6 */  d.dname, trunc(e.hiredate, 'yyyy') hiredate,
count(empno) empcount   from emp e   join dept d     on e.deptno =
e.deptno  group by d.dname, trunc(e.hiredate, 'yyyy') having
count(empno) <= 5 and trunc(e.hiredate, 'yyyy') between (select
min(hiredate)
        from scott.emp) and (select max(hiredate)
                                                                   from
scott.emp)  order by d.dname


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 239717969

---------------------------------------------------------------------------
| Id  | Operation              | Name | E-Rows |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |        |       |       |          |
|*  1 |  FILTER                |      |        |       |       |          |
|   2 |   SORT GROUP BY        |      |      1 |  2048 |  2048 | 2048  (0)|
|   3 |    MERGE JOIN CARTESIAN|      |     40 |       |       |          |
|   4 |     TABLE ACCESS FULL  | DEPT |      4 |       |       |          |
|   5 |     BUFFER SORT        |      |     10 |  2048 |  2048 | 2048  (0)|

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   6 |      TABLE ACCESS FULL | EMP  |     10 |       |       |          |
|   7 |   SORT AGGREGATE       |      |      1 |       |       |          |
|   8 |    TABLE ACCESS FULL   | EMP  |     10 |       |       |          |
|   9 |   SORT AGGREGATE       |      |      1 |       |       |          |
|  10 |    TABLE ACCESS FULL   | EMP  |     10 |       |       |          |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((COUNT(*)<=5 AND TRUNC(INTERNAL_FUNCTION("E"."HIREDATE"),'

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
              fmyyyy')>= AND TRUNC(INTERNAL_FUNCTION("E"."HIREDATE"),'fmyyyy')<=))


Note
-----
   - Warning: basic plan statistics not available. These are only collected when
:

       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system leve

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
l



已選擇41行。

SQL>

group by的cube擴展

當與group by子句一起使用時弦赖，將會使得對每一行都要考慮包含在cube的參數(shù)中的所有可能的元素組合。這個運算將會生成比表中實際存在的行數(shù)更多的數(shù)據(jù)行沼沈。

-- hr.emplyees表的cube運算
select last_name, first_name
  from hr.employees
 group by first_name, last_name;

set autotrace off;
set autotrace on statistics;
with emps as
 (select /* lst-7 */
   last_name, first_name
    from hr.employees
   group by cube(last_name, first_name))
select rownum, last_name, first_name from emps;

對于每一對last_name,first_name币厕，cube將會按順序為每個元素替換為null值。cube生成的數(shù)據(jù)行在Oracle文檔中稱為超級聚合行旦装，可以在運算列中加入null值來識別页衙。


SQL> set autotrace off;
SQL> --預(yù)測cube返回行數(shù)
SQL> with counts as
  2   (select count(distinct first_name) first_name_count,
  3           count(distinct last_name) last_name_count,
  4           count(distinct(first_name || last_name)) full_name_count
  5      from hr.employees)
  6  select first_name_count,
  7         last_name_count,
  8         full_name_count,
  9         first_name_count + last_name_count + full_name_count + 1 total_count
 10    from counts;

FIRST_NAME_COUNT LAST_NAME_COUNT FULL_NAME_COUNT TOTAL_COUNT
---------------- --------------- --------------- -----------
              91             102             107         301

下面用ＳＱＬ語句模擬cube，可以看出cube為我們節(jié)省了許多力氣。

--用union all生成cube數(shù)據(jù)行
with emps as (
 select last_name,first_name from hr.employees
),
mycube as (
       select last_name,first_name from emps
       union all
       select last_name,null first_name from emps
       union all
       select null last_name,first_name from emps
       union all
       select null last_name,null first_name from emps
)
select /*+ gather_plan_statistics */ * 
from mycube group by last_name,first_name;

cube實際應(yīng)用

sales_history模式中包含1998~2001年的銷售數(shù)據(jù)店乐。
下面的ＳＱＬ展示2001年的所有銷售數(shù)據(jù)艰躺。并想要查看各個產(chǎn)品種類的銷售情況匯總，包含基于10年消費者年齡段眨八，收入水平的聚合腺兴；按照收入水平而不考慮年齡的匯總；以及按年齡而不考慮收入水平的聚合廉侧。

--銷售數(shù)據(jù)的union all查詢
with tsales as
 (select /* lst7-10 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001)),
gb as
 (select --Q1 - 所有分類通過收入和年齡范圍
   'Q1' query_tag,
   prod_category,
   cust_income_level,
   age_range,
   sum(profit) profit
    from tsales
   group by prod_category, cust_income_level, age_range
  union all
  select --Q2 - 所有分類通過年齡范圍
   'Q2' query_tag,
   prod_category,
   'ALL INCOME' cust_income_level,
   age_range,
   sum(profit) profit
    from tsales
   group by prod_category, 'ALL INCOME', age_range
  union all
  select --Q3 - 所有分類通過收入
   'Q3' query_tag,
   prod_category,
   cust_income_level,
   'ALL AGE' age_range,
   sum(profit) profit
    from tsales
   group by prod_category, cust_income_level, 'ALL AGE'
  union all
  select --Q4 - 所有分類
   'Q4' query_tag,
   prod_category,
   'ALL INCOME' cust_income_level,
   'ALL AGE' age_range,
   sum(profit) profit
    from tsales
   group by prod_category, 'ALL INCOME', 'ALL AGE'
  )
select * from gb order by prod_category, profit;

【語法】NVL (expr1, expr2)
【功能】若expr1為NULL页响，返回expr2；expr1不為NULL段誊，返回expr1闰蚕。
注意兩者的類型要一致
nvl(c.cust_income_level, 'A: below 30,000') cust_income_level
若c.cust_income_level為null,則返回'A: below 30,000'
mod(x,y)
【功能】返回x除以y的余數(shù)
【參數(shù)】x,y，數(shù)字型表達式
【返回】數(shù)字
【示例】
select mod(23,8),mod(24,8) from dual;
返回：7,0
to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' || to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range
上面是求年齡段连舍，如果56没陡，則求出的范圍為50_60

--用cube代替union all
with tsales as
 (select /* lst7-11 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || decode(cust_income_level,
                     null,
                     decode(age_range, null, 4, 3),
                     decode(age_range, null, 2, 1)) query_tag,
       prod_category,
       cust_income_level,
       age_range,
       sum(profit) profit
  from tsales
 group by prod_category, cube(cust_income_level, age_range)
 order by prod_category, profit;

decode(條件,值1,翻譯值1,值2,翻譯值2,...值n,翻譯值n,缺省值)
【功能】根據(jù)條件返回相應(yīng)值
【參數(shù)】c1, c2, ...,cn,字符型/數(shù)值型/日期型，必須類型相同或null
注：值1……n 不能為條件表達式,這種情況只能用case when then end解決
·含義解釋：　　
　　decode(條件,值1,翻譯值1,值2,翻譯值2,...值n,翻譯值n,缺省值)　　
　　該函數(shù)的含義如下：　　
　　IF 條件=值1 THEN
　　RETURN(翻譯值1)
　　ELSIF 條件=值2 THEN
　　RETURN(翻譯值2)
　　......
　　ELSIF 條件=值n THEN
　　RETURN(翻譯值n)　　
　　ELSE
　　RETURN(缺省值)
　　END IF
　　
或：
　　when case 條件=值1 THEN
　　RETURN(翻譯值1)
　　ElseCase 條件=值2 THEN
　　RETURN(翻譯值2)
　　......
　　ElseCase 條件=值n THEN
　　RETURN(翻譯值n)　　
　　ELSE

RETURN(缺省值)
　　END
'Q' || decode(cust_income_level, null,decode(age_range, null, 4, 3),decode(age_range, null, 2, 1)) query_tag
是返回查詢分類標識cust_income_level為null返返回decode(age_range, null, 4, 3)否則返回decode(age_range, null, 2, 1)
cust_income_level==null and age_range==null,query_tag=4
cust_income_level==null and age_range!=null,query_tag=3
cust_income_level!=null and age_range==null,query_tag=2
cust_income_level!=null and age_range!=null,query_tag=1

用grouping()函數(shù)排除空值

上面的ＳＱＬ有個問題烟瞧，盡管總行數(shù)與之前使用union all運算符所得到的相一致诗鸭，一些數(shù)據(jù)行中的cust_income_level和age_range具有空值，并且有一行的這兩列都為空值参滴。當cube的參數(shù)中包含生成列的所有可能組合時强岸，每一列都有會產(chǎn)生n-1個空值，n為列表中的數(shù)目砾赔。在查詢的例子中有兩個例暴心，因此對于每個唯一的age_range值都會在cust_income_level列上產(chǎn)生空值专普。對于age_range列來說也適用同樣的規(guī)則檀夹。如果這兩列中的數(shù)據(jù)在某些行上原本就有空值炸渡，這些空值就可能出問題买决。如何辨別數(shù)據(jù)中原有的空值和cube擴展所插入的值呢？在oracle 8i中引入了grouping()函數(shù)嘁灯，可以用來識別這些超聚合行旁仿。被用來作為grouping()函數(shù)參數(shù)的表達式必須與出現(xiàn)在group by子句中的表達式相匹配。例如
decode(grouping(age_range),1,'ALL AGE',age_range) age_range
age_range檢測age_range是否有一行由cube產(chǎn)生的空值办悟，或者是否其在數(shù)據(jù)庫中本身就是空值病蛉。如果當前行是由cube生成的超聚合行則返回值為１铺然，對于其它所有情況返回值都為０魄健。
當與case()表達式或decode()函數(shù)組合時沽瘦，超聚合行中的空值可以用一個報告中有用的值替換析恋。這種情況下助隧，decode()看上去是更好的選擇并村，因為它更簡便并且grouping()函數(shù)僅有兩種可能的返回值橘霎。

--grouping()函數(shù)
--無grouping
with tsales as
 (select /* lst7-11 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || decode(cust_income_level,
                     null,
                     decode(age_range, null, 4, 3),
                     decode(age_range, null, 2, 1)) query_tag,
       prod_category,
       cust_income_level,
       age_range,
       sum(profit) profit
  from tsales
 group by prod_category, cube(cust_income_level, age_range)
 order by prod_category, profit;

--有g(shù)rouping
--case和decode都可以工作，我更喜歡用decode
with tsales as
 (select /* lst7-12 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || decode(cust_income_level,
                     null,
                     decode(age_range, null, 4, 3),
                     decode(age_range, null, 2, 1)) query_tag,
       prod_category,
       case grouping(cust_income_level)
         when 1 then
          'ALL INCOME'
         else
          cust_income_level
       end cust_income_level,
       decode(grouping(age_range), 1, 'ALL AGE', age_range) age_range,
       sum(profit) profit
  from tsales
 group by prod_category, cube(cust_income_level, age_range)
 order by prod_category, profit;

用grouping()擴展報告

另一種使用grouping()的方法是放在having子句中原环，用來控制在輸出中顯示哪個層級的聚合嘱吗。
使用grouping()函數(shù)可以被濃縮為對cube擴展中的各行或所有行進行滾動小計谒麦。

--在having子句中進行g(shù)rouping()
with tsales as
 (select /* lst7-13 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || decode(cust_income_level,
                     null,
                     decode(age_range, null, 4, 3),
                     decode(age_range, null, 2, 1)) query_tag,
       prod_category,
       case grouping(cust_income_level)
         when 1 then
          'ALL INCOME'
         else
          cust_income_level
       end cust_income_level,
       decode(grouping(age_range), 1, 'ALL AGE', age_range) age_range,
       sum(profit) profit
  from tsales
 group by prod_category, cube(cust_income_level, age_range)
 --having grouping(cust_income_level)=1
 --having grouping(age_range)=1
 having grouping(cust_income_level)=1 and grouping(age_range)=1
 order by prod_category, profit;

上面sql的數(shù)據(jù)可以看到將grouping()應(yīng)用到cust_income_level列對所有age_range值跨各個收入層次創(chuàng)建聚合。對age_range列進行這樣的操作會得到類似的效果耻蛇，對所有cust_income_level值進行聚合而不考慮age_range的值臣咖。將cube擴展中的所有列作為grouping()函數(shù)的參數(shù)將會導(dǎo)致聚合被濃縮為一行類似sum(profit)和group by prod_category所實現(xiàn)的功能夺蛇。但是蚊惯，使用cube擴展簡單修改having子句就可以創(chuàng)建幾份不同的報告截型。

用grouping_id()擴展報告

grouping_id()函數(shù)相對grouping()函數(shù)來說是相對較新的宦焦，在oracle 9i中引入波闹，與grouping()函數(shù)在某種程度上是類似的精堕。不同的是grouping()計算一個表達式并返回０或１歹篓，而grouping_id()計算一個表達式庄撮，確定其參數(shù)中的哪一行（如果有的話）用來生成超聚合行洞斯，然后創(chuàng)建一個位矢量烙如，并將該值作為整形值返回厅翔。

--group_id()位矢量
with rowgen as (
     select 1 bit_1,0 bit_0
     from dual
), cubed as (
 select 
 grouping_id(bit_1,bit_0) gid,
 to_char(grouping(bit_1)) bv_1,
 to_char(grouping(bit_0)) bv_0,
 decode(grouping(bit_1),1,'GRP BIT 1') gb_1,
 decode(grouping(bit_0),1,'GRP BIT 0') gb_0
 from rowgen
 group by cube(bit_1,bit_0)
)
select gid,bv_1 || bv_0 bit_vector,
gb_1,
gb_0
from cubed
order by gid;

group_id()位矢量運行結(jié)果

我們己經(jīng)知道如何使用grouping()通過having子句來控制輸出刀闷，但考慮數(shù)據(jù)庫效率時甸昏，單獨的grouping_id()調(diào)用可以用來取代所有不同的having grouping()子句施蜜。grouping()函數(shù)的功能僅僅用來辨別數(shù)據(jù)行翻默，因為它僅能返回０或１。由于grouping_id()函數(shù)返回一個基于位矢量的數(shù)值检盼，它可以輕易被用來進行各種不同的比較而不用修改sql語句吨枉。
為什么要關(guān)注不改變SQL語句就能改變比較呢柬唯？如上面基于銷售歷史的例子中圃庭，用戶可能會被給出４個輸出選項，任意一個或多個可能會被選中师坎。用戶的選擇可以用來作為使用having grouping_id()函數(shù)的一個單獨的sql語句胯陋，而不是基于having grouping()的不同組全的多個sql語句的輸入遏乔，因此需要數(shù)據(jù)庫解析sql語句的次數(shù)也比就較少盟萨。同時這也會使得需要執(zhí)行的sql語句更少，使用更小的IO,以及更少的內(nèi)存前计。
正如使用cube來避免通過union all將多個sql語句結(jié)合起來一樣丈屹，grouping_id()能夠避免在應(yīng)用中使用多個sql語句伶棒。

--顯示所有收入層次和年齡段的聚合
variable N_ALL_DATA number 
--顯示所有年齡段的聚合
variable N_AGE_RANGE number 
--顯示所有收入層次的聚合
variable N_INCOME_LEVEL number 
--只給出匯總
variable N_SUMMAY number 
begin
 :N_ALL_DATA      := 0;      -- 1 生效
 :N_AGE_RANGE     := 2;      -- 2 生效
 :N_INCOME_LEVEL  := 0;      -- 3 生效
 :N_SUMMAY        := 4;      -- 4 生效
end;
/
with tsales as
 (select /* lst7-15 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || to_char(grouping_id(cust_income_level,age_range)+1) query_tag,
prod_category,
decode(grouping(cust_income_level),1,'ALL INCOME',cust_income_level) cust_income_level,
decode(grouping(age_range),1,'ALL AGE',age_range) age_range, 
sum(profit) profit
from tsales
group by prod_category,cube(cust_income_level,age_range)
having grouping_id(cust_income_level,age_range)+1 in(:N_ALL_DATA,:N_AGE_RANGE,:N_INCOME_LEVEL,:N_SUMMAY)
order by prod_category,profit;

使用grouping函數(shù)也可以實現(xiàn)同的結(jié)果，但需要在having子句中進行一些測試舅锄。示例銷售歷史數(shù)據(jù)查詢在cube參數(shù)中只包含兩列皇忿。在having子句中總共需要進行４次測試鳍烁，因為grouping子句將會返回１或者０幔荒，每一列有兩個可能的值。從而需要４次測試提澎。如果３列盼忌，則需要８次谦纱，所需的測試次數(shù)將會是２的n次方跨嘉，其中n為cube中參數(shù)列或表達式的個數(shù)。

用grouping()代替grouping_id()的having子句的例子

--顯示所有收入層次和年齡段的聚合
variable N_ALL_DATA number 
--顯示所有年齡段的聚合
variable N_AGE_RANGE number 
--顯示所有收入層次的聚合
variable N_INCOME_LEVEL number 
--只給出匯總
variable N_SUMMAY number 
begin
 :N_ALL_DATA      := 0;      -- 1 生效
 :N_AGE_RANGE     := 2;      -- 2 生效
 :N_INCOME_LEVEL  := 0;      -- 3 生效
 :N_SUMMAY        := 4;      -- 4 生效
end;
/
with tsales as
 (select /* lst7-16 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || to_char(grouping_id(cust_income_level,age_range)+1) query_tag,
prod_category,
decode(grouping(cust_income_level),1,'ALL INCOME',cust_income_level) cust_income_level,
decode(grouping(age_range),1,'ALL AGE',age_range) age_range, 
sum(profit) profit
from tsales
group by prod_category,cube(cust_income_level,age_range)
having 
 (bin_to_num(grouping(cust_income_level),grouping(age_range))+1 = :N_ALL_DATA)
 or (bin_to_num(grouping(cust_income_level),grouping(age_range))+1 = :N_AGE_RANGE)
 or (bin_to_num(grouping(cust_income_level),grouping(age_range))+1 = :N_INCOME_LEVEL)
 or (bin_to_num(grouping(cust_income_level),grouping(age_range))+1 = :N_SUMMAY)
order by prod_category,profit;

１.　使用grouping可以判斷該行是數(shù)據(jù)庫中本來的行，還是有統(tǒng)計產(chǎn)生的行
grouping值為0時說明這個值是數(shù)據(jù)庫中本來的值，為1說明是統(tǒng)計的結(jié)果(也可以說該列為空時是1力崇，不為空時是0)
2.　GROUPING_ID()函數(shù)可以接受一列或多列亮靴，返回GROUPING位向量的十進制值茧吊。GROUPING位向量的計算方法是將按照順序?qū)γ恳涣姓{(diào)用GROUPING函數(shù)的結(jié)果組合起來，所以說01和10的值不一樣的
3.　group_id的使用當group by子句中重復(fù)使用一個列時,通過group_id來去除重復(fù)值

grouping sets與rollup()

group by的grouping sets()擴展在oracle 9i中初次登場话速，前面的例子中的整個group by...having子句可以用group by grouping sets()替換乳讥。


with tsales as
 (select /* lst7-17 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || to_char(grouping_id(cust_income_level,age_range)+1) query_tag,
prod_category,
decode(grouping(cust_income_level),1,'ALL INCOME',cust_income_level) cust_income_level,
decode(grouping(age_range),1,'ALL AGE',age_range) age_range, 
sum(profit) profit
from tsales
group by prod_category,grouping sets(
rollup(prod_category), --產(chǎn)品分類小計
(cust_income_level),--產(chǎn)品分類和收入層次
(age_range),　--產(chǎn)品分類和年齡范圍
(cust_income_level,age_range)　--產(chǎn)品分類云石，年齡范圍和收入層次
)
--having group_id() < 1
order by prod_category,profit;

group by cube having grouping_id()與group by grouping sets一個主要的區(qū)別是汹忠，前者能將變量設(shè)定為正確的值來簡便修改輸出错维，而后者的輸出不能修改参歹，除非修改或動態(tài)生成sql語句犬庇。修改sql語句意味著需要維護更多的代碼并且占用更多的數(shù)據(jù)庫資源臭挽。最好盡量避免使用動態(tài)生成sql語句，因為它會消耗的數(shù)據(jù)庫資源更多涨共，并且在出現(xiàn)問題時難以檢修纽帖。
某些時候grouping_sets()擴展會導(dǎo)致輸出中出現(xiàn)重復(fù)。重復(fù)是由rollup(prod_category)產(chǎn)生的举反“弥保可以通過去掉rollup()然后重新運行得到驗證，重復(fù)的行將不復(fù)存在火鼻。但是室囊，每種產(chǎn)品種類的總計也不存在了。解決的辦法就是使用group_id()函數(shù)標記重復(fù)的行魁索，并將其插入到having子句中。
在上面的sql中將--having group_id() < 1 改成　having group_id() < 1
這樣，輸出結(jié)果就如預(yù)期的那樣不包含重復(fù)的行了趁窃。有趣的是如果將rollup(prod_category)改成null,去掉having子句裆针，同時還能得到預(yù)期的輸出耘婚。代碼如下：


with tsales as
 (select /* lst7-17-1 */
   s.quantity_sold,
   s.amount_sold,
   to_char(mod(c.cust_year_of_birth, 10) * 10) || '_' ||
   to_char((mod(c.cust_year_of_birth, 10) * 10) + 10) age_range,
   nvl(c.cust_income_level, 'A: below 30,000') cust_income_level,
   p.prod_name,
   p.prod_desc,
   p.prod_category,
   (pf.unit_cost * s.quantity_sold) total_cost,
   s.amount_sold - (pf.unit_cost * s.quantity_sold) profit
    from sh.sales s
    join sh.customers c
      on c.cust_id = s.cust_id
    join sh.products p
      on p.prod_id = s.prod_id
    join sh.times t
      on t.time_id = s.time_id
    join sh.costs pf
      on pf.channel_id = s.channel_id
     and pf.prod_id = s.prod_id
     and pf.promo_id = s.promo_id
     and pf.time_id = s.time_id
   where (t.fiscal_year = 2001))
select 'Q' || to_char(grouping_id(cust_income_level,age_range)+1) query_tag,
prod_category,
decode(grouping(cust_income_level),1,'ALL INCOME',cust_income_level) cust_income_level,
decode(grouping(age_range),1,'ALL AGE',age_range) age_range, 
sum(profit) profit
from tsales
group by prod_category,grouping sets(
null,
--rollup(prod_category), --產(chǎn)品分類小計
(cust_income_level),--產(chǎn)品分類和收入層次
(age_range),　--產(chǎn)品分類和年齡范圍
(cust_income_level,age_range)　--產(chǎn)品分類胞锰，年齡范圍和收入層次
)
--having group_id() < 1
order by prod_category,profit

group by的rollup()擴展也可以單獨用來計算否則將會需要由union all結(jié)合起來的多個查詢完成的小計誊册。
例如：創(chuàng)建顯示器所有名字以Sul開頭的消費者各自的購買總額報告，并且要求對每個消費者分別按年，產(chǎn)品分類進行小計，還要有所有消費的總計料祠。這種類型的任務(wù)可以使用rollup()完成顺呕。

--rollup()小計
with mysales as (
    select c.cust_last_name || ',' || c.cust_first_name cust_name,
    p.prod_category,
    to_char(trunc(time_id,'YYYY'),'YYYY') sale_year,
    p.prod_name,
    s.amount_sold
    from sh.sales s
    join sh.products p on p.prod_id=s.prod_id
    join sh.customers c on c.cust_id=s.cust_id
    where c.cust_last_name like 'Sul%'
)
select 
    decode(grouping(m.cust_name),1,'GRAND TOTAL',m.cust_name) cust_name,
    decode(grouping(m.sale_year),1,'TOTAL BY YEAR',m.sale_year) sale_year,
    decode(grouping(m.prod_category),1,'TOTAL BY CATEGORY',m.prod_category) prod_category,
    sum(m.amount_sold) amount_sold
from mysales m
group by rollup(m.cust_name,m.prod_category,m.sale_year)
order by grouping(m.cust_name), 1,2,3;

注意decode()和grouping()函數(shù)再一次被用來表示小計行启盛。使用grouping(m.cust_name)將總計顯示在報告的最后棍厂。由于這個值>0的唯一情況就是當計算所有消費者總計時张漂，這個總計值只會出現(xiàn)在報告的最后漠畜。

group by的局限性

LOB列瘾敢，嵌套表或數(shù)組不能用做group by表達式的一部分

SQL> with lobtest as (
  2   select to_clob(d.dname) dname
  3   from scott.emp e
  4   join scott.dept d on d.deptno=e.deptno
  5  )
  6  select l.dname
  7  from lobtest l
  8  group by l.dname;
group by l.dname
         *
第 8 行出現(xiàn)錯誤:
ORA-00932: 數(shù)據(jù)類型不一致: 應(yīng)為 -, 但卻獲得 CLOB

不允許使用標量子查詢表達式


SQL> select d.dname,count(empno) empcount
  2  from scott.emp e
  3  join scott.dept d on d.deptno=e.deptno
  4  group by (select dname from scott.dept d2 where d2.dname = d.dname )
  5  order by d.dname;
group by (select dname from scott.dept d2 where d2.dname = d.dname )
          *
第 4 行出現(xiàn)錯誤:
ORA-22818: 這里不允許出現(xiàn)子查詢表達式

如果group by子句引用任何對象類型的列則查詢不能并行化

create type dept_location_type as object
(
       street_address varchar2(40),
       postal_code varchar2(10),
       city varchar2(30),
       state_province varchar2(10),
       country_id char(2),
       order member function match (e dept_location_type) return integer
);
/
create or replace type body  dept_location_type
as order member function match (e dept_location_type) return integer
is 
   begin
         if city <e.city then
                 return -1;
         elsif city > e.city then
                 return 1;
         else
                 return 0;        
         end if;
   end;
end;
/      

create table deptobj
as 
select d.deptno,d.dname
from scott.dept d;     

alter table  deptobj add (dept_location dept_location_type);   

select * from deptobj;

update deptobj set dept_location=dept_location_type('1234 fenmenao st','453076','ShenZhen','GuangDong','GD') where deptno=1;
update deptobj set dept_location=dept_location_type('345 Leshan st','123456','LeShan','SiCuan','SC') where deptno=2;
update deptobj set dept_location=dept_location_type('345 ChongQing st','123456','ChongQing','ChongQing','CQ') where deptno=3;
update deptobj set dept_location=dept_location_type('345 ChangChun st','123456','ChangChun','GuiYang','GY') where deptno=4;
--對象列的并行g(shù)roup by
select /*+ gather_plan_statictics parallel(e 2)*/
d.dept_location,count(e.ename) ecount
from scott.emp e,deptobj d
where e.deptno=d.deptno
group by d.dept_location
order by d.dept_location;

執(zhí)行結(jié)果

對象列的并行g(shù)roup by

dept_location類型體中的成員函數(shù)匹配用來進行城市值的比較郑临，然后使用group by將雇員按城市分組典奉。最后一個列出的局限性在后期的版本是可以工作的陕靠。

總結(jié)

Ｏracle以group by子句擴展的形式為SQL開發(fā)者提供了一些極佳的工具税肪，幫助我們不僅能夠減少代碼量，并且能提高數(shù)據(jù)庫效率。大多數(shù)的特性也要與其它不同的功能進行組合滔以。

group by總結(jié)

最后編輯于：2017.12.03 10:24:09

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末敦迄，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子瞧柔，更是在濱河造成了極大的恐慌，老刑警劉巖，帶你破解...
沈念sama閱讀 219,427評論 6贊 508
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件雹拄，死亡現(xiàn)場離奇詭異势篡，居然都是意外死亡绷蹲，警方通過查閱死者的電腦和手機疤估，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,551評論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門雕什，熙熙樓的掌柜王于貴愁眉苦臉地迎上來磷雇，“玉大人，你說我怎么就攤上這事抄瑟⊥什猓” “怎么了？”我有些...
開封第一講書人閱讀 165,747評論 0贊 356
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵汛兜，是天一觀的道長巴粪。經(jīng)常有香客問我，道長粥谬，這世上最難降的妖魔是什么肛根？我笑而不...
開封第一講書人閱讀 58,939評論 1贊 295
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮漏策，結(jié)果婚禮上派哲，老公的妹妹穿的比我還像新娘。我一直安慰自己掺喻，他們只是感情好芭届，可當我...
茶點故事閱讀 67,955評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布储矩。她就那樣靜靜地躺著，像睡著了一般褂乍。火紅的嫁衣襯著肌膚如雪持隧。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 51,737評論 1贊 305
城市分裂傳說
那天逃片，我揣著相機與錄音屡拨，去河邊找鬼。笑死褥实，一個胖子當著我的面吹牛呀狼，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播损离，決...
沈念sama閱讀 40,448評論 3贊 420
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼哥艇，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了僻澎？” 一聲冷哼從身側(cè)響起貌踏，我...
開封第一講書人閱讀 39,352評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎怎棱，沒想到半個月后哩俭，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 45,834評論 1贊 317
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡拳恋，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 37,992評論 3贊 338
?白月光啟示錄
正文我和宋清朗相戀三年凡资，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片谬运。...
茶點故事閱讀 40,133評論 1贊 351
活死人
序言：一個原本活蹦亂跳的男人離奇死亡隙赁，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出梆暖，到底是詐尸還是另有隱情伞访，我是刑警寧澤，帶...
沈念sama閱讀 35,815評論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布轰驳，位于F島的核電站厚掷，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏级解。R本人自食惡果不足惜冒黑，卻給世界環(huán)境...
茶點故事閱讀 41,477評論 3贊 331
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望勤哗。院中可真熱鬧抡爹，春花似錦、人聲如沸芒划。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,022評論 0贊 22
一樁弒父案欧穴，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至泵殴，卻和暖如春涮帘，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背袋狞。一陣腳步聲響...
開封第一講書人閱讀 33,147評論 1贊 272
情欲美人皮
我被黑心中介騙來泰國打工焚辅，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人苟鸯。一個月前我還...
沈念sama閱讀 48,398評論 3贊 373
代替公主和親
正文我出身青樓，卻偏偏與公主長得像棚点，于是被迫代替她去往敵國和親早处。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當晚...
茶點故事閱讀 45,077評論 2贊 355