行列互轉問題
行轉列
從數據的行拆出數據的列露氮,即把觀測的屬性(行)拆成一個變量(列)峡碉。
附注
1.其實我一開始覺得這是列轉行瞎惫,因為觀測的屬性由列排布變成了行,但一般都稱這種問題為行轉列胎源。
2.我對概念的理解可能有問題棉钧。
示例數據
id | revenue | month |
---|---|---|
1 | 8000 | Jan |
2 | 9000 | Jan |
3 | 1000 | Feb |
1 | 7000 | Feb |
1 | 6000 | Mar |
輸出每個部門每個月的薪資,即結果表格結構為:
id | Jan_Revenue | ... | Dec_Revenue |
---|
允許空值涕蚤。
錯誤解答
select id,case when month='Jan' then revenue end as Jan_Revenue,case when month='Feb' then revenue end as Feb_Revenue,case when month='Mar' then revenue end as Mar_Revenue,case when month='Apr' then revenue end as Apr_Revenue,case when month='May' then Revenue end as May_Revenue,case when month='Jun' then revenue end as Jun_Revenue,case when month='Jul' then revenue end as Jul_Revenue,case when month='Aug' then revenue end as Aug_Revenue,case when month='Sep' then revenue end as Sep_Revenue,case when month='Oct' then revenue end as Oct_Revenue,case when month='Nov' then revenue end as Nov_Revenue,case when month='Dec' then revenue end as Dec_Revenue
from department;
運行結果(部分):
id | Jan_Revenue | Feb_Revenue | ... | Dec_Revenue |
---|---|---|---|---|
1 | 8000 | NULL | ... | NULL |
2 | 9000 | NULL | ... | NULL |
3 | NULL | 10000 | ... | NULL |
1 | NULL | 7000 | ... | NULL |
1 | NULL | NULL | ... | NULL |
對于每個用戶分別輸出結果宪卿,不符合要求。
考慮合并結果万栅,一個自然的想法是聚合函數(注意:sum(NULL)=0):
select id,sum(case when month='Jan' then revenue end) as Jan_Revenue,
sum(case when month='Feb' then revenue end )as Feb_Revenue,
sum(case when month='Mar' then revenue end) as Mar_Revenue,
sum(case when month='Apr' then revenue end) as Apr_Revenue,
sum(case when month='May' then Revenue end) as May_Revenue,
sum(case when month='Jun' then revenue end) as Jun_Revenue,
sum(case when month='Jul' then revenue end) as Jul_Revenue,
sum(case when month='Aug' then revenue end) as Aug_Revenue,
sum(case when month='Sep' then revenue end) as Sep_Revenue,
sum(case when month='Oct' then revenue end) as Oct_Revenue,
sum(case when month='Nov' then revenue end) as Nov_Revenue,
sum(case when month='Dec' then revenue end) as Dec_Revenue
from department
group by id;
運行結果:
id | Jan_Revenue | Feb_Revenue | ... | Dec_Revenue |
---|---|---|---|---|
1 | 8000 | 7000 | ... | NULL |
2 | 9000 | NULL | ... | NULL |
3 | NULL | 10000 | ... | NULL |
這一結果符合要求佑钾。
另外的寫法:
select id,sum(if(month='Jan',revenue,null)) as Jan_Revenue,
sum(if(month='Feb',revenue,null))as Feb_Revenue,
sum(if(month='Mar',revenue,null)) as Mar_Revenue,
sum(if(month='Apr',revenue,null)) as Apr_Revenue,
sum(if(month='May',revenue,null)) as May_Revenue,
sum(if(month='Jun',revenue,null)) as Jun_Revenue,
sum(if(month='Jul',revenue,null)) as Jul_Revenue,
sum(if(month='Aug',revenue,null)) as Aug_Revenue,
sum(if(month='Sep',revenue,null)) as Sep_Revenue,
sum(if(month='Oct',revenue,null)) as Oct_Revenue,
sum(if(month='Nov',revenue,null)) as Nov_Revenue,
sum(if(month='Dec',revenue,null)) as Dec_Revenue
from department
group by id;
與case when的寫法思路相同,稍微簡潔一些烦粒。
思路總結
Step1. 用case when或if篩選出將要轉為變量的屬性休溶。
Step2.利用分組聚合函數sum匯總行。
語句框架:
select xx,sum(case when yy='屬性1'then zz end),...
from tb
group by xx;
select xx,sum(if(yy='屬性1',zz,null)),...
from tb
group by xx;
列轉行
列轉行是行轉列的逆過程扰她。
把上述結果再轉回原表格:
#為了方便兽掰,把行專列的結果存成一個臨時表
with tb as(
select id,sum(if(month='Jan',revenue,null)) as Jan_Revenue,
sum(if(month='Feb',revenue,null))as Feb_Revenue,
sum(if(month='Mar',revenue,null)) as Mar_Revenue,
sum(if(month='Apr',revenue,null)) as Apr_Revenue,
sum(if(month='May',revenue,null)) as May_Revenue,
sum(if(month='Jun',revenue,null)) as Jun_Revenue,
sum(if(month='Jul',revenue,null)) as Jul_Revenue,
sum(if(month='Aug',revenue,null)) as Aug_Revenue,
sum(if(month='Sep',revenue,null)) as Sep_Revenue,
sum(if(month='Oct',revenue,null)) as Oct_Revenue,
sum(if(month='Nov',revenue,null)) as Nov_Revenue,
sum(if(month='Dec',revenue,null)) as Dec_Revenue
from department
group by id)
#列轉行
select id,Jan_Revenue as Revenue,'Jan' as month
from tb
union all
select id,Feb_Revenue as Revenue,'Feb' as month
from tb
union all
select id,Mar_Revenue as Revenue,'Mar' as month
from tb
union all
select id,Apr_Revenue as Revenue,'Apr' as month
from tb
union all
select id,May_Revenue as Revenue,'May' as month
from tb
union all
select id,Jun_Revenue as Revenue,'Jun' as month
from tb
union all
select id,Jul_Revenue as Revenue,'Jul' as month
from tb
union all
select id,Aug_Revenue as Revenue,'Aug' as month
from tb
union all
select id,Sep_Revenue as Revenue,'Sep' as month
from tb
union all
select id,Oct_Revenue as Revenue,'Oct' as month
from tb
union all
select id,Nov_Revenue as Revenue,'Nov' as month
from tb
union all
select id,Dec_Revenue as Revenue,'Dec' as month
from tb;
運行結果:
id | revenue | month |
---|---|---|
1 | 8000 | Jan |
2 | 9000 | Jan |
3 | NULL | Jan |
1 | 7000 | Feb |
2 | NULL | Feb |
3 | 10000 | Feb |
.. | .. | .. |
其中含有很多NULL,不是原來的表徒役,考慮去除孽尽。
with tb as(
select id,sum(if(month='Jan',revenue,null)) as Jan_Revenue,
sum(if(month='Feb',revenue,null))as Feb_Revenue,
sum(if(month='Mar',revenue,null)) as Mar_Revenue,
sum(if(month='Apr',revenue,null)) as Apr_Revenue,
sum(if(month='May',revenue,null)) as May_Revenue,
sum(if(month='Jun',revenue,null)) as Jun_Revenue,
sum(if(month='Jul',revenue,null)) as Jul_Revenue,
sum(if(month='Aug',revenue,null)) as Aug_Revenue,
sum(if(month='Sep',revenue,null)) as Sep_Revenue,
sum(if(month='Oct',revenue,null)) as Oct_Revenue,
sum(if(month='Nov',revenue,null)) as Nov_Revenue,
sum(if(month='Dec',revenue,null)) as Dec_Revenue
from department
group by id)
#列轉行
select *
from(
select id,Jan_Revenue as Revenue,'Jan' as month
from tb
union all
select id,Feb_Revenue as Revenue,'Feb' as month
from tb
union all
select id,Mar_Revenue as Revenue,'Mar' as month
from tb
union all
select id,Apr_Revenue as Revenue,'Apr' as month
from tb
union all
select id,May_Revenue as Revenue,'May' as month
from tb
union all
select id,Jun_Revenue as Revenue,'Jun' as month
from tb
union all
select id,Jul_Revenue as Revenue,'Jul' as month
from tb
union all
select id,Aug_Revenue as Revenue,'Aug' as month
from tb
union all
select id,Sep_Revenue as Revenue,'Sep' as month
from tb
union all
select id,Oct_Revenue as Revenue,'Oct' as month
from tb
union all
select id,Nov_Revenue as Revenue,'Nov' as month
from tb
union all
select id,Dec_Revenue as Revenue,'Dec' as month
from tb
)tb2
where revenue is not null;
運行結果:
id | revenue | month |
---|---|---|
1 | 8000 | Jan |
2 | 9000 | Jan |
1 | 7000 | Feb |
3 | 10000 | Feb |
1 | 6000 | Mar |
返回了原表。
思路總結
Step1 分別取出各觀測各屬性的數據忧勿,生成屬性外其它列與屬性列(示例中的month)泻云。
Step1 用union all合并觀測艇拍。
語句框架:
select xx,yy as 'name1',屬性1 as 'name2'
from tb
union all
select xx,yy as 'name1',屬性2 as 'name2'
from tb
union all
...