一、創(chuàng)建表,并將數(shù)據(jù)導(dǎo)入userinfo和orderinfo表中
1.導(dǎo)入數(shù)據(jù):
第一步:建表
create table orderinfo(
orderid? ? int? primary key not null ,
userid? ? ? int,
isPaid? ? ? varchar(10),
price? ? ? ? float,
paidTime? varchar(30));
create table userinfo(
? ? userid? int primary key,
? ? sex? ? varchar(10),
? ? birth? ? date);
第二步:導(dǎo)數(shù)
load data local? infile 'C:/Users/lan/Desktop/123/order_info_utf.csv' into table orderinfo? fields terminated by ',';
load data local? infile 'C:/Users/lan/Desktop/123/user_info_utf.csv' into table userinfo? fields terminated by ',';
二值骇、mysql數(shù)據(jù)分析案例
1.首先了解兩個(gè)表的信息:
userinfo 用戶信息表:
userid:主鍵罢屈,用戶id
sex:用戶性別
birth:用戶生日
orderinfo 訂單信息表
orderid:訂單id
userid:用戶id
ispaid:支付狀態(tài)
price:支付價(jià)格
paidtime:支付時(shí)間
2.統(tǒng)計(jì)不同月份的下單人數(shù)
分析要點(diǎn):不同月份奖磁,使用month函數(shù)獲取訂單日期中的月份亏吝,下單狀態(tài)為已支付狀態(tài)
select month(paidtime),count(distinct userid) from orderinfo where ispaid='已支付' group by month(paidtime);
3.統(tǒng)計(jì)用戶三月份和四月份回購率和復(fù)購率
復(fù)購率:本月消費(fèi)的人數(shù)有多少是重復(fù)購買的,即購買次數(shù)大于1次;
分步計(jì)算:
首先求出三月份存在消費(fèi)的用戶球拦,以及每個(gè)用戶的消費(fèi)次數(shù);(由于運(yùn)行數(shù)據(jù)較大限制顯示行數(shù))
select userid,count(userid) as ct from orderinfo where ispaid='已支付' and month(paidtime)=3 group by userid limit 10;
如上表所示userid列為本月存在消費(fèi)的用戶帐我,ct列為消費(fèi)次數(shù)坎炼,統(tǒng)計(jì)ct列,count(ct)為計(jì)算本月存在消費(fèi)總?cè)藬?shù)拦键,通過if函數(shù)計(jì)算出消費(fèi)次數(shù)大于1的消費(fèi)者數(shù)量谣光,即復(fù)購的消費(fèi)者數(shù),相除即為復(fù)購率芬为;
select count(ct),count(if(ct>1,1,null)) as nt from
(select userid,count(userid) as ct from orderinfo where ispaid='已支付' and month(paidtime)=3 group by userid) t;
也可使用case when方法萄金,結(jié)果相同;
select count(ct),count(case when ct>1 then 1 else null end) as nt from
(select userid,count(userid) as ct from orderinfo where ispaid='已支付' and month(paidtime)=3 group by userid) t;
回購率:三月份購買的人數(shù)媚朦,四月份依舊購買氧敢;
首先用userid和月份進(jìn)行分組,計(jì)算用戶在當(dāng)月是否消費(fèi)過询张;(由于運(yùn)行數(shù)據(jù)較大限制顯示行數(shù))
select userid,date_format(paidtime,'%Y-%m-01') from orderinfo where ispaid='已支付' group by userid,date_format(paidtime,'%Y-%m-01') limit 10;
通過left join對(duì)表進(jìn)行關(guān)聯(lián)孙乖,關(guān)聯(lián)出三月、四月都存在消費(fèi)情況(由于運(yùn)行數(shù)據(jù)較大限制顯示行數(shù))
如圖,userid5 在3月和4月都存在消費(fèi)的圆,
select * from
(select userid,date_format(paidtime,'%Y-%m-01') as m from orderinfo where ispaid='已支付' group by userid,date_format(paidtime,'%Y-%m-01')) t1
left join
(select userid,date_format(paidtime,'%Y-%m-01') as m from orderinfo where ispaid='已支付' group by userid,date_format(paidtime,'%Y-%m-01'))t2
on t1.userid=t2.userid and t1.m=date_sub(t2.m,interval 1 month)? limit 20;
對(duì)上表進(jìn)行計(jì)數(shù)鼓拧,計(jì)算出當(dāng)月存在消費(fèi)的情況和下月依然存在消費(fèi)的情況,相除即為復(fù)購率越妈;
select t1.m,count(t1.m),count(t2.m) from
(select userid,date_format(paidtime,'%Y-%m-01') as m from orderinfo where ispaid='已支付' group by userid,date_format(paidtime,'%Y-%m-01')) t1
left join
(select userid,date_format(paidtime,'%Y-%m-01') as m from orderinfo where ispaid='已支付' group by userid,date_format(paidtime,'%Y-%m-01'))t2
on t1.userid=t2.userid and t1.m=date_sub(t2.m,interval 1 month) group by t1.m;
3.統(tǒng)計(jì)男女用戶的消費(fèi)頻次是否有差異季俩;
首先性別字段存在空值,先把空值過濾掉梅掠;
select * from userinfo where sex <>' ' limit 10;
將訂單表與用戶表相關(guān)聯(lián)酌住;
select * from orderinfo o
inner join
(select * from userinfo where sex<>' ') t
on o.userid=t.userid limit 20;
統(tǒng)計(jì)男、女的消費(fèi)總次數(shù)阎抒;
select o.userid,sex,count(1) from orderinfo o
inner join
(select * from userinfo where sex<>' ') t
on o.userid=t.userid
group by o.userid,sex limit 20;
統(tǒng)計(jì)男女消費(fèi)頻次差異酪我;
select sex,avg(ct) from (
select o.userid,sex,count(1) as ct from orderinfo o
inner join
(select * from userinfo where sex<>' ') t
on o.userid=t.userid
group by o.userid,sex) t2
group by sex;
4.統(tǒng)計(jì)多次消費(fèi)的用戶,第一次消費(fèi)和最后一次消費(fèi)的時(shí)間間隔
首先求出多次消費(fèi)的用戶且叁;
select * from orderinfo where ispaid='已支付' group by userid having count(1)>1;
求出最大時(shí)間都哭,和最小時(shí)間,時(shí)間間隔(求分組后的組內(nèi)的最大值和最小值,按照userid 分組已經(jīng)過濾掉了消費(fèi)次數(shù)小于1的用戶逞带,所以至少存在兩次消費(fèi)情況欺矫,課選出最大時(shí)間,最小時(shí)間)
select userid,max(paidtime),min(paidtime) ,datediff(max(paidtime),min(paidtime)) as mt from orderinfo where ispaid='已支付' group by userid having count(1)>1 limit 20;
5.統(tǒng)計(jì)不同年齡段展氓,消費(fèi)是否有差異:
首先統(tǒng)計(jì)用戶的年齡段穆趴;注意年齡計(jì)算函數(shù),過濾無效數(shù)據(jù)遇汞;
select userid,ceil((year(now())-year(birth))/10) as ct from userinfo where birth>'1901-00-00' limit 10;
訂單表和用戶表關(guān)聯(lián)未妹,查看不同年齡段消費(fèi)情況;
select * from orderinfo o
inner join
(select userid,ceil((year(now())-year(birth))/10) as ct from userinfo where birth>'1901-00-00') t
on o.userid=t.userid limit 10;
統(tǒng)計(jì)不同年齡段消費(fèi)頻次空入;
select o.userid,t.ct,count(t.ct) as mt from orderinfo o
inner join
(select userid,ceil((year(now())-year(birth))/10) as ct from userinfo where birth>'1901-00-00') t
on o.userid=t.userid group by o.userid,t.ct limit 10;
6.統(tǒng)計(jì)消費(fèi)二八法則络它,消費(fèi)20%的用戶,貢獻(xiàn)了多少額度歪赢;
首先計(jì)算用戶的消費(fèi)總金額酪耕;
select userid,sum(price) from orderinfo where ispaid='已支付' group by userid limit 10;
select userid,sum(price) as total from orderinfo where ispaid='已支付' group by userid limit 10;
計(jì)算出前20%的用戶;
select count(userid) *0.2,sum(total) from (
select userid,sum(price) as total from orderinfo where ispaid='已支付' group by userid order by total) t;
由于這里計(jì)算出的結(jié)果是前17000行轨淌,所以將內(nèi)層查詢條件限制為17000;
select count(userid) ,sum(total) from (
select userid,sum(price) as total from orderinfo where ispaid='已支付' group by userid order by total desc limit 17000) t;