背景
APP分析中經(jīng)常用到AARRR模型(海島模型)用來分析APP的現(xiàn)狀焦除,其中一個(gè)重要節(jié)點(diǎn)就是提高留存(Acquisition)尊流,而留存率這個(gè)指標(biāo)在這個(gè)階段可以說是核心指標(biāo)也不為過。那如何用SQL計(jì)算留存率呢玛歌?
留存率計(jì)算方法
假如今天新增了100名用戶昧港,第二天登陸了50名,則次日留存率為50/100=50%支子,第三天登錄了30名创肥,則第二日留存率為30/100=30%,以此類推。
用SQL的計(jì)算思路
用SQL調(diào)取出user_id和用戶login_time的表值朋,獲得新增用戶登錄時(shí)間表叹侄。
根據(jù)user_id和login_time,增加一列first_day昨登,此列存著每個(gè)用戶最早登錄時(shí)間趾代。
有了最早登錄時(shí)間和所有的登錄時(shí)間,再增加一列by_day丰辣,這一列是用login_time - first_day 撒强,得到0禽捆,1,2飘哨,3胚想,4,5......芽隆,這就得到了某一天登錄離第一次登錄有多長時(shí)間浊服。
- 然后從表中提取數(shù)據(jù),找到first_day對(duì)應(yīng)的with_first列中0有多少個(gè)胚吁,1有多少個(gè)牙躺,一直到7以上。
- 根據(jù)此表囤采,就很容易計(jì)算出每天引流的留存率述呐。
實(shí)際操作
數(shù)據(jù):是我用excel隨便模擬的數(shù)據(jù),與真實(shí)情況不符蕉毯。
數(shù)據(jù)庫:MySQL
步驟一:從數(shù)據(jù)庫中提取出user_id和login_time并排序
select
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
from user_info
group by 1,2;
步驟二:增加一列first_day乓搬,存儲(chǔ)每個(gè)用戶ID最早登錄時(shí)間
SELECT
b.user_id,
b.login_time,
c.first_day
FROM
(select
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
from user_info
group by 1,2) b
LEFT JOIN
(SELECT ---找到user_id對(duì)應(yīng)的最早登錄時(shí)間,然后匹配帶登錄時(shí)間的user_id
user_id,
min(login_time) first_day
FROM
(select
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
from user_info
group by 1,2) a
group by 1) c
on b.user_id = c.user_id
order by 1,2;
步驟三:用登錄時(shí)間-最早登錄時(shí)間得到一列by_day
SELECT
user_id,
login_time,
first_day,
DATEDIFF(login_time,first_day) as by_day
FROM
(SELECT
b.user_id,
b.login_time,
c.first_day
FROM
(SELECT
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
FROM user_info
GROUP BY 1,2) b
LEFT JOIN
(SELECT
user_id,
min(login_time) first_day
FROM
(select
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
from user_info
group by 1,2) a
group by 1) c
on b.user_id = c.user_id
order by 1,2) e
order by 1,2
最后一步:提取字段作為列名
SELECT
first_day,
sum(case when by_day = 0 then 1 else 0 end) day_0,
sum(case when by_day = 1 then 1 else 0 end) day_1,
sum(case when by_day = 2 then 1 else 0 end) day_2,
sum(case when by_day = 3 then 1 else 0 end) day_3,
sum(case when by_day = 4 then 1 else 0 end) day_4,
sum(case when by_day = 5 then 1 else 0 end) day_5,
sum(case when by_day = 6 then 1 else 0 end) day_6,
sum(case when by_day >= 7 then 1 else 0 end) day_7plus
FROM
(SELECT
user_id,
login_time,
first_day,
DATEDIFF(login_time,first_day) as by_day
FROM
(SELECT
b.user_id,
b.login_time,
c.first_day
FROM
(SELECT
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
FROM user_info
GROUP BY 1,2) b
LEFT JOIN
(SELECT
user_id,
min(login_time) first_day
FROM
(select
user_id,
str_to_date(login_time,'%Y/%m/%d') login_time
FROM
user_info
group by 1,2) a
group by 1) c
on b.user_id = c.user_id
order by 1,2) e
order by 1,2) f
group by 1
order by 1
結(jié)語
根據(jù)最后得到的數(shù)據(jù)代虾,我們直接用除法或者加一個(gè)SQL語句进肯,就能算出來留存率,之后的分析就是看自己了棉磨。
作者:成鵬9
鏈接:http://www.reibang.com/p/be2cb8880df6
來源:簡書
著作權(quán)歸作者所有江掩。商業(yè)轉(zhuǎn)載請聯(lián)系作者獲得授權(quán)乘瓤,非商業(yè)轉(zhuǎn)載請注明出處环形。</pre>