在數(shù)據(jù)分析中,我們經(jīng)常要對數(shù)據(jù)進(jìn)行統(tǒng)計(jì)分析馋辈;但是返回的結(jié)果往往是一串很長的浮點(diǎn)數(shù)不能給人直觀的感受子巾,本節(jié)來解釋如何使用lucid函數(shù)來改進(jìn)數(shù)據(jù)格式使P值更加直觀
原文鏈接:R中優(yōu)雅的對P值進(jìn)行轉(zhuǎn)換
安裝并加載R包
package.list=c("tidyverse","lucid","broom")
for (package in package.list) {
if (!require(package,character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
}
數(shù)據(jù)展示
Orange %>% group_by(Tree) %>%
do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame
可以看到返回的P值格式很不直觀
Tree term estimate std.error statistic p.value
1 3 (Intercept) 19.20353638 5.863410215 3.275148 2.207255e-02
2 3 age 0.08111158 0.005628105 14.411881 2.901046e-05
3 1 (Intercept) 24.43784664 6.543311039 3.734783 1.350409e-02
4 1 age 0.08147716 0.006280721 12.972581 4.851902e-05
5 5 (Intercept) 8.75834459 8.176436207 1.071169 3.330518e-01
6 5 age 0.11102891 0.007848307 14.146861 3.177093e-05
7 2 (Intercept) 19.96090337 9.352361105 2.134317 8.593318e-02
8 2 age 0.12506176 0.008977041 13.931291 3.425041e-05
9 4 (Intercept) 14.63762022 11.233762751 1.303002 2.493507e-01
10 4 age 0.13517222 0.010782940 12.535748 5.733090e-05
lucid轉(zhuǎn)換格式
Orange %>% group_by(Tree) %>%
do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>% lucid
Tree term estimate std.error statistic p.value
<ord> <chr> <chr> <chr> <chr> <chr>
1 3 (Intercept) "19.2 " " 5.86 " " 3.28" "0.0221 "
2 3 age " 0.0811" " 0.00563" "14.4 " "0.000029 "
3 1 (Intercept) "24.4 " " 6.54 " " 3.73" "0.0135 "
4 1 age " 0.0815" " 0.00628" "13 " "0.0000485"
5 5 (Intercept) " 8.76 " " 8.18 " " 1.07" "0.333 "
6 5 age " 0.111 " " 0.00785" "14.1 " "0.0000318"
7 2 (Intercept) "20 " " 9.35 " " 2.13" "0.0859 "
8 2 age " 0.125 " " 0.00898" "13.9 " "0.0000343"
9 4 (Intercept) "14.6 " "11.2 " " 1.3 " "0.249 "
10 4 age " 0.135 " " 0.0108 " "12.5 " "0.0000573"
經(jīng)過lucid函數(shù)處理后荞膘,可以看到數(shù)據(jù)符合人類的感官了,但是請注意數(shù)據(jù)格式變?yōu)榱俗址愋拖滤叮虼撕罄m(xù)我們需求將其重新轉(zhuǎn)換為數(shù)值型
P值轉(zhuǎn)換
通過symnum函數(shù)將P值轉(zhuǎn)換為
*
Orange %>% group_by(Tree) %>%
do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>%
mutate(p.value=as.numeric(p.value)) %>%
lucid %>%
mutate(pvalue=as.numeric(p.value),
p_signif=symnum(pvalue,
cutpoints = c(0,0.001,0.01,0.05,1),
symbols = c("***","**","*"," "))) %>%
select(-pvalue)
Tree term estimate std.error statistic p.value pvalue signif
1 3 (Intercept) 19.2 5.86 3.28 0.0221 2.21e-02 *
2 3 age 0.0811 0.00563 14.4 0.000029 2.90e-05 ***
3 1 (Intercept) 24.4 6.54 3.73 0.0135 1.35e-02 *
4 1 age 0.0815 0.00628 13 0.0000485 4.85e-05 ***
5 5 (Intercept) 8.76 8.18 1.07 0.333 3.33e-01
6 5 age 0.111 0.00785 14.1 0.0000318 3.18e-05 ***
7 2 (Intercept) 20 9.35 2.13 0.0859 8.59e-02
8 2 age 0.125 0.00898 13.9 0.0000343 3.43e-05 ***
9 4 (Intercept) 14.6 11.2 1.3 0.249 2.49e-01
10 4 age 0.135 0.0108 12.5 0.0000573 5.73e-05 ***
自定義函數(shù)結(jié)合sapply對P值進(jìn)行轉(zhuǎn)換
myfun <- function(pval) {
stars = ""
if(pval <= 0.001)
stars = "***"
if(pval > 0.001 & pval <= 0.01)
stars = "**"
if(pval > 0.01 & pval <= 0.05)
stars = "*"
if(pval > 0.05 & pval <= 0.1)
stars = ""
stars
}
Orange %>% group_by(Tree) %>%
do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>%
lucid %>%
mutate(pvalue=as.numeric(p.value)) %>%
mutate(signif = sapply(p.value, function(x) myfun(x)))
Tree term estimate std.error statistic p.value pvalue signif
1 3 (Intercept) 19.2 5.86 3.28 0.0221 2.21e-02 *
2 3 age 0.0811 0.00563 14.4 0.000029 2.90e-05 ***
3 1 (Intercept) 24.4 6.54 3.73 0.0135 1.35e-02 *
4 1 age 0.0815 0.00628 13 0.0000485 4.85e-05 ***
5 5 (Intercept) 8.76 8.18 1.07 0.333 3.33e-01
6 5 age 0.111 0.00785 14.1 0.0000318 3.18e-05 ***
7 2 (Intercept) 20 9.35 2.13 0.0859 8.59e-02
8 2 age 0.125 0.00898 13.9 0.0000343 3.43e-05 ***
9 4 (Intercept) 14.6 11.2 1.3 0.249 2.49e-01
10 4 age 0.135 0.0108 12.5 0.0000573 5.73e-05 ***
喜歡的小伙伴歡迎關(guān)注我的公眾號 丁逝,下回更新不迷路
R語言數(shù)據(jù)分析指南汁胆,持續(xù)分享數(shù)據(jù)可視化的經(jīng)典案例及一些生信知識,希望對大家