本文首發(fā)于公眾號:醫(yī)學(xué)和生信筆記痴柔,完美觀看體驗(yàn)請至公眾號查看本文诀拭。
醫(yī)學(xué)和生信筆記,專注R語言在臨床醫(yī)學(xué)中的使用众辨,R語言數(shù)據(jù)分析和可視化端三。
這是R語言和醫(yī)學(xué)統(tǒng)計(jì)學(xué)的第2篇內(nèi)容。
主要是用R語言復(fù)現(xiàn)課本中的例子鹃彻。我使用的課本是孫振球主編的《醫(yī)學(xué)統(tǒng)計(jì)學(xué)》第4版郊闯,封面如下:
使用課本例4-2的數(shù)據(jù)。
首先是構(gòu)造數(shù)據(jù)浮声,本次數(shù)據(jù)自己從書上摘錄虚婿。旋奢。
trt<-c(rep("group1",30),rep("group2",30),rep("group3",30),rep("group4",30))
weight<-c(3.53,4.59,4.34,2.66,3.59,3.13,3.30,4.04,3.53,3.56,3.85,4.07,1.37,
3.93,2.33,2.98,4.00,3.55,2.64,2.56,3.50,3.25,2.96,4.30,3.52,3.93,
4.19,2.96,4.16,2.59,2.42,3.36,4.32,2.34,2.68,2.95,2.36,2.56,2.52,
2.27,2.98,3.72,2.65,2.22,2.90,1.98,2.63,2.86,2.93,2.17,2.72,1.56,
3.11,1.81,1.77,2.80,3.57,2.97,4.02,2.31,2.86,2.28,2.39,2.28,2.48,
2.28,3.48,2.42,2.41,2.66,3.29,2.70,2.66,3.68,2.65,2.66,2.32,2.61,
3.64,2.58,3.65,3.21,2.23,2.32,2.68,3.04,2.81,3.02,1.97,1.68,0.89,
1.06,1.08,1.27,1.63,1.89,1.31,2.51,1.88,1.41,3.19,1.92,0.94,2.11,
2.81,1.98,1.74,2.16,3.37,2.97,1.69,1.19,2.17,2.28,1.72,2.47,1.02,
2.52,2.10,3.71)
data1<-data.frame(trt,weight)
head(data1)
## trt weight
## 1 group1 3.53
## 2 group1 4.59
## 3 group1 4.34
## 4 group1 2.66
## 5 group1 3.59
## 6 group1 3.13
數(shù)據(jù)一共兩列泳挥,第一列是分組(一共四組),第二列是低密度脂蛋白測量值
先簡單看下數(shù)據(jù)分布
boxplot(weight ~ trt, data = data1)
進(jìn)行完全隨機(jī)設(shè)計(jì)資料的方差分析:
fit <- aov(weight ~ trt, data = data1)
summary(fit)
## Df Sum Sq Mean Sq F value Pr(>F)
## trt 3 32.16 10.719 24.88 1.67e-12 ***
## Residuals 116 49.97 0.431
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
結(jié)果顯示組間自由度為3至朗,組內(nèi)自由度為116屉符,組間離均差平方和為32.16,組內(nèi)離均差平方和為49.97锹引,組間均方為10.719矗钟,組內(nèi)均方為0.431,F(xiàn)值=24.88嫌变,p=1.67e-12吨艇,和課本一致。
再簡單介紹一下可視化的平均數(shù)和可信區(qū)間的方法:
library(gplots)
##
## 載入程輯包:'gplots'
## The following object is masked from 'package:stats':
##
## lowess
plotmeans(weight~trt,xlab = "treatment",ylab = "weight",
main="mean plot\nwith95% CI")
隨機(jī)區(qū)組設(shè)計(jì)資料的方差分析
使用例4-3的數(shù)據(jù)腾啥。
首先是構(gòu)造數(shù)據(jù)东涡,本次數(shù)據(jù)自己從書上摘錄冯吓。。
weight <- c(0.82,0.65,0.51,0.73,0.54,0.23,0.43,0.34,0.28,0.41,0.21,
0.31,0.68,0.43,0.24)
block <- c(rep(c("1","2","3","4","5"),each=3))
group <- c(rep(c("A","B","C"),5))
data4_4 <- data.frame(weight,block,group)
head(data4_4)
## weight block group
## 1 0.82 1 A
## 2 0.65 1 B
## 3 0.51 1 C
## 4 0.73 2 A
## 5 0.54 2 B
## 6 0.23 2 C
數(shù)據(jù)一共3列疮跑,第一列是小白鼠肉瘤重量组贺,第二列是區(qū)組因素(5個區(qū)組),第三列是分組(一共3組)
進(jìn)行完全隨機(jī)設(shè)計(jì)資料的方差分析:
fit <- aov(weight ~ block + group,data = data4_4) #隨機(jī)區(qū)組設(shè)計(jì)方差分析祖娘,注意順序
summary(fit)
## Df Sum Sq Mean Sq F value Pr(>F)
## block 4 0.2284 0.05709 5.978 0.01579 *
## group 2 0.2280 0.11400 11.937 0.00397 **
## Residuals 8 0.0764 0.00955
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
結(jié)果顯示區(qū)組間自由度為4失尖,分組間自由度為2,組內(nèi)自由度為8渐苏,區(qū)組間離均差平方和為0.2284掀潮,分組間離均差平方和為0.2280,組內(nèi)離均差平方和為0.0764琼富,區(qū)組間均方為0.05709胧辽,分組間均方為0.1140,組內(nèi)均方為0.00955公黑,區(qū)組間F值=5.798邑商,p=0.01579,分組間F值=11.937凡蚜,p=0.00397人断,和課本一致。
拉丁方設(shè)計(jì)方差分析
使用課本例4-5的數(shù)據(jù)朝蜘。首先是構(gòu)造數(shù)據(jù)恶迈,本次數(shù)據(jù)自己從書上摘錄。谱醇。
psize <- c(87,75,81,75,84,66,73,81,87,85,64,79,73,73,74,78,73,77,77,68,69,74,76,73,
64,64,72,76,70,81,75,77,82,61,82,61)
drug <- c("C","B","E","D","A","F","B","A","D","C","F","E","F","E","B","A","D","C",
"A","F","C","B","E","D","D","C","F","E","B","A","E","D","A","F","C","B")
col_block <- c(rep(1:6,6))
row_block <- c(rep(1:6,each=6))
mydata <- data.frame(psize,drug,col_block,row_block)
mydata$col_block <- factor(mydata$col_block)
mydata$row_block <- factor(mydata$row_block)
str(mydata)
## 'data.frame': 36 obs. of 4 variables:
## $ psize : num 87 75 81 75 84 66 73 81 87 85 ...
## $ drug : chr "C" "B" "E" "D" ...
## $ col_block: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6 1 2 3 4 ...
## $ row_block: Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 1 2 2 2 2 ...
數(shù)據(jù)一共4列暇仲,第一列是皮膚皰疹大小,第二列是不同藥物(處理因素副渴,共5種)奈附,第三列是列區(qū)組因素,第四列是行區(qū)組因素煮剧。
進(jìn)行拉丁方設(shè)計(jì)的方差分析:
fit <- aov(psize ~ drug + row_block + col_block, data = mydata)
summary(fit)
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 5 667.1 133.43 3.906 0.0124 *
## row_block 5 250.5 50.09 1.466 0.2447
## col_block 5 85.5 17.09 0.500 0.7723
## Residuals 20 683.2 34.16
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
結(jié)果顯示行區(qū)組間自由度為5斥滤,列區(qū)組間自由度為5,分組(處理因素)間自由度為5勉盅,組內(nèi)自由度為20佑颇;
行區(qū)組間離均差平方和為250.5,列區(qū)組間離均差平方和為85.5草娜,分組間離均差平方和為667.1挑胸,組內(nèi)離均差平方和為0.0683.2;
行區(qū)組間均方為50.09宰闰,列區(qū)組間均方為17.09茬贵,分組間均方為133.43凸克,組內(nèi)均方為34.16,
行區(qū)組間F值=1.466闷沥,p=0.2447萎战,列區(qū)組間F值=0.5,p=0.7723舆逃,分組間F值=3.906蚂维,p=0.0124,和課本一致路狮。
兩階段交叉設(shè)計(jì)資料方差分析
使用課本例4-6的數(shù)據(jù)虫啥。首先是構(gòu)造數(shù)據(jù),本次數(shù)據(jù)自己從書上摘錄奄妨。涂籽。
contain <- c(760,770,860,855,568,602,780,800,960,958,940,952,635,650,440,450,
528,530,800,803)
phase <- rep(c("phase_1","phase_2"),10)
type <- c("A","B","B","A","A","B","A","B","B","A","B","A","A","B","B","A",
"A","B","B","A")
testid <- rep(1:10,each=2)
mydata <- data.frame(testid,phase,type,contain)
str(mydata)
## 'data.frame': 20 obs. of 4 variables:
## $ testid : int 1 1 2 2 3 3 4 4 5 5 ...
## $ phase : chr "phase_1" "phase_2" "phase_1" "phase_2" ...
## $ type : chr "A" "B" "B" "A" ...
## $ contain: num 760 770 860 855 568 602 780 800 960 958 ...
mydata$testid <- factor(mydata$testid)
數(shù)據(jù)一共4列,第一列是受試者id砸抛,第二列是不同階段评雌,第三列是測定方法,第四列是測量值直焙。
簡單看下2個階段情況:
table(mydata$phase,mydata$type)
##
## A B
## phase_1 5 5
## phase_2 5 5
進(jìn)行兩階段交叉設(shè)計(jì)資料方差分析:
fit <- aov(contain~phase+type+testid,mydata)
summary(fit)
## Df Sum Sq Mean Sq F value Pr(>F)
## phase 1 490 490 9.925 0.0136 *
## type 1 198 198 4.019 0.0799 .
## testid 9 551111 61235 1240.195 1.32e-11 ***
## Residuals 8 395 49
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
結(jié)果和課本一致景东!
本文首發(fā)于公眾號:醫(yī)學(xué)和生信筆記,完美觀看體驗(yàn)請至公眾號查看本文奔誓。
醫(yī)學(xué)和生信筆記斤吐,專注R語言在臨床醫(yī)學(xué)中的使用,R語言數(shù)據(jù)分析和可視化厨喂。