箱線圖是統(tǒng)計分析里面最為重要且基礎(chǔ)的圖形爽航,利用R語言繪制箱線圖是數(shù)據(jù)分析里必不可少的一環(huán)舀瓢。這里我總結(jié)了面對不同情形時,繪制箱線圖的經(jīng)驗届谈。
這里只介紹利用基礎(chǔ)繪圖命令boxplot缅糟,來繪制箱線圖挺智。
● 給出一列數(shù)據(jù),畫出箱線圖
library(RColorBrewer)
Value=round(runif(50,min=10,max=50),2)
colors = brewer.pal(8,"Accent")
boxplot(Value,col = "white",cex=0.5, pch=20,outcol=colors[6],
main = "Example",ylab="Value",xlab="Feature")
stripchart(list(Value),vertical = T,method = "jitter",cex = 0.8,
pch = 20,col = colors[1],add=T)
dev.off()
Example
這里窗宦,我們采用了stripchart函數(shù)以及RColorBrewer包來添加數(shù)據(jù)點并調(diào)整顏色赦颇。
● 給出兩列數(shù)據(jù),畫出箱線圖
Value1=round(runif(50,min=10,max=50),2)
Value2=round(runif(50,min=6,max=45),2)
colors = brewer.pal(8,"Accent")
boxplot(Value1,Value2,col = "white",cex=0.5, pch=20,outcol=colors[6],
main = "Example",ylab="Value",names =c("Feature1","Feature2"))
stripchart(list(Value1,Value2),vertical = T,method = "jitter",cex = 0.8,
pch = 20,col = colors[1:2],add=T)
dev.off()
Example2
注意赴涵,此時應(yīng)當(dāng)利用boxplot函數(shù)里面的內(nèi)置參數(shù)names設(shè)置每個箱線圖的特征名媒怯。并且將stripchart函數(shù)里面的數(shù)據(jù)用列表并起來。
● 給出按照分類變量劃分髓窜,含有眾多特征的dataframe扇苞,畫出箱線圖
library(dplyr)
class = factor(c(rep(1,50),rep(2,50)))
Value1=round(runif(50,min=10,max=50),2)
Value2=round(runif(50,min=6,max=45),2)
Dat = data.frame(class,Value1,Value2)
colors = brewer.pal(8,"Accent")
boxplot(Value1~class,Dat,col = "white",cex=0.5, pch=20,outcol=colors[6], main = "Example",ylab="Value")
Dat1 = Dat %>% filter(class==1)
Dat2 = Dat %>% filter(class==2)
Dat3 = Dat %>% filter(class==3)
Dat4 = Dat %>% filter(class==4)
stripchart(list(Dat1[,"Value1"],Dat2[,"Value1"],Dat3[,"Value1"],Dat4[,"Value1"]), vertical = T,method = "jitter",cex = 0.8,pch = 20,col = colors[5:9],add=T)
dev.off()
Example3
這里欺殿,我們使用了dplyr包,將數(shù)據(jù)按照不同類別(1,2,3,4)來分成新的數(shù)據(jù)鳖敷,并且脖苏,在boxplot函數(shù)里,不同于上述幾種方式定踱,其數(shù)據(jù)的給定是以公式的方式給出帆阳,形如“特征~類別,數(shù)據(jù)名”的樣式。
箱線圖美化
1. 我們會發(fā)現(xiàn)畫2個箱線圖組合時屋吨,箱線圖的寬度太大,以至于這種箱線圖看起來不美觀山宾,如Example2至扰,此時我們可以通過在boxplot函數(shù)中指定位置,將箱線圖顯示在1和3的坐標(biāo)軸上资锰,起到縮小寬度的效果敢课,當(dāng)然添點也是如此。
Value1=round(runif(50,min=10,max=50),2)
Value2=round(runif(50,min=6,max=45),2)
colors = brewer.pal(8,"Accent")
boxplot(Value1,Value2,col = "white",cex=0.5, pch=20,outcol=colors[6],
main = "Example",ylab="Value",names =c("Feature1","Feature2"),
at=c(1, 3), xlim=c(0, 4))
stripchart(list(Value1,Value2),vertical = T,method = "jitter",cex = 0.8,
pch = 20,col = colors[1:2],add=T,at=c(1, 3))
dev.off()
Example4
2. 當(dāng)特征過多時绷杜,箱線圖名稱容易出現(xiàn)重疊的現(xiàn)象直秆,此時需要對名稱進(jìn)行角度傾斜,而這在boxplot函數(shù)參數(shù)里面不能設(shè)置鞭盟,此時我們需要取消橫軸的命名圾结,采用text函數(shù),貼在畫布上齿诉。
class = factor(c(rep(1,25),rep(2,25),rep(3,25),rep(4,25)))
Value1=round(runif(50,min=10,max=50),2)
Value2=round(runif(50,min=6,max=45),2)
Dat = data.frame(class,Value1,Value2)
colors = brewer.pal(8,"Accent")
boxplot(Value1~class,Dat,col = "white",cex=0.5, pch=20,outcol=colors[6],
main = "Example",ylab="Value",xlab="",xaxt="n")
axis(side=1, at=1:4, labels=FALSE)
text(c(1:4), x=1:4, y=6, xpd=T, srt=30)
Dat1 = Dat %>% filter(class==1)
Dat2 = Dat %>% filter(class==2)
Dat3 = Dat %>% filter(class==3)
Dat4 = Dat %>% filter(class==4)
stripchart(list(Dat1[,"Value1"],Dat2[,"Value1"],Dat3[,"Value1"],Dat4[,"Value1"]),
vertical = T,method = "jitter",cex = 0.8,
pch = 20,col = colors[5:9],add=T)
dev.off()
Example5