最近年中總結帆卓,我們學習一些基礎知識纫骑,還有幾天總結就結束了衙熔,期待FFPE做空間轉錄組大放異彩
安裝及加載ggpubr包
安裝方式有兩種:
- 直接從CRAN安裝:
install.packages("ggpubr")
- 從GitHub上安裝最新版本:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")
安裝完之后直接加載就行:
library(ggpubr)
ggpubr可繪制圖形:
ggpubr可繪制大部分我們常用的圖形嫉你,下面一一介紹信认。
分布圖(Distribution)
#構建數據集
set.seed(1234)
df <- data.frame( sex=factor(rep(c("f", "M"), each=200)),
weight=c(rnorm(200, 55), rnorm(200, 58)))
head(df)
## sex weight
## 1 f 53.79293
## 2 f 55.27743
## 3 f 56.08444
## 4 f 52.65430
## 5 f 55.42912
## 6 f 55.50606
密度分布圖以及邊際地毯線并添加平均值線
ggdensity(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
帶有均值線和邊際地毯線的直方圖
gghistogram(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
箱線圖與小提琴圖
#加載數據集ToothGrowth
data("ToothGrowth")
df1 <- ToothGrowth
head(df1)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
p <- ggboxplot(df1, x="dose", y="len", color = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "jitter", shape="dose")#增加了jitter點,點shape由dose映射p
增加不同組間的p-value值均抽,可以自定義需要標注的組間比較
my_comparisons <- list(c("0.5", "1"), c("1", "2"), c("0.5", "2"))
p+stat_compare_means(comparisons = my_comparisons)+#不同組間的比較
stat_compare_means(label.y = 50)
內有箱線圖的小提琴圖
ggviolin(df1, x="dose", y="len", fill = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "boxplot", add.params = list(fill="white"))+
stat_compare_means(comparisons = my_comparisons, label = "p.signif")+#label這里表示選擇顯著性標記(星號)
stat_compare_means(label.y = 50)
條形圖
data("mtcars")
df2 <- mtcars
df2$cyl <- factor(df2$cyl)
df2$name <- rownames(df2)#添加一行name
head(df2[, c("name", "wt", "mpg", "cyl")])
按從小到大順序繪制條形圖(不分組排序)
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",
palette = "jco",#雜志jco的配色
sort.val = "desc",#下降排序
sort.by.groups=FALSE,#不按組排序
x.text.angle=60)
按組進行排序
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",
palette = "jco",#雜志jco的配色
sort.val = "asc",#上升排序,區(qū)別于desc嫁赏,具體看圖演示
sort.by.groups=TRUE,#按組排序
x.text.angle=90)
偏差圖
偏差圖展示了與參考值之間的偏差
df2$mpg_z <- (df2$mpg-mean(df2$mpg))/sd(df2$mpg)
df2$mpg_grp <- factor(ifelse(df2$mpg_z<0, "low", "high"), levels = c("low", "high"))
head(df2[, c("name", "wt", "mpg", "mpg_grp", "cyl")])
繪制排序過的條形圖
ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",
palette = "jco", sort.val = "asc", sort.by.groups = FALSE, x.text.angle=60,
ylab = "MPG z-score", xlab = FALSE, legend.title="MPG Group")
坐標軸變換
ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",
palette = "jco", sort.val = "desc", sort.by.groups = FALSE,
x.text.angle=90, ylab = "MPG z-score", xlab = FALSE,
legend.title="MPG Group", rotate=TRUE, ggtheme = theme_minimal())
點圖(Dot charts)
棒棒糖圖(Lollipop chart)
棒棒圖可以代替條形圖展示數據
ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "ascending",
add = "segments", ggtheme = theme_pubr())
可以自設置各種參數
ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "descending",
add = "segments", rotate = TRUE, group = "cyl", dot.size = 6,
label = round(df2$mpg), font.label = list(color="white", size=9, vjust=0.5),
ggtheme = theme_pubr())
偏差圖
ggdotchart(df2, x="name", y="mpg_z", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "descending",
add = "segment", add.params = list(color="lightgray", size=2),
group = "cyl", dot.size = 6, label = round(df2$mpg_z, 1),
font.label = list(color="white", size=9, vjust=0.5), ggtheme = theme_pubr())+
geom_line(yintercept=0, linetype=2, color="lightgray")
Cleveland點圖
ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "descending",
rotate = TRUE, dot.size = 2, y.text.col=TRUE, ggtheme = theme_pubr())+
theme_cleveland()
當然,還有很多其他的圖表功能
3. 更多
- ggscatter() 散點圖
- stat_cor() 將有P值的相關系數添加到散點圖中
- stat_stars()) Add Stars to a Scatter Plot
- ggscatterhist() 繪制具有邊際直方圖的散點圖
-
ggpaired() Plot Paired Data
Make MA-plot which is a scatter plot of log2 fold changes (on the y-axis) versus the mean expression signal (on the x-axis).
MA plot充分展示了基因豐度和表達變化之間的關系油挥。我們可以看到潦蝇,越靠左下或者右上的點款熬,就是豐度越高而且變化幅度越大的基因。當然了攘乒,MA plot就丟了FDR這類統計量贤牛。二維圖嘛,死活兩個參數则酝,頂多用顏色做個假三維殉簸。
不過對于終端小白用戶來說,如果在volcano plot和MA plot中發(fā)現了重疊的靶點(實際上會有不少重疊)沽讹,那就愉快地拿去做實驗吧般卑。
- 基因豐度:基因組中某基因的拷貝數。
- 基因表達豐度:某基因轉錄的mRNA數量爽雄◎鸺欤可以用RT-qPCR來檢測。
- 表達變化(fold change):就是倍數變化挚瘟,假設A基因表達值為1叹谁,B表達值為3,那么B的表達就是A的3倍乘盖。一般我們都用count焰檩、TPM或FPKM來衡量基因表達水平,所以基因表達值肯定是非負數订框,那么fold change的取值就是(0, +∞).
- 差異的顯著性:P-value來衡量析苫。假設檢驗首先必須要有假設,我們假設A和B的表達沒有差異(H0布蔗,零假設)藤违,然后基于此假設浪腐,通過t test(以RT-PCR為例)算出我們觀測到的A和B出現的概率纵揍,就得到了P-value,如果P-value<0.05议街,那么說明小概率事件出現了泽谨,我們應該拒絕零假設,即A和B的表達不一樣特漩,即有顯著差異吧雹。
基礎知識,多多學習