火山圖是測(cè)序分析報(bào)告中最為核心的圖片之一踊谋。繪制火山圖的方法有許多蝉仇,Excel和第三方軟件等,本文主要運(yùn)用ggplot2和ggrepel兩個(gè)R包演示[1][2]殖蚕。
橫坐標(biāo)為Fold change(倍數(shù))轿衔,越偏離中心差異倍數(shù)越大;縱坐標(biāo)為P value(P值)睦疫,值越大差異越顯著害驹。
(一):繪圖分析:
基本要素: 區(qū)分上調(diào)、下調(diào)的散點(diǎn)蛤育、輔助線宛官、坐標(biāo)軸、圖例瓦糕、標(biāo)簽
運(yùn)用參數(shù): geom_point()底洗、geom_vline and geom_hline、labs()咕娄、theme()亥揖、geom_text_repel()
注:區(qū)分上、下調(diào)基因圣勒,就要對(duì)差異基因進(jìn)行統(tǒng)計(jì)學(xué)分析费变,主要包括統(tǒng)計(jì)顯著性和生物學(xué)差異顯著性;需要用到ifelse函數(shù) [3]
(二):數(shù)據(jù)處理
rm(list = ls())
library(ggplot2)
# 數(shù)據(jù)來自http://www.reibang.com/p/a973c29c9103 [參考資料2]
dataset <- read.table(file = "C:/Users/Administrator/Documents/dataset_volcano.txt",
header = TRUE, sep = "")
# 設(shè)置p_value和logFC的閾值
cut_off_pvalue = 0.0000001 #統(tǒng)計(jì)顯著性
cut_off_logFC = 1 #差異倍數(shù)值
# 根據(jù)閾值參數(shù)圣贸,上調(diào)基因設(shè)置為‘up’挚歧,下調(diào)基因設(shè)置為‘Down’,無差異設(shè)置為‘Stable’旁趟,并保存到change列中
dataset$change = ifelse(dataset$P.Value < cut_off_pvalue & abs(dataset$logFC) >= cut_off_logFC,
ifelse(dataset$logFC> cut_off_logFC ,'Up','Down'),
'Stable')
head(dataset)
gene logFC P.Value change
1 LOC100909539 -4.514013 4.33e-12 Down
2 Steap2 -3.678175 1.12e-11 Down
3 Trpt1 2.594115 1.21e-11 Up
4 Cers6 -3.630773 1.45e-11 Down
5 Hs3st1 -3.129658 1.74e-11 Down
6 Lama5 -2.772380 2.51e-11 Down
先從整體判斷條件的真假:
1. 不滿足昼激,記作"stable"庇绽;
2. 滿足锡搜;
logFC大于1,記作"up"瞧掺;
logFC小于1耕餐,記作"down";
(三):繪圖
p <- ggplot(
# 數(shù)據(jù)辟狈、映射肠缔、顏色
dataset, aes(x = logFC, y = -log10(P.Value), colour=change)) +
geom_point(alpha=0.4, size=3.5) +
scale_color_manual(values=c("#546de5", "#d2dae2","#ff4757"))+
# 輔助線
geom_vline(xintercept=c(-1,1),lty=4,col="black",lwd=0.8) +
geom_hline(yintercept = -log10(cut_off_pvalue),lty=4,col="black",lwd=0.8) +
# 坐標(biāo)軸
labs(x="log2(fold change)",
y="-log10 (p-value)")+
theme_bw()+
# 圖例
theme(plot.title = element_text(hjust = 0.5),
legend.position="right",
legend.title = element_blank())
p
(四):添加標(biāo)簽
# 將需要標(biāo)記的基因放置在label列(logFC >= 5)
library(ggrepel)
dataset$label <- ifelse(dataset$P.Value > cut_off_pvalue & abs(dataset$logFC) >= 5,
as.character(dataset$gene), "")
dataset[2600:2606, ]
gene logFC P.Value change label
2600 Ano5 0.1883082 1.20e-07 Stable
2601 Rab3a 0.8501624 1.21e-07 Stable
2602 Pank1 5.2116194 1.21e-07 Stable Pank1
2603 Ghitm 2.4541362 1.21e-07 Stable
2604 Ndufa8 1.4141038 1.21e-07 Stable
2605 Zc3h7a -0.9070267 1.21e-07 Stable
2606 Trpm3 0.1879326 1.21e-07 Stable
p + geom_label_repel(data = dataset, aes(x = dataset$logFC,
y = -log10(dataset$P.Value),
label = label),
size = 3, box.padding = unit(0.5, "lines"),
point.padding = unit(0.8, "lines"),
segment.color = "black",
show.legend = FALSE)
關(guān)于ggrepel包夏跷,后期會(huì)寫參數(shù)使用相關(guān)教程
(五):Volcano plot[4]
# 火山圖應(yīng)用
genes <- read.table(file = "C:/Users/Administrator/Documents/volcano.txt", header = TRUE)
genes$Significant <- ifelse(genes$padj < 0.05 & abs(genes$log2FoldChange) >= 1,
ifelse(genes$log2FoldChange > 1, "Up", "Down"), "Stable")
ggplot(
# 數(shù)據(jù)、映射明未、顏色
genes, aes(x = log2FoldChange, y = -log10(pvalue))) +
geom_point(aes(color = Significant), size=2) +
scale_color_manual(values = c("blue","grey", "red")) +
# 注釋
geom_text_repel(
data = subset(genes, padj < 0.05 & abs(genes$log2FoldChange) >= 1),
aes(label = Gene),
size = 5,
box.padding = unit(0.35, "lines"),
point.padding = unit(0.3, "lines")) +
# 輔助線
geom_vline(xintercept=c(-1,1),lty=4,col="black",lwd=0.8) +
geom_hline(yintercept = 1,lty=4,col="black",lwd=0.8) +
# 坐標(biāo)軸
labs(x="log2(fold change)",
y="-log10 (p-value)") +
# 圖例
theme(legend.position = "bottom")
進(jìn)一步修飾
# 進(jìn)一步修飾
ggplot(
# 數(shù)據(jù)槽华、映射、顏色
genes, aes(x = log2FoldChange, y = -log10(pvalue))) +
geom_point(aes(color = Significant), size=2) +
scale_color_manual(values = c("blue","grey", "red")) +
# 注釋
geom_label_repel(
data = subset(genes, padj < 0.05 & abs(genes$log2FoldChange) >= 1),
aes(label = Gene),
size = 5,fill = "darkred", color = "white",
box.padding = unit(0.35, "lines"),
point.padding = unit(0.3, "lines")) +
# 輔助線
geom_vline(xintercept=c(-0.5, 0.5),lty=4,col="#666666",lwd=0.8) +
geom_hline(yintercept = 1,lty=4,col="#666666",lwd=0.8) +
# 坐標(biāo)軸
labs(x="log2(fold change)",
y="-log10 (p-value)") +
# 圖例
theme(legend.position = "bottom")
參考資料: