繪圖場景:在高通量檢測數(shù)據(jù)中,我們在數(shù)據(jù)展示上往往都喜歡將感興趣的功能或者通路基因繪制成熱圖形式,能夠讓讀者一目了然的發(fā)現(xiàn)某功能是增強了?還是減弱了岭接?在實際過程中富拗,表型實驗上已經(jīng)證明了炎癥反應(yīng)的走向,從初期增強到后期減弱鸣戴。但在對應(yīng)的高通量數(shù)據(jù)中卻不是如我們所想啃沪,趨勢很凌亂,很難說炎癥這個過程的增強減弱葵擎。因此在拿到數(shù)據(jù)后谅阿,需要有選擇性的篩選和展示。今天就以項目為例酬滤,記錄整個繪制過程签餐。
1. 對促炎數(shù)據(jù)做聚類分析并提取所需信息
首先清除環(huán)境,安裝并加載所需要的R包
rm(list = ls()) #清除環(huán)境內(nèi)存
#install.packages("pheatmap") #安裝pheatmap包
#install.packages("readxl") #安裝readxl包
#install.packages("ggplot2") #安裝ggplot2包
#install.packages(c('officer', 'rvg', 'flextable', 'stargazer', 'broom' ))#export的依賴包
#下載export盯串,鏈接https://cran.r-project.org/src/contrib/Archive/export/export_0.2.2.tar.gz
#install.packages("D:/R-3.6.2/library/export_0.2.2.tar.gz", repos = NULL)
library(ggplot2)
library(pheatmap) #加載pheatmap包
library(readxl) #加載readxl包
library(export) #用于輸出不同格式的結(jié)果
讀入促炎數(shù)據(jù)并對數(shù)據(jù)做簡單處理
data_pos<-read_excel("demo3.xlsx",1) #讀入excel數(shù)據(jù)
data_pos<-as.data.frame(data_pos) # 將data轉(zhuǎn)換為data.frame格式
rownames(data_pos)<-data_pos[,1] #將第一列數(shù)據(jù)設(shè)置為行名
data_pos<-data_pos[,-1] #去除重復的第一列
colnames(data_pos) = gsub("_expression","",colnames(data_pos))
head(data_pos)
## 12h 24h 3d 7d
## Il1rl1 1.0475045 0.9948235 1.0268230 1.010622
## Tnfsf4 0.9327795 1.0742431 0.9676329 1.010017
## Tnfsf18 0.8085097 0.8327313 1.0099118 1.020472
## Hspd1 0.9905324 0.9932249 0.9839949 1.002263
## Pla2g4a 1.0225494 0.9872852 1.0146842 1.002581
## Xcl1 0.9028666 0.9687647 0.9989017 1.065237
繪圖查看基因聚類情況
A<-pheatmap(data_pos,fontsize_col=12,fontsize_row=8,
show_rownames=F,show_colnames = T,
cluster_row=T,cluster_col=F,scale="row",
treeheight_row=25,treeheight_col=25,
border_color=NA,main="Heatmap",
color = colorRampPalette(c("blue", "white", "red"))(50),
cutree_rows=3) # 畫圖
ggsave(A,filename = "demo3_pos.pdf",width=3,height=5)
ggsave(A,filename = "demo3_pos.png",width=3,height=5)
從上圖發(fā)現(xiàn)氯檐,基因按照表達模式可分為3大類,其中1和2類是想要的數(shù)據(jù)体捏。
那如何提取這部分數(shù)據(jù)呢冠摄?
首先將表達數(shù)據(jù)與聚類分類信息合并
clus_a=cutree(A$tree_row,3) #獲取按照3塊,得到聚類模塊信息
table(clus_a)# 每個聚類模塊的數(shù)量
## clus_a
## 1 2 3
## 12 17 31
clus_a<-as.data.frame(clus_a)#轉(zhuǎn)化為data.frame格式
data_pos= cbind(data_pos, clus_a)# 按列合并數(shù)據(jù)
data_pos #查看合并后數(shù)據(jù)
## 12h 24h 3d 7d clus_a
## Il1rl1 1.0475045 0.9948235 1.0268230 1.0106219 1
## Tnfsf4 0.9327795 1.0742431 0.9676329 1.0100173 2
## Tnfsf18 0.8085097 0.8327313 1.0099118 1.0204721 3
## Hspd1 0.9905324 0.9932249 0.9839949 1.0022628 3
## Pla2g4a 1.0225494 0.9872852 1.0146842 1.0025812 1
## Xcl1 0.9028666 0.9687647 0.9989017 1.0652369 3
## Tgm2 0.9719927 0.9908516 0.9937439 1.0165441 3
## S100a8 1.1048567 0.9885955 1.1913848 0.9661311 1
## Ctss 1.0075379 0.9725514 1.0238447 1.0097870 3
## Adora3 0.8768417 0.9808549 0.9766739 1.0384855 3
## Fabp4 1.0056085 1.0287909 0.9618795 0.9613461 2
## Tlr2 0.9925440 1.0575861 0.9961778 0.9766016 2
## S100a9 0.9364502 0.9293324 0.9946467 1.0106743 3
## Tlr4 1.0503438 1.0367557 1.0205949 0.9283988 1
## Zp3 0.8793431 0.9968576 1.0090100 1.0340921 3
## Clock 1.0138973 1.0231884 1.0113094 0.9766799 2
## Trpv4 0.9513371 0.9818378 1.0161012 1.0326306 3
## Ccl26 0.9786242 0.9316714 1.0050418 1.0356912 3
## Ccl24 0.9361875 0.9389519 1.0271715 1.0666504 3
## Serpine1 0.9992854 1.0101892 0.9987391 0.9768985 2
## Il17ra 1.0549646 1.0201907 0.9829150 0.9061829 1
## Tnfrsf1a 1.0196453 1.0356289 0.9909272 0.9341439 2
## Pde2a 0.9950061 1.0437024 1.0036394 0.9912396 2
## Gprc5b 0.9359255 0.9440990 1.0133725 1.0411849 3
## Adam8 1.0053608 1.0149603 1.0010803 1.0082890 2
## Cx3cl1 0.9103365 0.9611052 0.9865766 1.0385095 3
## Tlr3 1.0712287 0.9860523 1.0268784 0.9320602 1
## Ednra 0.9760352 1.0376250 0.9664353 1.0616230 3
## Ldlr 0.9951658 1.0412401 1.0044728 0.9757518 2
## Ets1 0.9649667 0.9932523 0.9633367 1.0178563 3
## Tlr9 0.9845119 1.0433439 1.0179088 0.9331340 2
## Hyal2 0.9805355 0.9952188 0.9537844 1.0180491 3
## Ccr2 1.0781580 0.9166186 0.9737452 0.9468374 1
## Ccr5 1.0379514 1.0791485 0.9571004 0.9994896 2
## Cdk19 1.0834689 0.9774149 0.9731305 1.0011084 1
## Egfr 0.9594616 0.9865012 0.9968854 1.0231812 3
## Ccl2 0.9410630 0.9234708 0.9172620 1.0855231 3
## Ccl7 0.9377025 0.9959727 0.9830637 0.9999279 3
## Ccl11 0.9395817 1.0046657 0.9952031 1.1051408 3
## Ccl12 1.0210894 0.9428178 0.9676126 1.0958590 3
## Ccl4 0.9173600 1.0137382 0.9935696 0.8478784 2
## Stat5a 0.9984534 1.0020705 0.9927384 1.0171578 3
## Ace 0.9411769 0.9791180 0.9885292 1.0293273 3
## Tnip1 0.9967821 1.0371306 0.9868051 0.9751483 2
## Ccl1 0.9490765 1.0061028 1.0299452 1.0591937 3
## Ccl9 1.0748156 1.0255019 1.0143933 1.0110047 1
## Ccl6 0.9836699 1.0019130 0.9749013 1.0098487 3
## Ccl3 1.0027344 1.0298867 1.0158420 0.9723747 2
## Stat5b 1.0067589 1.0196183 0.9994107 0.9936060 2
## Prkca 1.0188645 0.9967514 0.9482353 0.9719247 1
## Itga2 0.9682602 0.9558409 0.9855012 1.0177017 3
## Wnt5a 0.9772947 0.9830877 0.9540308 1.0548477 3
## Il17rb 0.9341813 0.9980961 0.9743833 1.0585646 3
## Ptger4 0.9765908 1.0148698 1.0042148 0.9546091 2
## Cd47 0.9976905 0.9901815 1.0223011 1.0073487 3
## Ager 0.9438379 1.0014070 0.9860253 1.0298349 3
## Tnf 1.0131602 1.0011471 1.0135716 0.9635546 1
## Gpsm3 0.9960939 1.0271684 0.9634463 0.9898771 2
## Jak2 1.0512738 0.9872834 0.9996479 0.9795569 1
## Il33 1.0075621 0.9593182 0.9695374 1.0968678 3
按照聚類順序?qū)⒕垲?和2的數(shù)據(jù)提取出來
order_row = A$tree_row$order #記錄熱圖的行排序
data_pos = data_pos[order_row,] # 按照熱圖的順序几缭,重新排原始數(shù)據(jù)
data_pos = data.frame(rownames(data_pos),data_pos,check.names =T) # 將行名加到表格數(shù)據(jù)中
data_pos =data_pos[(data_pos$clus %in% c("1","2")),]
#或者data_pos =data_pos[(data_pos$clus < 3),]
data_pos #查看數(shù)據(jù)
## rownames.data_pos. X12h X24h X3d X7d clus_a
## Prkca Prkca 1.0188645 0.9967514 0.9482353 0.9719247 1
## Ccr2 Ccr2 1.0781580 0.9166186 0.9737452 0.9468374 1
## Jak2 Jak2 1.0512738 0.9872834 0.9996479 0.9795569 1
## Cdk19 Cdk19 1.0834689 0.9774149 0.9731305 1.0011084 1
## Ccl9 Ccl9 1.0748156 1.0255019 1.0143933 1.0110047 1
## Tlr4 Tlr4 1.0503438 1.0367557 1.0205949 0.9283988 1
## Il17ra Il17ra 1.0549646 1.0201907 0.9829150 0.9061829 1
## Tlr3 Tlr3 1.0712287 0.9860523 1.0268784 0.9320602 1
## Tnf Tnf 1.0131602 1.0011471 1.0135716 0.9635546 1
## S100a8 S100a8 1.1048567 0.9885955 1.1913848 0.9661311 1
## Il1rl1 Il1rl1 1.0475045 0.9948235 1.0268230 1.0106219 1
## Pla2g4a Pla2g4a 1.0225494 0.9872852 1.0146842 1.0025812 1
## Tnfsf4 Tnfsf4 0.9327795 1.0742431 0.9676329 1.0100173 2
## Adam8 Adam8 1.0053608 1.0149603 1.0010803 1.0082890 2
## Ccr5 Ccr5 1.0379514 1.0791485 0.9571004 0.9994896 2
## Gpsm3 Gpsm3 0.9960939 1.0271684 0.9634463 0.9898771 2
## Ccl4 Ccl4 0.9173600 1.0137382 0.9935696 0.8478784 2
## Ptger4 Ptger4 0.9765908 1.0148698 1.0042148 0.9546091 2
## Tlr9 Tlr9 0.9845119 1.0433439 1.0179088 0.9331340 2
## Ccl3 Ccl3 1.0027344 1.0298867 1.0158420 0.9723747 2
## Ldlr Ldlr 0.9951658 1.0412401 1.0044728 0.9757518 2
## Tlr2 Tlr2 0.9925440 1.0575861 0.9961778 0.9766016 2
## Pde2a Pde2a 0.9950061 1.0437024 1.0036394 0.9912396 2
## Tnfrsf1a Tnfrsf1a 1.0196453 1.0356289 0.9909272 0.9341439 2
## Clock Clock 1.0138973 1.0231884 1.0113094 0.9766799 2
## Serpine1 Serpine1 0.9992854 1.0101892 0.9987391 0.9768985 2
## Fabp4 Fabp4 1.0056085 1.0287909 0.9618795 0.9613461 2
## Tnip1 Tnip1 0.9967821 1.0371306 0.9868051 0.9751483 2
## Stat5b Stat5b 1.0067589 1.0196183 0.9994107 0.9936060 2
data_pos_reorder=data_pos[,-6]
write.table(data_pos_reorder,file="data_pos_reorder.txt",row.names=FALSE,quote = FALSE,sep='\t') #輸出結(jié)果河泳,按照熱圖中的順序
2. 對抑炎數(shù)據(jù)做聚類分析并提取所需信息
讀入抑炎數(shù)據(jù)并對數(shù)據(jù)做簡單處理
data_neg<-read_excel("demo3.xlsx",2) #讀入excel數(shù)據(jù)
data_neg<-as.data.frame(data_neg) # 將data轉(zhuǎn)換為data.frame格式
rownames(data_neg)<-data_neg[,1] #將第一列數(shù)據(jù)設(shè)置為行名
data_neg<-data_neg[,-1] #去除重復的第一列
colnames(data_neg) = gsub("_expression","",colnames(data_neg))
head(data_neg)
## 12h 24h 3d 7d
## Il10 0.8684374 0.9661305 0.9805500 1.053469
## Adora1 0.9303991 0.9886013 0.9663481 1.056422
## Il2ra 0.9800390 0.9831342 0.9467422 1.011937
## Tnfaip6 0.9267055 1.0009174 1.0224794 1.063077
## Tyro3 0.9413618 0.9496315 0.9760268 1.060393
## Gata3 0.9527483 0.9656051 0.9631235 1.045775
繪圖查看基因聚類情況
B<-pheatmap(data_neg,fontsize_col=12,fontsize_row=8,
show_rownames=F,show_colnames = T,
cluster_row=T,cluster_col=F,scale="row",
treeheight_row=25,treeheight_col=25,
border_color=NA,main="Heatmap",
color = colorRampPalette(c("blue", "white", "red"))(50),
cutree_rows=2) # 畫圖
ggsave(B,filename = "demo3_neg.pdf",width=3,height=8)
ggsave(B,filename = "demo3_neg.png",width=3,height=8)
從上圖發(fā)現(xiàn),基因按照表達模式可分為2大類年栓,其中1類是想要的數(shù)據(jù)拆挥。
那如何提取這部分數(shù)據(jù)呢?
首先將表達數(shù)據(jù)與聚類分類信息合并
clus_b=cutree(B$tree_row,2) #獲取按照3塊某抓,得到聚類模塊信息
table(clus_b)# 每個聚類模塊的數(shù)量
## clus_b
## 1 2
## 52 30
clus_b<-as.data.frame(clus_b)
data_neg= cbind(data_neg, clus_b)
head(data_pos) #查看合并后數(shù)據(jù)
## rownames.data_pos. X12h X24h X3d X7d clus_a
## Prkca Prkca 1.018865 0.9967514 0.9482353 0.9719247 1
## Ccr2 Ccr2 1.078158 0.9166186 0.9737452 0.9468374 1
## Jak2 Jak2 1.051274 0.9872834 0.9996479 0.9795569 1
## Cdk19 Cdk19 1.083469 0.9774149 0.9731305 1.0011084 1
## Ccl9 Ccl9 1.074816 1.0255019 1.0143933 1.0110047 1
## Tlr4 Tlr4 1.050344 1.0367557 1.0205949 0.9283988 1
按照聚類順序?qū)⒕垲?和2的數(shù)據(jù)提取出來
order_row = B$tree_row$order #記錄熱圖的行排序
data_neg = data_neg[order_row,] # 按照熱圖的順序纸兔,重新排原始數(shù)據(jù)
data_neg = data.frame(rownames(data_neg),data_neg,check.names =T) # 將行名加到表格數(shù)據(jù)中
table(data_neg$clus)
##
## 1 2
## 52 30
data_neg =data_neg[(data_neg$clus %in% c("1")),]
#或者data_neg =data_neg[(data_neg$clus < 2),]
head(data_neg)
## rownames.data_neg. X12h X24h X3d X7d clus_b
## Chrna7 Chrna7 0.9867821 0.9545026 0.9598899 0.9981845 1
## Aoah Aoah 1.0066938 0.9492302 0.9263789 1.0097112 1
## Pbk Pbk 1.0071100 0.9945398 1.0009174 1.0369612 1
## Il2 Il2 0.9414986 0.9277887 0.9402898 1.1531456 1
## Apoe Apoe 0.9638917 0.9554340 0.9667258 1.0191987 1
## Apoa1 Apoa1 0.9878721 0.9745789 0.9888851 1.0551538 1
data_neg_reorder=data_neg[,-6]
write.table(data_neg_reorder,file="data_neg_reorder.txt",row.names=FALSE,quote = FALSE,sep='\t') #輸出結(jié)果,按照熱圖中的順序
3. 限定pos組數(shù)據(jù)的顏色否副,統(tǒng)一顏色標尺
首席清除環(huán)境汉矿,加載R包并讀入上游篩選出的促炎數(shù)據(jù)
rm(list = ls()) #清除環(huán)境內(nèi)存
library(pheatmap) #加載pheatmap包
library(readxl) #加載readxl包
data=read.table('data_pos_reorder.txt',sep = '\t',stringsAsFactors = F,header = T)
data<-as.data.frame(data) # 將data轉(zhuǎn)換為data.frame格式
rownames(data)<-data[,1] #將第一列數(shù)據(jù)設(shè)置為行名
data<-data[,-1] #去除重復的第一列
colnames(data) = gsub("X","",colnames(data))
限定顏范圍
ys <- c(seq(-1,-0.01,by=0.01),seq(0,1,by=0.01))#設(shè)定顏色范圍數(shù)
繪圖(促炎熱圖)
p1=pheatmap(data,fontsize_col=8,fontsize_row=7,
show_rownames=T,show_colnames = T,
cluster_row=T,cluster_col=F,scale="row",
treeheight_row=25,treeheight_col=25,
border_color=NA,main="Heatmap",
color = c(colorRampPalette(colors = c("blue","white"))(length(ys)/2),colorRampPalette(colors = c("white","red"))(length(ys)/2)),
legend_breaks=seq(-1,1,0.5),
breaks=ys)
ggsave(p1,filename = "demo3_1.pdf",width=3,height=5)
ggsave(p1,filename = "demo3_1.png",width=3,height=5)
4. 限定neg組數(shù)據(jù)的顏色,統(tǒng)一顏色標尺
首席清除環(huán)境备禀,加載R包并讀入上游篩選出的促炎數(shù)據(jù)
rm(list = ls()) #清除環(huán)境內(nèi)存
library(pheatmap) #加載pheatmap包
library(readxl) #加載readxl包
data=read.table('data_neg_reorder.txt',sep = '\t',stringsAsFactors = F,header = T)
data<-as.data.frame(data) # 將data轉(zhuǎn)換為data.frame格式
rownames(data)<-data[,1] #將第一列數(shù)據(jù)設(shè)置為行名
data<-data[,-1] #去除重復的第一列
colnames(data) = gsub("X","",colnames(data))
限定顏范圍
ys <- c(seq(-1,-0.01,by=0.01),seq(0,1,by=0.01))#設(shè)定顏色范圍數(shù)
繪圖(抑炎熱圖)
p2=pheatmap(data,fontsize_col=8,fontsize_row=7,
show_rownames=T,show_colnames = T,
cluster_row=T,cluster_col=F,scale="row",
treeheight_row=25,treeheight_col=25,
border_color=NA,main="Heatmap",
color = c(colorRampPalette(colors = c("blue","white"))(length(ys)/2),colorRampPalette(colors = c("white","red"))(length(ys)/2)),
legend_breaks=seq(-1,1,0.5),
breaks=ys)
ggsave(p2,filename = "demo3_2.pdf",width=3,height=8)
ggsave(p2,filename = "demo3_2.png",width=3,height=8)
5. 合并兩張圖片
最后可以用AI組合兩個結(jié)果洲拇,也許R包也可以,還沒有學曲尸。呻待。。成圖如下:整體體現(xiàn)出促炎反應(yīng)隨著時間減弱队腐,抑炎隨著時間增強,兩個功能呈現(xiàn)此消彼長的趨勢奏篙,符合實驗表型柴淘。
顯示運行環(huán)境
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=Chinese (Simplified)_China.936
## [2] LC_CTYPE=Chinese (Simplified)_China.936
## [3] LC_MONETARY=Chinese (Simplified)_China.936
## [4] LC_NUMERIC=C
## [5] LC_TIME=Chinese (Simplified)_China.936
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] export_0.2.2 readxl_1.3.1 pheatmap_1.0.12 ggplot2_3.2.1
##
## loaded via a namespace (and not attached):
## [1] rgl_0.100.50 Rcpp_1.0.3 lattice_0.20-38
## [4] tidyr_1.0.2 assertthat_0.2.1 digest_0.6.24
## [7] mime_0.9 R6_2.4.1 cellranger_1.1.0
## [10] backports_1.1.5 evaluate_0.14 pillar_1.4.3
## [13] gdtools_0.2.1 rlang_0.4.4 lazyeval_0.2.2
## [16] uuid_0.1-4 miniUI_0.1.1.1 data.table_1.12.8
## [19] flextable_0.5.9 rmarkdown_2.1 webshot_0.5.2
## [22] stringr_1.4.0 htmlwidgets_1.5.1 munsell_0.5.0
## [25] shiny_1.4.0 broom_0.5.5 compiler_3.6.2
## [28] httpuv_1.5.2 xfun_0.12 pkgconfig_2.0.3
## [31] systemfonts_0.1.1 base64enc_0.1-3 rvg_0.2.4
## [34] htmltools_0.4.0 tidyselect_1.0.0 tibble_2.1.3
## [37] crayon_1.3.4 dplyr_0.8.4 withr_2.1.2
## [40] later_1.0.0 grid_3.6.2 nlme_3.1-142
## [43] jsonlite_1.6.1 xtable_1.8-4 gtable_0.3.0
## [46] lifecycle_0.1.0 magrittr_1.5 scales_1.1.0
## [49] zip_2.0.4 stringi_1.4.6 promises_1.1.0
## [52] xml2_1.2.2 vctrs_0.2.3 generics_0.0.2
## [55] openxlsx_4.1.4 stargazer_5.2.2 RColorBrewer_1.1-2
## [58] tools_3.6.2 manipulateWidget_0.10.1 glue_1.3.1
## [61] officer_0.3.8 purrr_0.3.3 crosstalk_1.0.0
## [64] fastmap_1.0.1 yaml_2.2.1 colorspace_1.4-1
## [67] knitr_1.28
往期回顧
R繪圖|ggplot2散點圖的繪制
R繪圖|pheatmap熱圖繪制——基礎(chǔ)篇
R繪圖|pheatmap熱圖繪制——中階篇
今天的內(nèi)容就到這里~~迫淹,更多內(nèi)容可關(guān)注公共號“YJY技能修煉”~~