R繪圖|pheatmap熱圖繪制——高階篇

繪圖場景：在高通量檢測數(shù)據(jù)中，我們在數(shù)據(jù)展示上往往都喜歡將感興趣的功能或者通路基因繪制成熱圖形式，能夠讓讀者一目了然的發(fā)現(xiàn)某功能是增強了？還是減弱了岭接？在實際過程中富拗，表型實驗上已經(jīng)證明了炎癥反應(yīng)的走向，從初期增強到后期減弱鸣戴。但在對應(yīng)的高通量數(shù)據(jù)中卻不是如我們所想啃沪，趨勢很凌亂，很難說炎癥這個過程的增強減弱葵擎。因此在拿到數(shù)據(jù)后谅阿，需要有選擇性的篩選和展示。今天就以項目為例酬滤，記錄整個繪制過程签餐。

1. 對促炎數(shù)據(jù)做聚類分析并提取所需信息

首先清除環(huán)境，安裝并加載所需要的R包

rm(list = ls()) #清除環(huán)境內(nèi)存
#install.packages("pheatmap")  #安裝pheatmap包
#install.packages("readxl")  #安裝readxl包
#install.packages("ggplot2") #安裝ggplot2包
#install.packages(c('officer', 'rvg', 'flextable', 'stargazer', 'broom' ))#export的依賴包
#下載export盯串，鏈接https://cran.r-project.org/src/contrib/Archive/export/export_0.2.2.tar.gz
#install.packages("D:/R-3.6.2/library/export_0.2.2.tar.gz", repos = NULL)

library(ggplot2)
library(pheatmap) #加載pheatmap包
library(readxl)  #加載readxl包
library(export) #用于輸出不同格式的結(jié)果

讀入促炎數(shù)據(jù)并對數(shù)據(jù)做簡單處理

data_pos<-read_excel("demo3.xlsx",1)    #讀入excel數(shù)據(jù)
data_pos<-as.data.frame(data_pos) # 將data轉(zhuǎn)換為data.frame格式
rownames(data_pos)<-data_pos[,1]  #將第一列數(shù)據(jù)設(shè)置為行名
data_pos<-data_pos[,-1] #去除重復的第一列
colnames(data_pos) = gsub("_expression","",colnames(data_pos))
head(data_pos)
##               12h       24h        3d       7d
## Il1rl1  1.0475045 0.9948235 1.0268230 1.010622
## Tnfsf4  0.9327795 1.0742431 0.9676329 1.010017
## Tnfsf18 0.8085097 0.8327313 1.0099118 1.020472
## Hspd1   0.9905324 0.9932249 0.9839949 1.002263
## Pla2g4a 1.0225494 0.9872852 1.0146842 1.002581
## Xcl1    0.9028666 0.9687647 0.9989017 1.065237

繪圖查看基因聚類情況

A<-pheatmap(data_pos,fontsize_col=12,fontsize_row=8,
         show_rownames=F,show_colnames = T,
         cluster_row=T,cluster_col=F,scale="row",
         treeheight_row=25,treeheight_col=25,
         border_color=NA,main="Heatmap",
         color = colorRampPalette(c("blue", "white", "red"))(50),
         cutree_rows=3)  # 畫圖

image

ggsave(A,filename = "demo3_pos.pdf",width=3,height=5)
ggsave(A,filename = "demo3_pos.png",width=3,height=5)

從上圖發(fā)現(xiàn)氯檐，基因按照表達模式可分為3大類，其中1和2類是想要的數(shù)據(jù)体捏。
那如何提取這部分數(shù)據(jù)呢冠摄？

首先將表達數(shù)據(jù)與聚類分類信息合并

clus_a=cutree(A$tree_row,3) #獲取按照3塊，得到聚類模塊信息
table(clus_a)# 每個聚類模塊的數(shù)量
## clus_a
##  1  2  3
## 12 17 31
clus_a<-as.data.frame(clus_a)#轉(zhuǎn)化為data.frame格式
data_pos= cbind(data_pos, clus_a)# 按列合并數(shù)據(jù)
data_pos #查看合并后數(shù)據(jù)
##                12h       24h        3d        7d clus_a
## Il1rl1   1.0475045 0.9948235 1.0268230 1.0106219      1
## Tnfsf4   0.9327795 1.0742431 0.9676329 1.0100173      2
## Tnfsf18  0.8085097 0.8327313 1.0099118 1.0204721      3
## Hspd1    0.9905324 0.9932249 0.9839949 1.0022628      3
## Pla2g4a  1.0225494 0.9872852 1.0146842 1.0025812      1
## Xcl1     0.9028666 0.9687647 0.9989017 1.0652369      3
## Tgm2     0.9719927 0.9908516 0.9937439 1.0165441      3
## S100a8   1.1048567 0.9885955 1.1913848 0.9661311      1
## Ctss     1.0075379 0.9725514 1.0238447 1.0097870      3
## Adora3   0.8768417 0.9808549 0.9766739 1.0384855      3
## Fabp4    1.0056085 1.0287909 0.9618795 0.9613461      2
## Tlr2     0.9925440 1.0575861 0.9961778 0.9766016      2
## S100a9   0.9364502 0.9293324 0.9946467 1.0106743      3
## Tlr4     1.0503438 1.0367557 1.0205949 0.9283988      1
## Zp3      0.8793431 0.9968576 1.0090100 1.0340921      3
## Clock    1.0138973 1.0231884 1.0113094 0.9766799      2
## Trpv4    0.9513371 0.9818378 1.0161012 1.0326306      3
## Ccl26    0.9786242 0.9316714 1.0050418 1.0356912      3
## Ccl24    0.9361875 0.9389519 1.0271715 1.0666504      3
## Serpine1 0.9992854 1.0101892 0.9987391 0.9768985      2
## Il17ra   1.0549646 1.0201907 0.9829150 0.9061829      1
## Tnfrsf1a 1.0196453 1.0356289 0.9909272 0.9341439      2
## Pde2a    0.9950061 1.0437024 1.0036394 0.9912396      2
## Gprc5b   0.9359255 0.9440990 1.0133725 1.0411849      3
## Adam8    1.0053608 1.0149603 1.0010803 1.0082890      2
## Cx3cl1   0.9103365 0.9611052 0.9865766 1.0385095      3
## Tlr3     1.0712287 0.9860523 1.0268784 0.9320602      1
## Ednra    0.9760352 1.0376250 0.9664353 1.0616230      3
## Ldlr     0.9951658 1.0412401 1.0044728 0.9757518      2
## Ets1     0.9649667 0.9932523 0.9633367 1.0178563      3
## Tlr9     0.9845119 1.0433439 1.0179088 0.9331340      2
## Hyal2    0.9805355 0.9952188 0.9537844 1.0180491      3
## Ccr2     1.0781580 0.9166186 0.9737452 0.9468374      1
## Ccr5     1.0379514 1.0791485 0.9571004 0.9994896      2
## Cdk19    1.0834689 0.9774149 0.9731305 1.0011084      1
## Egfr     0.9594616 0.9865012 0.9968854 1.0231812      3
## Ccl2     0.9410630 0.9234708 0.9172620 1.0855231      3
## Ccl7     0.9377025 0.9959727 0.9830637 0.9999279      3
## Ccl11    0.9395817 1.0046657 0.9952031 1.1051408      3
## Ccl12    1.0210894 0.9428178 0.9676126 1.0958590      3
## Ccl4     0.9173600 1.0137382 0.9935696 0.8478784      2
## Stat5a   0.9984534 1.0020705 0.9927384 1.0171578      3
## Ace      0.9411769 0.9791180 0.9885292 1.0293273      3
## Tnip1    0.9967821 1.0371306 0.9868051 0.9751483      2
## Ccl1     0.9490765 1.0061028 1.0299452 1.0591937      3
## Ccl9     1.0748156 1.0255019 1.0143933 1.0110047      1
## Ccl6     0.9836699 1.0019130 0.9749013 1.0098487      3
## Ccl3     1.0027344 1.0298867 1.0158420 0.9723747      2
## Stat5b   1.0067589 1.0196183 0.9994107 0.9936060      2
## Prkca    1.0188645 0.9967514 0.9482353 0.9719247      1
## Itga2    0.9682602 0.9558409 0.9855012 1.0177017      3
## Wnt5a    0.9772947 0.9830877 0.9540308 1.0548477      3
## Il17rb   0.9341813 0.9980961 0.9743833 1.0585646      3
## Ptger4   0.9765908 1.0148698 1.0042148 0.9546091      2
## Cd47     0.9976905 0.9901815 1.0223011 1.0073487      3
## Ager     0.9438379 1.0014070 0.9860253 1.0298349      3
## Tnf      1.0131602 1.0011471 1.0135716 0.9635546      1
## Gpsm3    0.9960939 1.0271684 0.9634463 0.9898771      2
## Jak2     1.0512738 0.9872834 0.9996479 0.9795569      1
## Il33     1.0075621 0.9593182 0.9695374 1.0968678      3

按照聚類順序?qū)⒕垲?和2的數(shù)據(jù)提取出來

order_row = A$tree_row$order  #記錄熱圖的行排序
data_pos = data_pos[order_row,]   # 按照熱圖的順序几缭，重新排原始數(shù)據(jù)
data_pos = data.frame(rownames(data_pos),data_pos,check.names =T)  # 將行名加到表格數(shù)據(jù)中
data_pos =data_pos[(data_pos$clus %in% c("1","2")),]
#或者data_pos =data_pos[(data_pos$clus < 3),]
data_pos #查看數(shù)據(jù)
##          rownames.data_pos.      X12h      X24h       X3d       X7d clus_a
## Prkca                 Prkca 1.0188645 0.9967514 0.9482353 0.9719247      1
## Ccr2                   Ccr2 1.0781580 0.9166186 0.9737452 0.9468374      1
## Jak2                   Jak2 1.0512738 0.9872834 0.9996479 0.9795569      1
## Cdk19                 Cdk19 1.0834689 0.9774149 0.9731305 1.0011084      1
## Ccl9                   Ccl9 1.0748156 1.0255019 1.0143933 1.0110047      1
## Tlr4                   Tlr4 1.0503438 1.0367557 1.0205949 0.9283988      1
## Il17ra               Il17ra 1.0549646 1.0201907 0.9829150 0.9061829      1
## Tlr3                   Tlr3 1.0712287 0.9860523 1.0268784 0.9320602      1
## Tnf                     Tnf 1.0131602 1.0011471 1.0135716 0.9635546      1
## S100a8               S100a8 1.1048567 0.9885955 1.1913848 0.9661311      1
## Il1rl1               Il1rl1 1.0475045 0.9948235 1.0268230 1.0106219      1
## Pla2g4a             Pla2g4a 1.0225494 0.9872852 1.0146842 1.0025812      1
## Tnfsf4               Tnfsf4 0.9327795 1.0742431 0.9676329 1.0100173      2
## Adam8                 Adam8 1.0053608 1.0149603 1.0010803 1.0082890      2
## Ccr5                   Ccr5 1.0379514 1.0791485 0.9571004 0.9994896      2
## Gpsm3                 Gpsm3 0.9960939 1.0271684 0.9634463 0.9898771      2
## Ccl4                   Ccl4 0.9173600 1.0137382 0.9935696 0.8478784      2
## Ptger4               Ptger4 0.9765908 1.0148698 1.0042148 0.9546091      2
## Tlr9                   Tlr9 0.9845119 1.0433439 1.0179088 0.9331340      2
## Ccl3                   Ccl3 1.0027344 1.0298867 1.0158420 0.9723747      2
## Ldlr                   Ldlr 0.9951658 1.0412401 1.0044728 0.9757518      2
## Tlr2                   Tlr2 0.9925440 1.0575861 0.9961778 0.9766016      2
## Pde2a                 Pde2a 0.9950061 1.0437024 1.0036394 0.9912396      2
## Tnfrsf1a           Tnfrsf1a 1.0196453 1.0356289 0.9909272 0.9341439      2
## Clock                 Clock 1.0138973 1.0231884 1.0113094 0.9766799      2
## Serpine1           Serpine1 0.9992854 1.0101892 0.9987391 0.9768985      2
## Fabp4                 Fabp4 1.0056085 1.0287909 0.9618795 0.9613461      2
## Tnip1                 Tnip1 0.9967821 1.0371306 0.9868051 0.9751483      2
## Stat5b               Stat5b 1.0067589 1.0196183 0.9994107 0.9936060      2
data_pos_reorder=data_pos[,-6]
write.table(data_pos_reorder,file="data_pos_reorder.txt",row.names=FALSE,quote = FALSE,sep='\t')  #輸出結(jié)果河泳，按照熱圖中的順序

2. 對抑炎數(shù)據(jù)做聚類分析并提取所需信息

讀入抑炎數(shù)據(jù)并對數(shù)據(jù)做簡單處理

data_neg<-read_excel("demo3.xlsx",2)    #讀入excel數(shù)據(jù)
data_neg<-as.data.frame(data_neg) # 將data轉(zhuǎn)換為data.frame格式
rownames(data_neg)<-data_neg[,1]  #將第一列數(shù)據(jù)設(shè)置為行名
data_neg<-data_neg[,-1] #去除重復的第一列
colnames(data_neg) = gsub("_expression","",colnames(data_neg))
head(data_neg)
##               12h       24h        3d       7d
## Il10    0.8684374 0.9661305 0.9805500 1.053469
## Adora1  0.9303991 0.9886013 0.9663481 1.056422
## Il2ra   0.9800390 0.9831342 0.9467422 1.011937
## Tnfaip6 0.9267055 1.0009174 1.0224794 1.063077
## Tyro3   0.9413618 0.9496315 0.9760268 1.060393
## Gata3   0.9527483 0.9656051 0.9631235 1.045775

繪圖查看基因聚類情況

B<-pheatmap(data_neg,fontsize_col=12,fontsize_row=8,
            show_rownames=F,show_colnames = T,
            cluster_row=T,cluster_col=F,scale="row",
            treeheight_row=25,treeheight_col=25,
            border_color=NA,main="Heatmap",
            color = colorRampPalette(c("blue", "white", "red"))(50),
            cutree_rows=2)  # 畫圖

image.gif

ggsave(B,filename = "demo3_neg.pdf",width=3,height=8)
ggsave(B,filename = "demo3_neg.png",width=3,height=8)

從上圖發(fā)現(xiàn)，基因按照表達模式可分為2大類年栓，其中1類是想要的數(shù)據(jù)拆挥。
那如何提取這部分數(shù)據(jù)呢？

首先將表達數(shù)據(jù)與聚類分類信息合并

clus_b=cutree(B$tree_row,2) #獲取按照3塊某抓，得到聚類模塊信息
table(clus_b)# 每個聚類模塊的數(shù)量
## clus_b
##  1  2
## 52 30
clus_b<-as.data.frame(clus_b)
data_neg= cbind(data_neg, clus_b)
head(data_pos) #查看合并后數(shù)據(jù)
##       rownames.data_pos.     X12h      X24h       X3d       X7d clus_a
## Prkca              Prkca 1.018865 0.9967514 0.9482353 0.9719247      1
## Ccr2                Ccr2 1.078158 0.9166186 0.9737452 0.9468374      1
## Jak2                Jak2 1.051274 0.9872834 0.9996479 0.9795569      1
## Cdk19              Cdk19 1.083469 0.9774149 0.9731305 1.0011084      1
## Ccl9                Ccl9 1.074816 1.0255019 1.0143933 1.0110047      1
## Tlr4                Tlr4 1.050344 1.0367557 1.0205949 0.9283988      1

按照聚類順序?qū)⒕垲?和2的數(shù)據(jù)提取出來

order_row = B$tree_row$order  #記錄熱圖的行排序
data_neg = data_neg[order_row,]   # 按照熱圖的順序纸兔，重新排原始數(shù)據(jù)
data_neg = data.frame(rownames(data_neg),data_neg,check.names =T)  # 將行名加到表格數(shù)據(jù)中
table(data_neg$clus)
##
##  1  2
## 52 30
data_neg =data_neg[(data_neg$clus %in% c("1")),]
#或者data_neg =data_neg[(data_neg$clus < 2),]
head(data_neg)
##        rownames.data_neg.      X12h      X24h       X3d       X7d clus_b
## Chrna7             Chrna7 0.9867821 0.9545026 0.9598899 0.9981845      1
## Aoah                 Aoah 1.0066938 0.9492302 0.9263789 1.0097112      1
## Pbk                   Pbk 1.0071100 0.9945398 1.0009174 1.0369612      1
## Il2                   Il2 0.9414986 0.9277887 0.9402898 1.1531456      1
## Apoe                 Apoe 0.9638917 0.9554340 0.9667258 1.0191987      1
## Apoa1               Apoa1 0.9878721 0.9745789 0.9888851 1.0551538      1
data_neg_reorder=data_neg[,-6]
write.table(data_neg_reorder,file="data_neg_reorder.txt",row.names=FALSE,quote = FALSE,sep='\t')  #輸出結(jié)果，按照熱圖中的順序

3. 限定pos組數(shù)據(jù)的顏色否副，統(tǒng)一顏色標尺

首席清除環(huán)境汉矿，加載R包并讀入上游篩選出的促炎數(shù)據(jù)

rm(list = ls()) #清除環(huán)境內(nèi)存
library(pheatmap) #加載pheatmap包
library(readxl)  #加載readxl包
data=read.table('data_pos_reorder.txt',sep = '\t',stringsAsFactors = F,header = T)
data<-as.data.frame(data) # 將data轉(zhuǎn)換為data.frame格式
rownames(data)<-data[,1]  #將第一列數(shù)據(jù)設(shè)置為行名
data<-data[,-1] #去除重復的第一列
colnames(data) = gsub("X","",colnames(data))

限定顏范圍

ys <- c(seq(-1,-0.01,by=0.01),seq(0,1,by=0.01))#設(shè)定顏色范圍數(shù)

繪圖(促炎熱圖)

p1=pheatmap(data,fontsize_col=8,fontsize_row=7,
         show_rownames=T,show_colnames = T,
         cluster_row=T,cluster_col=F,scale="row",
         treeheight_row=25,treeheight_col=25,
         border_color=NA,main="Heatmap",
         color = c(colorRampPalette(colors = c("blue","white"))(length(ys)/2),colorRampPalette(colors = c("white","red"))(length(ys)/2)),
         legend_breaks=seq(-1,1,0.5),
         breaks=ys)

image

ggsave(p1,filename = "demo3_1.pdf",width=3,height=5)
ggsave(p1,filename = "demo3_1.png",width=3,height=5)

4. 限定neg組數(shù)據(jù)的顏色，統(tǒng)一顏色標尺

首席清除環(huán)境备禀，加載R包并讀入上游篩選出的促炎數(shù)據(jù)

rm(list = ls()) #清除環(huán)境內(nèi)存
library(pheatmap) #加載pheatmap包
library(readxl)  #加載readxl包
data=read.table('data_neg_reorder.txt',sep = '\t',stringsAsFactors = F,header = T)
data<-as.data.frame(data) # 將data轉(zhuǎn)換為data.frame格式
rownames(data)<-data[,1]  #將第一列數(shù)據(jù)設(shè)置為行名
data<-data[,-1] #去除重復的第一列
colnames(data) = gsub("X","",colnames(data))

限定顏范圍

ys <- c(seq(-1,-0.01,by=0.01),seq(0,1,by=0.01))#設(shè)定顏色范圍數(shù)

繪圖(抑炎熱圖)

p2=pheatmap(data,fontsize_col=8,fontsize_row=7,
         show_rownames=T,show_colnames = T,
         cluster_row=T,cluster_col=F,scale="row",
         treeheight_row=25,treeheight_col=25,
         border_color=NA,main="Heatmap",
         color = c(colorRampPalette(colors = c("blue","white"))(length(ys)/2),colorRampPalette(colors = c("white","red"))(length(ys)/2)),
         legend_breaks=seq(-1,1,0.5),
         breaks=ys)

image

ggsave(p2,filename = "demo3_2.pdf",width=3,height=8)
ggsave(p2,filename = "demo3_2.png",width=3,height=8)

5. 合并兩張圖片

最后可以用AI組合兩個結(jié)果洲拇，也許R包也可以，還沒有學曲尸。呻待。。成圖如下：整體體現(xiàn)出促炎反應(yīng)隨著時間減弱队腐，抑炎隨著時間增強，兩個功能呈現(xiàn)此消彼長的趨勢奏篙，符合實驗表型柴淘。

image.png

顯示運行環(huán)境

sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=Chinese (Simplified)_China.936
## [2] LC_CTYPE=Chinese (Simplified)_China.936
## [3] LC_MONETARY=Chinese (Simplified)_China.936
## [4] LC_NUMERIC=C
## [5] LC_TIME=Chinese (Simplified)_China.936
##
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base
##
## other attached packages:
## [1] export_0.2.2    readxl_1.3.1    pheatmap_1.0.12 ggplot2_3.2.1
##
## loaded via a namespace (and not attached):
##  [1] rgl_0.100.50            Rcpp_1.0.3              lattice_0.20-38
##  [4] tidyr_1.0.2             assertthat_0.2.1        digest_0.6.24
##  [7] mime_0.9                R6_2.4.1                cellranger_1.1.0
## [10] backports_1.1.5         evaluate_0.14           pillar_1.4.3
## [13] gdtools_0.2.1           rlang_0.4.4             lazyeval_0.2.2
## [16] uuid_0.1-4              miniUI_0.1.1.1          data.table_1.12.8
## [19] flextable_0.5.9         rmarkdown_2.1           webshot_0.5.2
## [22] stringr_1.4.0           htmlwidgets_1.5.1       munsell_0.5.0
## [25] shiny_1.4.0             broom_0.5.5             compiler_3.6.2
## [28] httpuv_1.5.2            xfun_0.12               pkgconfig_2.0.3
## [31] systemfonts_0.1.1       base64enc_0.1-3         rvg_0.2.4
## [34] htmltools_0.4.0         tidyselect_1.0.0        tibble_2.1.3
## [37] crayon_1.3.4            dplyr_0.8.4             withr_2.1.2
## [40] later_1.0.0             grid_3.6.2              nlme_3.1-142
## [43] jsonlite_1.6.1          xtable_1.8-4            gtable_0.3.0
## [46] lifecycle_0.1.0         magrittr_1.5            scales_1.1.0
## [49] zip_2.0.4               stringi_1.4.6           promises_1.1.0
## [52] xml2_1.2.2              vctrs_0.2.3             generics_0.0.2
## [55] openxlsx_4.1.4          stargazer_5.2.2         RColorBrewer_1.1-2
## [58] tools_3.6.2             manipulateWidget_0.10.1 glue_1.3.1
## [61] officer_0.3.8           purrr_0.3.3             crosstalk_1.0.0
## [64] fastmap_1.0.1           yaml_2.2.1              colorspace_1.4-1
## [67] knitr_1.28

往期回顧
R繪圖|ggplot2散點圖的繪制
 R繪圖|pheatmap熱圖繪制——基礎(chǔ)篇
 R繪圖|pheatmap熱圖繪制——中階篇

今天的內(nèi)容就到這里~~_{迫淹，更多內(nèi)容可關(guān)注公共號“YJY技能修煉”}~~