在簡書 土豆學(xué)生信 分享的內(nèi)容看到這篇論文 簡書的鏈接是 http://www.reibang.com/p/bbf9cb13b41a
論文是
image.png
論文對應(yīng)的代碼是公開的 https://github.com/ajwilk/2020_Wilk_COVID
image.png
在學(xué)習(xí)他這個代碼的時候發(fā)現(xiàn)其中自定義了一個函數(shù)可以操作熱圖的文字標(biāo)簽税娜,可以讓熱圖上只顯示我們感興趣的文字標(biāo)簽。
我在運行這個代碼的時候遇到了報錯,沒有把代碼完全運行完,但是已經(jīng)獲得和
NK.markers
這個表達(dá)量文件舱呻,部分內(nèi)容如下
image.png
我們用這個表達(dá)量文件先做一個簡單的熱圖
讀入數(shù)據(jù)
df<-read.csv("NM/NK_markers_1.csv",header=T,row.names = 1)
head(df)
最簡單的熱圖
library(pheatmap)
pdf(file = "NM/hp-1.pdf",width = 4,height = 10)
pheatmap(df,fontsize = 3)
dev.off()
image.png
我們可以看到上圖右側(cè)所有的基因名都顯示出來了,如果我們想只顯示自己感興趣的嚷炉,那該如何實現(xiàn)呢捅厂?可以用開頭提到的自定義函數(shù)
add.flag <- function(pheatmap,
kept.labels,
repel.degree) {
# repel.degree = number within [0, 1], which controls how much
# space to allocate for repelling labels.
## repel.degree = 0: spread out labels over existing range of kept labels
## repel.degree = 1: spread out labels over the full y-axis
heatmap <- pheatmap$gtable
new.label <- heatmap$grobs[[which(heatmap$layout$name == "row_names")]]
# keep only labels in kept.labels, replace the rest with ""
new.label$label <- ifelse(new.label$label %in% kept.labels,
new.label$label, "")
# calculate evenly spaced out y-axis positions
repelled.y <- function(d, d.select, k = repel.degree){
# d = vector of distances for labels
# d.select = vector of T/F for which labels are significant
# recursive function to get current label positions
# (note the unit is "npc" for all components of each distance)
strip.npc <- function(dd){
if(!"unit.arithmetic" %in% class(dd)) {
return(as.numeric(dd))
}
d1 <- strip.npc(dd$arg1)
d2 <- strip.npc(dd$arg2)
fn <- dd$fname
return(lazyeval::lazy_eval(paste(d1, fn, d2)))
}
full.range <- sapply(seq_along(d), function(i) strip.npc(d[i]))
selected.range <- sapply(seq_along(d[d.select]), function(i) strip.npc(d[d.select][i]))
return(unit(seq(from = max(selected.range) + k*(max(full.range) - max(selected.range)),
to = min(selected.range) - k*(min(selected.range) - min(full.range)),
length.out = sum(d.select)),
"npc"))
}
new.y.positions <- repelled.y(new.label$y,
d.select = new.label$label != "")
new.flag <- segmentsGrob(x0 = new.label$x,
x1 = new.label$x + unit(0.15, "npc"),
y0 = new.label$y[new.label$label != ""],
y1 = new.y.positions)
# shift position for selected labels
new.label$x <- new.label$x + unit(0.2, "npc")
new.label$y[new.label$label != ""] <- new.y.positions
# add flag to heatmap
heatmap <- gtable::gtable_add_grob(x = heatmap,
grobs = new.flag,
t = 4,
l = 4
)
# replace label positions in heatmap
heatmap$grobs[[which(heatmap$layout$name == "row_names")]] <- new.label
# plot result
grid.newpage()
grid.draw(heatmap)
# return a copy of the heatmap invisibly
invisible(heatmap)
}
將以上函數(shù)放到文本文件里,通過source()加載這個函數(shù)
source("useful_R_function/add_flag.r")
選擇感興趣的基因名爷耀,我這里就隨機選取幾個了
gene_name<-sample(rownames(df),10)
畫圖
source("useful_R_function/add_flag.r")
library(grid)
gene_name<-sample(rownames(df),10)
p1<-pheatmap(df)
add.flag(p1,
kept.labels = gene_name,
repel.degree = 0.2)
結(jié)果就變成了如下
image.png
接下來是簡單的美化
代碼
source("useful_R_function/add_flag.r")
df<-read.csv("NM/NK_markers_1.csv",header=T,row.names = 1)
head(df)
library(pheatmap)
library(grid)
gene_name<-sample(rownames(df),10)
paletteLength <- 100
mycolor<-colorRampPalette(c("blue","white","red"))(100)
mycolor
myBreaks <- unique(c(seq(min(df), 0, length.out=ceiling(paletteLength/2) + 1),
seq(max(df)/paletteLength, max(df),
length.out=floor(paletteLength/2))))
p1<-pheatmap(df,color = mycolor,breaks = myBreaks)
pdf(file = "NM/hp-2.pdf",width = 4,height = 8)
add.flag(p1,
kept.labels = gene_name,
repel.degree = 0.2)
dev.off()
image.png
這個圖和開頭提到的論文里的Figure3f就有幾分相似了甘桑,但是還沒有添加分組信息
歡迎大家關(guān)注我的公眾號
小明的數(shù)據(jù)分析筆記本
小明的數(shù)據(jù)分析筆記本 公眾號 主要分享:1、R語言和python做數(shù)據(jù)分析和數(shù)據(jù)可視化的簡單小例子歹叮;2跑杭、園藝植物相關(guān)轉(zhuǎn)錄組學(xué)、基因組學(xué)咆耿、群體遺傳學(xué)文獻閱讀筆記德谅;3、生物信息學(xué)入門學(xué)習(xí)資料及自己的學(xué)習(xí)筆記萨螺!