ComplexHeatmap復(fù)雜熱圖繪制學(xué)習(xí)——11.例子

更多示例

1.為基因表達(dá)矩陣添加更多信息

熱圖非常流行用于可視化基因表達(dá)矩陣驹止。矩陣中的行對應(yīng)于基因惭蹂，在表達(dá)熱圖之后可以添加關(guān)于這些基因的更多信息管宵。

在下面的例子中纫塌，熱圖可視化了基因的相對表達(dá)（每個基因的表達(dá)被縮放）洞翩。在右側(cè)焊虏，我們將基因的絕對表達(dá)水平作為單列熱圖港令∩度荩基因長度和基因類型（即蛋白質(zhì)編碼或lincRNA）也作為熱圖注釋或熱圖。

在熱圖的最左側(cè)顷霹，繪制了彩色矩形 anno_block()以識別 k 均值聚類中的五個聚類咪惠。在“基本均值”和“基因類型”熱圖之上，還有匯總圖（條形圖和箱形圖）顯示五個集群中數(shù)據(jù)點的統(tǒng)計數(shù)據(jù)或分布淋淀。

library(ComplexHeatmap)
library(circlize)

expr = readRDS(system.file(package = "ComplexHeatmap", "extdata", "gene_expression.rds"))
mat = as.matrix(expr[, grep("cell", colnames(expr))])
base_mean = rowMeans(mat)
mat_scaled = t(apply(mat, 1, scale))

type = gsub("s\\d+_", "", colnames(mat))
ha = HeatmapAnnotation(type = type, annotation_name_side = "left")

ht_list = Heatmap(mat_scaled, name = "expression", row_km = 5, 
    col = colorRamp2(c(-2, 0, 2), c("green", "white", "red")),
    top_annotation = ha, 
    show_column_names = FALSE, row_title = NULL, show_row_dend = FALSE) +
Heatmap(base_mean, name = "base mean", 
    top_annotation = HeatmapAnnotation(summary = anno_summary(gp = gpar(fill = 2:6), 
        height = unit(2, "cm"))),
    width = unit(15, "mm")) +
rowAnnotation(length = anno_points(expr$length, pch = 16, size = unit(1, "mm"), 
    axis_param = list(at = c(0, 2e5, 4e5, 6e5), 
        labels = c("0kb", "200kb", "400kb", "600kb")),
    width = unit(2, "cm"))) +
Heatmap(expr$type, name = "gene type", 
    top_annotation = HeatmapAnnotation(summary = anno_summary(height = unit(2, "cm"))),
    width = unit(15, "mm"))

ht_list = rowAnnotation(block = anno_block(gp = gpar(fill = 2:6, col = NA)), 
    width = unit(2, "mm")) + ht_list

draw(ht_list)

image

2.麻疹疫苗熱圖

以下代碼重現(xiàn)了此處和此處介紹的熱圖遥昧。

mat = readRDS(system.file("extdata", "measles.rds", package = "ComplexHeatmap"))
ha1 = HeatmapAnnotation(
    dist1 = anno_barplot(
        colSums(mat), 
        bar_width = 1, 
        gp = gpar(col = "white", fill = "#FFE200"), 
        border = FALSE,
        axis_param = list(at = c(0, 2e5, 4e5, 6e5, 8e5),
            labels = c("0", "200k", "400k", "600k", "800k")),
        height = unit(2, "cm")
    ), show_annotation_name = FALSE)
ha2 = rowAnnotation(
    dist2 = anno_barplot(
        rowSums(mat), 
        bar_width = 1, 
        gp = gpar(col = "white", fill = "#FFE200"), 
        border = FALSE,
        axis_param = list(at = c(0, 5e5, 1e6, 1.5e6),
            labels = c("0", "500k", "1m", "1.5m")),
        width = unit(2, "cm")
    ), show_annotation_name = FALSE)
year_text = as.numeric(colnames(mat))
year_text[year_text %% 10 != 0] = ""
ha_column = HeatmapAnnotation(
    year = anno_text(year_text, rot = 0, location = unit(1, "npc"), just = "top")
)
col_fun = colorRamp2(c(0, 800, 1000, 127000), c("white", "cornflowerblue", "yellow", "red"))
ht_list = Heatmap(mat, name = "cases", col = col_fun,
    cluster_columns = FALSE, show_row_dend = FALSE, rect_gp = gpar(col= "white"), 
    show_column_names = FALSE,
    row_names_side = "left", row_names_gp = gpar(fontsize = 8),
    column_title = 'Measles cases in US states 1930-2001\nVaccine introduced 1961',
    top_annotation = ha1, bottom_annotation = ha_column,
    heatmap_legend_param = list(at = c(0, 5e4, 1e5, 1.5e5), 
        labels = c("0", "50k", "100k", "150k"))) + ha2
draw(ht_list, ht_gap = unit(3, "mm"))
decorate_heatmap_body("cases", {
    i = which(colnames(mat) == "1961")
    x = i/ncol(mat)
    grid.lines(c(x, x), c(0, 1), gp = gpar(lwd = 2, lty = 2))
    grid.text("Vaccine introduced", x, unit(1, "npc") + unit(5, "mm"))
})

image

3.從單細(xì)胞 RNASeq 中可視化細(xì)胞異質(zhì)性

在這個例子中，小鼠 T 細(xì)胞的單細(xì)胞 RNA-Seq 數(shù)據(jù)被可視化以顯示細(xì)胞的異質(zhì)性朵纷。 ( mouse_scRNAseq_corrected.txt) 數(shù)據(jù)來自Buettner 等人炭臭，2015 年，補充數(shù)據(jù) 1mouse_scRNAseq_corrected.txt袍辞，表“細(xì)胞周期校正基因表達(dá)”鞋仍。你可以到這里。

在以下代碼中搅吁，刪除了重復(fù)的基因凿试。

expr = read.table("data/mouse_scRNAseq_corrected.txt", sep = "\t", header = TRUE)
expr = expr[!duplicated(expr[[1]]), ]
rownames(expr) = expr[[1]]
expr = expr[-1]
expr = as.matrix(expr)

過濾掉一半以上細(xì)胞中未表達(dá)的基因。

expr = expr[apply(expr, 1, function(x) sum(x > 0)/length(x) > 0.5), , drop = FALSE]

該get_correlated_variable_rows()函數(shù)在這里被定義似芝。它提取在細(xì)胞之間可變表達(dá)與其他基因相關(guān)的特征基因那婉。

get_correlated_variable_genes = function(mat, n = nrow(mat), cor_cutoff = 0, n_cutoff = 0) {
    ind = order(apply(mat, 1, function(x) {
            q = quantile(x, c(0.1, 0.9))
            x = x[x < q[1] & x > q[2]]
            var(x)/mean(x)
        }), decreasing = TRUE)[1:n]
    mat2 = mat[ind, , drop = FALSE]
    dt = cor(t(mat2), method = "spearman")
    diag(dt) = 0
    dt[abs(dt) < cor_cutoff] = 0
    dt[dt < 0] = -1
    dt[dt > 0] = 1

    i = colSums(abs(dt)) > n_cutoff

    mat3 = mat2[i, ,drop = FALSE]
    return(mat3)
}

特征基因被定義為一個基因列表，其中每個基因的絕對相關(guān)性超過 0.5 并且有 20 多個相關(guān)基因党瓮。

mat2包含按基因縮放的表達(dá)值详炬，這意味著它包含每個基因跨細(xì)胞的相對表達(dá)。由于單細(xì)胞 RNASeq 數(shù)據(jù)高度可變且異常值頻繁出現(xiàn)寞奸，基因表達(dá)僅在第10和第90分位數(shù)內(nèi)縮放呛谜。

mat = get_correlated_variable_genes(expr, cor_cutoff = 0.5, n_cutoff = 20)
mat2 = t(apply(mat, 1, function(x) {
    q10 = quantile(x, 0.1)
    q90 = quantile(x, 0.9)
    x[x < q10] = q10
    x[x > q90] = q90
    scale(x)
}))
colnames(mat2) = colnames(mat)

加載細(xì)胞周期基因和核糖核蛋白基因。細(xì)胞周期基因列表來自Buettner et al., 2015 , 補充表 1, “ Union of Cyclebase and GO gene”枪萄。核糖核蛋白基因來自 GO:0030529隐岛。基因列表存儲在 mouse_cell_cycle_gene.rds和mouse_ribonucleoprotein.rds中瓷翻。這兩個文件可以在這里和這里找到聚凹。

cc = readRDS("data/mouse_cell_cycle_gene.rds")
ccl = rownames(mat) %in% cc
cc_gene = rownames(mat)[ccl]

rp = readRDS("data/mouse_ribonucleoprotein.rds")
rpl = rownames(mat) %in% rp

由于縮放每個基因的表達(dá)值割坠，一個基因相對于其他基因的表達(dá)水平已經(jīng)丟失，我們將基本平均值計算為所有樣本中基因的平均表達(dá)妒牙”撕撸基本平均值可用于比較基因之間的表達(dá)水平。

base_mean = rowMeans(mat)

現(xiàn)在可以使用以下信息：

縮放表達(dá)式, mat2,
基本均值, base_mean,
基因是否為核糖核蛋白基因rpl湘今，
基因是否是細(xì)胞周期基因ccl敢朱，
細(xì)胞周期基因的符號, cc_gene,

在下一步中，我們可以將信息放在一起并將其可視化為熱圖列表摩瞎。最后添加了一個基因-基因相關(guān)熱圖拴签，并定義為 main_heatmap，這意味著所有熱圖/行注釋的行順序都基于此相關(guān)矩陣的聚類旗们。

對于表達(dá)水平相對較高的細(xì)胞周期基因（大于所有基因的 25% 分位數(shù)）蚓哩，基因名稱以文本標(biāo)簽表示。在第一個熱圖中蚪拦，列樹狀圖下面有兩種不同的顏色杖剪，這些顏色基于通過層次聚類得出的兩個主要組冻押，以突出顯示兩個亞群驰贷。

library(GetoptLong)
ht_list = Heatmap(mat2, col = colorRamp2(c(-1.5, 0, 1.5), c("blue", "white", "red")), 
    name = "scaled_expr", column_title = qq("relative expression for @{nrow(mat)} genes"),
    show_column_names = FALSE, width = unit(8, "cm"),
    heatmap_legend_param = list(title = "Scaled expr")) +
    Heatmap(base_mean, name = "base_expr", width = unit(5, "mm"),
        heatmap_legend_param = list(title = "Base expr")) +
    Heatmap(rpl + 0, name = "ribonucleoprotein", col = c("0" = "white", "1" = "purple"), 
        show_heatmap_legend = FALSE, width = unit(5, "mm")) +
    Heatmap(ccl + 0, name = "cell_cycle", col = c("0" = "white", "1" = "red"), 
        show_heatmap_legend = FALSE, width = unit(5, "mm")) +
    rowAnnotation(link = anno_mark(at = which(ccl & base_mean > quantile(base_mean, 0.25)), 
        labels = rownames(mat)[ccl & base_mean > quantile(base_mean, 0.25)], 
        labels_gp = gpar(fontsize = 10), padding = unit(1, "mm"))) +
    Heatmap(cor(t(mat2)), name = "cor", 
        col = colorRamp2(c(-1, 0, 1), c("green", "white", "red")), 
        show_row_names = FALSE, show_column_names = FALSE, row_dend_side = "right", 
        show_column_dend = FALSE, column_title = "pairwise correlation between genes",
        heatmap_legend_param = list(title = "Correlation"))
ht_list = draw(ht_list, main_heatmap = "cor")
decorate_column_dend("scaled_expr", {
    tree = column_dend(ht_list)$scaled_expr
    ind = cutree(as.hclust(tree), k = 2)[order.dendrogram(tree)]

    first_index = function(l) which(l)[1]
    last_index = function(l) { x = which(l); x[length(x)] }
    x1 = c(first_index(ind == 1), first_index(ind == 2)) - 1
    x2 = c(last_index(ind == 1), last_index(ind == 2))
    grid.rect(x = x1/length(ind), width = (x2 - x1)/length(ind), just = "left",
        default.units = "npc", gp = gpar(fill = c("#FF000040", "#00FF0040"), col = NA))
})

image

熱圖清楚地表明細(xì)胞被分成兩個亞群。第一個熱圖中左側(cè)的群體表現(xiàn)出細(xì)胞周期基因子集的高表達(dá)（細(xì)胞周期基因在“ cell_cycle ”熱圖中表示）洛巢。然而括袒，這些基因的整體表達(dá)水平相對較低（參見“ base_expr ”熱圖）。右側(cè)的群體在其他特征基因中具有更高的表達(dá)稿茉。有趣的是锹锰，在該亞群中高表達(dá)的特征基因富含編碼核糖核蛋白的基因（參見“核糖核蛋白”熱圖）。核糖核蛋白基因的一個子集顯示出強烈的共表達(dá)（參見相關(guān)熱圖）和整體高表達(dá)水平（“ base_expr”熱圖）漓库。

4.甲基化恃慧、表達(dá)值和其他基因組特征之間的相關(guān)性

在以下示例中，數(shù)據(jù)是根據(jù)未發(fā)表分析中發(fā)現(xiàn)的模式隨機生成的渺蒿。首先我們加載數(shù)據(jù)痢士。meth.rds可以在這里找到。

res_list = readRDS("data/meth.rds")
type = res_list$type
mat_meth = res_list$mat_meth
mat_expr = res_list$mat_expr
direction = res_list$direction
cor_pvalue = res_list$cor_pvalue
gene_type = res_list$gene_type
anno_gene = res_list$anno_gene
dist = res_list$dist
anno_enhancer = res_list$anno_enhancer

不同的信息來源和相應(yīng)的變量是：

type：顯示樣本是腫瘤還是正常的標(biāo)簽茂装。
mat_meth：一個矩陣怠蹂，其中的行對應(yīng)于差異甲基化區(qū)域 (DMR)。矩陣中的值是每個樣品中 DMR 中的平均甲基化水平少态。
mat_expr：一個矩陣城侧，其中的行對應(yīng)于與 DMR 相關(guān)的基因（即最接近 DMR 的基因）。矩陣中的值是每個樣本中每個基因的表達(dá)水平彼妻。表達(dá)針對樣本中的每個基因進(jìn)行縮放嫌佑。
direction：甲基化變化的方向（hyper 表示腫瘤樣本中較高的甲基化豆茫，hypo 表示腫瘤樣本中較低的甲基化）。
cor_pvalue：甲基化和相關(guān)基因表達(dá)之間相關(guān)性檢驗的 p 值歧强。
gene_type：基因的類型（例如蛋白質(zhì)編碼基因或lincRNA）澜薄。
anno_gene: 基因模型的注釋（基因間、基因內(nèi)或 TSS）摊册。
dist：相關(guān)基因的 DMR 到 TSS 的距離肤京。
anno_enhancer：與增強子重疊的 DMR 的分?jǐn)?shù)。

數(shù)據(jù)僅包括相關(guān)基因的甲基化和表達(dá)呈負(fù)相關(guān)的 DMR茅特。

首先計算甲基化矩陣的列聚類忘分，以便可以將表達(dá)矩陣中的列調(diào)整為與甲基化矩陣中的列順序相同。

column_tree = hclust(dist(t(mat_meth)))
column_order = column_tree$order

library(RColorBrewer)
meth_col_fun = colorRamp2(c(0, 0.5, 1), c("blue", "white", "red"))
direction_col = c("hyper" = "red", "hypo" = "blue")
expr_col_fun = colorRamp2(c(-2, 0, 2), c("green", "white", "red"))
pvalue_col_fun = colorRamp2(c(0, 2, 4), c("white", "white", "red"))
gene_type_col = structure(brewer.pal(length(unique(gene_type)), "Set3"), 
    names = unique(gene_type))
anno_gene_col = structure(brewer.pal(length(unique(anno_gene)), "Set1"), 
    names = unique(anno_gene))
dist_col_fun = colorRamp2(c(0, 10000), c("black", "white"))
enhancer_col_fun = colorRamp2(c(0, 1), c("white", "orange"))

我們首先定義兩列注釋白修，然后制作復(fù)雜的熱圖妒峦。

ht_opt(
    legend_title_gp = gpar(fontsize = 8, fontface = "bold"), 
    legend_labels_gp = gpar(fontsize = 8), 
    heatmap_column_names_gp = gpar(fontsize = 8),
    heatmap_column_title_gp = gpar(fontsize = 10),
    heatmap_row_title_gp = gpar(fontsize = 8)
)

ha = HeatmapAnnotation(type = type, 
    col = list(type = c("Tumor" = "pink", "Control" = "royalblue")),
    annotation_name_side = "left")
ha2 = HeatmapAnnotation(type = type, 
    col = list(type = c("Tumor" = "pink", "Control" = "royalblue")), 
    show_legend = FALSE)

ht_list = Heatmap(mat_meth, name = "methylation", col = meth_col_fun,
    column_order= column_order,
    top_annotation = ha, column_title = "Methylation") +
    Heatmap(direction, name = "direction", col = direction_col) +
    Heatmap(mat_expr[, column_tree$order], name = "expression", 
        col = expr_col_fun, 
        column_order = column_order, 
        top_annotation = ha2, column_title = "Expression") +
    Heatmap(cor_pvalue, name = "-log10(cor_p)", col = pvalue_col_fun) +
    Heatmap(gene_type, name = "gene type", col = gene_type_col) +
    Heatmap(anno_gene, name = "anno_gene", col = anno_gene_col) +
    Heatmap(dist, name = "dist_tss", col = dist_col_fun) +
    Heatmap(anno_enhancer, name = "anno_enhancer", col = enhancer_col_fun, 
        cluster_columns = FALSE, column_title = "Enhancer")

draw(ht_list, row_km = 2, row_split = direction,
    column_title = "Comprehensive correspondence between methylation, expression and other genomic features", 
    column_title_gp = gpar(fontsize = 12, fontface = "bold"), 
    merge_legends = TRUE, heatmap_legend_side = "bottom")

image

ht_opt(RESET = TRUE)

復(fù)雜的熱圖顯示高度甲基化的 DMR 富含基因間和基因內(nèi)區(qū)域，很少與增強子重疊兵睛。相比之下肯骇，低甲基化的 DMR 富含轉(zhuǎn)錄起始位點 (TSS) 和增強子。

5. 使用復(fù)雜注釋可視化甲基化概況

在這個例子中祖很，Strum et al., 2012 中的圖 1 重新實現(xiàn)了一些調(diào)整笛丙。

需要先加載包。

library(matrixStats)
library(GenomicRanges)

甲基化圖譜可以從GEO 數(shù)據(jù)庫下載假颇。該GEOquery 包用于從GEO檢索數(shù)據(jù)胚鸯。

library(GEOquery)
gset = getGEO("GSE36278")

已通過 Illumina HumanMethylation450 BeadChip 陣列測量甲基化圖譜。我們通過IlluminaHumanMethylation450kanno.ilmn12.hg19包加載探針數(shù)據(jù)笨鸡。

調(diào)整矩陣中的行名稱與探針相同姜钳。

library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
data(Locations)

mat = exprs(gset[[1]])
colnames(mat) = phenoData(gset[[1]])@data$title
mat = mat[rownames(Locations), ]

probe包含探針的位置以及 CpG 位點是否與 SNP 重疊的信息。在這里形耗，我們?nèi)コ诵匀旧w上的探針和SNP 重疊的探針哥桥。

data(SNPs.137CommonSingle)
data(Islands.UCSC)
l = Locations$chr %in% paste0("chr", 1:22) & is.na(SNPs.137CommonSingle$Probe_rs)
mat = mat[l, ]

相應(yīng)地獲取探針位置的子集和對 CpG 島的注釋。

cgi = Islands.UCSC$Relation_to_Island[l]
loc = Locations[l, ]

將基質(zhì)分為腫瘤樣品基質(zhì)和正常樣品基質(zhì)激涤。還要修改腫瘤樣本的列名以和表型數(shù)據(jù)一致拟糕。

mat1 = as.matrix(mat[, grep("GBM", colnames(mat))])   # tumor samples
mat2 = as.matrix(mat[, grep("CTRL", colnames(mat))])  # normal samples
colnames(mat1) = gsub("GBM", "dkfz", colnames(mat1))

表型數(shù)據(jù)來自Sturm 等人，2012 年昔期，補充表 S1已卸，可在此處找到。

表型數(shù)據(jù)的行被調(diào)整為與甲基化矩陣的列相同硼一。

phenotype = read.table("data/450K_annotation.txt", header = TRUE, sep = "\t", 
    row.names = 1, check.names = FALSE, comment.char = "", stringsAsFactors = FALSE)
phenotype = phenotype[colnames(mat1), ]

請注意累澡，我們僅使用來自 DKFZ 的 136 個樣本，而在Sturm et al., 2012 中般贼，使用了額外的 74 個 TCGA 樣本愧哟。

提取腫瘤樣本中甲基化變化最大的前 8000 個探針奥吩，并相應(yīng)地對其他信息進(jìn)行子集化。

ind = order(rowVars(mat1, na.rm = TRUE), decreasing = TRUE)[1:8000]
m1 = mat1[ind, ]
m2 = mat2[ind, ]
cgi2 = cgi[ind]
cgi2 = ifelse(grepl("Shore", cgi2), "Shore", cgi2)
cgi2 = ifelse(grepl("Shelf", cgi2), "Shelf", cgi2)
loc = loc[ind, ]

對于每個探頭蕊梧，找到到最近的 TSS 的距離霞赫。pc_tx_tss.bed包含來自蛋白質(zhì)編碼基因的 TSS 位置。

gr = GRanges(loc[, 1], ranges = IRanges(loc[, 2], loc[, 2]+1))
tss = read.table("data/pc_tx_tss.bed", stringsAsFactors = FALSE)
tss = GRanges(tss[[1]], ranges = IRanges(tss[, 2], tss[, 3]))

tss_dist = distanceToNearest(gr, tss)
tss_dist = tss_dist@elementMetadata$distance

因為NA矩陣中有一些( sum(is.na(m1))/length(m1)= 0.0011967) 會破壞cor()函數(shù)肥矢，所以我們替換NA為中間甲基化 (0.5)端衰。請注意，盡管ComplexHeatmap允許NA在矩陣中甘改，但刪除NA會加速聚類旅东。

m1[is.na(m1)] = 0.5
m2[is.na(m2)] = 0.5

以下注釋將添加到甲基化矩陣的列中：

年齡
DKFZ 亞型分類
TCGA 亞型分類
基于表達(dá)譜的 TCGA 亞型分類
IDH1突變
H3F3A突變
TP53突變
chr7增益
chr10 損失
CDKN2A 缺失
EGFR擴(kuò)增
PDGFRA 擴(kuò)增

在下面的代碼中，我們在ha變量中定義了列注釋十艾。我們還自定義注釋的顏色抵代、圖例和高度。

mutation_col = structure(names = c("MUT", "WT", "G34R", "G34V", "K27M"), 
    c("black", "white", "#4DAF4A", "#4DAF4A", "#377EB8"))
cnv_col = c("gain" = "#E41A1C", "loss" = "#377EB8", "amp" = "#E41A1C", 
    "del" = "#377EB8", "normal" = "white")
ha = HeatmapAnnotation(
    age = anno_points(phenotype[[13]], 
        gp = gpar(col = ifelse(phenotype[[13]] > 20, "black", "red")), 
        height = unit(3, "cm")),
    dkfz_cluster = phenotype[[1]],
    tcga_cluster = phenotype[[2]],
    tcga_expr = phenotype[[3]],
    IDH1 = phenotype[[5]],
    H3F3A = phenotype[[4]],
    TP53 = phenotype[[6]],
    chr7_gain = ifelse(phenotype[[7]] == 1, "gain", "normal"),
    chr10_loss = ifelse(phenotype[[8]] == 1, "loss", "normal"),
    CDKN2A_del = ifelse(phenotype[[9]] == 1, "del", "normal"),
    EGFR_amp = ifelse(phenotype[[10]] == 1, "amp", "normal"),
    PDGFRA_amp = ifelse(phenotype[[11]] == 1, "amp", "normal"),
    col = list(dkfz_cluster = structure(names = c("IDH", "K27", "G34", "RTK I PDGFRA", 
            "Mesenchymal", "RTK II Classic"), brewer.pal(6, "Set1")),
        tcga_cluster = structure(names = c("G-CIMP+", "Cluster #2", "Cluster #3"), 
            brewer.pal(3, "Set1")),
        tcga_expr = structure(names = c("Proneural", "Classical", "Mesenchymal"), 
            c("#377EB8", "#FFFF33", "#FF7F00")),
        IDH1 = mutation_col,
        H3F3A = mutation_col,
        TP53 = mutation_col,
        chr7_gain = cnv_col,
        chr10_loss = cnv_col,
        CDKN2A_del = cnv_col,
        EGFR_amp = cnv_col,
        PDGFRA_amp = cnv_col),
    na_col = "grey", border = TRUE,
    show_legend = c(TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE),
    show_annotation_name = FALSE,
    annotation_legend_param = list(
        dkfz_cluster = list(title = "DKFZ Methylation"),
        tcga_cluster = list(title = "TCGA Methylation"),
        tcga_expr = list(title = "TCGA Expression"),
        H3F3A = list(title = "Mutations"),
        chr7_gain = list(title = "CNV"))
)

在最后的圖中忘嫉，添加了四個熱圖荤牍。從左到右，有

腫瘤樣本中甲基化的熱圖
正常樣本中的甲基化
到最近的 TSS 的距離
CpG 島 (CGI) 注釋庆冕。

熱圖根據(jù) CGI 注釋按行拆分康吵。

繪制熱圖后，decorate_*()函數(shù)會添加附加圖形愧杯，例如注釋標(biāo)簽涎才。

col_fun = colorRamp2(c(0, 0.5, 1), c("#377EB8", "white", "#E41A1C"))
ht_list = Heatmap(m1, col = col_fun, name = "Methylation",
    clustering_distance_columns = "spearman",
    show_row_dend = FALSE, show_column_dend = FALSE,
    show_column_names = FALSE,
    bottom_annotation = ha, column_title = qq("GBM samples (n = @{ncol(m1)})"),
    row_split = factor(cgi2, levels = c("Island", "Shore", "Shelf", "OpenSea")), 
    row_title_gp = gpar(col = "#FFFFFF00")) + 
Heatmap(m2, col = col_fun, show_column_names = FALSE, 
    show_column_dend = FALSE, column_title = "Controls",
    show_heatmap_legend = FALSE, width = unit(1, "cm")) +
Heatmap(tss_dist, name = "tss_dist", col = colorRamp2(c(0, 2e5), c("white", "black")), 
    width = unit(5, "mm"),
    heatmap_legend_param = list(at = c(0, 1e5, 2e5), labels = c("0kb", "100kb", "200kb"))) + 
Heatmap(cgi2, name = "CGI", show_row_names = FALSE, width = unit(5, "mm"),
    col = structure(names = c("Island", "Shore", "Shelf", "OpenSea"), c("red", "blue", "green", "#CCCCCC")))
draw(ht_list, row_title = paste0("DNA methylation probes (n = ", nrow(m1), ")"),
    annotation_legend_side = "left", heatmap_legend_side = "left")

annotation_titles = c(dkfz_cluster = "DKFZ Methylation",
    tcga_cluster = "TCGA Methylation",
    tcga_expr = "TCGA Expression",
    IDH1 = "IDH1",
    H3F3A = "H3F3A",
    TP53 = "TP53",
    chr7_gain = "Chr7 gain",
    chr10_loss = "Chr10 loss",
    CDKN2A_del = "Chr10 loss",
    EGFR_amp = "EGFR amp",
    PDGFRA_amp = "PDGFRA amp")
for(an in names(annotation_titles)) {
    decorate_annotation(an, {
        grid.text(annotation_titles[an], unit(-2, "mm"), just = "right")
        grid.rect(gp = gpar(fill = NA, col = "black"))
    })
}
decorate_annotation("age", {
    grid.text("Age", unit(8, "mm"), just = "right")
    grid.rect(gp = gpar(fill = NA, col = "black"))
    grid.lines(unit(c(0, 1), "npc"), unit(c(20, 20), "native"), gp = gpar(lty = 2))
})
decorate_annotation("IDH1", {
    grid.lines(unit(c(-40, 0), "mm"), unit(c(1, 1), "npc"))
})
decorate_annotation("chr7_gain", {
    grid.lines(unit(c(-40, 0), "mm"), unit(c(1, 1), "npc"))
})

image

6. 添加多個箱線圖到單行

注釋函數(shù)anno_boxplot()只為一行繪制一個箱線圖鞋既。當(dāng)多個熱圖連接在一起力九，或者已經(jīng)定義了列的組時，對于每一行邑闺，我們要在熱圖或列組之間進(jìn)行比較跌前，因此，需要為每一行繪制多個箱線圖陡舅。

在以下示例中抵乓，我們演示了如何實現(xiàn)為單行繪制多個箱線圖的注釋函數(shù)。該grid.boxplot()函數(shù)來自ComplexHeatmap包靶衍，它可以很容易地在網(wǎng)格系統(tǒng)下繪制箱線圖灾炭。

m1 = matrix(sort(rnorm(100)), 10, byrow = TRUE)
m2 = matrix(sort(rnorm(100), decreasing = TRUE), 10, byrow = TRUE)

ht_list = Heatmap(m1, name = "m1") + Heatmap(m2, name = "m2")

rg = range(c(m1, m2))
rg[1] = rg[1] - (rg[2] - rg[1])* 0.02
rg[2] = rg[2] + (rg[2] - rg[1])* 0.02
anno_multiple_boxplot = function(index) {
    nr = length(index)
    pushViewport(viewport(xscale = rg, yscale = c(0.5, nr + 0.5)))
    for(i in seq_along(index)) {
        grid.rect(y = nr-i+1, height = 1, default.units = "native")
        grid.boxplot(m1[ index[i], ], pos = nr-i+1 + 0.2, box_width = 0.3, 
            gp = gpar(fill = "red"), direction = "horizontal")
        grid.boxplot(m2[ index[i], ], pos = nr-i+1 - 0.2, box_width = 0.3, 
            gp = gpar(fill = "green"), direction = "horizontal")
    }
    grid.xaxis()
    popViewport()
}

ht_list = ht_list + rowAnnotation(boxplot = anno_multiple_boxplot, width = unit(4, "cm"), 
    show_annotation_name = FALSE)
lgd = Legend(labels = c("m1", "m2"), title = "boxplots",
    legend_gp = gpar(fill = c("red", "green")))
draw(ht_list, padding = unit(c(20, 2, 2, 2), "mm"), heatmap_legend_list = list(lgd))

image

其他技巧

參考鏈接

1.為不同維度的不同熱圖設(shè)置相同的單元格大小

假設(shè)您有一個 heatmaps/oncoPrints 列表要保存為不同的文件（例如 png 或 pdf ），您可能想要做的一件事是使熱圖中每個網(wǎng)格/單元格的大小在熱圖中相同颅眶，因此蜈出，您需要根據(jù)熱圖中的行數(shù)或列數(shù)計算png/pdf文件的大小。在ComplexHeatmap生成的熱圖中涛酗，所有熱圖組件都具有絕對大小铡原，并且只有熱圖主體的大型迪谩（或單元格的大小）是可更改的（或者換句話說燕刻，如果您更改最終圖形設(shè)備的大小只泼，例如通過拖動圖形窗口，如果您繪制時卵洗，僅調(diào)整熱圖主體的大星氤），這意味著整個圖的大小與熱圖中的行數(shù)或列數(shù)呈線性相關(guān)过蹂。這意味著我們實際上可以擬合一個線性模型y = a*x + b籍滴，其中y是整個圖的高度，x是行數(shù)榴啸。

在下面的例子中孽惰，我們簡單地演示了如何在熱圖中建立繪圖高度和行數(shù)之間的關(guān)系。我們首先定義一個函數(shù)鸥印，該函數(shù)生成一個具有特定行數(shù)的 10 列矩陣勋功。請注意，矩陣中的值在本演示中并不重要库说。

random_mat = function(nr) {
    m = matrix(rnorm(10*nr), nc = 10)
    colnames(m) = letters[1:10]
    return(m)
}

由于是絕對線性的關(guān)系狂鞋，我們只需要測試兩個具有不同行數(shù)的熱圖，其中單行的高度為unit(5, "mm")潜的。在熱圖中骚揍，還有列標(biāo)題、列樹狀圖啰挪、列注釋和列名信不。

下面的代碼有幾點需要注意：

熱圖對象應(yīng)該由draw()返回，因為熱圖的布局是在執(zhí)行后才計算的draw()亡呵。
component_height()返回一個單位向量抽活，這些單位對應(yīng)于熱圖中從上到下的所有熱圖組件的高度。（component_width()返回?zé)釄D組件的寬度）锰什。
在計算時ht_height下硕，我們添加unit(4, "mm")，因為在最終圖的頂部和底部汁胆，有2mm白色邊框梭姓。
ht_height需要轉(zhuǎn)換為cm或inch單位。

在下文中嫩码，y包含以inch單位測量的值誉尖。

y = NULL
for(nr in c(10, 20)) {
    ht = draw(Heatmap(random_mat(nr), height = unit(5, "mm")*nr, 
        column_title = "foo", # one line text
        top_annotation = HeatmapAnnotation(bar = 1:10)))
    ht_height = sum(component_height(ht)) + unit(4, "mm")
    ht_height = convertHeight(ht_height, "inch", valueOnly = TRUE)
    y = c(y, ht_height)
}

然后我們可以擬合y和行數(shù)之間的線性關(guān)系：

x = c(10, 20)
lm(y ~ x)

## 
## Call:
## lm(formula = y ~ x)
## 
## Coefficients:
## (Intercept)            x  
##      1.2222       0.1969

這意味著行數(shù)x和繪圖高度y之間的關(guān)系是：y = 0.1969*x + 1.3150。

您可以通過以下代碼測試不同行的熱圖的單行高度是否相同谢谦。請注意释牺，所有熱圖配置都應(yīng)與您準(zhǔn)備的y相同萝衩。

for(nr in c(10, 20)) {
    png(paste0("test_heatmap_nr_", nr, ".png"), width = 5, height = 0.1969*nr + 1.3150, 
        units = "in", res = 100)
    draw(Heatmap(random_mat(nr), height = unit(5, "mm")*nr, 
        column_title = "foo", # column title can be any one-line string
        top_annotation = HeatmapAnnotation(bar = 1:10)))
    dev.off()
}

image

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個濱河市没咙，隨后出現(xiàn)的幾起案子猩谊，更是在濱河造成了極大的恐慌，老刑警劉巖祭刚，帶你破解...
沈念sama閱讀 206,126評論 6贊 481
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件牌捷，死亡現(xiàn)場離奇詭異，居然都是意外死亡涡驮，警方通過查閱死者的電腦和手機暗甥，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,254評論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來捉捅，“玉大人撤防，你說我怎么就攤上這事“艨冢” “怎么了寄月？”我有些...
開封第一講書人閱讀 152,445評論 0贊 341
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長无牵。經(jīng)常有香客問我漾肮，道長，這世上最難降的妖魔是什么茎毁？我笑而不...
開封第一講書人閱讀 55,185評論 1贊 278
?港島之戀（遺憾婚禮）
正文為了忘掉前任克懊，我火速辦了婚禮，結(jié)果婚禮上七蜘，老公的妹妹穿的比我還像新娘谭溉。我一直安慰自己，他們只是感情好崔梗，可當(dāng)我...
茶點故事閱讀 64,178評論 5贊 371
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布夜只。她就那樣靜靜地躺著垒在，像睡著了一般蒜魄。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上场躯，一...
開封第一講書人閱讀 48,970評論 1贊 284
城市分裂傳說
那天谈为，我揣著相機與錄音，去河邊找鬼踢关。笑死伞鲫，一個胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的签舞。我是一名探鬼主播秕脓，決...
沈念sama閱讀 38,276評論 3贊 399
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼柒瓣，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了吠架？” 一聲冷哼從身側(cè)響起芙贫，我...
開封第一講書人閱讀 36,927評論 0贊 259
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎傍药，沒想到半個月后磺平，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 43,400評論 1贊 300
?護(hù)林員之死
正文獨居荒郊野嶺守林人離奇死亡拐辽，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 35,883評論 2贊 323
?白月光啟示錄
正文我和宋清朗相戀三年拣挪，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片俱诸。...
茶點故事閱讀 37,997評論 1贊 333
活死人
序言：一個原本活蹦亂跳的男人離奇死亡菠劝，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出睁搭，到底是詐尸還是另有隱情闸英，我是刑警寧澤，帶...
沈念sama閱讀 33,646評論 4贊 322
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布介袜，位于F島的核電站甫何，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏遇伞。R本人自食惡果不足惜辙喂，卻給世界環(huán)境...
茶點故事閱讀 39,213評論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望鸠珠。院中可真熱鬧巍耗，春花似錦、人聲如沸渐排。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,204評論 0贊 19
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽驯耻。三九已至亲族，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間可缚，已是汗流浹背霎迫。一陣腳步聲響...
開封第一講書人閱讀 31,423評論 1贊 260
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留帘靡，地道東北人知给。一個月前我還...
沈念sama閱讀 45,423評論 2贊 352
代替公主和親
正文我出身青樓，卻偏偏與公主長得像，于是被迫代替她去往敵國和親涩赢。傳聞我的和親對象是個殘疾皇子戈次，可洞房花燭夜當(dāng)晚...
茶點故事閱讀 42,722評論 2贊 345

ComplexHeatmap復(fù)雜熱圖繪制學(xué)習(xí)——11.例子

更多示例

1.為基因表達(dá)矩陣添加更多信息

2.麻疹疫苗熱圖

3.從單細(xì)胞 RNASeq 中可視化細(xì)胞異質(zhì)性

4.甲基化恃慧、表達(dá)值和其他基因組特征之間的相關(guān)性

5. 使用復(fù)雜注釋可視化甲基化概況

6. 添加多個箱線圖到單行

其他技巧

1.為不同維度的不同熱圖設(shè)置相同的單元格大小

推薦閱讀更多精彩內(nèi)容