學(xué)習(xí)一篇NC的單細(xì)胞文章(一):質(zhì)控
學(xué)習(xí)一篇NC的單細(xì)胞文章(二):標(biāo)準(zhǔn)化
學(xué)習(xí)一篇NC的單細(xì)胞文章(三):復(fù)雜熱圖
學(xué)習(xí)一篇NC的單細(xì)胞文章(四):Clustering
學(xué)習(xí)一篇NC的單細(xì)胞文章(五):tSNE
在第五節(jié)芹血,由于大部分細(xì)胞(868 個(gè))都被歸為上皮細(xì)胞群中(Fig2 c)驱还,這868個(gè)細(xì)胞可被分成5個(gè)cluster徘层,接著對(duì)這5個(gè)cluster細(xì)胞進(jìn)行探索。我們使用一組來自對(duì)乳腺腫塊的非監(jiān)督分析的基因表達(dá)特征對(duì)5個(gè)cluster進(jìn)行了研究吭敢。這些基因表達(dá)特征通過比較三陰性乳腺癌(TNBC)的四個(gè)亞型(ERBB2 amplicon养泡,Luminal Subtype 埂奈、Basal epithelial-cell enriched 和Luminal epithelial gene cluster containing ER)而建立迄损。先看看這5個(gè)clusters的基底細(xì)胞來源的細(xì)胞群有多少。
## 讀取數(shù)據(jù)
basal_PNAS_all <- read.table("data/genes_for_basal_vs_non_basal_tnbc_PNAS.txt", header = TRUE, sep = "\t")
#提取Basal.epithelial.cell.enriched.cluster的基因
basal_PNAS_long <- basal_PNAS_all$Basal.epithelial.cell.enriched.cluster
#合并剩下17個(gè)基因
basal_PNAS <- intersect(basal_PNAS_long, rownames(mat_ct))
> basal_PNAS
[1] "SOX9" "GALNT3" "CDH3" "LAMC2" "CX3CL1" "TRIM29" "KRT17" "KRT5" "CHI3L2"
[10] "SLPI" "NFIB" "MRAS" "TGFB2" "CAPN6" "DMD" "FABP7" "CXCL1"
#算出17個(gè)basal_PNAS基因在1112個(gè)細(xì)胞的表達(dá)平均值
basal_PNAS_avg_exprs <- apply(mat_ct[match(basal_PNAS, rownames(mat_ct)),], 2, mean)
#檢查一下數(shù)據(jù)
all.equal(names(basal_PNAS_avg_exprs), colnames(mat_ct))
#提取868個(gè)上皮細(xì)胞群體的17個(gè)basal_PNAS基因表達(dá)平均值
basal_PNAS_avg_exprs <- basal_PNAS_avg_exprs[which(pd_ct$cell_types_cl_all == "epithelial")]
#檢查一下數(shù)據(jù)
all.equal(colnames(HSMM_allepith_clustering), names(basal_PNAS_avg_exprs))
#把17個(gè)basal_PNAS基因表達(dá)平均值賦給HSMM_allepith_clustering账磺,以便于后續(xù)分析
pData(HSMM_allepith_clustering)$basal_PNAS_avg_exprs <- basal_PNAS_avg_exprs
#畫figS9b
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "basal_PNAS_avg_exprs", cell_size = 2) + facet_wrap(~patient) +
scale_color_continuous(low = "yellow", high = "blue")
#畫figS9a
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "basal_PNAS_avg_exprs", cell_size = 2) + facet_wrap(~Cluster) +
scale_color_continuous(low = "yellow", high = "blue")
figS9a:大多數(shù)TNBC樣本都有basal gene signature的表達(dá)。figS9b:在868個(gè)上皮細(xì)胞群中痊远,cluster2的basal gene signature表達(dá)量最豐富垮抗。
接著,使用另外一個(gè)基因表達(dá)特征數(shù)據(jù)集TNBCtype4 signatures碧聪,這個(gè)signatures根據(jù)基因?qū)⒓?xì)胞分為6個(gè)類:basal_like_1冒版、basal_like_2、immunomodulatory逞姿、mesenchymal辞嗡、mesenchymal_stem_like和luminal_ar。作者將基因表達(dá)特征中上調(diào)基因的平均表達(dá)值減去下調(diào)基因的平均表達(dá)值滞造,將差值作為每個(gè)細(xì)胞在TNBCtype4 signatures (basal_like_1续室、basal_like_2、mesenchymal和luminal_ar)中的每個(gè)基因表達(dá)值谒养,挑選最高基因表達(dá)值對(duì)應(yīng)的signature挺狰,將其分配給對(duì)應(yīng)的細(xì)胞。
#讀取數(shù)據(jù)
lehman_long <- read.table("data/Lehman_signature.txt", sep = "\t", header = TRUE, stringsAsFactors = FALSE)
#這個(gè)for循環(huán)提取了lehman_long里面的四列g(shù)ene买窟、regulation丰泊、no_samples和signature,建立一個(gè)data.rrame
for (i in 0:5) {
gene <- "gene"
regulation <- "regulation"
no_samples <- "no_samples"
signature <- "signature"
if (i == 0) {
lehman <- lehman_long[, 1:4]
lehman <- lehman[-which(lehman$signature == ""),]
}
if (i > 0) {
gene <- paste("gene", i, sep = ".")
regulation <- paste("regulation", i, sep = ".")
no_samples <- paste("no_samples", i, sep = ".")
signature <- paste("signature", i, sep = ".")
mat_to_bind <- lehman_long[, c(gene, regulation, no_samples, signature)]
colnames(mat_to_bind) <- c("gene", "regulation", "no_samples", "signature")
if (length(which(is.na(mat_to_bind$no_samples))) > 0 )
mat_to_bind <- mat_to_bind[-which(mat_to_bind$signature == ""),]
lehman <- rbind(lehman, mat_to_bind)
}
}
#刪掉一些mat_ct沒有檢測(cè)到的基因
lehman <- lehman[which(!is.na(match(lehman$gene, rownames(mat_ct)))),]
lehman_signatures <- unique(lehman$signature)
lehman_avg_exps <- apply(mat_ct, 2, function(x){
mns <- matrix(NA, nrow = length(lehman_signatures), ncol = 2)
rownames(mns) <- lehman_signatures
for (s in 1:length(lehman_signatures)) {
sign <- lehman_signatures[s] # current signature
lehman_here <- lehman %>%
dplyr::filter(signature == sign)
lehman_here_up <- lehman_here %>%
dplyr::filter(regulation == "UP")
lehman_here_down <- lehman_here %>%
dplyr::filter(regulation == "DOWN")
idx_genes_up <- match(lehman_here_up$gene, rownames(mat_ct))
idx_genes_down <- match(lehman_here_down$gene, rownames(mat_ct))
mns[s,] <- c(mean(x[idx_genes_up]), mean(x[idx_genes_down])) #算上調(diào)始绍、下調(diào)的基因在樣本中的平均表達(dá)值
}
return(mns)
})
#檢查數(shù)據(jù)
all.equal(colnames(lehman_avg_exps), rownames(pd_ct))
#只看868個(gè)上皮細(xì)胞的情況
lehman_avg_exprs_epithelial <- lehman_avg_exps[,which(pd_ct$cell_types_cl_all == "epithelial")]
#提取lehman_avg_exps前面6行瞳购,對(duì)應(yīng)的是up
lehman_avg_ups <- lehman_avg_exps[c(1:6), ]
rownames(lehman_avg_ups) <- lehman_signatures
all.equal(colnames(lehman_avg_ups), rownames(pd_ct))
lehman_avg_ups_epithelial <- lehman_avg_ups[,which(pd_ct$cell_types_cl_all == "epithelial")]
#提取lehman_avg_exps后面6行,對(duì)應(yīng)的是down
lehman_avg_downs <- lehman_avg_exps[c(7:12),]
rownames(lehman_avg_downs) <- lehman_signatures
all.equal(colnames(lehman_avg_downs), rownames(pd_ct))
lehman_avg_downs_epithelial <- lehman_avg_downs[,which(pd_ct$cell_types_cl_all == "epithelial")]
#上調(diào)基因的平均表達(dá)值減去下調(diào)基因的平均表達(dá)值
lehman_avg_both <- lehman_avg_ups - lehman_avg_downs
all.equal(colnames(lehman_avg_both), rownames(pd_ct))
#挑選最高基因表達(dá)值對(duì)應(yīng)的signature亏推,將其分配給對(duì)應(yīng)的細(xì)胞学赛。
assignments_lehman_both <- apply(lehman_avg_both, 2, function(x){rownames(lehman_avg_both)[which.max(x)]})
assignments_lehman_both_epithelial <- assignments_lehman_both[which(pd_ct$cell_types_cl_all == "epithelial")]
#刪除immunomodulatory和mesenchymal_stem_like signature
lehman_avg_both_epithelial_new <- lehman_avg_both_epithelial[-which(rownames(lehman_avg_both_epithelial) %in% c("immunomodulatory", "mesenchymal_stem_like")),]
assignments_lehman_both_epithelial_new <- apply(lehman_avg_both_epithelial_new, 2, function(x){rownames(lehman_avg_both_epithelial_new)[which.max(x)]})
接下來畫圖,同樣地径簿,需要對(duì)heatmap函數(shù)代碼進(jìn)行修改罢屈。
ha_lehman_epith_pat <- list()
for (i in 1:length(patients_now)) {
if (i == 1)
ha_lehman_epith_pat[[i]] <- HeatmapAnnotation(df=data.frame(cluster_all = clusterings_sep_allepith[[i]]),
col = list(cluster_all = c("1" = "#ee204d", "2" = "#17806d", "3" = "#b2ec5d", "4" = "#cda4de", "5" = "#1974d2")),
annotation_name_side = "left", annotation_name_gp = gpar(fontsize = 12),
annotation_legend_param = list(list(title_position = "topcenter", title = "cluster")),
show_annotation_name = FALSE,
gap = unit(c(2), "mm"),
show_legend = FALSE)
if (i > 1 && i != 5 )
ha_lehman_epith_pat[[i]] <- HeatmapAnnotation(df=data.frame(cluster_all = clusterings_sep_allepith[[i]]),
col = list(cluster_all = c("1" = "#ee204d", "2" = "#17806d", "3" = "#b2ec5d", "4" = "#cda4de", "5" = "#1974d2")),
annotation_name_side = "left", annotation_name_gp = gpar(fontsize = 12),
annotation_legend_param = list(list(title_position = "topcenter", title = "cluster")),
show_annotation_name = FALSE,
gap = unit(c(2), "mm"),
show_legend = FALSE)
if (i == 5)
ha_lehman_epith_pat[[i]] <- HeatmapAnnotation(df=data.frame(cluster_all = clusterings_sep_allepith[[i]]),
col = list(cluster_all = c("1" = "#ee204d", "2" = "#17806d", "3" = "#b2ec5d", "4" = "#cda4de", "5" = "#1974d2")),
annotation_name_side = "right", annotation_name_gp = gpar(fontsize = 12),
annotation_legend_param = list(list(title_position = "topcenter",title = "cluster")),
show_annotation_name = FALSE,
gap = unit(c(2), "mm"),
show_legend = TRUE)
}
#檢查數(shù)據(jù)
all.equal(names(lehmans_epith_pat_both), patients_now)
#將basal signature添加進(jìn)去,以便后續(xù)作圖
lehmans_epith_pat_both_wbasal_new <- lehmans_epith_pat_both_new
for (i in 1:length(patients_now)) {
lehmans_epith_pat_both_wbasal_new[[i]] <- rbind(lehmans_epith_pat_both_new[[i]], pData(HSMM_allepith_clustering)$basal_PNAS_avg_exprs[which(HSMM_allepith_clustering$patient == patients_now[i])])
rownames(lehmans_epith_pat_both_wbasal_new[[i]])[5] <- "intrinsic_basal"
}
# 畫圖
ht_sep_lehmans_both_wbasal_new <-
Heatmap(lehmans_epith_pat_both_wbasal_new[[1]],
col = colorRamp2(c(-0.7, 0, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[1],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[1]],
name = patients_now[1],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(lehmans_epith_pat_both_wbasal_new[[2]],
col = colorRamp2(c(-0.7, 0, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[2],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[2]],
name = patients_now[2],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(lehmans_epith_pat_both_wbasal_new[[3]],
col = colorRamp2(c(-0.7, 0, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[3],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[3]],
name = patients_now[3],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(lehmans_epith_pat_both_wbasal_new[[4]],
col = colorRamp2(c(-0.7, 0, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[4],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[4]],
name = patients_now[4],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(lehmans_epith_pat_both_wbasal_new[[5]],
col = colorRamp2(c(-0.7, 0, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[5],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[5]],
name = patients_now[5],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(lehmans_epith_pat_both_wbasal_new[[6]],
col = colorRamp2(c(-0.7, 0, 1), c("blue","white", "red")),
cluster_rows = FALSE,
row_names_side = "right",
column_title = patients_now[6],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[6]],
name = patients_now[6],
show_column_names = FALSE,
top_annotation_height = unit(c(2), "cm"),
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9)))
#畫fig3d
print(draw(ht_sep_lehmans_both_wbasal_new, annotation_legend_side = "right"))
我們只需要把右邊注釋條PS一下篇亭,就可以達(dá)到跟文獻(xiàn)的圖片一模一樣了缠捌。
#檢查數(shù)據(jù)
all.equal(colnames(HSMM_allepith_clustering), names(assignments_lehman_both_epithelial_new))
pData(HSMM_allepith_clustering)$assignments_lehman_both_new <- assignments_lehman_both_epithelial_new
畫fig3g
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "assignments_lehman_both_new", cell_size = 2) + facet_wrap(~patient)
#畫figS8
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "assignments_lehman_both_new", cell_size = 2) + facet_wrap(~Cluster)
cluster4也富集了 Basal-Like 1 signature,而cluster3高度富集了TNBCtype “Luminal Androgen Receptor” signature。為了更清楚的看到上皮細(xì)胞群的5個(gè)cluster對(duì)應(yīng)的TNBCtype signatures的平均表達(dá)量曼月,接著繼續(xù)探索下去...
clust_avg_lehman_both_new <- matrix(NA, nrow = length(unique(HSMM_allepith_clustering$Cluster)), ncol = nrow(lehman_avg_both_epithelial_new))
#列名:cluster1,cluster2,cluster3,cluster4,cluster5
rownames(clust_avg_lehman_both_new) <- paste("clust", c(1:length(unique(HSMM_allepith_clustering$Cluster))), sep = "")
#行名:basal_like_1谊却、basal_like_2、mesenchymal哑芹、luminal_ar"
colnames(clust_avg_lehman_both_new) <- rownames(lehman_avg_both_epithelial_new)
#算出每個(gè)cluster的signatures 平均值
for (c in 1:length(unique(HSMM_allepith_clustering$Cluster))) {
clust_avg_lehman_both_new[c,] <- apply(lehman_avg_both_epithelial_new[,which(HSMM_allepith_clustering$Cluster == c)], 1, mean)
}
clust_avg_lehman_both_new <- as.data.frame(clust_avg_lehman_both_new)
#增加一列cluster
clust_avg_lehman_both_new$Cluster <- rownames(clust_avg_lehman_both_new)
#拆分?jǐn)?shù)據(jù)
clust_avg_lehman_melt_new <- melt(clust_avg_lehman_both_new, "Cluster")
#畫fig3e
ggplot(clust_avg_lehman_melt_new, aes(Cluster, value, fill = factor(variable), color = factor(variable),
shape = factor(variable))) +
geom_point(size = 3, stroke = 1) +
scale_shape_discrete(solid = T) +
#guides(colour = guide_legend(override.aes = list(size=3))) +
ylab("average expression of signature in cluster") +
xlab("cluster") +
ylim(c(-0.35, 0.5))
可以看到:cluster2和4強(qiáng)烈地表達(dá)Basal-Like 1 signature炎辨,而cluster3顯著表達(dá)Basal-Like 2 signature和 luminal AR signature
接著,讀取另外一個(gè)gene signatures聪姿,作者通過從上調(diào)基因的平均表達(dá)量中減去下調(diào)基因的平均表達(dá)量碴萧,計(jì)算出three normal breast signatures下每個(gè)細(xì)胞的表達(dá)量(Lim et al., 2009a),然后將每個(gè)細(xì)胞分配給其具有最高表達(dá)量的signatures末购。這三個(gè)normal breast signatures 是:mature luminal (ML)破喻,basal和luminal progenitor (LP),在每個(gè)signatures中盟榴,都有對(duì)應(yīng)的上調(diào)基因和下調(diào)基因曹质。
## normal signatures
ml_signature_long <- read.table("data/ML_signature.txt", sep = "\t", header = TRUE)
if (length(which(ml_signature_long$Symbol == "")) > 0)
#將空白行去掉
ml_signature_long <- ml_signature_long[-which(ml_signature_long$Symbol == ""),]
#按照基因字母進(jìn)行排序,如果基因字母有一樣的擎场,則按照Average.log.fold.change絕對(duì)值的負(fù)數(shù)進(jìn)行從小到大排序
ml_signature_long <- ml_signature_long[order(ml_signature_long$Symbol, -abs(ml_signature_long$Average.log.fold.change) ), ]
#對(duì)基因取唯一值羽德,去重復(fù)
ml_signature_long <- ml_signature_long[ !duplicated(ml_signature_long$Symbol), ]
#總共有818個(gè)基因
ml_signature <- ml_signature_long[which(!is.na(match(ml_signature_long$Symbol, rownames(mat_ct)))), ]
#上調(diào)基因有384個(gè)
ml_up <- ml_signature[which(ml_signature$Average.log.fold.change > 0), ]
#下調(diào)基因有197個(gè)
ml_down <- ml_signature[which(ml_signature$Average.log.fold.change < 0), ]
#匹配一下
idx_ml_up <- match(ml_up$Symbol, rownames(mat_ct))
idx_ml_down <- match(ml_down$Symbol, rownames(mat_ct))
#讀取basal signature,處理過程跟上面的一樣的迅办。
basal_signature_long <- read.table("data/basal_signature.txt", sep = "\t", header = TRUE)
if (length(which(basal_signature_long$Symbol == "")) > 0)
basal_signature_long <- basal_signature_long[-which(basal_signature_long$Symbol == ""),]
basal_signature_long <- basal_signature_long[order(basal_signature_long$Symbol, -abs(basal_signature_long$Average.log.fold.change) ), ]
basal_signature_long <- basal_signature_long[ !duplicated(basal_signature_long$Symbol), ]
#總共有1335個(gè)基因
basal_signature <- basal_signature_long[which(!is.na(match(basal_signature_long$Symbol, rownames(mat_ct)))), ]
#上調(diào)基因有588個(gè)
basal_up <- basal_signature[which(basal_signature$Average.log.fold.change > 0), ]
#下調(diào)基因有757個(gè)
basal_down <- basal_signature[which(basal_signature$Average.log.fold.change < 0), ]
idx_basal_up <- match(basal_up$Symbol, rownames(mat_ct))
idx_basal_down <- match(basal_down$Symbol, rownames(mat_ct))
#讀取LP signature,還是同樣的操作
lp_signature_long <- read.table("data/Lp_signature.txt", sep = "\t", header = TRUE)
if (length(which(lp_signature_long$Symbol == "")) > 0)
lp_signature_long <- lp_signature_long[-which(lp_signature_long$Symbol == ""),]
lp_signature_long <- lp_signature_long[order(lp_signature_long$Symbol, -abs(lp_signature_long$Average.log.fold.change) ), ]
lp_signature_long <- lp_signature_long[ !duplicated(lp_signature_long$Symbol), ]
lp_signature <- lp_signature_long[which(!is.na(match(lp_signature_long$Symbol, rownames(mat_ct)))), ]
lp_up <- lp_signature[which(lp_signature$Average.log.fold.change > 0), ]
lp_down <- lp_signature[which(lp_signature$Average.log.fold.change < 0), ]
idx_lp_up <- match(lp_up$Symbol, rownames(mat_ct))
idx_lp_down <- match(lp_down$Symbol, rownames(mat_ct))
#對(duì)ML宅静、basal和LP 3個(gè)signatures基因,將上調(diào)基因的表達(dá)值減去下調(diào)基因表達(dá)值礼饱,并分別返回結(jié)果坏为。
normsig_avg_exprs <- apply(mat_ct, 2, function(x){
avg_ml_up <- mean(x[idx_ml_up])
avg_ml_down <- mean(x[idx_ml_down])
avg_ml_both <- avg_ml_up - avg_ml_down
avg_basal_up <- mean(x[idx_basal_up])
avg_basal_down <- mean(x[idx_basal_down])
avg_basal_both <- avg_basal_up - avg_basal_down
avg_lp_up <- mean(x[idx_lp_up])
avg_lp_down <- mean(x[idx_lp_down])
avg_lp_both <- avg_lp_up - avg_lp_down
return(c(avg_ml_up, avg_basal_up, avg_lp_up, avg_ml_both, avg_basal_both, avg_lp_both))
})
rownames(normsig_avg_exprs) <- c("avg_ml_up", "avg_basal_up", "avg_lp_up", "avg_ml_both", "avg_basal_both", "avg_lp_both")
#檢查數(shù)據(jù)
all.equal(colnames(normsig_avg_exprs), rownames(pd_ct))
#只看上皮細(xì)胞群
normsig_avg_exprs_epithelial <- normsig_avg_exprs[,which(pd_ct$cell_types_cl_all == "epithelial")]
normsig_avg_ups <- normsig_avg_exprs[c(1:3), ]
all.equal(colnames(normsig_avg_ups), rownames(pd_ct))
normsig_avg_ups_epithelial <- normsig_avg_ups[,which(pd_ct$cell_types_cl_all == "epithelial")]
normsig_avg_both <- normsig_avg_exprs[c(4:6),]
all.equal(colnames(normsig_avg_both), rownames(pd_ct))
normsig_avg_both_epithelial <- normsig_avg_both[,which(pd_ct$cell_types_cl_all == "epithelial")]
#挑選最大值=上調(diào)基因的平均表達(dá)值最大數(shù)值分配給對(duì)應(yīng)的細(xì)胞類型
assignments_normsig_ups <- apply(normsig_avg_ups, 2, function(x){rownames(normsig_avg_ups)[which.max(x)]})
assignments_normsig_ups_epithelial <- assignments_normsig_ups[which(pd_ct$cell_types_cl_all == "epithelial")]
#上調(diào)基因的平均表達(dá)值-下調(diào)基因的平均表達(dá)值的最大數(shù)值分配給對(duì)應(yīng)的細(xì)胞類型
assignments_normsig_both <- apply(normsig_avg_both, 2, function(x){rownames(normsig_avg_both)[which.max(x)]})
assignments_normsig_both_epithelial <- assignments_normsig_both[which(pd_ct$cell_types_cl_all == "epithelial")]
pd_ct_epith <- pd_ct[which(pd_ct$cell_types_cl_all == "epithelial"),]
normsig_epith_pat_both <- list()
normsig_epith_pat_ups <- list()
pds_epith_ct <- list()
for (i in 1:length(patients_now)) {
normsig_epith_pat_both[[i]] <- normsig_avg_both_epithelial[,which(pd_ct_epith$patient == patients_now[i])]
normsig_epith_pat_ups[[i]] <- normsig_avg_ups_epithelial[,which(pd_ct_epith$patient == patients_now[i])]
pds_epith_ct[[i]] <- pds_ct[[i]][which(pds_ct[[i]]$cell_types_cl_all == "epithelial"),]
}
names(normsig_epith_pat_both) <- patients_now
names(normsig_epith_pat_ups) <- patients_now
names(pds_epith_ct) <- patients_now
ht_sep_normsig_both <-
Heatmap(normsig_epith_pat_both[[1]],
col = colorRamp2(c(-0.7, -0.2, 0.7), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[1],
top_annotation = ha_lehman_epith_pat[[1]],
column_title_gp = gpar(fontsize = 12),
show_row_names = FALSE,
name = patients_now[1],
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(normsig_epith_pat_both[[2]],
col = colorRamp2(c(-0.7, -0.2, 0.7), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[2],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[2]],
name = patients_now[2],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(normsig_epith_pat_both[[3]],
col = colorRamp2(c(-0.7, -0.2, 0.7), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[3],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[3]],
name = patients_now[3],
show_row_names = FALSE,
top_annotation_height = unit(c(2), "cm"),
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(normsig_epith_pat_both[[4]],
col = colorRamp2(c(-0.7, -0.2, 0.7), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[4],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[4]],
name = patients_now[4],
show_row_names = FALSE,
top_annotation_height = unit(c(2), "cm"),
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(normsig_epith_pat_both[[5]],
col = colorRamp2(c(-0.7, -0.2, 0.7), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[5],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[5]],
name = patients_now[5],
show_row_names = FALSE,
top_annotation_height = unit(c(2), "cm"),
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(normsig_epith_pat_both[[6]],
col = colorRamp2(c(-0.7, -0.2, 0.7), c("blue","white", "red")),
cluster_rows = FALSE,
row_names_side = "right",
column_title = patients_now[6],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[6]],
name = patients_now[6],
show_column_names = FALSE,
top_annotation_height = unit(c(2), "cm"),
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9)))
#畫fig3b.pdf
print(draw(ht_sep_normsig_both, annotation_legend_side = "bottom"))
# 每個(gè)樣本的normal signatures 數(shù)目
all.equal(colnames(HSMM_allepith_clustering), names(assignments_normsig_both_epithelial))
pData(HSMM_allepith_clustering)$assignments_normsig_both <- assignments_normsig_both_epithelial
pData(HSMM_allepith_clustering)$assignments_normsig_ups <- assignments_normsig_ups_epithelial
#畫fig3f
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "assignments_normsig_both", cell_size = 2) + facet_wrap(~patient)
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "assignments_normsig_both", cell_size = 2) + facet_wrap(~Cluster)
#檢查數(shù)據(jù)
all.equal(HSMM_allepith_clustering$Cluster, clustering_allepith)
all.equal(colnames(normsig_avg_both_epithelial), colnames(HSMM_allepith_clustering))
clust_avg_normsig_both <- matrix(NA, nrow = length(unique(HSMM_allepith_clustering$Cluster)), ncol = nrow(normsig_avg_both_epithelial))
rownames(clust_avg_normsig_both) <- paste("clust", c(1:length(unique(HSMM_allepith_clustering$Cluster))), sep = "")
colnames(clust_avg_normsig_both) <- rownames(normsig_avg_both_epithelial)
#算上皮細(xì)胞群的avg_both 平均表達(dá)值,接下來還是同樣的操作
for (c in 1:length(unique(HSMM_allepith_clustering$Cluster))) {
clust_avg_normsig_both[c,] <- apply(normsig_avg_both_epithelial[,which(HSMM_allepith_clustering$Cluster == c)], 1, mean)
}
clust_avg_normsig_both <- as.data.frame(clust_avg_normsig_both)
clust_avg_normsig_both$Cluster <- rownames(clust_avg_normsig_both)
clust_avg_normsig_melt <- melt(clust_avg_normsig_both, "Cluster")
#畫fig3e
ggplot(clust_avg_normsig_melt, aes(Cluster, value, fill = factor(variable), color = factor(variable),
shape = factor(variable))) +
geom_point(size = 3, stroke = 1) +
scale_shape_discrete(solid = T) +
ylab("average expression of signature in cluster") +
xlab("cluster") +
ylim(c(-0.35, 0.5))
fig3e:Clusters 2和Clusters4 強(qiáng)烈表達(dá)the LP signature, 而cluster 3則高表達(dá) ML signature.
接著為了進(jìn)一步探究臨床相關(guān)性镊绪,作者使用了三個(gè)gene signatures匀伏,進(jìn)一步探究這868個(gè)上皮細(xì)胞的特征,這868個(gè)上皮細(xì)胞真的被研究到很徹底蝴韭,真的佩服够颠,這工作量真的好大....
第一個(gè)gene signatures:70-gene prognostic signature 缤剧,該signatures最初是從對(duì)有無轉(zhuǎn)移復(fù)發(fā)患者的原發(fā)腫瘤之間差異表達(dá)基因的分析中得出的外永,總共70個(gè)基因像寒。
mammaprint_long <- read.table("data/mammaprint_sig_new.txt", header = TRUE, sep = "\t")
mammaprint <- apply(mammaprint_long, 2, function(x){return(intersect(x, rownames(mat_ct)))})[,1]
mammaprint_avg_exprs <- apply(mat_ct[match(mammaprint, rownames(mat_ct)),], 2, mean)
all.equal(names(mammaprint_avg_exprs), colnames(mat_ct))
mammaprint_avg_exprs <- mammaprint_avg_exprs[which(pd_ct$cell_types_cl_all == "epithelial")]
all.equal(colnames(HSMM_allepith_clustering), names(mammaprint_avg_exprs))
pData(HSMM_allepith_clustering)$mammaprint_avg_exprs <- mammaprint_avg_exprs
# 畫figS13b
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "mammaprint_avg_exprs", cell_size = 2) + facet_wrap(~patient) +
scale_color_continuous(low = "yellow", high = "blue")
# 畫figS13a
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "mammaprint_avg_exprs", cell_size = 2) + facet_wrap(~Cluster) +
scale_color_continuous(low = "yellow", high = "blue")
第二個(gè)gene signatures:49-gene metastatic burden signature.該signatures可以區(qū)分了患者來源的小鼠TNBC異種移植模型中單個(gè)循環(huán)轉(zhuǎn)移細(xì)胞所產(chǎn)生的高轉(zhuǎn)移負(fù)荷和低轉(zhuǎn)移負(fù)荷稠集,總共包括49個(gè)基因。
zenawerb_long <- read.table("data/werb_49_metastasis_sig.txt", header = TRUE, sep = "\t")
zenawerb <- apply(zenawerb_long, 2, function(x){return(intersect(x, rownames(mat_ct)))})[,1]
zenawerb_avg_exprs <- apply(mat_ct[match(zenawerb, rownames(mat_ct)),], 2, mean)
all.equal(names(zenawerb_avg_exprs), colnames(mat_ct))
zenawerb_avg_exprs <- zenawerb_avg_exprs[which(pd_ct$cell_types_cl_all == "epithelial")]
all.equal(colnames(HSMM_allepith_clustering), names(zenawerb_avg_exprs))
pData(HSMM_allepith_clustering)$zenawerb_avg_exprs <- zenawerb_avg_exprs
#畫figS14b
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "zenawerb_avg_exprs", cell_size = 2) + facet_wrap(~patient) +
scale_color_continuous(low = "yellow", high = "blue")
#畫figS14a
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "zenawerb_avg_exprs", cell_size = 2) + facet_wrap(~Cluster) +
scale_color_continuous(low = "yellow", high = "blue")
第三個(gè)gene siganatures:從接受手術(shù)前化療治療的原發(fā)性乳腺癌患者的殘存腫瘤群體中富集的基因中獲得的撑教,包括354個(gè)基因厕怜。
artega_long <- read.table("data/artega_sig.txt", header = TRUE, sep = "\t")
artega <- apply(artega_long, 2, function(x){return(intersect(x, rownames(mat_ct)))})[,1]
artega_avg_exprs <- apply(mat_ct[match(artega, rownames(mat_ct)),], 2, mean)
all.equal(names(artega_avg_exprs), colnames(mat_ct))
artega_avg_exprs <- artega_avg_exprs[which(pd_ct$cell_types_cl_all == "epithelial")]
all.equal(colnames(HSMM_allepith_clustering), names(artega_avg_exprs))
pData(HSMM_allepith_clustering)$artega_avg_exprs <- artega_avg_exprs
畫figS15a
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "artega_avg_exprs", cell_size = 2) + facet_wrap(~patient) +
scale_color_continuous(low = "yellow", high = "blue")
#畫figS15b
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "artega_avg_exprs", cell_size = 2) + facet_wrap(~Cluster) +
scale_color_continuous(low = "yellow", high = "blue")
#將3個(gè)gene signatures的表達(dá)值合成一個(gè)數(shù)據(jù)框
prognosis_sig <- cbind(mammaprint_avg_exprs, zenawerb_avg_exprs, artega_avg_exprs)
#取行名
colnames(prognosis_sig) <- c("mammaprint", "zenawerb", "artega")
prognosis_epith_pat <- list()
for (i in 1:length(patients_now)) {
prognosis_epith_pat[[i]] <- t(prognosis_sig)[,which(pd_ct_epith$patient == patients_now[i])]
}
names(prognosis_epith_pat) <- patients_now
for (i in 1:length(patients_now)) {
print(all.equal(colnames(prognosis_epith_pat[[1]]), rownames(pds_epith_ct[[1]])))
print(all.equal(names(clusterings_sep_allepith[[1]]), colnames(prognosis_epith_pat[[1]])))
}
ht_sep_prognosis <-
Heatmap(prognosis_epith_pat[[1]],
cluster_rows = FALSE,
col = colorRamp2(c(-0.2, 0.2, 1), c("blue","white", "red")),
show_column_names = FALSE,
column_title = patients_now[1],
top_annotation = ha_lehman_epith_pat[[1]],
column_title_gp = gpar(fontsize = 12),
show_row_names = FALSE,
name = patients_now[1],
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(prognosis_epith_pat[[2]],
col = colorRamp2(c(-0.2, 0.2, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[2],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[2]],
name = patients_now[2],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(prognosis_epith_pat[[3]],
col = colorRamp2(c(-0.2, 0.2, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[3],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[3]],
name = patients_now[3],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(prognosis_epith_pat[[4]],
col = colorRamp2(c(-0.2, 0.2, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[4],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[4]],
name = patients_now[4],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(prognosis_epith_pat[[5]],
col = colorRamp2(c(-0.2, 0.2, 1), c("blue","white", "red")),
cluster_rows = FALSE,
show_column_names = FALSE,
column_title = patients_now[5],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[5]],
name = patients_now[5],
show_row_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9))) +
Heatmap(prognosis_epith_pat[[6]],
col = colorRamp2(c(-0.2, 0.2, 1), c("blue","white", "red")),
cluster_rows = FALSE,
row_names_side = "right",
column_title = patients_now[6],
column_title_gp = gpar(fontsize = 12),
top_annotation = ha_lehman_epith_pat[[6]],
name = patients_now[6],
show_column_names = FALSE,
heatmap_legend_param = list(title_gp = gpar(fontsize = 9), labels_gp = gpar(fontsize = 9)))
#畫fig4a
print(draw(ht_sep_prognosis, annotation_legend_side = "right"))
接著看這3個(gè)gene signatures在不同clusters的細(xì)胞之間的表達(dá)分布纺讲。
clust_avg_prognosis <- matrix(NA, nrow = length(unique(HSMM_allepith_clustering$Cluster)), ncol = ncol(prognosis_sig))
#列明是clusters1-5
rownames(clust_avg_prognosis) <- paste("clust", c(1:length(unique(HSMM_allepith_clustering$Cluster))), sep = "")
#行名是mammaprint驶忌、zenawerb矛辕、artega和Cluster
colnames(clust_avg_prognosis) <- colnames(prognosis_sig)
#每個(gè)clusters的三個(gè)gene signatures的平均值
for (c in 1:length(unique(HSMM_allepith_clustering$Cluster))) {
clust_avg_prognosis[c,] <- apply(prognosis_sig[which(HSMM_allepith_clustering$Cluster == c),], 2, mean)}
prognosis_sig <- as.data.frame(prognosis_sig)
all.equal(rownames(prognosis_sig), colnames(HSMM_allepith_clustering))
prognosis_sig$Cluster <- as.numeric(HSMM_allepith_clustering$Cluster)
prognosis_melt <- melt(prognosis_sig, id.vars = c("Cluster"))
prognosis_melt$value <- as.numeric(prognosis_melt$value)
prognosis_melt <- prognosis_melt %>%
filter(Cluster %in% c(2,3,4))
#畫fig4b
p <- ggplot(prognosis_melt, aes(factor(Cluster), value, fill = factor(Cluster))) +
scale_fill_manual(values = c("1" = "#ee204d", "2" = "#17806d", "3" = "#b2ec5d", "4" = "#cda4de", "5" = "#1974d2"))
p + geom_violin(adjust = .5) + facet_wrap(~variable) + stat_summary(fun.y = mean, geom = "point", shape = 22, size = 3)
#畫figS16
venn(list("PS" = mammaprint, "MBS" = zenawerb, "RTS" = artega))
可以看到這3個(gè)gene signatures沒有重疊的基因,并且它們來源不同飞蹂,但這3個(gè) gene signatures均在clusters 2亞群中都高表達(dá)惊窖。接著赁遗,作者想跟臨床預(yù)后聯(lián)系到一起逝淹,于是選取了3個(gè)有代表性的數(shù)據(jù)集gene signature,對(duì)868個(gè)上皮細(xì)胞的5個(gè)clusters進(jìn)行探索横殴,結(jié)果提示clusters2具有與其他clusters不同的特征,接著對(duì)clusters2進(jìn)行進(jìn)一步探索陷虎。在之前的第五篇筆記tSNE中窝稿,已經(jīng)用differentialGeneTest對(duì)每一個(gè)上皮細(xì)胞cluster與全部的上皮細(xì)胞進(jìn)行差異分析后,以FDR=0.1為閾值凿掂,選取前10個(gè)有顯著差異變化的基因在METABRIC 數(shù)據(jù)庫(kù)中進(jìn)行生存分析伴榔,作者在補(bǔ)充材料里已經(jīng)描述得很詳細(xì)。
結(jié)果顯示高表達(dá)clusters2的腫瘤與OS顯著負(fù)相關(guān)相關(guān)性庄萎。相反踪少,但是在三個(gè)gene signatures中的預(yù)后則無統(tǒng)計(jì)學(xué)意義。此外糠涛,clusters1援奢、3、4和5的基因無法預(yù)測(cè)臨床結(jié)果忍捡。這些發(fā)現(xiàn)揭示了單細(xì)胞測(cè)序分析能揭示與臨床預(yù)后相關(guān)細(xì)胞狀態(tài)的能力集漾,但這些細(xì)胞狀態(tài)還沒有在腫瘤真實(shí)世界樣本中沒被發(fā)現(xiàn)。
我們繼續(xù)探索砸脊。在fig3a中具篇,我們已經(jīng)對(duì)868個(gè)上皮細(xì)胞進(jìn)行tSNE,接著我們將這個(gè)tSNE結(jié)果,進(jìn)行進(jìn)一步注釋可并可視化脓规。
## 可視化normal signatures栽连,figS10b
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "assignments_normsig_both", cell_size = 3)
#可視化TNBCtype4 Lehman signatures
#去掉immunomodulatory和mesenchymal_stem_like上皮細(xì)胞
lehman_avg_both_epithelial_new <- lehman_avg_both_epithelial[-which(rownames(lehman_avg_both_epithelial) %in% c("immunomodulatory", "mesenchymal_stem_like")),]
#給868個(gè)上皮細(xì)胞進(jìn)行分類(basal_like_1,basal_like_2,mesenchymal,luminal_ar)
assignments_lehman_both_epithelial_new <- apply(lehman_avg_both_epithelial_new, 2, function(x){rownames(lehman_avg_both_epithelial_new)[which.max(x)]})
all.equal(colnames(HSMM_allepith_clustering), names(assignments_lehman_both_epithelial_new))
pData(HSMM_allepith_clustering)$assignments_lehman_both_new <- assignments_lehman_both_epithelial_new
#畫figS10c
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "assignments_lehman_both_new", cell_size = 3)
#可視化mammaprint signature (PS)
plot_cell_clusters(HSMM_allepith_clustering, 1, 2, color = "mammaprint_avg_exprs", cell_size = 3) +
scale_color_continuous(low = "yellow", high = "blue")
然后,通過研究在cluster2中與其他上皮細(xì)胞群的差異表達(dá)的基因中富集的通路變化豐侨舆,來確定cluster2亞群背后的功能機(jī)制秒紧。我們發(fā)現(xiàn),最富集的通路與鞘糖脂的生物合成和溶酶體的周轉(zhuǎn)有關(guān)挨下,這兩種途徑影響了同樣富集到的先天免疫系統(tǒng)的細(xì)胞因子途徑熔恢。接著可視化每個(gè)樣本的鞘糖脂的生物合成和先天免疫系統(tǒng)的細(xì)胞因子兩條途徑的熱圖。
糖鞘糖脂最近被認(rèn)為是乳腺癌中許多促進(jìn)腫瘤特性的介質(zhì)臭笆,包括改變的生長(zhǎng)因子信號(hào)叙淌、EMT和干樣行為秤掌。該途徑中的多個(gè)關(guān)鍵基因在cluster2亞群中高表達(dá),包括糖脂轉(zhuǎn)移蛋白基因GLTP和鞘脂生物合成關(guān)鍵亞單位基因SPTLC1鹰霍。鞘磷脂也被認(rèn)為是天然免疫系統(tǒng)和獲得性免疫系統(tǒng)的重要調(diào)節(jié)劑闻鉴,并與多種上皮組織炎癥相關(guān)癌變有關(guān)。
#path1是糖鞘糖脂的關(guān)鍵基因
path1 <- c("ST3GAL4", "ST3GAL6", "ST8SIA1", "FUT1", "FUT2", "FUT3", "FUT4", "FUT6", "FUT9", "GLTP", "SPTLC1", "KDSR", "SPTLC2", "CERK", "ARSA")
idx_path1 <- match(path1, rownames(HSMM_allepith_clustering))
#path1是先天免疫系統(tǒng)的細(xì)胞因子途徑的關(guān)鍵基因
path2 <- c("CCL20", "CCL22", "CCL4", "CCR6", "IL11", "IL12RB1", "IL6R", "CSF2RA", "BMP7", "GLMN", "GPI", "INHBA", "TNF", "TNFSF15", "GHR", "LEPR", "TLR1", "TLR2", "TLR5", "TLR7", "TLR10", "F11R")
idx_path2 <- match(path2, rownames(HSMM_allepith_clustering))
path3 <- c("ERBB2", "AKT1", "AKT3", "PIK3R3", "PIK3R4", "RPS6KB2", "TRIB3", "BTK", "GRB10", "GRB2", "ILK", "PAK1", "PRKCZ", "CSNK2A1",
"IRS1", "IRAK1", "MYD88", "MAP2K1", "MAPK8", "MAPK1", "PTPN11", "EIF4E", "EIF4EBP1", "EIF4G1", "FKBP1A", "PDK1", "RHEB", "RPS6KB1")
idx_path3 <- match(path3, rownames(HSMM_allepith_clustering))
all_idx_paths <- c(idx_path1, idx_path2)
#命名
names_path <- c(rep("glycosphigolipid metabolism", length(idx_path1)), rep("innate immunity", length(idx_path2)))
#上色
anno_cols_path <- c("glycosphigolipid metabolism" = "#ff9baa", "innate immunity" = "#17806d")
ha_path_rows <- HeatmapAnnotation(df = data.frame(pathway = names_path),
annotation_legend_param = list(pathway = list(ncol = 1, title = "pathway", title_position = "topcenter")),
which = "row", col = list("pathway" = anno_cols_path), annotation_width = unit(3, "mm"))
#分離每個(gè)cluster的矩陣茂洒,并提取關(guān)鍵基因孟岛。
mat_epith_allpats <- exprs(HSMM_allepith_clustering)
mats_epith_patient <- list()
pds_epith_patient <- list()
mats_epith_patient_clusters <- list()
for (i in 1:length(patients_now)) {
pds_epith_patient[[i]] <- pData(HSMM_allepith_clustering)[which(pData(HSMM_allepith_clustering)$patient == patients_now[i]), ]
mats_epith_patient[[i]] <- mat_epith_allpats[all_idx_paths, which(pData(HSMM_allepith_clustering)$patient == patients_now[i])]
mats_epith_patient_clusters[[i]] <- list()
for (c in 1:length(unique(HSMM_allepith_clustering$Cluster))) {
mats_epith_patient_clusters[[i]][[c]] <- mats_epith_patient[[i]][, which(pds_epith_patient[[i]]$Cluster == c)]
}
names(mats_epith_patient_clusters[[i]]) <- paste0("clust", c(1:5))
}
names(mats_epith_patient) <- patients_now
names(pds_epith_patient) <- patients_now
# 接下來就是畫圖了。
ht_paths <-
ha_path_rows +
Heatmap(mats_epith_patient_clusters[[1]][[1]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
show_column_dend = TRUE, show_column_names = FALSE,
name = "cluster1", column_title = "cluster1",
row_names_gp = gpar(fontsize = 9, col = anno_cols_path),
split = names_path) +
Heatmap(mats_epith_patient_clusters[[1]][[2]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster2", column_title = "cluster2",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[1]][[3]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster3", column_title = "cluster3",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[1]][[4]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster4", column_title = "cluster4",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[1]][[5]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster5", column_title = "cluster5",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[2]][[1]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
show_column_dend = TRUE, show_column_names = FALSE,
name = "cluster1_2", column_title = "cluster1",
row_names_gp = gpar(fontsize = 9, col = anno_cols_path),
split = names_path) +
Heatmap(mats_epith_patient_clusters[[2]][[2]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster2_2", column_title = "cluster2",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[2]][[3]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster3_2", column_title = "cluster3",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[2]][[4]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster4_2", column_title = "cluster4",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[2]][[5]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster5_2", column_title = "cluster5",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[3]][[1]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster1_3", column_title = "cluster1",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[3]][[2]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster2_3", column_title = "cluster2",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[3]][[3]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster3_3", column_title = "cluster3",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[3]][[4]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster4_3", column_title = "cluster4",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[3]][[5]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster5_3", column_title = "cluster5",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[4]][[1]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster1_4", column_title = "cluster1",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[4]][[2]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster2_4", column_title = "cluster2",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[4]][[3]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster3_4", column_title = "cluster3",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[4]][[4]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster4_4", column_title = "cluster4",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[4]][[5]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster5_4", column_title = "cluster5",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[5]][[1]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster1_5", column_title = "cluster1",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[5]][[2]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster2_5", column_title = "cluster2",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[5]][[3]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster3_5", column_title = "cluster3",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[5]][[4]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster4_5", column_title = "cluster4",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[5]][[5]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster5_5", column_title = "cluster5",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[6]][[1]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster1_6", column_title = "cluster1",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[6]][[2]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster2_6", column_title = "cluster2",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[6]][[3]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster3_6", column_title = "cluster3",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[6]][[4]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster4_6", column_title = "cluster4",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE) +
Heatmap(mats_epith_patient_clusters[[6]][[5]],
col = colorRamp2(c(-0.5, 1, 3), c("blue", "white", "red")),
cluster_rows = TRUE, show_row_dend = FALSE,
name = "cluster5_6", column_title = "cluster5",
show_row_names = FALSE,
show_column_dend = TRUE, show_column_names = FALSE)
畫得有點(diǎn)丑督勺,畫圖參數(shù)得調(diào)整需要參考ComplexHeatmap包的說明書渠羞,繼續(xù)美化一下。
到了這里智哀,這篇文獻(xiàn)的單細(xì)胞部分就學(xué)習(xí)完了次询,花了一個(gè)月的時(shí)間斷斷續(xù)續(xù)把這個(gè)系列共6篇筆記寫完。作者把這1000多個(gè)細(xì)胞不斷分群瓷叫,不斷探索屯吊,最后集中在了上皮細(xì)胞群,進(jìn)一步細(xì)分摹菠,然后結(jié)合臨床預(yù)后情況雌芽,繼續(xù)探索。這是很標(biāo)準(zhǔn)的單細(xì)胞分析流程辨嗽。當(dāng)然,文章中也加了一些免疫組化的實(shí)驗(yàn)方法淮腾,加以驗(yàn)證糟需。還有一部分外顯子測(cè)序,這又是另外一個(gè)需要學(xué)習(xí)的領(lǐng)域了谷朝,在單細(xì)胞學(xué)習(xí)的路上洲押,感恩遇到得到了“單細(xì)胞天地”公眾號(hào)團(tuán)隊(duì)的答疑解惑。2021年圆凰,希望能學(xué)到更多的東西杈帐,寫更多的筆記,相信一切的努力都不會(huì)白白付出专钉。