單細胞數據挖掘實戰(zhàn):文獻復現(二)批量創(chuàng)建Seurat對象及質控
單細胞數據挖掘實戰(zhàn):文獻復現(三)降維灯蝴、聚類和細胞注釋
前面對細胞進行了注釋锥债,在腫瘤樣本中發(fā)現有很多Mo/MΦ細胞曼追,這時需要借助圖形來直觀的表達跳夭,下面就來嘗試畫一下文獻中的Fig. 1d勃黍。
一懦砂、加載R包
if(T){
if(!require(BiocManager))install.packages("BiocManager")
if(!require(Seurat))install.packages("Seurat")
if(!require(Matrix))install.packages("Matrix")
if(!require(ggplot2))install.packages("ggplot2")
if(!require(cowplot))install.packages("cowplot")
if(!require(magrittr))install.packages("magrittr")
if(!require(dplyr))install.packages("dplyr")
if(!require(purrr))install.packages("purrr")
if(!require(ggrepel))install.packages("ggrepel")
if(!require(ggpubr))install.packages("ggpubr")
}
二姐扮、讀入數據
sex_condition_objects = readRDS("sex_condition_objects.RDS")
三抬探、將細胞注釋結果整理成一個EXCEL表并讀入
前面得到了四個樣本的注釋結果子巾,將它們整理成一個excel表,部分截圖如下
注意這里cluster列和cell_type的命名需按照截圖里的規(guī)則小压,不然后面的代碼會報錯线梗,當然也可以根據自己的命名修改后面的代碼。
cell_types<-read.csv("./anno_cell/cell_type_index.csv", header = T)
四怠益、在sex_condition_objects中添加細胞類型
sex_condition_objects <- lapply(sex_condition_objects, function(x) {
x$full_cluster_id <- paste(substring(x$shortID,12,12), x$condition, Idents(x), sep="_")
x$cell_type <- cell_types[match(x$full_cluster_id, cell_types$cluster), "cell_type"]
x$cell_type <- factor(x$cell_type, levels= c("micro", "pre-micro", "macro", "BAM", "NKT", "NK","B-cells", "T-cells","Ncam1+", "DC", "other"))
x$cell_type_selection <- ""
x$cell_type_selection[x$cell_type %in% c("micro", "pre-micro")] <- "Microglia"
x$cell_type_selection[x$cell_type == "macro"] <- "Macrophages"
x$cell_type_selection[x$cell_type == "BAM"] <- "BAM"
x
})
五仪搔、畫圖
# Figure 1d(Pie charts)
# 定義細胞的顏色
micro<-"#53AFE6"
pre_micro<-"#2DA7C8"
BAM<- "#0DD1AD"
UN<-"grey"
Mo<-"#FCE80C"
Mo_Mg<-"#FABF00"
Mg<-"#E98934"
NK<-"#8c42a3"
ncam<-"#C2B4FC"
NKT<-"#DFA5F2"
DC<-"#bf7a58"
Tcells<-"#94112f"
Bcells<-"#EC5CA5"
freq_list <- lapply(sex_condition_objects, function(x) {
freq <- data.frame(cell_type = x$cell_type)
freq <- freq %>%
group_by(cell_type) %>%
count() %>%
ungroup %>%
mutate(per = `n`/sum(`n`))
freq$cell_type <- factor(freq$cell_type, levels= c("micro", "pre-micro", "macro", "BAM", "NKT","NK", "B-cells", "T-cells","Ncam1+", "DC", "other"))
freq$label <- scales::percent(freq$per)
freq
})
cf<-ggplot(freq_list$`GSM4039241-F-ctrl`,
aes(x="", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, pre_micro, BAM))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
cm<-ggplot(freq_list$`GSM4039245-M-ctrl`,
aes(x=" ", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, pre_micro, BAM, NK, DC,UN))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
tf<-ggplot(freq_list$`GSM4039243-F-tumor`,
aes(x="", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, Mo_Mg, BAM, NKT, NK, Bcells, Tcells, ncam, DC,UN))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
tm<-ggplot(freq_list$`GSM4039247-M-tumor`,
aes(x="", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, Mo_Mg, BAM, NKT, Bcells,Tcells, DC, UN ))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
pdf(file = "pie.pdf",width = 20,height = 10)
ggarrange(cf, cm, tf, tm, ncol = 4)
dev.off()
與文獻中的圖比較一下
每種細胞的比例跟文獻中基本保持一致,在腫瘤樣本中蜻牢,MG仍然是最豐富的細胞群烤咧,但比例有所下降,出現了很多其它種類的細胞抢呆,這也就是腫瘤的異質性煮嫌。
往期單細胞數據挖掘實戰(zhàn):