本文是參考學(xué)習(xí) CNS圖表復(fù)現(xiàn)12—檢查原文的細(xì)胞亞群的標(biāo)記基因的學(xué)習(xí)筆記风钻【猩冢可能根據(jù)學(xué)習(xí)情況有所改動(dòng)晃酒。
前面的教程里面港庄,我們首先根據(jù) CNS圖表復(fù)現(xiàn)08—腫瘤單細(xì)胞數(shù)據(jù)第一次分群通用規(guī)則進(jìn)行了初步分群奉狈,如下所示:
immune (CD45+,PTPRC), epithelial/cancer (EpCAM+,EPCAM), and stromal (CD10+,MME,fibo or CD31+,PECAM1,endo)
然后根據(jù)CNS圖表復(fù)現(xiàn)06-根據(jù)CellMarker網(wǎng)站進(jìn)行人工校驗(yàn)免疫細(xì)胞亞群 進(jìn)行了免疫細(xì)胞細(xì)分亞群模闲,但是我注意到涧郊,其實(shí)文章給定了一下他們自己的收集整理好的標(biāo)記基因作為他們文章的分群依據(jù)钾虐,如下:
Table Description
1. General Cell Markers , General markers used for differing between non-immune and immune cell types as well as non immuen epithelial cell types
2. COSMIC mutation list , COSMIC Tier 1 genes and overlap with genes used in clincal DNA assays
3. Cancer Cell Signature Genes ,Gene lists of each cancer signature
4. Immune Markers Markers , used for differing between primary immune cell types
現(xiàn)在我們就校驗(yàn)一下原文的細(xì)胞亞群的標(biāo)記基因的可靠性:
首先看 General Cell Markers
首先從 General Cell Markers , General markers used for differing between non-immune and immune cell types as well as non immuen epithelial cell types拿到基因名字:
代碼如下:
rm(list=ls())
options(stringsAsFactors = F)
library(Seurat)
library(ggplot2)
load(file = 'first_sce.Rdata')
load(file = 'phe-of-first-anno.Rdata')
sce=sce.first
table(phe$immune_annotation)
sce@meta.data=phe
sce@meta.data$new=paste(phe$immune_annotation,phe$seurat_clusters)
genes_to_check=c('PTPRC','CD3G','CD3E','CD79A','BLNK','CD68','CSF1R','MARCO','CD207','PMEL','MLANA','PECAM1','CD34','VWF','EPCAM','SFN','KRT19','ACTA2','MCAM','MYLK','MYL9','FAP','THY1','ALB')
p3 <- DotPlot(sce, features = genes_to_check,
assay='RNA',group.by = 'new' ) #+ coord_flip()
p3
可以很清楚的看到,高表達(dá)ALB基因的Hepatocytes被我劃分到了stromal細(xì)胞大群吏廉,是需要區(qū)分出來的泞遗。而且高表達(dá)PMEL和MLANA的Melanocytes也被我劃分到了stromal細(xì)胞大群,是需要區(qū)分出來的席覆。
而且有一群細(xì)胞史辙,既表達(dá)EPCAM等上皮細(xì)胞的標(biāo)記基因,也表達(dá)MYL9這個(gè)Fibroblasts的基因佩伤,很有可能是并不純粹的細(xì)胞亞群聊倔,或者說是雙細(xì)胞情況。
然后看 Immune Markers Markers
首先從 Immune Markers Markers , used for differing between primary immune cell types 拿到基因名字生巡。
承接上面的代碼方库,如下:
cells.use <- row.names(sce@meta.data)[which(phe$immune_annotation=='immune')]
length(cells.use)
sce <-subset(sce, cells=cells.use)
sce
load(file = 'phe-of-subtypes-Immune-by-manual.Rdata')
sce@meta.data=phe
table(phe$immuSub)
table(phe$immuSub,phe$seurat_clusters)
sce@meta.data$new=paste(phe$immuSub,phe$seurat_clusters)
table(sce@meta.data$new)
genes_to_check=c( 'CD2','CD3D','CD3E','CD3G','MARCO','CSF1R','CD68','GLDN','APOE','CCL3L1',
'TREM2','C1QB','NUPR1','FOLR2','RNASE1','C1QA','CD1E','CD1C','FCER1A','PKIB',
'CYP2S1','NDRG2','CMA1','MS4A2','TPSAB1','TPSB2','IGLL5','MZB1','JCHAIN','DERL3',
'SDC1','MS4A1','BANK1','PAX5','CD79A','PRDM1','XBP1','IRF4','MS4A1','IRF8','ACTB',
'GAPDH','MALAT1','FCGR3B','ALPL','CXCR1','CXCR2','ADGRG3','CMTM2','PROK2','MME','MMP25',
'TNFRSF10C','SLC32A1','SHD','LRRC26','PACSIN1','LILRA4','CLEC4C','DNASE1L3',
'CLEC4C','LRRC26','SCT','LAMP5')
genes_to_check=unique(genes_to_check)
p4 <- DotPlot(sce, features = genes_to_check,
assay='RNA',group.by = 'new' ) #+ coord_flip()
p4
在所有的細(xì)胞亞群,都表達(dá)的基因是3個(gè)Housekeeping障斋,分別是:'ACTB', 'GAPDH','MALAT1'