1.加載數(shù)據(jù)
library(Seurat)
library(SeuratData)
pbmc <- LoadData("pbmc3k", type = "pbmc3k.final")
> pbmc
An object of class Seurat
13714 features across 2638 samples within 1 assay
Active assay: RNA (13714 features, 2000 variable features)
2 dimensional reductions calculated: pca, umap
2.執(zhí)行默認(rèn)的差異表達(dá)測(cè)試
Seurat 的大部分差異表達(dá)特征可以通過(guò)“FindMarkers()”函數(shù)訪問(wèn)蜈抓。 默認(rèn)情況下,Seurat 基于非參數(shù) Wilcoxon 秩和檢驗(yàn)執(zhí)行差分表達(dá)式满粗。 這取代了以前的默認(rèn)測(cè)試('bimod')。 要測(cè)試兩組特定細(xì)胞之間的差異表達(dá)愚争,請(qǐng)指定“ident.1”和“ident.2”參數(shù)映皆。
# list options for groups to perform differential expression on
> levels(pbmc)
[1] "Naive CD4 T" "Memory CD4 T" "CD14+ Mono" "B" "CD8 T"
[6] "FCGR3A+ Mono" "NK" "DC" "Platelet"
# Find differentially expressed features between CD14+ and FCGR3A+ Monocytes
>monocyte.de.markers <- FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono")
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=04s
# view results
> head(monocyte.de.markers)
p_val avg_log2FC pct.1 pct.2 p_val_adj
FCGR3A 1.193617e-101 -3.776553 0.131 0.975 1.636926e-97
LYZ 8.134552e-75 2.614275 1.000 0.988 1.115572e-70
RHOC 4.479768e-68 -2.325013 0.162 0.864 6.143554e-64
S100A8 7.471811e-65 3.766437 0.975 0.500 1.024684e-60
S100A9 1.318422e-64 3.299060 0.996 0.870 1.808084e-60
IFITM2 4.821669e-64 -2.085807 0.677 1.000 6.612437e-60
查找 CD14+ 單核細(xì)胞與所有其他細(xì)胞之間的差異表達(dá)特征
monocyte.de.markers <- FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = NULL, only.pos = TRUE)
head(monocyte.de.markers)
3.預(yù)過(guò)濾基因或細(xì)胞以提高 DE 的速度
為了提高差異基因發(fā)現(xiàn)的速度,特別是對(duì)于大型數(shù)據(jù)集轰枝,Seurat 允許對(duì)基因或細(xì)胞進(jìn)行預(yù)過(guò)濾捅彻。 例如,在任一組細(xì)胞中很少檢測(cè)到的特征基因鞍陨,或者以相似的平均水平表達(dá)的特征步淹,不太可能被差異表達(dá)。 min.pct
、logfc.threshold
贤旷、min.diff.pct
和 max.cells.per.ident
參數(shù)的示例用例如下所示。
head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", min.pct = 0.5))
head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", logfc.threshold = log(2)))
head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", min.diff.pct = 0.25))
head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", max.cells.per.ident = 200))
4.使用替代統(tǒng)計(jì)方法執(zhí)行 DE 分析 (中文翻譯可能不準(zhǔn))
- "wilcox":Wilcoxon 秩和檢驗(yàn)(默認(rèn))
- "bimod":?jiǎn)渭?xì)胞特征表達(dá)的似然比檢驗(yàn)
- "roc" : 標(biāo)準(zhǔn) AUC 分類器
- "t" : 學(xué)生 t 檢驗(yàn)
- “泊松”:假設(shè)潛在負(fù)二項(xiàng)分布的似然比檢驗(yàn)砾脑。僅用于基于 UMI 的數(shù)據(jù)集
- “negbinom”:假設(shè)潛在負(fù)二項(xiàng)分布的似然比檢驗(yàn)幼驶。僅用于基于 UMI 的數(shù)據(jù)集
- “LR”:使用邏輯回歸框架來(lái)確定差異表達(dá)的基因。構(gòu)建基于每個(gè)特征單獨(dú)預(yù)測(cè)組成員的邏輯回歸模型韧衣,并將其與具有似然比檢驗(yàn)的空模型進(jìn)行比較盅藻。
- “MAST”:將細(xì)胞檢測(cè)率視為協(xié)變量的 GLM 框架
- “DESeq2”:DE 基于使用負(fù)二項(xiàng)分布的模型
# necessary to get MAST to work properly
library(SingleCellExperiment)
head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", test.use = "MAST"))
head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", test.use = "DESeq2", max.cells.per.ident = 50))