FindMarkers
gzh:BBio
對分群結(jié)果進(jìn)行差異基因鑒定的函數(shù)丐重,理想情況下茉继,對于每群細(xì)胞來說,marker基因都位于上調(diào)基因的前列喻圃。
#FindMarker結(jié)果以p_val從小到大排列蒋纬,wilcox檢驗方法使用data中的數(shù)據(jù)猎荠。pct.1和pct.2則分別為基因在對應(yīng)ident參數(shù)中的表達(dá)比例坚弱。
df <- FindMarkers(pbmc, ident.1=1, ident.2=0, slot="data", logfc.threshold=0.25, min.pct=0.1, test.use = "wilcox")
head(df)
# p_val avg_logFC pct.1 pct.2 p_val_adj
# TYMP 1.702818e-11 2.539289 1.00 0.111 3.916481e-09
# CST3 4.469249e-11 2.552769 1.00 0.306 1.027927e-08
# S100A8 5.334985e-11 4.037048 0.96 0.111 1.227047e-08
# LYZ 6.997602e-11 3.082150 1.00 0.417 1.609449e-08
# HLA-DRB1 3.287672e-10 3.325130 0.88 0.083 7.561646e-08
# HLA-DPB1 4.061018e-10 3.547416 0.88 0.083 9.340340e-08
avg_logFC
結(jié)果中avg_logFC是否是FC的自然對數(shù)呢?看看源碼吧法牲。
getAnywhere('FindMarkers.default')
# mean.fxn <- if (is.null(x = reduction) && slot != "scale.data") {
# switch(EXPR = slot, data = function(x) {
# return(log(x = rowMeans(x = expm1(x = x)) + pseudocount.use))
# }, function(x) {
# return(log(x = rowMeans(x = x) + pseudocount.use))
# })
# }
# else {
# rowMeans
# }
# data.1 <- mean.fxn(data[features, cells.1, drop = FALSE])
# data.2 <- mean.fxn(data[features, cells.2, drop = FALSE])
# total.diff <- (data.1 - data.2)
logfc.threshold和min.pct
差異倍數(shù)及表達(dá)比例的參數(shù)設(shè)置會影響運(yùn)行時間嗎史汗?
system.time(FindMarkers(pbmc, ident.1=1, logfc.threshold=0.25, min.pct=0.1))
#user system elapsed
#21.154 0.435 21.590
system.time(FindMarkers(pbmc, ident.1=1, logfc.threshold=0, min.pct=0))
#user system elapsed
#330.599 1.365 332.409
放寬參數(shù)的設(shè)置耗時大幅增加,查看源碼拒垃。min.pct和logfc.threshold都在代碼頭部用于過濾基因停撞,閾值放寬使得用于分析的基因增多。
馬克marker
#B細(xì)胞FeaturePlot(object = pbmc_small, features = c('MS4A1', 'CD19', 'CD79B'),ncol=3)