SCENIC | 從單細(xì)胞數(shù)據(jù)推斷基因調(diào)控網(wǎng)絡(luò)和細(xì)胞類型

在2019/08/07的Nature刊中殷勘,中科院景乃禾課題組發(fā)表了文章——Molecular architecture of lineage allocation and tissue organization in early mouse embryo 厌殉,我在這篇文章中發(fā)現(xiàn)了一個(gè)被湯神組 (就是Hemberg-lab單細(xì)胞轉(zhuǎn)錄組數(shù)據(jù)分析(二)- 實(shí)驗(yàn)平臺(tái)中開辟了單細(xì)胞轉(zhuǎn)錄組領(lǐng)域的人)反復(fù)用到的R工具-SCENIC皿淋,現(xiàn)在讓我們來(lái)一起看看該工具有何妙用!

SCENIC簡(jiǎn)介

SCENIC是一種同時(shí)重建基因調(diào)控網(wǎng)絡(luò)并從單細(xì)胞RNA-seq數(shù)據(jù)中鑒定stable cell states的工具藐俺∪笄福基于共表達(dá)和DNA淖阍桑基序 (motif)分析推斷基因調(diào)控網(wǎng)絡(luò) ,然后在每個(gè)細(xì)胞中分析網(wǎng)絡(luò)活性以鑒定細(xì)胞狀態(tài)读虏。

SCENIC發(fā)表于2017年的Nature method文章责静。具體見鏈接:

https://www.nature.com/articles/nmeth.4463

image

要求:當(dāng)前版本的SCENIC支持人類,鼠和果蠅(Drosophila melanogaster)盖桥。

要將SCENIC應(yīng)用于其他物種灾螃,需要手動(dòng)調(diào)整第二步(例如使用新的RcisTarget數(shù)據(jù)庫(kù)或使用不同的motif-enrichment-analysis工具)。

輸入:SCENIC需要輸入的是單細(xì)胞RNA-seq表達(dá)矩陣—— 每列對(duì)應(yīng)于樣品(細(xì)胞)揩徊,每行對(duì)應(yīng)一個(gè)基因腰鬼。基因ID應(yīng)該是gene-symbol并存儲(chǔ)為rownames (尤其是基因名字部分是為了與RcisTarget數(shù)據(jù)庫(kù)兼容)塑荒;表達(dá)數(shù)據(jù)是Gene的reads count熄赡。根據(jù)作者的測(cè)試,提供原始的或Normalized UMI count袜炕,無(wú)論是否log轉(zhuǎn)換本谜,或使用TPM值,結(jié)果相差不大偎窘。(Overall, SCENIC is quite robust to this choice, we have applied SCENIC to datasets using raw (logged) UMI counts, normalized UMI counts, and TPM and they all provided reliable results (see Aibar et al. (2017)).)

SCENIC

先安裝R乌助,如果處理大樣本數(shù)據(jù)溜在,則建議使用Python版 (https://pyscenic.readthedocs.io/en/latest/);

SCENIC在R中實(shí)現(xiàn)基于三個(gè)R包:

  1. GENIE3

    推斷基因共表達(dá)網(wǎng)絡(luò)

  2. RcisTarget

    用于分析轉(zhuǎn)錄因子結(jié)合motif

  3. AUCell

    用于鑒定scRNA-seq數(shù)據(jù)中具有活性基因集(基因網(wǎng)絡(luò))的細(xì)胞

運(yùn)行SCENIC需要安裝這些軟件包以及一些額外的依賴包:

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::version()
# If your bioconductor version is previous to 3.9, see the section bellow
## Required
BiocManager::install(c("AUCell", "RcisTarget"))
BiocManager::install(c("GENIE3")) # Optional. Can be replaced by GRNBoost
## Optional (but highly recommended):
# To score the network on cells (i.e. run AUCell):
BiocManager::install(c("zoo", "mixtools", "rbokeh"))
# For various visualizations and perform t-SNEs:
BiocManager::install(c("DT", "NMF", "pheatmap", "R2HTML", "Rtsne"))
# To support paralell execution (not available in Windows):
BiocManager::install(c("doMC", "doRNG"))
# To export/visualize in http://scope.aertslab.org
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("aertslab/SCopeLoomR", build_vignettes = TRUE)

Version: AUCell >=1.4.1 (minimum 1.2.4), RcisTarget>=1.2.0 (minimum 1.0.2) and GENIE3>=1.4.0(minimum 1.2.1).

安裝SCENIC:

if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("aertslab/SCENIC")
packageVersion("SCENIC")

下載評(píng)分?jǐn)?shù)據(jù)庫(kù)

除了必要的R包之外,需要下載RcisTarget的物種特定數(shù)據(jù)庫(kù)(https://resources.aertslab.org/cistarget/他托;主題排名)掖肋。默認(rèn)情況下,SCENIC使用在基因啟動(dòng)子(TSS上游500 bp)和TSS周圍 20 kb (+/- 10kb)中對(duì)模序進(jìn)行評(píng)分的數(shù)據(jù)庫(kù)赏参。

  • For human:

    dbFiles <- c("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather",
    "https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather")
    # mc9nr: Motif collection version 9: 24k motifs

  • For mouse:

dbFiles <- c("https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.feather",
"https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.feather")
# mc9nr: Motif collection version 9: 24k motifs

  • For fly:

dbFiles <- c("https://resources.aertslab.org/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/gene_based/dm6-5kb-upstream-full-tx-11species.mc8nr.feather")
# mc8nr: Motif collection version 8: 20k motifs

注意:下載后最好確認(rèn)下載的數(shù)據(jù)是否完整志笼,可基于MD5值評(píng)估。(參考鏈接:https://resources.aertslab.org/cistarget/databases/sha256sum.txt

完成這些設(shè)置步驟后把篓,SCENIC即可運(yùn)行纫溃!

數(shù)據(jù)格式不同時(shí)如何讀入?

最終讀入的信息有兩個(gè)韧掩,一個(gè)是前面說(shuō)的表達(dá)矩陣紊浩,還有一個(gè)是樣品分組信息。

  • a) From .loom file

.loom文件可以通過(guò)SCopeLoomR包直接導(dǎo)入SCENIC疗锐。(loom格式是用于存儲(chǔ)非常大的組學(xué)數(shù)據(jù)集的專屬格式坊谁,具體見 http://linnarssonlab.org/loompy/)

## Download:download.file("http://loom.linnarssonlab.org/clone/Previously%20Published/Cortex.loom", "Cortex.loom")
loomPath <- "Cortex.loom"
  • b) From 10X/CellRanger output files

10X/CellRanger輸出結(jié)果可用作SCENIC的輸入文件 (需要先安裝Seurat)。

# BiocManager::install("Seurat")
# 測(cè)試數(shù)據(jù)也可以從Seurat官網(wǎng)下載滑臊,自己修改路徑
# 測(cè)試數(shù)據(jù):https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/matrices#r-load-mat>
singleCellMatrix <- Seurat::Read10X(data.dir="data/pbmc3k/filtered_gene_bc_matrices/hg19/")
cellInfo <- data.frame(seuratCluster=Idents(seuratObject))
  • c) From other R objects (e.g. Seurat, SingleCellExperiment)
sce <- load_as_sce(dataPath) # any SingleCellExperiment object
exprMat <- counts(sce)
cellInfo <- colData(sce)
  • d) From GEO

從GEO下載和格式化數(shù)據(jù)集的示例(例如口芍,GEOGSE60361:3005小鼠腦細(xì)胞數(shù)據(jù)集)

# 獲取數(shù)據(jù)及數(shù)據(jù)標(biāo)注
# dir.create("SCENIC_MouseBrain"); setwd("SCENIC_MouseBrain") # if needed
# (This may take a few minutes)
if (!requireNamespace("GEOquery", quietly = TRUE)) BiocManager::install("GEOquery")
library(GEOquery)
geoFile <- getGEOSuppFiles("GSE60361", makeDirectory=FALSE)
gzFile <- grep("Expression", basename(rownames(geoFile)), value=TRUE)
txtFile <- gsub(".gz", "", gzFile)
gunzip(gzFile, destname=txtFile, remove=TRUE)
library(data.table)
geoData <- fread(txtFile, sep="\t")
geneNames <- unname(unlist(geoData[,1, with=FALSE]))
exprMatrix <- as.matrix(geoData[,-1, with=FALSE])
rm(geoData)
dim(exprMatrix)
rownames(exprMatrix) <- geneNames
exprMatrix <- exprMatrix[unique(rownames(exprMatrix)),]
exprMatrix[1:5,1:4]
# Remove file downloaded:
file.remove(txtFile)

運(yùn)行 SCENIC

SCENIC workflow:

建立基因調(diào)控網(wǎng)絡(luò)Gene Regulation Network,GRN):

  1. 基于共表達(dá)識(shí)別每個(gè)轉(zhuǎn)錄因子TF的潛在靶標(biāo)雇卷。
    過(guò)濾表達(dá)矩陣并運(yùn)行GENIE3或者GRNBoost鬓椭,它們是利用表達(dá)矩陣推斷基因調(diào)控網(wǎng)絡(luò)的一種算法,能得到轉(zhuǎn)錄因子和潛在靶標(biāo)的相關(guān)性網(wǎng)絡(luò)聋庵;
    將目標(biāo)從GENIE3或者GRNBoost格式轉(zhuǎn)為共表達(dá)模塊膘融。

  2. 根據(jù)DNA模序分析(motif)選擇潛在的直接結(jié)合靶標(biāo)(調(diào)節(jié)因子)(利用RcisTarget包:TF基序分析)

確定細(xì)胞狀態(tài)及其調(diào)節(jié)因子

  1. 分析每個(gè)細(xì)胞中的網(wǎng)絡(luò)活性(AUCell)
  • 在細(xì)胞中評(píng)分調(diào)節(jié)子(計(jì)算AUC)

SCENIC完整流程

### 導(dǎo)入數(shù)據(jù)
loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")
library(SCopeLoomR)
loom <- open_loom(loomPath, mode="r")
exprMat <- get_dgem(loom)
cellInfo <- get_cellAnnotation(loom)
close_loom(loom)
### Initialize settings 初始設(shè)置,導(dǎo)入評(píng)分?jǐn)?shù)據(jù)庫(kù)
library(SCENIC)
scenicOptions <- initializeScenic(org="mgi", dbDir="cisTarget_databases", nCores=10)
# scenicOptions@inputDatasetInfo$cellInfo <- "int/cellInfo.Rds"
saveRDS(scenicOptions, file="int/scenicOptions.Rds")
### 共表達(dá)網(wǎng)絡(luò)
genesKept <- geneFiltering(exprMat, scenicOptions)
exprMat_filtered <- exprMat[genesKept, ]
runCorrelation(exprMat_filtered, scenicOptions)
exprMat_filtered_log <- log2(exprMat_filtered+1)
runGenie3(exprMat_filtered_log, scenicOptions)
### Build and score the GRN 構(gòu)建并給基因調(diào)控網(wǎng)絡(luò)(GRN)打分
exprMat_log <- log2(exprMat+1)
scenicOptions@settings$dbs <- scenicOptions@settings$dbs["10kb"] # Toy run settings
runSCENIC_1_coexNetwork2modules(scenicOptions)
runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) # Toy run settings
runSCENIC_3_scoreCells(scenicOptions, exprMat_log)
# Export:
export2scope(scenicOptions, exprMat)
# Binarize activity?
# aucellApp <- plotTsne_AUCellApp(scenicOptions, exprMat_log)
# savedSelections <- shiny::runApp(aucellApp)
# newThresholds <- savedSelections$thresholds
# scenicOptions@fileNames$int["aucell_thresholds",1] <- "int/newThresholds.Rds"
# saveRDS(newThresholds, file=getIntName(scenicOptions, "aucell_thresholds"))
# saveRDS(scenicOptions, file="int/scenicOptions.Rds")
runSCENIC_4_aucell_binarize(scenicOptions)
### Exploring output
# Check files in folder 'output'
# .loom file @ http://scope.aertslab.org
# output/Step2_MotifEnrichment_preview.html in detail/subset:
motifEnrichment_selfMotifs_wGenes <- loadInt(scenicOptions, "motifEnrichment_selfMotifs_wGenes")
tableSubset <- motifEnrichment_selfMotifs_wGenes[highlightedTFs=="Sox8"]
viewMotifs(tableSubset)
# output/Step2_regulonTargetsInfo.tsv in detail:
regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
viewMotifs(tableSubset)

舉個(gè)栗子祭玉!

輸入表達(dá)矩陣

在本教程中氧映,我們提供了一個(gè)示例,樣本是小鼠大腦的200個(gè)細(xì)胞和862個(gè)基因:

loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")

打開loom文件并加載表達(dá)矩陣脱货;

library(SCopeLoomR)
loom <- open_loom(loomPath, mode="r")
exprMat <- get_dgem(loom)
cellInfo <- get_cellAnnotation(loom)
close_loom(loom)
dim(exprMat)
image

細(xì)胞信息/表型

# cellInfo$nGene <- colSums(exprMat>0)
head(cellInfo)
image
cellInfo <- data.frame(cellInfo)
cellTypeColumn <- "Class"
colnames(cellInfo)[which(colnames(cellInfo)==cellTypeColumn)] <- "CellType"
cbind(table(cellInfo$CellType))
image
# Color to assign to the variables (same format as for NMF::aheatmap)
colVars <- list(CellType=c("microglia"="forestgreen",
                           "endothelial-mural"="darkorange",
                           "astrocytes_ependymal"="magenta4",
                           "oligodendrocytes"="hotpink",
                           "interneurons"="red3",
                           "pyramidal CA1"="skyblue",
                           "pyramidal SS"="darkblue"))
colVars$CellType <- colVars$CellType[intersect(names(colVars$CellType), cellInfo$CellType)]
saveRDS(colVars, file="int/colVars.Rds")
plot.new(); legend(0,1, fill=colVars$CellType, legend=names(colVars$CellType))

image

初始化SCENIC設(shè)置

為了在SCENIC的多個(gè)步驟中保持設(shè)置一致岛都,SCENIC包中的大多數(shù)函數(shù)使用一個(gè)公共對(duì)象,該對(duì)象存儲(chǔ)當(dāng)前運(yùn)行的選項(xiàng)并代替大多數(shù)函數(shù)的“參數(shù)”振峻。比如下面的org臼疫,dbDir等,可以在開始就將物種rog(mgi—— mouse扣孟, hgnc —— human烫堤, dmel —— fly)和RcisTarge數(shù)據(jù)庫(kù)位置分別讀給對(duì)象orgdbDir,之后統(tǒng)一用函數(shù)initializeScenic得到對(duì)象scenicOptions鸽斟。具體參數(shù)設(shè)置可以用?initializeScenichelp一下拔创。

library(SCENIC)
org="mgi" # or hgnc, or dmel
dbDir="cisTarget_databases" # RcisTarget databases location
myDatasetTitle="SCENIC example on Mouse brain" # choose a name for your analysis
data(defaultDbNames)
dbs <- defaultDbNames[[org]]
scenicOptions <- initializeScenic(org=org, dbDir=dbDir, dbs=dbs, datasetTitle=myDatasetTitle, nCores=10)
image
# Modify if needed
scenicOptions@inputDatasetInfo$cellInfo <- "int/cellInfo.Rds"
scenicOptions@inputDatasetInfo$colVars <- "int/colVars.Rds"
# Databases:
# scenicOptions@settings$dbs <- c("mm9-5kb-mc8nr"="mm9-tss-centered-5kb-10species.mc8nr.feather")
# scenicOptions@settings$db_mcVersion <- "v8"
# Save to use at a later time...
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

共表達(dá)網(wǎng)絡(luò)

SCENIC工作流程的第一步是根據(jù)表達(dá)數(shù)據(jù)推斷潛在的轉(zhuǎn)錄因子靶標(biāo)。為此富蓄,我們使用GENIE3或GRNBoost剩燥,輸入文件是表達(dá)矩陣(過(guò)濾后的)和轉(zhuǎn)錄因子列表。GENIE3/GRBBoost的輸出結(jié)果和相關(guān)矩陣將用于創(chuàng)建共表達(dá)模塊(runSCENIC_1_coexNetwork2modules())立倍。

基因過(guò)濾/選擇

  • 按每個(gè)基因的reads總數(shù)進(jìn)行過(guò)濾灭红。

    該filter旨在去除最可能是噪音的基因。

    默認(rèn)情況下口注,它(minCountsPerGene)保留所有樣品中至少帶有6個(gè)UMI reads的基因(例如变擒,如果在1%的細(xì)胞中以3的值表達(dá),則基因?qū)⒕哂械目倲?shù))寝志。

  • 通過(guò)基因的細(xì)胞數(shù)來(lái)實(shí)現(xiàn)過(guò)濾(例如 UMI > 0 赁项,或log 2(TPM)> 1 )。

    默認(rèn)情況下(minSamples)澈段,保留下來(lái)的基因能在至少1%的細(xì)胞中檢測(cè)得到。

  • 最后舰攒,只保留RcisTarget數(shù)據(jù)庫(kù)中可用的基因败富。

# (Adjust minimum values according to your dataset)
genesKept <- geneFiltering(exprMat, scenicOptions=scenicOptions,
                           minCountsPerGene=3*.01*ncol(exprMat),
                           minSamples=ncol(exprMat)*.01)

image

在進(jìn)行網(wǎng)絡(luò)推斷之前,檢查是否有任何已知的相關(guān)基因被過(guò)濾掉(如果缺少任何相關(guān)基因摩窃,請(qǐng)仔細(xì)檢查filter設(shè)置是否合適):


interestingGenes <- c("Sox9", "Sox10", "Dlx5")
# any missing?
interestingGenes[which(!interestingGenes %in% genesKept)]

相關(guān)性

GENIE33或者GRNBoost可以檢測(cè)正負(fù)關(guān)聯(lián)兽叮。為了區(qū)分潛在的激活和抑制,我們將目標(biāo)分為正相關(guān)和負(fù)相關(guān)目標(biāo)(比如TF與潛在目標(biāo)之間的Spearman相關(guān)性)猾愿。

runCorrelation(exprMat_filtered, scenicOptions)

運(yùn)行GENIE3得到潛在轉(zhuǎn)錄因子TF


## If launched in a new session, you will need to reload...
# setwd("...")
# loomPath <- "..."
# loom <- open_loom(loomPath, mode="r")
# exprMat <- get_dgem(loom)
# close_loom(loom)
# genesKept <- loadInt(scenicOptions, "genesKept")
# exprMat_filtered <- exprMat[genesKept,]
# library(SCENIC)
# scenicOptions <- readRDS("int/scenicOptions.Rds")
# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1)
# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)

構(gòu)建并評(píng)分GRN(runSCENIC_ …)

必要時(shí)重新加載表達(dá)式矩陣:


loom <- open_loom(loomPath, mode="r")
exprMat <- get_dgem(loom)
close_loom(loom)
# Optional: log expression (for TF expression plot, it does not affect any other calculation)
logMat <- log2(exprMat+1)
dim(exprMat)

使用wrapper函數(shù)運(yùn)行其余步驟:


library(SCENIC)
scenicOptions <- readRDS("int/scenicOptions.Rds")
scenicOptions@settings$verbose <- TRUE
scenicOptions@settings$nCores <- 10
scenicOptions@settings$seed <- 123
# For a very quick run:
# coexMethod=c("top5perTarget")
scenicOptions@settings$dbs <- scenicOptions@settings$dbs["10kb"] # For toy run
# save...
runSCENIC_1_coexNetwork2modules(scenicOptions)
runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) #** Only for toy run!!
runSCENIC_3_scoreCells(scenicOptions, logMat)

可選步驟

將network activity轉(zhuǎn)換成ON/OFF(二進(jìn)制)格式

nPcs <- c(5) # For toy dataset
# nPcs <- c(5,15,50)

scenicOptions@settings$seed <- 123 # same seed for all of them
# Run t-SNE with different settings:
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50))
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50), onlyHighConf=TRUE, filePrefix="int/tSNE_oHC")
# Plot as pdf (individual files in int/):
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_", list.files("int"), value=T), value=T))


par(mfrow=c(length(nPcs), 3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)

image

# Using only "high-confidence" regulons (normally similar)
par(mfrow=c(3,3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_oHC_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)

輸出到 loom/SCope

SCENIC生成的結(jié)果既能在http://scope.aertslab.org查看鹦聪,還能用函數(shù)export2scope()(需要SCopeLoomR包)保存成.loom文件。

# DGEM (Digital gene expression matrix)
# (non-normalized counts)
# exprMat <- get_dgem(open_loom(loomPath))
# dgem <- exprMat
# head(colnames(dgem))  #should contain the Cell ID/name
# Export:
scenicOptions@fileNames$output["loomFile",] <- "output/mouseBrain_SCENIC.loom"
export2scope(scenicOptions, exprMat)

加載.loom文件中的結(jié)果

SCopeLoomR中也有函數(shù)可以導(dǎo)入.loom文件中的內(nèi)容蒂秘,比如調(diào)節(jié)因子泽本,AUC和封裝內(nèi)容(比如regulon activityt-SNE和UMAP結(jié)果)。


library(SCopeLoomR)
scenicLoomPath <- getOutName(scenicOptions, "loomFile")
loom <- open_loom(scenicLoomPath)
# Read information from loom file:
regulons_incidMat <- get_regulons(loom)
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonsAUC <- get_regulonsAuc(loom)
regulonsAucThresholds <- get_regulonThresholds(loom)
embeddings <- get_embeddings(loom)

解讀結(jié)果

1. 細(xì)胞狀態(tài)

AUCell提供跨細(xì)胞的調(diào)節(jié)子的活性,AUCell使用“Area under Curve 曲線下面積”(AUC)來(lái)計(jì)算輸入基因集的關(guān)鍵子集是否在每個(gè)細(xì)胞的表達(dá)基因中富集姻僧。通過(guò)該調(diào)節(jié)子活性(連續(xù)或二進(jìn)制AUC矩陣)來(lái)聚類細(xì)胞规丽,我們可以看出是否存在傾向于具有相同調(diào)節(jié)子活性的細(xì)胞群,并揭示在多個(gè)細(xì)胞中反復(fù)發(fā)生的網(wǎng)絡(luò)狀態(tài)撇贺。這些狀態(tài)等同于網(wǎng)絡(luò)的吸引子狀態(tài)赌莺。將這些聚類與不同的可視化方法相結(jié)合,我們可以探索細(xì)胞狀態(tài)與特定調(diào)節(jié)子的關(guān)聯(lián)松嘶。

將AUC和TF表達(dá)投射到t-SNE上


logMat <- exprMat # Better if it is logged/normalized
aucellApp <- plotTsne_AUCellApp(scenicOptions, logMat) # default t-SNE
savedSelections <- shiny::runApp(aucellApp)

print(tsneFileName(scenicOptions))

tSNE_scenic <- readRDS(tsneFileName(scenicOptions))
aucell_regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
# Show TF expression:
par(mfrow=c(2,3))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, exprMat, aucell_regulonAUC[onlyNonDuplicatedExtended(rownames(aucell_regulonAUC))[c("Dlx5", "Sox10", "Sox9","Irf1", "Stat6")],], plots="Expression")

image

# Save AUC as PDF:
Cairo::CairoPDF("output/Step4_BinaryRegulonActivity_tSNE_colByAUC.pdf", width=20, height=15)
par(mfrow=c(4,6))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, cellsAUC=aucell_regulonAUC, plots="AUC")
dev.off()
library(KernSmooth)
library(RColorBrewer)
dens2d <- bkde2D(tSNE_scenic$Y, 1)$fhat
image(dens2d, col=brewer.pal(9, "YlOrBr"), axes=FALSE)
contour(dens2d, add=TRUE, nlevels=5, drawlabels=FALSE)
image

#par(bg = "black")
par(mfrow=c(1,2))
regulonNames <- c( "Dlx5","Sox10")
cellCol <- plotTsne_rgb(scenicOptions, regulonNames, aucType="AUC", aucMaxContrast=0.6)
text(0, 10, attr(cellCol,"red"), col="red", cex=.7, pos=4)
text(-20,-10, attr(cellCol,"green"), col="green3", cex=.7, pos=4)
regulonNames <- list(red=c("Sox10", "Sox8"),
                     green=c("Irf1"),
                     blue=c( "Tef"))
cellCol <- plotTsne_rgb(scenicOptions, regulonNames, aucType="Binary")
text(5, 15, attr(cellCol,"red"), col="red", cex=.7, pos=4)
text(5, 15-4, attr(cellCol,"green"), col="green3", cex=.7, pos=4)
text(5, 15-8, attr(cellCol,"blue"), col="blue", cex=.7, pos=4)

image

GRN:Regulon靶標(biāo)和模序


regulons <- loadInt(scenicOptions, "regulons")
regulons[c("Dlx5", "Irf1")]
image

regulons <- loadInt(scenicOptions, "aucell_regulons")
head(cbind(onlyNonDuplicatedExtended(names(regulons))))

image

regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
viewMotifs(tableSubset)

2. 細(xì)胞群的調(diào)控因子

regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
regulonActivity_byCellType <- sapply(split(rownames(cellInfo), cellInfo$CellType),
                                     function(cells) rowMeans(getAUC(regulonAUC)[,cells]))
regulonActivity_byCellType_Scaled <- t(scale(t(regulonActivity_byCellType), center = T, scale=T))
pheatmap::pheatmap(regulonActivity_byCellType_Scaled, #fontsize_row=3,
                   color=colorRampPalette(c("blue","white","red"))(100), breaks=seq(-3, 3, length.out = 100),
                   treeheight_row=10, treeheight_col=10, border_color=NA)

image

# filename="regulonActivity_byCellType.pdf", width=10, height=20)
topRegulators <- reshape2::melt(regulonActivity_byCellType_Scaled)
colnames(topRegulators) <- c("Regulon", "CellType", "RelativeActivity")
topRegulators <- topRegulators[which(topRegulators$RelativeActivity>0),]
viewTable(topRegulators)


minPerc <- .7
binaryRegulonActivity <- loadInt(scenicOptions, "aucell_binary_nonDupl")
cellInfo_binarizedCells <- cellInfo[which(rownames(cellInfo)%in% colnames(binaryRegulonActivity)),, drop=FALSE]
regulonActivity_byCellType_Binarized <- sapply(split(rownames(cellInfo_binarizedCells), cellInfo_binarizedCells$CellType),
                                               function(cells) rowMeans(binaryRegulonActivity[,cells, drop=FALSE]))
binaryActPerc_subset <- regulonActivity_byCellType_Binarized[which(rowSums(regulonActivity_byCellType_Binarized>minPerc)>0),]
pheatmap::pheatmap(binaryActPerc_subset, # fontsize_row=5,
                   color = colorRampPalette(c("white","pink","red"))(100), breaks=seq(0, 1, length.out = 100),
                   treeheight_row=10, treeheight_col=10, border_color=NA)

image

參考文獻(xiàn)

Aibar, Sara, Carmen Bravo González-Blas, Thomas Moerman, Jasper Wouters, Van Anh Huynh-Thu, Hana Imrichová, Zeynep Kalender Atak, et al. 2017. “SCENIC: Single-Cell Regulatory Network Inference and Clustering.” Nature Methods 14 (october): 1083–6. doi:10.1038/nmeth.4463.

Davie, Kristofer, Jasper Janssens, Duygu Koldere, and “et al.” 2018. “A Single-Cell Transcriptome Atlas of the Aging Drosophila Brain.” Cell, june. doi:10.1016/j.cell.2018.05.057.

Huynh-Thu, Van Anh, Alexandre Irrthum, Louis Wehenkel, and Pierre Geurts. 2010. “Inferring Regulatory Networks from Expression Data Using Tree-Based Methods.” PloS One 5 (9). doi:10.1371/journal.pone.0012776.

Marbach, Daniel, James C. Costello, Robert Küffner, Nicole M. Vega, Robert J. Prill, Diogo M. Camacho, Kyle R. Allison, et al. 2012. “Wisdom of Crowds for Robust Gene Network Inference.” Nature Methods 9 (8): 796–804. doi:10.1038/nmeth.2016.

https://rawcdn.githack.com/aertslab/SCENIC/0a4c96ed8d930edd8868f07428090f9dae264705/inst/doc/SCENIC_Running.html#directories

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末艘狭,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌巢音,老刑警劉巖遵倦,帶你破解...
    沈念sama閱讀 216,324評(píng)論 6 498
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異港谊,居然都是意外死亡骇吭,警方通過(guò)查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,356評(píng)論 3 392
  • 文/潘曉璐 我一進(jìn)店門歧寺,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)燥狰,“玉大人,你說(shuō)我怎么就攤上這事斜筐×拢” “怎么了?”我有些...
    開封第一講書人閱讀 162,328評(píng)論 0 353
  • 文/不壞的土叔 我叫張陵顷链,是天一觀的道長(zhǎng)目代。 經(jīng)常有香客問(wèn)我,道長(zhǎng)嗤练,這世上最難降的妖魔是什么榛了? 我笑而不...
    開封第一講書人閱讀 58,147評(píng)論 1 292
  • 正文 為了忘掉前任,我火速辦了婚禮煞抬,結(jié)果婚禮上霜大,老公的妹妹穿的比我還像新娘。我一直安慰自己革答,他們只是感情好战坤,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,160評(píng)論 6 388
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著残拐,像睡著了一般途茫。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上溪食,一...
    開封第一講書人閱讀 51,115評(píng)論 1 296
  • 那天囊卜,我揣著相機(jī)與錄音,去河邊找鬼错沃。 笑死边败,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的捎废。 我是一名探鬼主播笑窜,決...
    沈念sama閱讀 40,025評(píng)論 3 417
  • 文/蒼蘭香墨 我猛地睜開眼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼登疗!你這毒婦竟也來(lái)了排截?” 一聲冷哼從身側(cè)響起嫌蚤,我...
    開封第一講書人閱讀 38,867評(píng)論 0 274
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎断傲,沒(méi)想到半個(gè)月后脱吱,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,307評(píng)論 1 310
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡认罩,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,528評(píng)論 2 332
  • 正文 我和宋清朗相戀三年箱蝠,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片垦垂。...
    茶點(diǎn)故事閱讀 39,688評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡宦搬,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出劫拗,到底是詐尸還是另有隱情间校,我是刑警寧澤,帶...
    沈念sama閱讀 35,409評(píng)論 5 343
  • 正文 年R本政府宣布页慷,位于F島的核電站憔足,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏酒繁。R本人自食惡果不足惜滓彰,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,001評(píng)論 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望州袒。 院中可真熱鬧找蜜,春花似錦、人聲如沸稳析。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,657評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)彰居。三九已至,卻和暖如春撰筷,著一層夾襖步出監(jiān)牢的瞬間陈惰,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 32,811評(píng)論 1 268
  • 我被黑心中介騙來(lái)泰國(guó)打工毕籽, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留抬闯,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 47,685評(píng)論 2 368
  • 正文 我出身青樓关筒,卻偏偏與公主長(zhǎng)得像溶握,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子蒸播,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,573評(píng)論 2 353

推薦閱讀更多精彩內(nèi)容