SCENIC/pySCENIC分析補(bǔ)充+小鼠數(shù)據(jù)示例2022-12-14

適用背景

之前寫(xiě)了兩篇博客(四步完成單細(xì)胞數(shù)據(jù)調(diào)控網(wǎng)絡(luò)流程分析-SCENIC/pySCENIC-2022-09-06SCENIC/pySCENIC結(jié)果可視化 2022-11-08)介紹SCENIC/pySCENIC的使用,最近在使用的時(shí)候遇到一些問(wèn)題阐污,因此這篇文章作為補(bǔ)充迫淹,如果看不懂本篇可以查看前兩篇博客末秃。

補(bǔ)充內(nèi)容主要有以下幾點(diǎn):

  • 1、小鼠的數(shù)據(jù)庫(kù)構(gòu)建
  • 2蒸痹、從四步流程縮減到三步
  • 3、環(huán)境構(gòu)建遇到的一些errors
  • 4、SCENIC版本的bugs

小鼠的數(shù)據(jù)庫(kù)構(gòu)建

之前的博客構(gòu)建的是人的數(shù)據(jù)庫(kù)蛔糯,但博主最近分析需要用到小鼠的,因此需要構(gòu)建一下新的數(shù)據(jù)庫(kù)窖式。正如之前博客寫(xiě)到的蚁飒,其實(shí)構(gòu)建這個(gè)數(shù)據(jù)庫(kù)只需要替換3個(gè)文件,轉(zhuǎn)錄因子(TF)列表文件萝喘,feather文件已經(jīng)tbl文件淮逻,建議使用最新的v10版本,當(dāng)然也可以根據(jù)網(wǎng)站提示挑選適合自己數(shù)據(jù)的文件阁簸。我下載的文件如下:


從四步流程縮減到三步

之前生成pyscenic需要的loom文件要先用R導(dǎo)出矩陣爬早,再用Python生成loom文件,實(shí)在有點(diǎn)繁瑣启妹,那可不可以直接從Seurat 對(duì)象生成loom文件呢筛严?答案當(dāng)然是可以,使用SCopeLoomR即可饶米。只需要使用build_loom函數(shù)桨啃,在get_count_from_seurat.R文件后面添加幾行代碼即可车胡。
get_count_from_seurat.R文件內(nèi)容如下:

library(optparse)
op_list <- list(
make_option(c("-i", "--inrds"), type = "character", default = NULL, action = "store", help = "The input of Seurat RDS",metavar="rds"),
make_option(c("-d", "--ident"), type = "character", default = NULL, action = "store", help = "The sample Ident of Seurat object",metavar="idents"),
make_option(c("-s", "--size"),  type = "integer", default = NULL, action = "store", help = "The sample size of Seurat object",metavar="size"),
make_option(c("-l", "--label"), type = "character", default = "out", action = "store", help = "The label of output file",metavar="label"),
make_option(c("-a", "--assay"), type = "character", default = "Spatial", action = "store", help = "The assay of input file",metavar="assay")
)
parser <- OptionParser(option_list = op_list)
opt = parse_args(parser)

assay <- opt$assay

library(Seurat)
obj <- readRDS(opt$inrds)
if (!is.null(opt$ident)) {
Idents(obj) <-  opt$ident
size=opt$size
if (!is.null(size)) {
obj <- subset(x = obj, downsample = opt$size)
}
saveRDS(obj,"subset.rds")
}
if (is.null(opt$label)) {
label1 <- 'out'
}else{
label1 <- opt$label
}

library(SCopeLoomR)
outloom <- paste0(label1,".loom")
build_loom(file.name = outloom,dgem = obj@assays[[assay]]@counts)
write.table(obj@meta.data,'metadata_subset.xls',sep='\t',quote=F)

使用方法:
-i,輸入Seurat對(duì)象的RDS文件
-d照瘾,隨機(jī)取樣分組的列名匈棘,例如groups,如果不賦值則表示不隨機(jī)取樣析命,使用全部細(xì)胞
-s主卫,隨機(jī)取樣的大小,例如20碳却,因?yàn)檫@里用的是pbmc_small队秩,所以設(shè)置小一點(diǎn),實(shí)際情況可能需要設(shè)置大一點(diǎn)
-l昼浦,輸出文件的標(biāo)簽馍资,默認(rèn)為out,
-a,使用的Seurat對(duì)象assay关噪,默認(rèn)為Spatial鸟蟹。

Rscript get_count_from_seurat.R -i test.rds -d groups -s 20 -l out -a Spatial

需要注意的是,新腳本添加了assay這個(gè)參數(shù)使兔,默認(rèn)為Spatial建钥,因?yàn)樽罱谟门芸臻g組數(shù)據(jù),如果要應(yīng)用到單細(xì)胞需要改為RNA虐沥。


環(huán)境構(gòu)建遇到的一些errors

因?yàn)樽罱涸诟南到y(tǒng)盤(pán)名稱(chēng)還有不斷刪文件導(dǎo)致我用別人的軟件熊经、數(shù)據(jù)庫(kù)文件和庫(kù)路徑時(shí)總是報(bào)錯(cuò),因?yàn)樽约洪_(kāi)始來(lái)重新構(gòu)建SCENIC/pyscenic運(yùn)行需要的環(huán)境欲险,期間遇到不少問(wèn)題镐依,這里匯總一下。

sh Miniconda3-latest-Linux-x86_64.sh

創(chuàng)建一個(gè)環(huán)境务唐,安裝Python和R

conda create -n pyscenic
conda activate pyscenic
conda install r-base=4.2.1
conda install python=3.9.13
pip install pyscenic -i https://mirrors.aliyun.com/pypi/simple/
pip install pandas -i https://mirrors.aliyun.com/pypi/simple/
pip install scanpy -i https://mirrors.aliyun.com/pypi/simple/
pip install opencv-python -i https://mirrors.aliyun.com/pypi/simple/
pip install loompy -i https://mirrors.aliyun.com/pypi/simple/

這樣就配置好基本環(huán)境了,接下來(lái)是安裝R包带兜,確保以下需要library的包就可以枫笛。

library(Seurat)
library(SCopeLoomR)
library(AUCell)
library(SCENIC)
library(dplyr)
library(KernSmooth)
library(RColorBrewer)
library(plotly)
library(BiocParallel)
library(optparse)

可以先用BiocManager::install() 安裝,如果安裝不了刚照,谷歌搜一下conda install [package name](百度搜不出來(lái)刑巧,不推薦)就能找到conda 安裝這個(gè)包的官網(wǎng),雖然conda很萬(wàn)能,但是慢啊海诲,BiocManager::install會(huì)快很多。當(dāng)然還有一些包是裝不了檩互,需要用devtools安裝特幔,例如SCopeLoomR,那哪些包要用devtools裝呢闸昨?一般BiocManager::install和conda裝不了的包就極大概率要用devtools裝了蚯斯,所以直接搜這個(gè)包的GitHub或官網(wǎng),例如谷歌搜SCopeLoomR GitHub饵较,GitHub里面說(shuō)明需要這樣安裝:

devtools::install_github("aertslab/SCopeLoomR")
  • 談?wù)凷CopeLoomR安裝過(guò)程遇到的一些坑
    本來(lái)想偷懶直接用別人的庫(kù)路徑拍嵌,library的時(shí)候結(jié)果報(bào)錯(cuò)HDF5 library version mismatched error,R還abort突然中斷了循诉,就是說(shuō)我的HDF5庫(kù)版本跟我這個(gè)包安裝時(shí)的版本不匹配横辆,只能重裝。用devtools重裝又各種報(bào)錯(cuò)茄猫,缺失某個(gè)包狈蚤,缺失某個(gè)動(dòng)態(tài)庫(kù)……反正各種缺失,沒(méi)辦法划纽,只能缺什么裝什么了脆侮,而且裝devtools的時(shí)候就很容易報(bào)錯(cuò),建議先用conda安裝以下包和動(dòng)態(tài)庫(kù)再安裝SCopeLoomR:
conda install -c anaconda git
conda install -c anaconda hdf5
conda install -c conda-forge libgit2
conda install -c conda-forge r-ragg
conda install -c conda-forge r-xml

再在R里運(yùn)行:

devtools::install_github("aertslab/SCopeLoomR")

SCENIC版本的bugs

在進(jìn)行可視化的時(shí)候需要用到SCENIC的R包勇劣,但是我發(fā)現(xiàn)調(diào)函數(shù)plotRSS的過(guò)濾參數(shù)時(shí)靖避,不管怎么調(diào),出來(lái)的圖都是一樣的比默。因?yàn)榭匆幌逻@個(gè)函數(shù)的源代碼幻捏,好家伙,發(fā)現(xiàn)它莫名奇妙過(guò)濾了1.5的值退敦,而且這個(gè)值是固定在函數(shù)里無(wú)法更改粘咖。

  • 查看SCENIC版本1.2.4
> packageVersion("SCENIC")
[1] ‘1.2.4’

函數(shù)plotRSS源代碼

> plotRSS
function (rss, labelsToDiscard = NULL, zThreshold = 1, cluster_columns = FALSE,
    order_rows = TRUE, trh = 0.01, varName = "cellType", col.low = "grey90",
    col.mid = "darkolivegreen3", col.high = "darkgreen", revCol = FALSE)
{
    varSize = "RSS"
    varCol = "Z"
    if (revCol) {
        varSize = "Z"
        varCol = "RSS"
    }
    rssNorm <- scale(rss)
    rssNorm[rssNorm < zThreshold] <- 0
    rssNorm <- rssNorm[, which(!colnames(rssNorm) %in% labelsToDiscard)]
    tmp <- .plotRSS_heatmap(rssNorm, trh = trh, cluster_columns = cluster_columns,
        order_rows = order_rows)
    rowOrder <- rev(tmp@row_names_param$labels)
    rss.df <- reshape2::melt(rss)
    head(rss.df)
    colnames(rss.df) <- c("Topic", varName, "RSS")
    rssNorm.df <- reshape2::melt(rssNorm)
    colnames(rssNorm.df) <- c("Topic", varName, "Z")
    rss.df <- merge(rss.df, rssNorm.df)
    rss.df <- rss.df[which(rss.df$Z >= 1.5), ]
    rss.df <- rss.df[which(!rss.df[, varName] %in% labelsToDiscard),
        ]
    rss.df[, "Topic"] <- factor(rss.df[, "Topic"], levels = rowOrder)
    p <- dotHeatmap(rss.df, var.x = varName, var.y = "Topic",
        var.size = varSize, min.size = 0.5, max.size = 5, var.col = varCol,
        col.low = col.low, col.mid = col.mid, col.high = col.high)
    invisible(list(plot = p, df = rss.df, rowOrder = rowOrder))
}
<bytecode: 0x55b0ea0f86b8>
<environment: namespace:SCENIC>

于是,我把這個(gè)包更新了一些再看函數(shù)代碼侈百,發(fā)現(xiàn)去掉了過(guò)濾1.5值的操作瓮下,果然是舊版本的一個(gè)bug啊。

  • 查看SCENIC版本1.3.1
> packageVersion("SCENIC")
[1] ‘1.3.1’

函數(shù)plotRSS源代碼

> plotRSS
function (rss, labelsToDiscard = NULL, zThreshold = 1, cluster_columns = FALSE,
    order_rows = TRUE, thr = 0.01, varName = "cellType", col.low = "grey90",
    col.mid = "darkolivegreen3", col.high = "darkgreen", revCol = FALSE,
    verbose = TRUE)
{
    varSize = "RSS"
    varCol = "Z"
    if (revCol) {
        varSize = "Z"
        varCol = "RSS"
    }
    rssNorm <- scale(rss)
    rssNorm <- rssNorm[, which(!colnames(rssNorm) %in% labelsToDiscard)]
    rssNorm[rssNorm < 0] <- 0
    rssSubset <- rssNorm
    if (!is.null(zThreshold))
        rssSubset[rssSubset < zThreshold] <- 0
    tmp <- .plotRSS_heatmap(rssSubset, thr = thr, cluster_columns = cluster_columns,
        order_rows = order_rows, verbose = verbose)
    rowOrder <- rev(tmp@row_names_param$labels)
    rm(tmp)
    rss.df <- reshape2::melt(rss)
    head(rss.df)
    colnames(rss.df) <- c("Topic", varName, "RSS")
    rssNorm.df <- reshape2::melt(rssNorm)
    colnames(rssNorm.df) <- c("Topic", varName, "Z")
    rss.df <- base::merge(rss.df, rssNorm.df)
    rss.df <- rss.df[which(!rss.df[, varName] %in% labelsToDiscard),
        ]
    if (nrow(rss.df) < 2)
        stop("Insufficient rows left to plot RSS.")
    rss.df <- rss.df[which(rss.df$Topic %in% rowOrder), ]
    rss.df[, "Topic"] <- factor(rss.df[, "Topic"], levels = rowOrder)
    p <- dotHeatmap(rss.df, var.x = varName, var.y = "Topic",
        var.size = varSize, min.size = 0.5, max.size = 5, var.col = varCol,
        col.low = col.low, col.mid = col.mid, col.high = col.high)
    invisible(list(plot = p, df = rss.df, rowOrder = rowOrder))
}
<bytecode: 0x56295d178670>
<environment: namespace:SCENIC>

最后腳本匯總
三步完成單細(xì)胞數(shù)據(jù)調(diào)控網(wǎng)絡(luò)流程分析和可視化

需要四個(gè)腳本文件:create_loom_input.R钝域,pyscenic_from_loom.sh讽坏,calcRSS_by_scenic.R和function_pyscenic_visualize.R,我把四個(gè)腳本都放在目錄/00.scripts/下例证,然后只需要輸入Seurat對(duì)象的rds路徑inrds(例如/test.rds)和細(xì)胞分組列ingroup(例如sub)路呜,
運(yùn)行示例:

source ~/Miniconda/md/bin/activate pyscenic

inscp='~/00.scripts/'
inrds='~/test.rds'
ingroup='sub'

#Step 1
Rscript ${inscp}/create_loom_input.R -i ${inrds} -d ${ingroup} -l out -a Spatial
#Step 2
sh ${inscp}/pyscenic_from_loom.sh -i out.loom -n 20
#Step 3
Rscript ${inscp}/calcRSS_by_scenic.R -l aucell.loom -m metadata_subset.xls -c ${ingroup}

運(yùn)行上述腳本即可,有關(guān)參數(shù)請(qǐng)查看這篇博客四步完成單細(xì)胞數(shù)據(jù)調(diào)控網(wǎng)絡(luò)流程分析-SCENIC/pySCENIC-2022-09-06,這里不再贅述胀葱。
請(qǐng)注意漠秋,這里的腳本是適用于小鼠的,如果要用人的請(qǐng)?zhí)鎿Q數(shù)據(jù)庫(kù)三個(gè)文件抵屿。


附上需要用到的4個(gè)文件
  • create_loom_input.R
library(optparse)
op_list <- list(
make_option(c("-i", "--inrds"), type = "character", default = NULL, action = "store", help = "The input of Seurat RDS",metavar="rds"),
make_option(c("-d", "--ident"), type = "character", default = NULL, action = "store", help = "The sample Ident of Seurat object",metavar="idents"),
make_option(c("-s", "--size"),  type = "integer", default = NULL, action = "store", help = "The sample size of Seurat object",metavar="size"),
make_option(c("-l", "--label"), type = "character", default = "out", action = "store", help = "The label of output file",metavar="label"),
make_option(c("-a", "--assay"), type = "character", default = "Spatial", action = "store", help = "The assay of input file",metavar="assay")
)
parser <- OptionParser(option_list = op_list)
opt = parse_args(parser)

assay <- opt$assay

library(Seurat)
obj <- readRDS(opt$inrds)
if (!is.null(opt$ident)) {
Idents(obj) <-  opt$ident
size=opt$size
if (!is.null(size)) {
obj <- subset(x = obj, downsample = opt$size)
}
saveRDS(obj,"subset.rds")
}
if (is.null(opt$label)) {
label1 <- 'out'
}else{
label1 <- opt$label
}

library(SCopeLoomR)
outloom <- paste0(label1,".loom")
build_loom(file.name = outloom,dgem = obj@assays[[assay]]@counts)
write.table(obj@meta.data,'metadata_subset.xls',sep='\t',quote=F)
  • pyscenic_from_loom.sh
input_loom=out.loom
n_workers=20
#help function
function usage() {
echo -e "OPTIONS:\n-i|--input_loom:\t input loom file"
echo -e "-n|--n_workers:\t working core number"
echo -e "-h|--help:\t Usage information"
exit 1
}
#get value
while getopts :i:n:h opt
do
    case "$opt" in
        i) input_loom="$OPTARG" ;;
        n) n_workers="$OPTARG" ;;
        h) usage ;;
        :) echo "This option -$OPTARG requires an argument."
           exit 1 ;;
        ?) echo "-$OPTARG is not an option"
           exit 2 ;;
    esac
done
#在這里改數(shù)據(jù)庫(kù)文件
database='~/01.mouse/'
tfs=${database}/allTFs_mm.txt
feather=${database}/mm10_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather
tbl=${database}/motifs-v10nr_clust-nr.mgi-m0.001-o0.0.tbl.txt
pyscenic=pyscenic

# grn
$pyscenic grn \
--num_workers $n_workers \
--output grn.tsv \
--method grnboost2 \
$input_loom  $tfs

# cistarget
$pyscenic ctx \
grn.tsv $feather \
--annotations_fname $tbl \
--expression_mtx_fname $input_loom \
--mode "dask_multiprocessing" \
--output ctx.csv \
--num_workers $n_workers   \
--mask_dropouts

# AUCell
$pyscenic aucell \
$input_loom \
ctx.csv \
--output aucell.loom \
--num_workers $n_workers
  • calcRSS_by_scenic.R
library(optparse)
op_list <- list(
make_option(c("-l", "--input_loom"), type = "character", default = NULL, action = "store", help = "The input of aucell loom file",metavar="rds"),
make_option(c("-m", "--input_meta"), type = "character", default = NULL, action = "store", help = "The metadata of Seurat object",metavar="idents"),
make_option(c("-a", "--assay"), type = "character", default = 'Spatial', action = "store", help = "The assay of Seurat object",metavar="assay"),
make_option(c("-c", "--celltype"), type = "character", default = NULL, action = "store", help = "The colname of metadata to calculate RSS",metavar="label")
)
parser <- OptionParser(option_list = op_list)
opt = parse_args(parser)

library(Seurat)
library(SCopeLoomR)
library(AUCell)
library(SCENIC)
library(dplyr)
library(KernSmooth)
library(RColorBrewer)
library(plotly)
library(BiocParallel)

celltype <- opt$celltype
assay <- opt$assay
loom <- open_loom(opt$input_loom)

regulons_incidMat <- get_regulons(loom, column.attr.name="Regulons")
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonAUC <- get_regulons_AUC(loom,column.attr.name='RegulonsAUC')
regulonAucThresholds <- get_regulon_thresholds(loom)
close_loom(loom)

meta <- read.table(opt$input_meta,sep='\t',header=T,stringsAsFactor=F)
cellinfo <- meta[,c(opt$celltype,paste0("nFeature_",assay),paste0("nCount_",assay))]
colnames(cellinfo)=c('celltype', 'nGene' ,'nUMI')
cellTypes <-  as.data.frame(subset(cellinfo,select = 'celltype'))
selectedResolution <- "celltype"

sub_regulonAUC <- regulonAUC
rss <- calcRSS(AUC=getAUC(sub_regulonAUC),
               cellAnnotation=cellTypes[colnames(sub_regulonAUC),
                                        selectedResolution])
rss=na.omit(rss)
try({
rssPlot <- plotRSS(rss)
save(regulonAUC,rssPlot,regulons,file='regulon_RSS.Rdata')
})

saveRDS(rss,paste0(celltype,"_rss.rds"))

source('~/04.database/00.common_scripts/function_pyscenic_visualize.R')
plot_pyscenic(inloom='aucell.loom',incolor=incolor,inrss=paste0(celltype,"_rss.rds"),inrds='subset.rds',infun='median', ct.col=celltype,inregulons=NULL,ingrn='grn.tsv',ntop1=5,ntop2=50)
  • function_pyscenic_visualize.R
library(Seurat)
library(SCopeLoomR)
library(AUCell)
library(SCENIC)
library(dplyr)
library(KernSmooth)
library(RColorBrewer)
library(plotly)
library(BiocParallel)
library(pheatmap)

library(cowplot)
library(ggpubr)
library(ggsci)
library(ggplot2)
library(tidygraph)
library(ggraph)
library(stringr)



colpalettes<-unique(c(pal_npg("nrc")(10),pal_aaas("default")(10),pal_nejm("default")(8),pal_lancet("lanonc")(9),
                      pal_jama("default")(7),pal_jco("default")(10),pal_ucscgb("default")(26),pal_d3("category10")(10),
                      pal_locuszoom("default")(7),pal_igv("default")(51),
                      pal_uchicago("default")(9),pal_startrek("uniform")(7),
                      pal_tron("legacy")(7),pal_futurama("planetexpress")(12),pal_rickandmorty("schwifty")(12),
                      pal_simpsons("springfield")(16),pal_gsea("default")(12)))
len <- 100
incolor<-c(brewer.pal(8, "Dark2"),brewer.pal(12, "Paired"),brewer.pal(8, "Set2"),brewer.pal(9, "Set1"),colpalettes,rainbow(len))

plot_pyscenic <- function(inloom='aucell.loom',incolor=incolor,inrss='seurat_annotations_rss.rds',inrds='subset.rds',infun='median', ct.col='seurat_annotations',inregulons=NULL,ingrn='grn.tsv',ntop1=5,ntop2=50){
###load data
loom <- open_loom(inloom)

regulons_incidMat <- get_regulons(loom, column.attr.name="Regulons")
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonAUC <- get_regulons_AUC(loom,column.attr.name='RegulonsAUC')
regulonAucThresholds <- get_regulon_thresholds(loom)

embeddings <- get_embeddings(loom)
close_loom(loom)

rss <- readRDS(inrss)
sce <- readRDS(inrds)

##calculate  RSS fc
df = do.call(rbind,
             lapply(1:ncol(rss), function(i){
                dat= data.frame(
                regulon  = rownames(rss),
                cluster =  colnames(rss)[i],
                sd.1 = rss[,i],
                sd.2 = apply(rss[,-i], 1, get(infun))
               )
             }))

df$fc = df$sd.1 - df$sd.2

#select top regulon
ntopg <- df %>% group_by(cluster) %>% top_n(ntop1, fc)

ntopgene <- unique(ntopg$regulon)
write.table(ntopgene,'sd_regulon_RSS.list',sep='\t',quote=F,row.names=F,col.names=F)
#plot rss by cluster

#using plotRSS
rssPlot <- plotRSS(rss)
regulonsToPlot <- rssPlot$rowOrder
rp_df <- rssPlot$df

write.table(regulonsToPlot,'rss_regulon.list',sep='\t',quote=F,row.names=F,col.names=F)
write.table(rp_df,'rssPlot_data.xls',sep='\t',quote=F)
nlen <- length(regulonsToPlot)
hei <- ceiling(nlen)*0.4
blu<-colorRampPalette(brewer.pal(9,"Blues"))(100)
lgroup <- levels(rssPlot$df$cellType)

nlen2 <- length(lgroup)
wei <- nlen2*2
pdf(paste0('regulons_RSS_',ct.col,'_in_dotplot.pdf'),wei,hei)
print(rssPlot$plot)
dev.off()

# sd top gene
anrow = data.frame( group = ntopg$cluster)
lcolor <- incolor[1:length(unique(ntopg$cluster))]
names(lcolor) <- unique(anrow$group)
annotation_colors <- list(group=lcolor)

pn1 = rss[ntopg$regulon,]
pn2 = rss[unique(ntopg$regulon),]
rownames(pn1) <-  make.unique(rownames(pn1))
rownames(anrow) <- rownames(pn1)
scale='row'
hei <- ceiling(length(ntopg$regulon)*0.4)
pdf(paste0('regulon_RSS_in_sd_topgene_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn1,annotation_row = anrow,scale=scale,annotation_colors=annotation_colors,show_rownames = T,main='sd top regulons')
)
print(
pheatmap(pn2,scale=scale,show_rownames = T, main='sd top unique regulons')
)
dev.off()

#plotRSS gene

pn2 = rss[unique(rp_df$Topic),]
scale='row'
hei <- ceiling(length(unique(rp_df$Topic))*0.4)
pdf(paste0('regulon_RSS_in_plotRSS_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn2,scale=scale,show_rownames = T, main='plotRSS unique regulons')
)
dev.off()

#all regulons

hei <- ceiling(length(rownames(rss))*0.2)
pdf(paste0('all_regulons_RSS_in_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(rss,scale=scale,show_rownames = T,main='all regulons RSS')
)
dev.off()
#plot rss by all cells
if (is.null(inregulons)){
inregulons <- regulonsToPlot
}else{
inregulons <- intersect(inregulons,rownames(rss))
regulonsToPlot <- inregulons

}
pn3=as.matrix(regulonAUC@assays@data$AUC)
regulon <- rownames(pn3)
#regulon <- inregulons
pn3 <- pn3[regulon,]
#pn3 <- pn3[,sample(1:dim(pn3)[2],500)]

sce$group1=sce@meta.data[,ct.col]

meta <- sce@meta.data
meta <- meta[order(meta$group1),]
#meta <- meta[colnames(pn3),]
ancol = data.frame(meta[,c('group1')])
colnames(ancol) <- c('group1')
rownames(ancol) <- rownames(meta)
lcolor <- incolor[1:length(unique(ntopg$cluster))]
names(lcolor) <- unique(ntopg$cluster)
annotation_colors <- list(group1 =lcolor)

df1 <- ancol
df1$cell <- rownames(df1)
df1 <- df1[order(df1$group1),]
pn3 <- pn3[,rownames(df1)]
torange=c(-2,2)
pn3 <- scales::rescale(pn3,to=torange)
pn3 <- pn3[,rownames(ancol)]

scale='none'
hei <- ceiling(length(unique(regulon))*0.2)
pdf(paste0('all_regulon_activity_in_allcells.pdf'),10,hei)
print(
pheatmap(pn3,annotation_col = ancol,scale=scale,annotation_colors=annotation_colors,show_rownames = T,show_colnames = F,cluster_cols=F)
)
#pheatmap(pn3,scale=scale,show_rownames = T, show_colnames = F,cluster_cols=F)
dev.off()

#plot in seurat
regulonsToPlot = inregulons
sce$sub_celltype <- sce@meta.data[,ct.col]
sub_regulonAUC <- regulonAUC[,match(colnames(sce),colnames(regulonAUC))]

cellClusters <- data.frame(row.names = colnames(sce),
                           seurat_clusters = as.character(sce$seurat_clusters))
cellTypes <- data.frame(row.names = colnames(sce),
                        celltype = sce$sub_celltype)

sce@meta.data = cbind(sce@meta.data ,t(sub_regulonAUC@assays@data@listData$AUC[regulonsToPlot,]))
Idents(sce) <- sce$sub_celltype

nlen <- length(regulonsToPlot)
hei <- ceiling(nlen)*0.4
blu<-colorRampPalette(brewer.pal(9,"Blues"))(100)
nlen2 <- length(unique(sce$sub_celltype))
wei <- nlen2*2
pdf('regulons_activity_in_dotplot.pdf',wei,hei)
print(DotPlot(sce, features = unique(regulonsToPlot)) + coord_flip()+
      theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = .5))+
      scale_color_gradientn(colours = blu)
      )
dev.off()

hei=ceiling(nlen/4)*4
pdf('regulons_activity_in_umap.pdf',16,hei)
print(RidgePlot(sce, features = regulonsToPlot , ncol = 4))
print(VlnPlot(sce, features = regulonsToPlot,pt.size = 0 ))
print(FeaturePlot(sce, features = regulonsToPlot))
dev.off()

grn <- read.table(ingrn,sep='\t',header=T,stringsAsFactors=F)
inregulons1=gsub('[(+)]','',inregulons)

c1 <- which(grn$TF %in% inregulons1)
grn <- grn[c1,]
#edge1 <- data.frame()
#node1 <- data.frame()
pdf(paste0(ntop2,'_regulon_netplot.pdf'),10,10)
for (tf in unique(grn$TF)) {
tmp <- subset(grn,TF==tf)
if (dim(tmp)[1] > ntop2) {
tmp <- tmp[order(tmp$importance,decreasing=T),]
tmp <- tmp[1:ntop2,]
}
node2 <- data.frame(tmp$target)
node2$node.size=1.5
node2$node.colour <- 'black'
colnames(node2) <- c('node','node.size','node.colour')
df1 <- data.frame(node=tf,node.size=2,node.colour='#FFDA00')
node2 <- rbind(df1,node2)


edge2 <- tmp
colnames(edge2) <- c('from','to','edge.width')
edge2$edge.colour <- "#1B9E77"
torange=c(0.1,1)
edge2$edge.width <- scales::rescale(edge2$edge.width,to=torange)

graph_data <- tidygraph::tbl_graph(nodes = node2, edges = edge2, directed = T)
p1 <- ggraph(graph = graph_data, layout = "stress", circular = TRUE) + geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
  scale_edge_width_continuous(range = c(1,0.2)) +geom_node_point(aes(colour = node.colour, size = node.size))+ theme_void() +
      geom_node_label(aes(label = node,colour = node.colour),size = 3.5, repel = TRUE)
p1 <- p1 + scale_color_manual(values=c('#FFDA00','black'))+scale_edge_color_manual(values=c("#1B9E77"))
print(p1)
}
dev.off()
#plot activity heatmap
meta <- sce@meta.data
celltype <- ct.col
cellsPerGroup <- split(rownames(meta),meta[,celltype])
sub_regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
# Calculate average expression:
regulonActivity_byGroup <- sapply(cellsPerGroup,
                                  function(cells)
                                    rowMeans(getAUC(sub_regulonAUC)[,cells]))
scale='row'
rss <- regulonActivity_byGroup
hei <- ceiling(length(regulonsToPlot)*0.4)
pn1 <- rss[regulonsToPlot,]
pdf(paste0('regulon_activity_in_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn1,scale=scale,show_rownames = T, main='regulons activity')
)
dev.off()

hei <- ceiling(length(rownames(rss))*0.2)
pdf(paste0('all_regulons_activity_in_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(rss,scale=scale,show_rownames = T,main='all regulons activity')
)
dev.off()

##calculate fc
df = do.call(rbind,
             lapply(1:ncol(rss), function(i){
                dat= data.frame(
                regulon  = rownames(rss),
                cluster =  colnames(rss)[i],
                sd.1 = rss[,i],
                sd.2 = apply(rss[,-i], 1, get(infun))
               )
             }))

df$fc = df$sd.1 - df$sd.2

#select top regulon
ntopg <- df %>% group_by(cluster) %>% top_n(ntop1, fc)

ntopgene <- unique(ntopg$regulon)
write.table(ntopgene,'sd_regulon_activity.list',sep='\t',quote=F,row.names=F,col.names=F)

anrow = data.frame( group = ntopg$cluster)
lcolor <- incolor[1:length(unique(ntopg$cluster))]
names(lcolor) <- unique(anrow$group)
annotation_colors <- list(group=lcolor)
pn1 = rss[ntopg$regulon,]
pn2 = rss[unique(ntopg$regulon),]
rownames(pn1) <-  make.unique(rownames(pn1))
rownames(anrow) <- rownames(pn1)
scale='row'
hei <- ceiling(length(ntopg$regulon)*0.4)
pdf(paste0('regulon_activity_in_sd_topgene_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn1,annotation_row = anrow,scale=scale,annotation_colors=annotation_colors,show_rownames = T,main='sd top regulons')
)
print(
pheatmap(pn2,scale=scale,show_rownames = T, main='sd top unique regulons ')
)
dev.off()

}

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末刻盐,一起剝皮案震驚了整個(gè)濱河市秘豹,隨后出現(xiàn)的幾起案子锌半,更是在濱河造成了極大的恐慌扮惦,老刑警劉巖,帶你破解...
    沈念sama閱讀 206,126評(píng)論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件尿扯,死亡現(xiàn)場(chǎng)離奇詭異求晶,居然都是意外死亡,警方通過(guò)查閱死者的電腦和手機(jī)衷笋,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,254評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門(mén)芳杏,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái),“玉大人右莱,你說(shuō)我怎么就攤上這事蚜锨。” “怎么了慢蜓?”我有些...
    開(kāi)封第一講書(shū)人閱讀 152,445評(píng)論 0 341
  • 文/不壞的土叔 我叫張陵亚再,是天一觀的道長(zhǎng)。 經(jīng)常有香客問(wèn)我晨抡,道長(zhǎng)氛悬,這世上最難降的妖魔是什么? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 55,185評(píng)論 1 278
  • 正文 為了忘掉前任耘柱,我火速辦了婚禮如捅,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘调煎。我一直安慰自己镜遣,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,178評(píng)論 5 371
  • 文/花漫 我一把揭開(kāi)白布士袄。 她就那樣靜靜地躺著悲关,像睡著了一般。 火紅的嫁衣襯著肌膚如雪娄柳。 梳的紋絲不亂的頭發(fā)上寓辱,一...
    開(kāi)封第一講書(shū)人閱讀 48,970評(píng)論 1 284
  • 那天,我揣著相機(jī)與錄音赤拒,去河邊找鬼秫筏。 笑死诱鞠,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的这敬。 我是一名探鬼主播航夺,決...
    沈念sama閱讀 38,276評(píng)論 3 399
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼崔涂!你這毒婦竟也來(lái)了敷存?” 一聲冷哼從身側(cè)響起,我...
    開(kāi)封第一講書(shū)人閱讀 36,927評(píng)論 0 259
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤堪伍,失蹤者是張志新(化名)和其女友劉穎,沒(méi)想到半個(gè)月后觅闽,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體帝雇,經(jīng)...
    沈念sama閱讀 43,400評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 35,883評(píng)論 2 323
  • 正文 我和宋清朗相戀三年蛉拙,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了尸闸。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 37,997評(píng)論 1 333
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡孕锄,死狀恐怖吮廉,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情畸肆,我是刑警寧澤宦芦,帶...
    沈念sama閱讀 33,646評(píng)論 4 322
  • 正文 年R本政府宣布,位于F島的核電站轴脐,受9級(jí)特大地震影響调卑,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜大咱,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,213評(píng)論 3 307
  • 文/蒙蒙 一恬涧、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧碴巾,春花似錦溯捆、人聲如沸。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 30,204評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至旷痕,卻和暖如春碳锈,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背欺抗。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 31,423評(píng)論 1 260
  • 我被黑心中介騙來(lái)泰國(guó)打工售碳, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 45,423評(píng)論 2 352
  • 正文 我出身青樓贸人,卻偏偏與公主長(zhǎng)得像间景,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子艺智,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,722評(píng)論 2 345

推薦閱讀更多精彩內(nèi)容