適用背景
之前寫(xiě)了兩篇博客(四步完成單細(xì)胞數(shù)據(jù)調(diào)控網(wǎng)絡(luò)流程分析-SCENIC/pySCENIC-2022-09-06和SCENIC/pySCENIC結(jié)果可視化 2022-11-08)介紹SCENIC/pySCENIC的使用,最近在使用的時(shí)候遇到一些問(wèn)題阐污,因此這篇文章作為補(bǔ)充迫淹,如果看不懂本篇可以查看前兩篇博客末秃。
補(bǔ)充內(nèi)容主要有以下幾點(diǎn):
- 1、小鼠的數(shù)據(jù)庫(kù)構(gòu)建
- 2蒸痹、從四步流程縮減到三步
- 3、環(huán)境構(gòu)建遇到的一些errors
- 4、SCENIC版本的bugs
小鼠的數(shù)據(jù)庫(kù)構(gòu)建
之前的博客構(gòu)建的是人的數(shù)據(jù)庫(kù)蛔糯,但博主最近分析需要用到小鼠的,因此需要構(gòu)建一下新的數(shù)據(jù)庫(kù)窖式。正如之前博客寫(xiě)到的蚁飒,其實(shí)構(gòu)建這個(gè)數(shù)據(jù)庫(kù)只需要替換3個(gè)文件,轉(zhuǎn)錄因子(TF)列表文件萝喘,feather文件已經(jīng)tbl文件淮逻,建議使用最新的v10版本,當(dāng)然也可以根據(jù)網(wǎng)站提示挑選適合自己數(shù)據(jù)的文件阁簸。我下載的文件如下:
- 小鼠TF文件:allTFs_mm.txt
https://resources.aertslab.org/cistarget/tf_lists/ - feather文件:mm10_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather
https://resources.aertslab.org/cistarget/databases/mus_musculus/mm10/refseq_r80/mc_v10_clust/gene_based/ - tbl文件:motifs-v10nr_clust-nr.mgi-m0.001-o0.0.tbl.txt
https://resources.aertslab.org/cistarget/motif2tf/motifs-v10nr_clust-nr.mgi-m0.001-o0.0.tbl
從四步流程縮減到三步
之前生成pyscenic需要的loom文件要先用R導(dǎo)出矩陣爬早,再用Python生成loom文件,實(shí)在有點(diǎn)繁瑣启妹,那可不可以直接從Seurat 對(duì)象生成loom文件呢筛严?答案當(dāng)然是可以,使用SCopeLoomR即可饶米。只需要使用build_loom函數(shù)桨啃,在get_count_from_seurat.R文件后面添加幾行代碼即可车胡。
get_count_from_seurat.R文件內(nèi)容如下:
library(optparse)
op_list <- list(
make_option(c("-i", "--inrds"), type = "character", default = NULL, action = "store", help = "The input of Seurat RDS",metavar="rds"),
make_option(c("-d", "--ident"), type = "character", default = NULL, action = "store", help = "The sample Ident of Seurat object",metavar="idents"),
make_option(c("-s", "--size"), type = "integer", default = NULL, action = "store", help = "The sample size of Seurat object",metavar="size"),
make_option(c("-l", "--label"), type = "character", default = "out", action = "store", help = "The label of output file",metavar="label"),
make_option(c("-a", "--assay"), type = "character", default = "Spatial", action = "store", help = "The assay of input file",metavar="assay")
)
parser <- OptionParser(option_list = op_list)
opt = parse_args(parser)
assay <- opt$assay
library(Seurat)
obj <- readRDS(opt$inrds)
if (!is.null(opt$ident)) {
Idents(obj) <- opt$ident
size=opt$size
if (!is.null(size)) {
obj <- subset(x = obj, downsample = opt$size)
}
saveRDS(obj,"subset.rds")
}
if (is.null(opt$label)) {
label1 <- 'out'
}else{
label1 <- opt$label
}
library(SCopeLoomR)
outloom <- paste0(label1,".loom")
build_loom(file.name = outloom,dgem = obj@assays[[assay]]@counts)
write.table(obj@meta.data,'metadata_subset.xls',sep='\t',quote=F)
使用方法:
-i,輸入Seurat對(duì)象的RDS文件
-d照瘾,隨機(jī)取樣分組的列名匈棘,例如groups,如果不賦值則表示不隨機(jī)取樣析命,使用全部細(xì)胞
-s主卫,隨機(jī)取樣的大小,例如20碳却,因?yàn)檫@里用的是pbmc_small队秩,所以設(shè)置小一點(diǎn),實(shí)際情況可能需要設(shè)置大一點(diǎn)
-l昼浦,輸出文件的標(biāo)簽馍资,默認(rèn)為out,
-a,使用的Seurat對(duì)象assay关噪,默認(rèn)為Spatial鸟蟹。
Rscript get_count_from_seurat.R -i test.rds -d groups -s 20 -l out -a Spatial
需要注意的是,新腳本添加了assay這個(gè)參數(shù)使兔,默認(rèn)為Spatial建钥,因?yàn)樽罱谟门芸臻g組數(shù)據(jù),如果要應(yīng)用到單細(xì)胞需要改為RNA虐沥。
環(huán)境構(gòu)建遇到的一些errors
因?yàn)樽罱涸诟南到y(tǒng)盤(pán)名稱(chēng)還有不斷刪文件導(dǎo)致我用別人的軟件熊经、數(shù)據(jù)庫(kù)文件和庫(kù)路徑時(shí)總是報(bào)錯(cuò),因?yàn)樽约洪_(kāi)始來(lái)重新構(gòu)建SCENIC/pyscenic運(yùn)行需要的環(huán)境欲险,期間遇到不少問(wèn)題镐依,這里匯總一下。
- 首先天试,需要安裝一下miniconda槐壳,在這個(gè)網(wǎng)站下載根據(jù)網(wǎng)站教程進(jìn)行安裝
https://docs.conda.io/en/latest/miniconda.html#linux-installers
其實(shí)就是下載一個(gè)shell腳本,然后運(yùn)行就好喜每,例如:
sh Miniconda3-latest-Linux-x86_64.sh
創(chuàng)建一個(gè)環(huán)境务唐,安裝Python和R
conda create -n pyscenic
conda activate pyscenic
conda install r-base=4.2.1
conda install python=3.9.13
pip install pyscenic -i https://mirrors.aliyun.com/pypi/simple/
pip install pandas -i https://mirrors.aliyun.com/pypi/simple/
pip install scanpy -i https://mirrors.aliyun.com/pypi/simple/
pip install opencv-python -i https://mirrors.aliyun.com/pypi/simple/
pip install loompy -i https://mirrors.aliyun.com/pypi/simple/
這樣就配置好基本環(huán)境了,接下來(lái)是安裝R包带兜,確保以下需要library的包就可以枫笛。
library(Seurat)
library(SCopeLoomR)
library(AUCell)
library(SCENIC)
library(dplyr)
library(KernSmooth)
library(RColorBrewer)
library(plotly)
library(BiocParallel)
library(optparse)
可以先用BiocManager::install() 安裝,如果安裝不了刚照,谷歌搜一下conda install [package name](百度搜不出來(lái)刑巧,不推薦)就能找到conda 安裝這個(gè)包的官網(wǎng),雖然conda很萬(wàn)能,但是慢啊海诲,BiocManager::install會(huì)快很多。當(dāng)然還有一些包是裝不了檩互,需要用devtools安裝特幔,例如SCopeLoomR,那哪些包要用devtools裝呢闸昨?一般BiocManager::install和conda裝不了的包就極大概率要用devtools裝了蚯斯,所以直接搜這個(gè)包的GitHub或官網(wǎng),例如谷歌搜SCopeLoomR GitHub饵较,GitHub里面說(shuō)明需要這樣安裝:
devtools::install_github("aertslab/SCopeLoomR")
- 談?wù)凷CopeLoomR安裝過(guò)程遇到的一些坑
本來(lái)想偷懶直接用別人的庫(kù)路徑拍嵌,library的時(shí)候結(jié)果報(bào)錯(cuò)HDF5 library version mismatched error,R還abort突然中斷了循诉,就是說(shuō)我的HDF5庫(kù)版本跟我這個(gè)包安裝時(shí)的版本不匹配横辆,只能重裝。用devtools重裝又各種報(bào)錯(cuò)茄猫,缺失某個(gè)包狈蚤,缺失某個(gè)動(dòng)態(tài)庫(kù)……反正各種缺失,沒(méi)辦法划纽,只能缺什么裝什么了脆侮,而且裝devtools的時(shí)候就很容易報(bào)錯(cuò),建議先用conda安裝以下包和動(dòng)態(tài)庫(kù)再安裝SCopeLoomR:
conda install -c anaconda git
conda install -c anaconda hdf5
conda install -c conda-forge libgit2
conda install -c conda-forge r-ragg
conda install -c conda-forge r-xml
再在R里運(yùn)行:
devtools::install_github("aertslab/SCopeLoomR")
SCENIC版本的bugs
在進(jìn)行可視化的時(shí)候需要用到SCENIC的R包勇劣,但是我發(fā)現(xiàn)調(diào)函數(shù)plotRSS的過(guò)濾參數(shù)時(shí)靖避,不管怎么調(diào),出來(lái)的圖都是一樣的比默。因?yàn)榭匆幌逻@個(gè)函數(shù)的源代碼幻捏,好家伙,發(fā)現(xiàn)它莫名奇妙過(guò)濾了1.5的值退敦,而且這個(gè)值是固定在函數(shù)里無(wú)法更改粘咖。
- 查看SCENIC版本1.2.4
> packageVersion("SCENIC")
[1] ‘1.2.4’
函數(shù)plotRSS源代碼
> plotRSS
function (rss, labelsToDiscard = NULL, zThreshold = 1, cluster_columns = FALSE,
order_rows = TRUE, trh = 0.01, varName = "cellType", col.low = "grey90",
col.mid = "darkolivegreen3", col.high = "darkgreen", revCol = FALSE)
{
varSize = "RSS"
varCol = "Z"
if (revCol) {
varSize = "Z"
varCol = "RSS"
}
rssNorm <- scale(rss)
rssNorm[rssNorm < zThreshold] <- 0
rssNorm <- rssNorm[, which(!colnames(rssNorm) %in% labelsToDiscard)]
tmp <- .plotRSS_heatmap(rssNorm, trh = trh, cluster_columns = cluster_columns,
order_rows = order_rows)
rowOrder <- rev(tmp@row_names_param$labels)
rss.df <- reshape2::melt(rss)
head(rss.df)
colnames(rss.df) <- c("Topic", varName, "RSS")
rssNorm.df <- reshape2::melt(rssNorm)
colnames(rssNorm.df) <- c("Topic", varName, "Z")
rss.df <- merge(rss.df, rssNorm.df)
rss.df <- rss.df[which(rss.df$Z >= 1.5), ]
rss.df <- rss.df[which(!rss.df[, varName] %in% labelsToDiscard),
]
rss.df[, "Topic"] <- factor(rss.df[, "Topic"], levels = rowOrder)
p <- dotHeatmap(rss.df, var.x = varName, var.y = "Topic",
var.size = varSize, min.size = 0.5, max.size = 5, var.col = varCol,
col.low = col.low, col.mid = col.mid, col.high = col.high)
invisible(list(plot = p, df = rss.df, rowOrder = rowOrder))
}
<bytecode: 0x55b0ea0f86b8>
<environment: namespace:SCENIC>
于是,我把這個(gè)包更新了一些再看函數(shù)代碼侈百,發(fā)現(xiàn)去掉了過(guò)濾1.5值的操作瓮下,果然是舊版本的一個(gè)bug啊。
- 查看SCENIC版本1.3.1
> packageVersion("SCENIC")
[1] ‘1.3.1’
函數(shù)plotRSS源代碼
> plotRSS
function (rss, labelsToDiscard = NULL, zThreshold = 1, cluster_columns = FALSE,
order_rows = TRUE, thr = 0.01, varName = "cellType", col.low = "grey90",
col.mid = "darkolivegreen3", col.high = "darkgreen", revCol = FALSE,
verbose = TRUE)
{
varSize = "RSS"
varCol = "Z"
if (revCol) {
varSize = "Z"
varCol = "RSS"
}
rssNorm <- scale(rss)
rssNorm <- rssNorm[, which(!colnames(rssNorm) %in% labelsToDiscard)]
rssNorm[rssNorm < 0] <- 0
rssSubset <- rssNorm
if (!is.null(zThreshold))
rssSubset[rssSubset < zThreshold] <- 0
tmp <- .plotRSS_heatmap(rssSubset, thr = thr, cluster_columns = cluster_columns,
order_rows = order_rows, verbose = verbose)
rowOrder <- rev(tmp@row_names_param$labels)
rm(tmp)
rss.df <- reshape2::melt(rss)
head(rss.df)
colnames(rss.df) <- c("Topic", varName, "RSS")
rssNorm.df <- reshape2::melt(rssNorm)
colnames(rssNorm.df) <- c("Topic", varName, "Z")
rss.df <- base::merge(rss.df, rssNorm.df)
rss.df <- rss.df[which(!rss.df[, varName] %in% labelsToDiscard),
]
if (nrow(rss.df) < 2)
stop("Insufficient rows left to plot RSS.")
rss.df <- rss.df[which(rss.df$Topic %in% rowOrder), ]
rss.df[, "Topic"] <- factor(rss.df[, "Topic"], levels = rowOrder)
p <- dotHeatmap(rss.df, var.x = varName, var.y = "Topic",
var.size = varSize, min.size = 0.5, max.size = 5, var.col = varCol,
col.low = col.low, col.mid = col.mid, col.high = col.high)
invisible(list(plot = p, df = rss.df, rowOrder = rowOrder))
}
<bytecode: 0x56295d178670>
<environment: namespace:SCENIC>
最后腳本匯總
三步完成單細(xì)胞數(shù)據(jù)調(diào)控網(wǎng)絡(luò)流程分析和可視化
需要四個(gè)腳本文件:create_loom_input.R钝域,pyscenic_from_loom.sh讽坏,calcRSS_by_scenic.R和function_pyscenic_visualize.R,我把四個(gè)腳本都放在目錄/00.scripts/下例证,然后只需要輸入Seurat對(duì)象的rds路徑inrds(例如/test.rds)和細(xì)胞分組列ingroup(例如sub)路呜,
運(yùn)行示例:
source ~/Miniconda/md/bin/activate pyscenic
inscp='~/00.scripts/'
inrds='~/test.rds'
ingroup='sub'
#Step 1
Rscript ${inscp}/create_loom_input.R -i ${inrds} -d ${ingroup} -l out -a Spatial
#Step 2
sh ${inscp}/pyscenic_from_loom.sh -i out.loom -n 20
#Step 3
Rscript ${inscp}/calcRSS_by_scenic.R -l aucell.loom -m metadata_subset.xls -c ${ingroup}
運(yùn)行上述腳本即可,有關(guān)參數(shù)請(qǐng)查看這篇博客四步完成單細(xì)胞數(shù)據(jù)調(diào)控網(wǎng)絡(luò)流程分析-SCENIC/pySCENIC-2022-09-06,這里不再贅述胀葱。
請(qǐng)注意漠秋,這里的腳本是適用于小鼠的,如果要用人的請(qǐng)?zhí)鎿Q數(shù)據(jù)庫(kù)三個(gè)文件抵屿。
附上需要用到的4個(gè)文件
- create_loom_input.R
library(optparse)
op_list <- list(
make_option(c("-i", "--inrds"), type = "character", default = NULL, action = "store", help = "The input of Seurat RDS",metavar="rds"),
make_option(c("-d", "--ident"), type = "character", default = NULL, action = "store", help = "The sample Ident of Seurat object",metavar="idents"),
make_option(c("-s", "--size"), type = "integer", default = NULL, action = "store", help = "The sample size of Seurat object",metavar="size"),
make_option(c("-l", "--label"), type = "character", default = "out", action = "store", help = "The label of output file",metavar="label"),
make_option(c("-a", "--assay"), type = "character", default = "Spatial", action = "store", help = "The assay of input file",metavar="assay")
)
parser <- OptionParser(option_list = op_list)
opt = parse_args(parser)
assay <- opt$assay
library(Seurat)
obj <- readRDS(opt$inrds)
if (!is.null(opt$ident)) {
Idents(obj) <- opt$ident
size=opt$size
if (!is.null(size)) {
obj <- subset(x = obj, downsample = opt$size)
}
saveRDS(obj,"subset.rds")
}
if (is.null(opt$label)) {
label1 <- 'out'
}else{
label1 <- opt$label
}
library(SCopeLoomR)
outloom <- paste0(label1,".loom")
build_loom(file.name = outloom,dgem = obj@assays[[assay]]@counts)
write.table(obj@meta.data,'metadata_subset.xls',sep='\t',quote=F)
- pyscenic_from_loom.sh
input_loom=out.loom
n_workers=20
#help function
function usage() {
echo -e "OPTIONS:\n-i|--input_loom:\t input loom file"
echo -e "-n|--n_workers:\t working core number"
echo -e "-h|--help:\t Usage information"
exit 1
}
#get value
while getopts :i:n:h opt
do
case "$opt" in
i) input_loom="$OPTARG" ;;
n) n_workers="$OPTARG" ;;
h) usage ;;
:) echo "This option -$OPTARG requires an argument."
exit 1 ;;
?) echo "-$OPTARG is not an option"
exit 2 ;;
esac
done
#在這里改數(shù)據(jù)庫(kù)文件
database='~/01.mouse/'
tfs=${database}/allTFs_mm.txt
feather=${database}/mm10_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather
tbl=${database}/motifs-v10nr_clust-nr.mgi-m0.001-o0.0.tbl.txt
pyscenic=pyscenic
# grn
$pyscenic grn \
--num_workers $n_workers \
--output grn.tsv \
--method grnboost2 \
$input_loom $tfs
# cistarget
$pyscenic ctx \
grn.tsv $feather \
--annotations_fname $tbl \
--expression_mtx_fname $input_loom \
--mode "dask_multiprocessing" \
--output ctx.csv \
--num_workers $n_workers \
--mask_dropouts
# AUCell
$pyscenic aucell \
$input_loom \
ctx.csv \
--output aucell.loom \
--num_workers $n_workers
- calcRSS_by_scenic.R
library(optparse)
op_list <- list(
make_option(c("-l", "--input_loom"), type = "character", default = NULL, action = "store", help = "The input of aucell loom file",metavar="rds"),
make_option(c("-m", "--input_meta"), type = "character", default = NULL, action = "store", help = "The metadata of Seurat object",metavar="idents"),
make_option(c("-a", "--assay"), type = "character", default = 'Spatial', action = "store", help = "The assay of Seurat object",metavar="assay"),
make_option(c("-c", "--celltype"), type = "character", default = NULL, action = "store", help = "The colname of metadata to calculate RSS",metavar="label")
)
parser <- OptionParser(option_list = op_list)
opt = parse_args(parser)
library(Seurat)
library(SCopeLoomR)
library(AUCell)
library(SCENIC)
library(dplyr)
library(KernSmooth)
library(RColorBrewer)
library(plotly)
library(BiocParallel)
celltype <- opt$celltype
assay <- opt$assay
loom <- open_loom(opt$input_loom)
regulons_incidMat <- get_regulons(loom, column.attr.name="Regulons")
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonAUC <- get_regulons_AUC(loom,column.attr.name='RegulonsAUC')
regulonAucThresholds <- get_regulon_thresholds(loom)
close_loom(loom)
meta <- read.table(opt$input_meta,sep='\t',header=T,stringsAsFactor=F)
cellinfo <- meta[,c(opt$celltype,paste0("nFeature_",assay),paste0("nCount_",assay))]
colnames(cellinfo)=c('celltype', 'nGene' ,'nUMI')
cellTypes <- as.data.frame(subset(cellinfo,select = 'celltype'))
selectedResolution <- "celltype"
sub_regulonAUC <- regulonAUC
rss <- calcRSS(AUC=getAUC(sub_regulonAUC),
cellAnnotation=cellTypes[colnames(sub_regulonAUC),
selectedResolution])
rss=na.omit(rss)
try({
rssPlot <- plotRSS(rss)
save(regulonAUC,rssPlot,regulons,file='regulon_RSS.Rdata')
})
saveRDS(rss,paste0(celltype,"_rss.rds"))
source('~/04.database/00.common_scripts/function_pyscenic_visualize.R')
plot_pyscenic(inloom='aucell.loom',incolor=incolor,inrss=paste0(celltype,"_rss.rds"),inrds='subset.rds',infun='median', ct.col=celltype,inregulons=NULL,ingrn='grn.tsv',ntop1=5,ntop2=50)
- function_pyscenic_visualize.R
library(Seurat)
library(SCopeLoomR)
library(AUCell)
library(SCENIC)
library(dplyr)
library(KernSmooth)
library(RColorBrewer)
library(plotly)
library(BiocParallel)
library(pheatmap)
library(cowplot)
library(ggpubr)
library(ggsci)
library(ggplot2)
library(tidygraph)
library(ggraph)
library(stringr)
colpalettes<-unique(c(pal_npg("nrc")(10),pal_aaas("default")(10),pal_nejm("default")(8),pal_lancet("lanonc")(9),
pal_jama("default")(7),pal_jco("default")(10),pal_ucscgb("default")(26),pal_d3("category10")(10),
pal_locuszoom("default")(7),pal_igv("default")(51),
pal_uchicago("default")(9),pal_startrek("uniform")(7),
pal_tron("legacy")(7),pal_futurama("planetexpress")(12),pal_rickandmorty("schwifty")(12),
pal_simpsons("springfield")(16),pal_gsea("default")(12)))
len <- 100
incolor<-c(brewer.pal(8, "Dark2"),brewer.pal(12, "Paired"),brewer.pal(8, "Set2"),brewer.pal(9, "Set1"),colpalettes,rainbow(len))
plot_pyscenic <- function(inloom='aucell.loom',incolor=incolor,inrss='seurat_annotations_rss.rds',inrds='subset.rds',infun='median', ct.col='seurat_annotations',inregulons=NULL,ingrn='grn.tsv',ntop1=5,ntop2=50){
###load data
loom <- open_loom(inloom)
regulons_incidMat <- get_regulons(loom, column.attr.name="Regulons")
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonAUC <- get_regulons_AUC(loom,column.attr.name='RegulonsAUC')
regulonAucThresholds <- get_regulon_thresholds(loom)
embeddings <- get_embeddings(loom)
close_loom(loom)
rss <- readRDS(inrss)
sce <- readRDS(inrds)
##calculate RSS fc
df = do.call(rbind,
lapply(1:ncol(rss), function(i){
dat= data.frame(
regulon = rownames(rss),
cluster = colnames(rss)[i],
sd.1 = rss[,i],
sd.2 = apply(rss[,-i], 1, get(infun))
)
}))
df$fc = df$sd.1 - df$sd.2
#select top regulon
ntopg <- df %>% group_by(cluster) %>% top_n(ntop1, fc)
ntopgene <- unique(ntopg$regulon)
write.table(ntopgene,'sd_regulon_RSS.list',sep='\t',quote=F,row.names=F,col.names=F)
#plot rss by cluster
#using plotRSS
rssPlot <- plotRSS(rss)
regulonsToPlot <- rssPlot$rowOrder
rp_df <- rssPlot$df
write.table(regulonsToPlot,'rss_regulon.list',sep='\t',quote=F,row.names=F,col.names=F)
write.table(rp_df,'rssPlot_data.xls',sep='\t',quote=F)
nlen <- length(regulonsToPlot)
hei <- ceiling(nlen)*0.4
blu<-colorRampPalette(brewer.pal(9,"Blues"))(100)
lgroup <- levels(rssPlot$df$cellType)
nlen2 <- length(lgroup)
wei <- nlen2*2
pdf(paste0('regulons_RSS_',ct.col,'_in_dotplot.pdf'),wei,hei)
print(rssPlot$plot)
dev.off()
# sd top gene
anrow = data.frame( group = ntopg$cluster)
lcolor <- incolor[1:length(unique(ntopg$cluster))]
names(lcolor) <- unique(anrow$group)
annotation_colors <- list(group=lcolor)
pn1 = rss[ntopg$regulon,]
pn2 = rss[unique(ntopg$regulon),]
rownames(pn1) <- make.unique(rownames(pn1))
rownames(anrow) <- rownames(pn1)
scale='row'
hei <- ceiling(length(ntopg$regulon)*0.4)
pdf(paste0('regulon_RSS_in_sd_topgene_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn1,annotation_row = anrow,scale=scale,annotation_colors=annotation_colors,show_rownames = T,main='sd top regulons')
)
print(
pheatmap(pn2,scale=scale,show_rownames = T, main='sd top unique regulons')
)
dev.off()
#plotRSS gene
pn2 = rss[unique(rp_df$Topic),]
scale='row'
hei <- ceiling(length(unique(rp_df$Topic))*0.4)
pdf(paste0('regulon_RSS_in_plotRSS_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn2,scale=scale,show_rownames = T, main='plotRSS unique regulons')
)
dev.off()
#all regulons
hei <- ceiling(length(rownames(rss))*0.2)
pdf(paste0('all_regulons_RSS_in_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(rss,scale=scale,show_rownames = T,main='all regulons RSS')
)
dev.off()
#plot rss by all cells
if (is.null(inregulons)){
inregulons <- regulonsToPlot
}else{
inregulons <- intersect(inregulons,rownames(rss))
regulonsToPlot <- inregulons
}
pn3=as.matrix(regulonAUC@assays@data$AUC)
regulon <- rownames(pn3)
#regulon <- inregulons
pn3 <- pn3[regulon,]
#pn3 <- pn3[,sample(1:dim(pn3)[2],500)]
sce$group1=sce@meta.data[,ct.col]
meta <- sce@meta.data
meta <- meta[order(meta$group1),]
#meta <- meta[colnames(pn3),]
ancol = data.frame(meta[,c('group1')])
colnames(ancol) <- c('group1')
rownames(ancol) <- rownames(meta)
lcolor <- incolor[1:length(unique(ntopg$cluster))]
names(lcolor) <- unique(ntopg$cluster)
annotation_colors <- list(group1 =lcolor)
df1 <- ancol
df1$cell <- rownames(df1)
df1 <- df1[order(df1$group1),]
pn3 <- pn3[,rownames(df1)]
torange=c(-2,2)
pn3 <- scales::rescale(pn3,to=torange)
pn3 <- pn3[,rownames(ancol)]
scale='none'
hei <- ceiling(length(unique(regulon))*0.2)
pdf(paste0('all_regulon_activity_in_allcells.pdf'),10,hei)
print(
pheatmap(pn3,annotation_col = ancol,scale=scale,annotation_colors=annotation_colors,show_rownames = T,show_colnames = F,cluster_cols=F)
)
#pheatmap(pn3,scale=scale,show_rownames = T, show_colnames = F,cluster_cols=F)
dev.off()
#plot in seurat
regulonsToPlot = inregulons
sce$sub_celltype <- sce@meta.data[,ct.col]
sub_regulonAUC <- regulonAUC[,match(colnames(sce),colnames(regulonAUC))]
cellClusters <- data.frame(row.names = colnames(sce),
seurat_clusters = as.character(sce$seurat_clusters))
cellTypes <- data.frame(row.names = colnames(sce),
celltype = sce$sub_celltype)
sce@meta.data = cbind(sce@meta.data ,t(sub_regulonAUC@assays@data@listData$AUC[regulonsToPlot,]))
Idents(sce) <- sce$sub_celltype
nlen <- length(regulonsToPlot)
hei <- ceiling(nlen)*0.4
blu<-colorRampPalette(brewer.pal(9,"Blues"))(100)
nlen2 <- length(unique(sce$sub_celltype))
wei <- nlen2*2
pdf('regulons_activity_in_dotplot.pdf',wei,hei)
print(DotPlot(sce, features = unique(regulonsToPlot)) + coord_flip()+
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = .5))+
scale_color_gradientn(colours = blu)
)
dev.off()
hei=ceiling(nlen/4)*4
pdf('regulons_activity_in_umap.pdf',16,hei)
print(RidgePlot(sce, features = regulonsToPlot , ncol = 4))
print(VlnPlot(sce, features = regulonsToPlot,pt.size = 0 ))
print(FeaturePlot(sce, features = regulonsToPlot))
dev.off()
grn <- read.table(ingrn,sep='\t',header=T,stringsAsFactors=F)
inregulons1=gsub('[(+)]','',inregulons)
c1 <- which(grn$TF %in% inregulons1)
grn <- grn[c1,]
#edge1 <- data.frame()
#node1 <- data.frame()
pdf(paste0(ntop2,'_regulon_netplot.pdf'),10,10)
for (tf in unique(grn$TF)) {
tmp <- subset(grn,TF==tf)
if (dim(tmp)[1] > ntop2) {
tmp <- tmp[order(tmp$importance,decreasing=T),]
tmp <- tmp[1:ntop2,]
}
node2 <- data.frame(tmp$target)
node2$node.size=1.5
node2$node.colour <- 'black'
colnames(node2) <- c('node','node.size','node.colour')
df1 <- data.frame(node=tf,node.size=2,node.colour='#FFDA00')
node2 <- rbind(df1,node2)
edge2 <- tmp
colnames(edge2) <- c('from','to','edge.width')
edge2$edge.colour <- "#1B9E77"
torange=c(0.1,1)
edge2$edge.width <- scales::rescale(edge2$edge.width,to=torange)
graph_data <- tidygraph::tbl_graph(nodes = node2, edges = edge2, directed = T)
p1 <- ggraph(graph = graph_data, layout = "stress", circular = TRUE) + geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_width_continuous(range = c(1,0.2)) +geom_node_point(aes(colour = node.colour, size = node.size))+ theme_void() +
geom_node_label(aes(label = node,colour = node.colour),size = 3.5, repel = TRUE)
p1 <- p1 + scale_color_manual(values=c('#FFDA00','black'))+scale_edge_color_manual(values=c("#1B9E77"))
print(p1)
}
dev.off()
#plot activity heatmap
meta <- sce@meta.data
celltype <- ct.col
cellsPerGroup <- split(rownames(meta),meta[,celltype])
sub_regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
# Calculate average expression:
regulonActivity_byGroup <- sapply(cellsPerGroup,
function(cells)
rowMeans(getAUC(sub_regulonAUC)[,cells]))
scale='row'
rss <- regulonActivity_byGroup
hei <- ceiling(length(regulonsToPlot)*0.4)
pn1 <- rss[regulonsToPlot,]
pdf(paste0('regulon_activity_in_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn1,scale=scale,show_rownames = T, main='regulons activity')
)
dev.off()
hei <- ceiling(length(rownames(rss))*0.2)
pdf(paste0('all_regulons_activity_in_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(rss,scale=scale,show_rownames = T,main='all regulons activity')
)
dev.off()
##calculate fc
df = do.call(rbind,
lapply(1:ncol(rss), function(i){
dat= data.frame(
regulon = rownames(rss),
cluster = colnames(rss)[i],
sd.1 = rss[,i],
sd.2 = apply(rss[,-i], 1, get(infun))
)
}))
df$fc = df$sd.1 - df$sd.2
#select top regulon
ntopg <- df %>% group_by(cluster) %>% top_n(ntop1, fc)
ntopgene <- unique(ntopg$regulon)
write.table(ntopgene,'sd_regulon_activity.list',sep='\t',quote=F,row.names=F,col.names=F)
anrow = data.frame( group = ntopg$cluster)
lcolor <- incolor[1:length(unique(ntopg$cluster))]
names(lcolor) <- unique(anrow$group)
annotation_colors <- list(group=lcolor)
pn1 = rss[ntopg$regulon,]
pn2 = rss[unique(ntopg$regulon),]
rownames(pn1) <- make.unique(rownames(pn1))
rownames(anrow) <- rownames(pn1)
scale='row'
hei <- ceiling(length(ntopg$regulon)*0.4)
pdf(paste0('regulon_activity_in_sd_topgene_',ct.col,'.pdf'),wei,hei)
print(
pheatmap(pn1,annotation_row = anrow,scale=scale,annotation_colors=annotation_colors,show_rownames = T,main='sd top regulons')
)
print(
pheatmap(pn2,scale=scale,show_rownames = T, main='sd top unique regulons ')
)
dev.off()
}