10X單細(xì)胞(10X空間轉(zhuǎn)錄組)細(xì)胞通訊分析之InterCellDB(推斷通訊中感興趣的生物學(xué)功能)

作者,追風(fēng)少年i

hello牧抵,大家好笛匙,今天這個(gè)日子相信大家都不會(huì)陌生吧,不知道有沒(méi)有當(dāng)年復(fù)讀的童鞋犀变,日子很快妹孙,恐怕高考之后,再也沒(méi)有經(jīng)歷過(guò)高中那種煉獄的日子获枝。

今天我們分享一個(gè)好的內(nèi)容蠢正,關(guān)于細(xì)胞通訊,其實(shí)關(guān)于細(xì)胞通訊省店,最終的目的還是要知道通訊的生物學(xué)作用嚣崭,文章在InterCellDB: A User-Defined Database for Inferring Intercellular Networks,2022年發(fā)表在advanced science,單位是浙江大學(xué)懦傍,看看這個(gè)國(guó)產(chǎn)分析方法是否能媲美CellChat雹舀。

單細(xì)胞組學(xué)技術(shù)為分析細(xì)胞間網(wǎng)絡(luò)提供了足夠的分辨率。已經(jīng)開(kāi)發(fā)了幾種分析工具來(lái)揭示細(xì)胞間相互作用粗俱,例如 CellPhoneDB说榆、SingleCellSignalR、CellChat寸认、iTALK 和 NicheNet签财。但是,該領(lǐng)域仍有未滿(mǎn)足的需求偏塞。首先唱蒸,所有當(dāng)前的分析工具都提供有限類(lèi)型的交互。實(shí)際上烛愧,細(xì)胞間網(wǎng)絡(luò)要復(fù)雜得多油宜,包括配體-受體掂碱、受體-受體、細(xì)胞外基質(zhì)-受體和囊泡-細(xì)胞相互作用慎冤。此外疼燥,所有這些已發(fā)表的工具在提供特定功能的細(xì)胞間相互作用方面的能力有限。例如蚁堤,在最近的研究中醉者,發(fā)現(xiàn)沒(méi)有可用的分析方法能夠預(yù)測(cè)調(diào)節(jié)性 T 細(xì)胞(Treg 細(xì)胞)和小膠質(zhì)細(xì)胞之間的配體-受體相互作用如何促進(jìn)缺血性中風(fēng)后的少突膠質(zhì)細(xì)胞生成和白質(zhì)修復(fù)(說(shuō)白了就是通訊具體的生物學(xué)功能的挖掘)。最后披诗,大多數(shù)當(dāng)前方法都有固定的內(nèi)置搜索標(biāo)準(zhǔn)撬即,這限制了分析的維度。需要一個(gè)新平臺(tái)來(lái)以用戶(hù)定義的方式分析通訊網(wǎng)絡(luò)呈队。因此剥槐,開(kāi)發(fā)了一個(gè)名為 InterCellDB 的新分析平臺(tái)來(lái)解決這些差距,并使用單細(xì)胞 RNA 測(cè)序 (scRNA-seq) 數(shù)據(jù)集結(jié)合感興趣的特定生物學(xué)功能宪摧,定義的細(xì)胞間通訊綜合分析粒竖。

Overview of InterCellDB database and workflow.

首先看看軟件特點(diǎn)

  • 對(duì)配受體基因進(jìn)行了位置和功能分類(lèi)
  • 差異基因納入通訊分析,All the DEGs were annotated by matching to the gene database, including their cellular localization, functional classification, and their attribution in GO terms几于。當(dāng)然蕊苗,基因可以自我設(shè)定(比如某個(gè)GO通路的基因)。
  • 置換檢驗(yàn)
  • 可視化做的不錯(cuò)

參考示例

install.packages("devtools")
install.packages("BiocManager")
BiocManager::install(pkgs = "GO.db")
devtools::install_github("ZJUDBlab/InterCellDB")
library(InterCellDB)  # 0.9.2
library(Seurat)       # 3.2.0
library(dplyr)        # 1.0.5
library(future)       # 1.21.0

1沿彭、Data preparation and create InterCell object

InterCellDB requires 2 data as input:

  • normalized count matrix
  • differentially expressed genes (DEGs) with their belonging clusters
seurat.obj <- readRDS(url("https://figshare.com/ndownloader/files/31497371"))
tmp.counts <- seurat.obj[["RNA"]]@data
colnames(tmp.counts) <- as.character(Idents(seurat.obj))
# show data structure, row names are genes, column names are clusters
tmp.counts[1:5, 1:5]
# 5 x 5 sparse Matrix of class "dgCMatrix"
#            cDC2        NK        NK       cDC2   Myeloid
# Gnai3 .         1.4526836 0.7893417 0.02189368 .        
# Cdc45 0.1342404 0.1345867 .         1.01913912 0.1698247
# Scml2 .         .         .         .          .        
# Apoh  .         .         .         .          .        
# Narf  .         .         .         0.02189368 .

make sure the data normalization and cell clustering are done.Take Seurat for example, the NormalizeData or SCTransform should be processed for getting normalized data, and FindClusters should be processed for getting cell clustering result

差異基因示例

markers.used <- readRDS(url("https://figshare.com/ndownloader/files/31497401"))
#                p_val    LogFC pct.1 pct.2          PVal cluster    gene
# Plbd1   2.436468e-234 1.751750 0.990 0.293 4.703844e-230    cDC2   Plbd1
# Cd209a  1.126670e-219 2.934390 0.642 0.099 2.175148e-215    cDC2  Cd209a
# Ms4a6c  2.003678e-217 1.306534 0.916 0.244 3.868301e-213    cDC2  Ms4a6c
# Alox5ap 2.043584e-213 1.118439 0.922 0.214 3.945343e-209    cDC2 Alox5ap
# Fgr     3.582238e-205 1.177831 0.775 0.154 6.915869e-201    cDC2     Fgr

To note, InterCellDB defines several mandatory column names when read in DEGs, which are columns named 'gene', 'cluster', 'LogFC', 'PVal', and their meanings are listed below:

  • Column 'gene', the gene name.
  • Column 'cluster', the belonging cluster of every DEG.
  • Column 'LogFC', the log fold changes of each DEG.
  • Column 'PVal', the statistical significance for the gene being defined as DEG.

Then, the InterCell object is created and we name it here as inter.obj. This variable is used for doing all the following analysis.

inter.obj <- CreateInterCellObject(markers.used, species = "mouse", add.exprs = TRUE, exprs.data = tmp.counts)

2朽砰、Customize interactions and proteins from databases

For customizing interactions, InterCellDB provides 4 options: evidence source, credibility score, action mode and action effect. To run this example data, we use all experimentally validated interactions, pathway-curated interactions and predicted interactions, and select those physically associated ones.

inter.obj <- SelectDBSubset(inter.obj,
        use.exp = TRUE,   # to use experimentally validated interactions
        exp.score.range = c(1, 1000),  # use all credibility score for 'use.exp'
        use.know = TRUE,  # to use pathway curated interactions
        know.score.range = c(1, 1000),  # use all credibility score for 'use.know'
        use.pred = TRUE,  # to use predicted interactions
        pred.score.range = c(1, 1000),  # use all credibility score for 'use.pred' 
        sel.action.mode = "binding",  # select physically associated ones
        sel.action.effect = "ALL")    # not limiting action effects

Details about 4 options on customizing interaction database are given in the supporting information of the article. Here, we list some of the most commonly used options.

  • evidence sources: experimentally validated (use.exp), pathway curated (use.know), predicted (use.pred).
  • credibility score: ranging from 1 to 1000. Highest confidence should >= 900, and high confidence should >= 700, and medium >= 400, while low < 400.
  • action mode: giving the relation of interacting proteins. Commonly used one is 'binding', which selects physically associated protein interactions.
  • action effect: giving the effect of protein interaction. Commonly used ones are 'positive' and 'negative'. 'positive' means one protein promotes the expression of the other, while 'negative' means the inhibition of the other.

For customizing included proteins, InterCellDB also provides 4 options: protein expression change, protein subcellular location, function and relevant biological process.

# fetch genes of interest. GO.db version is v3.8.2 here, different version may have non-identical result.
genes.receiver <- FetchGeneOI(inter.obj, 
        sel.location = "Plasma Membrane",  # fetch proteins located in plasma membrane
        sel.location.score = c(4:5),       # 4 and 5 are of high confidence
        sel.type = "Receptor",             # fetch those receptors
        sel.go.terms = "GO:0006955"        # fetch [GO:0006955]immune response related proteins
        )
# [1] "Fetch 160 genes of interest."
genes.sender <- FetchGeneOI(inter.obj, 
        sel.location = "Extracellular Region",  # fetch proteins located in extracellular region
        sel.location.score = c(4:5),            # 4 and 5 are of high confidence
        sel.go.terms = "GO:0006955"             # fetch [GO:0006955]immune response related proteins
        )
# [1] "Fetch 281 genes of interest."

To note, as the protein expression change is highly associated with the input data, InterCellDB put it to be done in later steps, see parameter sel.exprs.change in AnalyzeInterInFullView. If users want to explore more selections on the other 3 options, InterCellDB provides several functions to list all items for them.

  • protein subcellular location: ListAllGeneLocation
  • protein function: ListAllGeneType and ListAllGeneMergeType
  • protein involved biological process: ListAllGeneGOTerm

3、Perform full network analysis

Full network analysis summarizes the interactions between every 2 cell clusters, and infers main participants by aggregated power and total count of gene pairs. The power of every gene pair is calculated by the product of expressions of 2 participating genes. The selected gene pairs are those with p-value < 0.05 in cell label permutation test (the p-value cutoff can also be set by users).

Full network analysis goes with 2 steps:

  • generate cell label permutation list
  • network analysis and filter statistically significant gene pairs
plan("multiprocess", workers = 2)  # package future provides the parallel interface, here create 2 parallel processes  
# using options(future.globals.maxSize=1024^2 * 1000) to get more space to run parallel process
tmp.permlist <- Tool.GenPermutation(inter.obj, tmp.counts, perm.times = 100)
plan("sequential")  # close the parallel processes

推斷網(wǎng)絡(luò)

plan("multiprocess", workers = 4)
inter.obj <- AnalyzeInterInFullView(inter.obj, 
        sel.some.genes.X = genes.receiver,  # set the genes for receiver cells
        sel.some.genes.Y = genes.sender,    # set the genes for sender cells
        sel.exprs.change = "Xup.Yup",       # select genes up-regulated for both sender cells and receiver cells
        run.permutation = TRUE,             # run statistical test with permutation list
        perm.expression = tmp.permlist,     # given the permutation list, generated by `Tool.GenPermutation`
        perm.pval.sides = "one-side-large", # use one-tailed test
        perm.pval.cutoff = 0.05)            # p-value cutoff 0.05
plan("sequential")

The results of full network analysis can be further collected by GetResultFullView for visualization or table output. Users can define their clusters of interest. Here, we select the interactions among clusters in tumor from example data, which exclude interactions involving 'LN T cell', 'Endo lymphatic', 'Endo LN' and 'Fibroblast LN' clusters.

torm.LN.clusters <- c("LN T cell", "Endo lymphatic", "Endo LN", "Fibroblast LN")
all.clusters <- ListAllClusters(inter.obj)                # list all clusters
used.clusters <- setdiff(all.clusters, torm.LN.clusters)  # select those clusters in tumor
used.clusters <- used.clusters[order(used.clusters)]
# show the result of full network analysis
fullview.result <- GetResultFullView(inter.obj, 
        show.clusters.in.x = used.clusters,
        show.clusters.in.y = used.clusters,
        plot.axis.x.name = "signal receiving cells",
        plot.axis.y.name = "signal sending cells",
        nodes.size.range = c(1,8))
Tool.ShowGraph(fullview.result)  # visualization
Tool.WriteTables(fullview.result, dir.path = "./")  # write result into csv file
圖片.png

4喉刘、Perform intercellular analysis

Intercellular analysis focuses on one cell-cell interaction, and explores the results in 4 aspects:

  • summarize action mode and action effect
  • subset and rank gene pairs
  • evaluate the specificity of gene pairs
  • summarize gene pairs in spatial pattern

4.1 Summarize action mode and action effect

inter.obj <- FetchInterOI(inter.obj, cluster.x = "Myeloid", cluster.y = "CAF 1")
inter.obj <- AnalyzeInterInAction(inter.obj)
used.action.mode <- c("activation", "inhibition", "catalysis", "reaction", "expression", "ptmod")  # action mode: not showing mode 'binding' and 'other'
used.color.mode <- c("#FB8072", "#80B1D3", "#8DD3C7", "#FFFFB3", "#BEBADA", "#FDB462")  # customized color, one-to-one corresponding to those in `used.action.mode`
action.mode.result <- GetResultPieActionMode(inter.obj, 
        limits.exprs.change = c("Xup.Yup"),
        limits.action.mode = used.action.mode,
        color.action.mode = used.color.mode)
Tool.ShowGraph(action.mode.result)
圖片.png
action.effect.result <- GetResultPieActionEffect(inter.obj, limits.exprs.change = c("Xup.Yup"))
Tool.ShowGraph(action.effect.result)
圖片.png

4.2 Subset and rank gene pairs

Circumstances exist when too many gene pairs are returned or the initial selection of gene subsets are not well satisfied to users' needs. We provide additional function SelectInterSubset for picking up subsets of gene pairs in intercellular scale. All subset selection given in full network analysis are applicable in this function. In addition, action mode and action effect can be customized.

Here, we filter activated gene pairs, i.e. selecting those with either 'activation' action mode or 'positive' action effect.

inter.obj <- SelectInterSubset(inter.obj, 
        sel.action.mode = "activation",
        sel.action.effect = "positive",
        sel.action.merge.option = "union"
        )
result.inter.pairs <- GetResultTgCrosstalk(inter.obj, 
        func.to.use = list(
            Power = function(x) { log1p(x) },        # further log-transform the power
            Confidence = function(x) { 1/(1+x) }     # invert the confidence to show
        ),  
        axis.order.xy = c("Power", "Power"),       # order genes by power
        axis.order.xy.decreasing = c(TRUE, TRUE),  # select if ordering is decreasing
        nodes.size.range = c(1, 8))
Tool.ShowGraph(result.inter.pairs)
圖片.png

We rank gene pairs by their power, and select those top ranked genes

inter.obj <- SelectInterSubset(inter.obj, 
        sel.some.genes.X = c("Itgb2","Itgam","C5ar1","Ccr2","C3ar1","Ccr1","Ccr5"),
        sel.some.genes.Y = c("C3","Ccl11","Cxcl12","Cxcl1","Ccl2","Ccl7","C4b")
        )

4.3 Evaluate the specificity of gene pairs

Besides the power rank, we also evaluate the specificity of the gene pairs

tmp.target.cluster.groups <- ListClusterGroups(inter.obj, 
        use.former = TRUE,
        cluster.former = c("Myeloid")  # list all crosstalk with Myeloid as receiver cell, this order is aligned with the X and Y axis in fullview result. In this example, clusters listed in X axis are the receiver cells.
        )
tmp.target.cluster.groups <- setdiff(tmp.target.cluster.groups, c("Myeloid~Myeloid", "Myeloid~Endo LN", "Myeloid~Endo lymphatic", "Myeloid~Fibroblast LN", "Myeloid~LN T cell"))

The analysis is processed in AnalyzeInterSpecificity and users can visualize the result by GetResultTgSpecificity.

inter.obj <- AnalyzeInterSpecificity(inter.obj, 
        to.cmp.cluster.groups = tmp.target.cluster.groups
        )

result.specificity <- GetResultTgSpecificity(inter.obj,
        sel.uq.cnt.options = seq_along(tmp.target.cluster.groups),
        plot.uq.cnt.merged = TRUE, 
        plot.name.X.to.Y = FALSE,
        func.to.use = list(Power = function(x) { log1p(x) },
                       Confidence = function(x) { 1/(1+x) }),
        dot.size.range = c(2,8),
        dot.colour.seq = c("#00809D", "#EEEEEE", "#C30000"),
        dot.colour.value.seq = c(0, 0.4, 1)
        )
Tool.ShowGraph(result.specificity)
圖片.png

4.4 Summarize gene pairs in spatial pattern

蛋白質(zhì)-蛋白質(zhì)相互作用受到它們空間接近性的限制瞧柔。 InterCellDB 為基因提供了相應(yīng)的基因產(chǎn)物亞細(xì)胞位置,還提供了可視化方法以空間模式顯示基因?qū)Α?可以將作用方式饱搏、作用效果非剃、蛋白質(zhì)亞細(xì)胞定位、基因表達(dá)變化整合到現(xiàn)實(shí)情況中推沸。
Here, we choose only the plasma membrane protein and extracellular region protein to show the paracrine protein-protein interactions.

tmp.hide.locations <- setdiff(ListAllGeneLocation(inter.obj), c("Plasma Membrane", "Extracellular Region"))  # select the subcellular locations allowed to show
set.seed(101L)  # set seed to make reproducible result, or gene will be re-arranged every time re-running GetResultTgCellPlot
result.sptialpattern <- GetResultTgCellPlot(inter.obj, 
        area.extend.times = 20,  # control the size of plotting. Increase the value by 10 when warnings ask to
        hide.other.area = TRUE,
        hide.locations.X = tmp.hide.locations,
        hide.locations.Y = tmp.hide.locations,
    expand.gap.radius.list = list(ECM = 8, CTP = 2, NC = 2, OTHER = 2),
        link.size = 0.3,
        link.alpha = 0.8,
        legend.manual.left.spacing = grid::unit(0.1, "cm")
        )
Tool.ShowGraph(result.sptialpattern)
圖片.png

生活很好,有你更好券坞,百度文庫(kù)出現(xiàn)了大量抄襲我的文章鬓催,對(duì)此我深表無(wú)奈,我寫(xiě)的文章恨锚,別人掛上去賺錢(qián)宇驾,抄襲可恥,掛到百度文庫(kù)的人更可恥

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
禁止轉(zhuǎn)載猴伶,如需轉(zhuǎn)載請(qǐng)通過(guò)簡(jiǎn)信或評(píng)論聯(lián)系作者课舍。
  • 序言:七十年代末塌西,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子筝尾,更是在濱河造成了極大的恐慌捡需,老刑警劉巖,帶你破解...
    沈念sama閱讀 206,013評(píng)論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件筹淫,死亡現(xiàn)場(chǎng)離奇詭異站辉,居然都是意外死亡,警方通過(guò)查閱死者的電腦和手機(jī)损姜,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,205評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門(mén)饰剥,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái),“玉大人摧阅,你說(shuō)我怎么就攤上這事汰蓉。” “怎么了棒卷?”我有些...
    開(kāi)封第一講書(shū)人閱讀 152,370評(píng)論 0 342
  • 文/不壞的土叔 我叫張陵顾孽,是天一觀的道長(zhǎng)。 經(jīng)常有香客問(wèn)我娇跟,道長(zhǎng)岩齿,這世上最難降的妖魔是什么? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 55,168評(píng)論 1 278
  • 正文 為了忘掉前任苞俘,我火速辦了婚禮盹沈,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘吃谣。我一直安慰自己乞封,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,153評(píng)論 5 371
  • 文/花漫 我一把揭開(kāi)白布岗憋。 她就那樣靜靜地躺著肃晚,像睡著了一般。 火紅的嫁衣襯著肌膚如雪仔戈。 梳的紋絲不亂的頭發(fā)上关串,一...
    開(kāi)封第一講書(shū)人閱讀 48,954評(píng)論 1 283
  • 那天,我揣著相機(jī)與錄音监徘,去河邊找鬼晋修。 笑死,一個(gè)胖子當(dāng)著我的面吹牛凰盔,可吹牛的內(nèi)容都是我干的墓卦。 我是一名探鬼主播,決...
    沈念sama閱讀 38,271評(píng)論 3 399
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼户敬,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼落剪!你這毒婦竟也來(lái)了睁本?” 一聲冷哼從身側(cè)響起,我...
    開(kāi)封第一講書(shū)人閱讀 36,916評(píng)論 0 259
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤忠怖,失蹤者是張志新(化名)和其女友劉穎呢堰,沒(méi)想到半個(gè)月后,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體脑又,經(jīng)...
    沈念sama閱讀 43,382評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡暮胧,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 35,877評(píng)論 2 323
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了问麸。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片往衷。...
    茶點(diǎn)故事閱讀 37,989評(píng)論 1 333
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖严卖,靈堂內(nèi)的尸體忽然破棺而出席舍,到底是詐尸還是另有隱情,我是刑警寧澤哮笆,帶...
    沈念sama閱讀 33,624評(píng)論 4 322
  • 正文 年R本政府宣布来颤,位于F島的核電站,受9級(jí)特大地震影響稠肘,放射性物質(zhì)發(fā)生泄漏福铅。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,209評(píng)論 3 307
  • 文/蒙蒙 一项阴、第九天 我趴在偏房一處隱蔽的房頂上張望滑黔。 院中可真熱鬧,春花似錦环揽、人聲如沸略荡。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 30,199評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)汛兜。三九已至,卻和暖如春通今,著一層夾襖步出監(jiān)牢的瞬間粥谬,已是汗流浹背。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 31,418評(píng)論 1 260
  • 我被黑心中介騙來(lái)泰國(guó)打工辫塌, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留帝嗡,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 45,401評(píng)論 2 352
  • 正文 我出身青樓璃氢,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親狮辽。 傳聞我的和親對(duì)象是個(gè)殘疾皇子一也,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,700評(píng)論 2 345

推薦閱讀更多精彩內(nèi)容