10X單細(xì)胞(10X空間轉(zhuǎn)錄組)之單細(xì)胞精度分析細(xì)胞通訊

開工了灾前,到了我們上班的時(shí)間了,剛開始孟辑,我們來分享一些新出現(xiàn)的方法哎甲,看看方法上有哪些更新蔫敲,我們雖然做不了潮流的引領(lǐng)者,但是要跟隨潮流炭玫,保證自己不被時(shí)代拋棄奈嘿,今天分享的文獻(xiàn)在Comparative analysis of cell-cell communication at single-cell resolution,里面對(duì)當(dāng)前細(xì)胞通訊的方法做了一定的總結(jié),我們借鑒一下吞加,并看看作者的方法有哪些可取之處裙犹。

目前的單細(xì)胞通訊的分析特點(diǎn)

  • at the level of the cell type or cluster, discarding single-cell level information.(這個(gè)大家應(yīng)該都非常清楚了,通訊強(qiáng)度就是基因表達(dá)的平均值相乘)

然而真實(shí)的細(xì)胞通訊

  • 細(xì)胞通訊 does not operate at the level of the group; rather, such interactions take place between individual cells(真正的細(xì)胞通訊是細(xì)胞之間榴鼎,而不是細(xì)胞類型群體之間伯诬,這也是空間轉(zhuǎn)錄組為什么分析空間臨近通訊的原因)。

如果我們要真正的研究細(xì)胞之間的通訊巫财,那么

  • analyze interactions at the level of the single cell
  • leverage the full information content contained within scRNA-seq data by looking at up- and down-stream cellular activity
  • enable comparative analysis between conditions;
  • robust to multiple experimental designs

不知道大家對(duì)單細(xì)胞做細(xì)胞通訊的方式有什么更好的建議呢盗似??平项?赫舒,我們來看看作者開發(fā)的方法,Scriabin闽瓢,an adaptable and computationally-efficient method for CCC analysis.Scriabin 通過結(jié)合收集的配體-受體相互作用數(shù)據(jù)庫(kù)接癌、下游細(xì)胞內(nèi)信號(hào)傳導(dǎo)模型、基于anchors的數(shù)據(jù)集集成和基因網(wǎng)絡(luò)分析扣讼,以單細(xì)胞分辨率剖析復(fù)雜的交流途徑缺猛,以在單細(xì)胞分辨率下恢復(fù)具有生物學(xué)意義的 CCC edges。

Scriabin的分析框架

  • the cell-cell interaction matrix workflow, optimal for smaller datasets, analyzes communication methods used for each cell-cell pair in the dataset(細(xì)胞對(duì)分析通訊)椭符,CCC 的基本單位是表達(dá)配體的發(fā)送細(xì)胞 Ni荔燎,這些配體被接收細(xì)胞 Nj 表達(dá)的同源受體接收。 Scriabin 通過計(jì)算數(shù)據(jù)集中每對(duì)細(xì)胞對(duì)每個(gè)配體-受體對(duì)表達(dá)的幾何平均值销钝,在細(xì)胞-細(xì)胞相互作用矩陣 M 中編碼此信息.由于配體-受體相互作用是定向的有咨,Scriabin將每個(gè)細(xì)胞分別視為“發(fā)送者”(配體表達(dá))和“接收者”(受體表達(dá)),從而保持 CCC 網(wǎng)絡(luò)的定向性質(zhì)蒸健。 M 可以類似于基因表達(dá)矩陣進(jìn)行處理座享,并用于降維、聚類和差異分析
  • the summarized interaction graph workflow, designed for large comparative analyses, identifies cell-cell pairs with different total communicative potential between samples(識(shí)別有意義的配受體對(duì))
  • the interaction program discovery workflow, suitable for any dataset size, finds modules of co-expressed ligand-receptor pairs(第三點(diǎn)尤為重要)似忧。
圖片.png

圖片.png

具體的方法

Generation of cell-cell interaction matrix

將一對(duì)細(xì)胞之間的細(xì)胞-細(xì)胞相互作用向量定義為每個(gè)同源配體-受體對(duì)的表達(dá)值的幾何平均值渣叛。 形式上,sender cell Ni 和receiver cell Nj 之間的交互向量 V 由下式給出:

圖片.png

where ????,???? represent a cognate ligand-receptor pair.選擇將配體和受體表達(dá)值相乘橡娄,以便配體或受體表達(dá)的零值將導(dǎo)致相互作用向量的相應(yīng)索引為零诗箍。 此外,選擇取配體-受體表達(dá)值乘積的平方根挽唉,這樣高表達(dá)的配體-受體對(duì)不會(huì)不成比例地驅(qū)動(dòng)下游分析滤祖。This definition is equivalent to the geometric mean。細(xì)胞-細(xì)胞相互作用矩陣 M 是通過連接細(xì)胞-細(xì)胞相互作用向量來構(gòu)建的瓶籽。 線性回歸用于校正 M 因測(cè)序深度引起的變化匠童,其中 Ni、Nj 的總測(cè)序深度定義為 Ni 和 Ni 中唯一分子標(biāo)識(shí)符 (UMI) 的總和塑顺。 M 用作低維嵌入的輸入以進(jìn)行可視化汤求,并用作最近鄰圖的輸入以進(jìn)行基于圖的聚類。

矩陣的下游分析(就是Seurat的基本操作)
圖片.png

通訊programs的分析严拒,Interaction program discovery workflow

Iterative approximation of a ligand-receptor pair topological overlap matrix (TOM)
圖片.png
Identification and significance testing of interaction programs(層次聚類,WGCNA)
圖片.png

看一看示例代碼

安裝
devtools::install_github("BlishLab/scriabin", ref = "main")
single-dataset(單一數(shù)據(jù)樣本)
library(Seurat)
library(SeuratData)
library(scriabin)
library(tidyverse)
#####示例數(shù)據(jù)
install.packages("https://seurat.nygenome.org/src/contrib/panc8.SeuratData_3.0.2.tar.gz", repos = NULL, type = "source") 
library(panc8.SeuratData)
panc8 <- LoadData("panc8")
####Analyze the indrop dataset for the interaction programs tutorial
panc_fl <- subset(panc8, cells = colnames(panc8)[panc8$tech=="fluidigmc1"])
panc_fl <- SCTransform(panc_fl, verbose = F) %>%
  RunPCA(verbose = F) %>%
  RunUMAP(dims = 1:30, verbose = F)
DimPlot(panc_fl, group.by = "celltype", label = T, repel = T) + NoLegend()
圖片.png
First we generate a cell-cell interaction matrix object on the alpha and beta cells in the dataset. Then we scale, find variable features, run PCA and UMAP, find neighbor graphs and clusters, treating this matrix just like we would a gene expression matrix
panc_fl_ccim <- GenerateCCIM(subset(panc_fl, cells = colnames(panc_fl)[panc_fl$celltype %in% c("alpha","beta")]))
panc_fl_ccim <- ScaleData(panc_fl_ccim) %>% 
  FindVariableFeatures() %>%
  RunPCA() %>% 
  RunUMAP(dims = 1:10) %>%
  FindNeighbors(dims = 1:10) %>%
  FindClusters(resolution = 0.2)
p1 <- DimPlot(panc_fl_ccim, group.by = "sender_celltype")
p2 <- DimPlot(panc_fl_ccim, group.by = "receiver_celltype")
p1+p2
DimPlot(panc_fl_ccim, label = T, repel = T) + NoLegend()

cluster4_5.edges <- FindMarkers(panc_fl_ccim, ident.1 = "4", ident.2 = "5")
cluster4_5.edges %>% top_n(20,wt = abs(avg_log2FC))

CCIMFeaturePlot(panc_fl_ccim, seu = panc_fl, 
            features = c("EFNA1","NPNT"), type_plot = "sender")
圖片.png

multi-dataset

library(Seurat)
library(SeuratData)
library(scriabin)

scriabin::load_nichenet_database()

library(SeuratData)
InstallData("ifnb")
ifnb <- UpdateSeuratObject(LoadData("ifnb"))
ifnb <- PercentageFeatureSet(ifnb, pattern = "MT-", col.name = "percent.mt") %>%
  SCTransform(ifnb, vars.to.regress = "percent.mt", verbose = F) %>%
  RunPCA(ifnb, verbose = F) %>% 
  FindNeighbors(ifnb, dims = 1:30, verbose = F) %>%
  FindClusters(ifnb, verbose = F) %>%
  RunUMAP(ifnb, dims = 1:30, verbose = F)
DimPlot(ifnb, label = T, repel = T) + NoLegend()
DimPlot(ifnb, label = T, repel = T, group.by = "seurat_annotations") + NoLegend()
DimPlot(ifnb, group.by = "stim")
圖片.png

This dataset is composed of two sub-datasets: PBMCs stimulated with IFNB (STIM) or control (CTRL).

If we assume there is no batch effect between these two datasets, we can observe from these dimensionality reduction projections the intense transcriptional perturbation caused by this stimulation. This is one situation in which it is easy to see why clustering/subclustering for high resolution CCC analysis can be a problematic task: the degree of perturbation is so high that the monocytes from these two datasets will never cluster together. We need a way to align these subspaces together. Given a monocyte in CTRL, what is its molecular counterpart in STIM?

We take recent progress in dataset integration methodology to develop a high-resolution alignment process we call "binning", where we assign each cell a bin identity so that we maximize the similarity of the cells within a bin, and maximize the representation of all the samples we wish to compare in each bin. After bins are identified, they are tested for the significance of their connectivity.

Dataset binning

There are two workflow options for binning in terms of bin identification:

  1. Bin by coarse cell identities (recommended). If the user specifies coarse cell identities (eg. in a PBMC dataset, T/NK cells, B cells, myeloid cells), this can prevent anchors from being identified between cells that the user believes to be in non-overlapping cell categories. This can result in cleaner bins, but there must be enough cells in each coarse identity from each dataset to proceed properly. A downside is that there may be some small groups of cells that aren't related to major cell populations (eg. a small population of platelets in a PBMC dataset).
  2. Bin all cells. Without specifying coarse cell identities for binning, potentially spurious associations may form between cells (especially in lower quality samples).

Significance testing proceeds by a permutation test: comparing connectivity of the identified bin against randomly generated bins. There are two workflow options for binning in terms of generating random bins:

  1. Pick cells from the same cell type in each random bin (recommended). If the user supplies granular cell type identities, a random bin will be constructed with cells from the same cell type identity. The more granular the cell type calls supplied, the more rigorous the significance testing.
  2. Without supplying granular cell type identities, the dataset will be clustered and the cluster identities used for significance testing. Generating random bins across the entire dataset would result in very few non-significant bins identified.

Here we will define a coarse cell identity to use for bin identification, and then use the Seurat-provided annotations for significance testing.

Additionally, Scriabin allows different reductions to be used for neighbor graph construction. For example, the results from a reference-based sPCA can be used for binning if the cell type relationships in the reference are considered more informative.

ifnb$coarse_ident <- mapvalues(ifnb$seurat_annotations, from = unique(ifnb$seurat_annotations),
                               to = c("myeloid","myeloid","T/misc","T/misc",
                                      "T/misc","T/misc","T/misc","B",
                                      "B","myeloid","myeloid","T/misc","T/misc"))
ifnb <- BinDatasets(ifnb, split.by = "stim", dims = 1:30, 
                    coarse_cell_types = "coarse_ident", sigtest_cell_types = "seurat_annotations")
圖片.png
ifnb_split <- SplitObject(ifnb, split.by = "stim")
sum_ig <- AssembleInteractionGraphs(ifnb, by = "prior", split.by = "stim")
ifnb_split <- pblapply(ifnb_split, function(x) {BuildPriorInteraction(x, correct.depth = T)})
ogig <- lapply(ifnb_split, function(x) {
  as.matrix(x@graphs$prior_interaction)
})
圖片.png

interaction-programs

library(Seurat)
library(SeuratData)
library(scriabin)
library(tidyverse)
library(ComplexHeatmap)
library(cowplot)
install.packages("https://seurat.nygenome.org/src/contrib/panc8.SeuratData_3.0.2.tar.gz", repos = NULL, type = "source") 
library(panc8.SeuratData)
panc8 <- LoadData("panc8")

panc_id <- subset(panc8, cells = colnames(panc8)[panc8$tech=="indrop"])
panc_id <- SCTransform(panc_id, verbose = F) %>%
  RunPCA(verbose = F) %>%
  RunUMAP(dims = 1:30, verbose = F)
DimPlot(panc_id, group.by = "celltype", label = T, repel = T) + NoLegend()
圖片.png
panc_ip <- InteractionPrograms(panc_id, iterate.threshold = 300)
#test for interaction program significance
panc_ip_sig <- InteractionProgramSignificance(panc_ip, n.replicate = 500)
#score cells by expression of interaction program
panc_id <- ScoreInteractionPrograms(panc_id, panc_ip_sig)

panc_id_ip_lig <- as.matrix(panc_id@meta.data %>% 
  select("celltype",
         starts_with("ligands")) %>%
  group_by(celltype) %>%
  summarise_if(is.numeric, mean) %>% column_to_rownames("celltype"))
Heatmap(panc_id_ip_lig, show_column_names = F)
panc_id_ip_rec <- as.matrix(panc_id@meta.data %>% 
  select("celltype",
         starts_with("receptors")) %>%
  group_by(celltype) %>%
  summarise_if(is.numeric, mean) %>% column_to_rownames("celltype"))
Heatmap(panc_id_ip_rec, show_column_names = F)
圖片.png
act_stellate_ip <- panc_id_ip_lig["activated_stellate",]
poi <- gsub("ligands_","",names(which(act_stellate_ip==max(act_stellate_ip))))
#Seurat's FeaturePlot has a nice option to blend expression of two features together on the same plot
p <- FeaturePlot(panc_id, 
          features = c(paste0("ligands_",poi),
                            paste0("receptors_",poi)), 
          blend = T, combine = F, 
          cols = c("grey90","purple","yellowgreen"), order = T)
p[[3]] + NoLegend()
DimPlot(panc_id, group.by = "celltype", label = T, repel = T) + NoLegend()
圖片.png
moi <- reshape2::melt(mod_df %>% dplyr::filter(name==poi) %>%
  select("lr_pair",contains("connectivity"))) %>% arrange(-value)
moi$lr_pair <- factor(moi$lr_pair, levels = unique(moi$lr_pair))
ggplot(moi, aes(x = lr_pair, y = value, color = variable)) + 
  geom_point() + theme_cowplot() + ggpubr::rotate_x_text() + labs(x = NULL, y = "Intramodular\nconnectivity")
圖片.png
最后問一句扬绪,大家覺得這個(gè)方法改進(jìn)之處合理么?

生活很好裤唠,有你更好

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
禁止轉(zhuǎn)載挤牛,如需轉(zhuǎn)載請(qǐng)通過簡(jiǎn)信或評(píng)論聯(lián)系作者。
  • 序言:七十年代末种蘸,一起剝皮案震驚了整個(gè)濱河市墓赴,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌航瞭,老刑警劉巖诫硕,帶你破解...
    沈念sama閱讀 216,324評(píng)論 6 498
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異刊侯,居然都是意外死亡章办,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,356評(píng)論 3 392
  • 文/潘曉璐 我一進(jìn)店門滨彻,熙熙樓的掌柜王于貴愁眉苦臉地迎上來藕届,“玉大人,你說我怎么就攤上這事疮绷『采啵” “怎么了?”我有些...
    開封第一講書人閱讀 162,328評(píng)論 0 353
  • 文/不壞的土叔 我叫張陵冬骚,是天一觀的道長(zhǎng)椅贱。 經(jīng)常有香客問我,道長(zhǎng)只冻,這世上最難降的妖魔是什么庇麦? 我笑而不...
    開封第一講書人閱讀 58,147評(píng)論 1 292
  • 正文 為了忘掉前任,我火速辦了婚禮喜德,結(jié)果婚禮上山橄,老公的妹妹穿的比我還像新娘。我一直安慰自己舍悯,他們只是感情好航棱,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,160評(píng)論 6 388
  • 文/花漫 我一把揭開白布睡雇。 她就那樣靜靜地躺著,像睡著了一般饮醇。 火紅的嫁衣襯著肌膚如雪它抱。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,115評(píng)論 1 296
  • 那天朴艰,我揣著相機(jī)與錄音观蓄,去河邊找鬼。 笑死祠墅,一個(gè)胖子當(dāng)著我的面吹牛侮穿,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播毁嗦,決...
    沈念sama閱讀 40,025評(píng)論 3 417
  • 文/蒼蘭香墨 我猛地睜開眼亲茅,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來了金矛?” 一聲冷哼從身側(cè)響起芯急,我...
    開封第一講書人閱讀 38,867評(píng)論 0 274
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎驶俊,沒想到半個(gè)月后娶耍,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,307評(píng)論 1 310
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡饼酿,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,528評(píng)論 2 332
  • 正文 我和宋清朗相戀三年榕酒,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片故俐。...
    茶點(diǎn)故事閱讀 39,688評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡想鹰,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出药版,到底是詐尸還是另有隱情辑舷,我是刑警寧澤,帶...
    沈念sama閱讀 35,409評(píng)論 5 343
  • 正文 年R本政府宣布槽片,位于F島的核電站何缓,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏还栓。R本人自食惡果不足惜碌廓,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,001評(píng)論 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望剩盒。 院中可真熱鬧谷婆,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,657評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至廷区,卻和暖如春唯灵,著一層夾襖步出監(jiān)牢的瞬間贾铝,已是汗流浹背隙轻。 一陣腳步聲響...
    開封第一講書人閱讀 32,811評(píng)論 1 268
  • 我被黑心中介騙來泰國(guó)打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留垢揩,地道東北人玖绿。 一個(gè)月前我還...
    沈念sama閱讀 47,685評(píng)論 2 368
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像叁巨,于是被迫代替她去往敵國(guó)和親斑匪。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,573評(píng)論 2 353

推薦閱讀更多精彩內(nèi)容