10X單細胞(10X空間轉錄組)TCR轉錄組聯(lián)合數(shù)據(jù)分析之TCRdist3(6)neighbor graph analysis(CoNGA)

hello坎缭,大家好,今天我們繼續(xù)坎吻,我們分享一個重要的分析方法拍棕,特別重要解总,轉錄組和TCR的聯(lián)合分析贮匕,文獻在Linking T cell receptor sequence to transcriptional profiles with clonotype neighbor graph analysis (CoNGA),影響因子54分(nature biotechnology).這一篇是原版文獻花枫。

方法特別重要刻盐,我們這一次分享文獻,以后分享示例代碼

文獻的部分我摘錄重點給大家

Abstract

1劳翰、Multi-modal single-cell technologies capable of simultaneously assaying gene expression and surface phenotype across large numbers of immune cells have described extensive heterogeneity within these complex populations, in healthy and diseased states.(現(xiàn)在的單細胞多組學內(nèi)容很多了敦锌,最近最為關注的就是單核 + ATAC,當然 5’ + TCR也是一個非常重要的方向,當然佳簸,如果條件允許乙墙,還可以加上單細胞蛋白組學)。
2、這些豐富的高維數(shù)據(jù)集有可能揭示 TCR 序列和 T 細胞表型之間的新關系听想,這些關系超越了克隆相關細胞共享特征的識別腥刹。當然,我們在識別TCR和轉錄組關系的同時也是非常困難的汉买,做過相應分析的同學應該都知道肛走。

Introduction

1、Previous work pairing gene expression and TCR sequence has largely focused on the TCR sequence as a unique 'barcode' by which to identify clonally related cells.(對腫瘤疾病方面的研究當然非常重要)录别。
2朽色、From these works we see that T cell clones derived from a common clonal ancestor tend to display a similar transcriptional profile.(這一點我也是第一次見到這樣的說法,不過组题,應該是對的)葫男。
3、然而,TCR 序列相似性與細胞表型之間的關系尚未使用現(xiàn)有的大型單細胞數(shù)據(jù)集進行系統(tǒng)探索崔列。(單獨分析的現(xiàn)在居多)梢褐。
4、but approaches that can identify completely new populations or subpopulations by correlating GEX and TCR sequence have not been reported. Also lacking are methods for identifying correlations between TCR sequence and GEX that do not extend to global similarity or associate with a defined cell population(這種分析的文獻確實少見赵讯,我?guī)缀鯖]印象)盈咳。
5、In parallel to the developments in single-cell profiling, methods for quantifying TCR repertoire features and identifying patterns within them have matured, helping extend our understanding of T cell biology.(單獨分析當然很成熟了).這里就要提到之前說到的方法边翼,TCRdist鱼响,計算TCR序列之間的相似性。

關鍵點6组底、it is clear that T cells targeting the same pathogen-derived epitope utilize T cell receptors that share consistent, definable amino acid motifs.(這個毋庸置疑丈积,不然怎么起作用)。

7债鸡、In addition to these conventional T cell responses, it is well known that certain unconventional T cell populations,such as mucosal-associated invariant T (MAIT) cells and invariant natural killer T (iNKT) cells, are characterized by conserved TCR sequence features and GEX profiles(這里隱含的意思應該就是相同的TCR序列擁有相似的基因表達)江滨。

8、已經(jīng)描述了許多不同的 T 細胞亞群的subsets厌均,它們具有適合其富集的標記唬滑,但是,很可能其他由 TCR 和 GEX 連接的亞群仍未被發(fā)現(xiàn)棺弊。
9晶密、hypothesized that by identifying correlations between “TCR neighborhoods”, defined by shared sequence features, and gene expression, we could overcome the strict limitation of examining these correlations within individual clonal families and potentially identify novel associations between T cell antigen-specificities and phenotypes.(這也是我作為生信人員一直想要的答案).
10、CoNGA镊屎,通過分析定義在一組 T 細胞克隆型上的相似性圖來識別 GEX 譜和 TCR 序列特征之間的相關性惹挟,并將其應用于公開可用的 T 細胞數(shù)據(jù)集的集合,以無偏見地搜索通過協(xié)變關聯(lián)的 T 細胞群 他們的repertoire特點和 GEX 資料 缝驳。
11连锯、T cell populations specific for individual pMHC epitopes showed distinct gene expression profiles, with EBV epitope-specific T cell populations appearing to cluster according to the stage (latent vs early) of the antigen from which the peptide epitope was derived.(看來TCR和轉錄組之間有極強的相關性)归苍。

Result

CoNGA algorithm(列舉重點)

CoNGA was developed to identify correlations between gene expression profile and TCR sequence in diverse T cell populations without prior knowledge of the precise nature of these correlations(基因表達信息和TCR信息的的關聯(lián)分析)。

envisioned two broad categories of correlation:

1运怖、one based on similarity, in which cells similar with respect to GEX are also similar with respect to TCR sequence, and one based on features, in which specific aspects of GEX and of TCR sequence are correlated, without global similarity of both properties.(翻譯一下:一種基于相似性拼弃,其中與 GEX 相似的細胞在 TCR 序列方面也相似,另一種基于特征摇展,其中 GEX 和 TCR 序列的特定方面是相關的吻氧,但沒有兩個屬性的全局相似性。 個人傾向于第二種).
2咏连、CoNGA graph-vs-graph correlation was developed to detect the first category of correlation, using the mathematical concept of graph neighborhoods to formalize our intuitive notion of global similarity盯孙。
3、在沒有相關特征的先驗知識的情況下祟滴,從頭發(fā)現(xiàn)基于特征的相關性更具挑戰(zhàn)性振惰,因為它需要枚舉和測試所有可能的特征對。 (當然垄懂,也最可靠)骑晶。
4、CoNGA graph-vs-feature analysis represents a compromise approach(折中方法) in which we assume that, at least on one side of the correlation, some degree of global similarity is present草慧。
5桶蛔、we then enumerate possible features defined by the other property, and test for graph neighborhoods with biased feature distributions.(如何尋找特征是重點)。In practice, we find substantial overlap between the results of these two approaches, as, for example, when the identified features in graph-vs-feature correlations are marker genes for a subpopulation of cells that also share detectable global similarity of gene expression.(發(fā)現(xiàn)這兩種方法的結果之間存在大量重疊漫谷,例如仔雷,當圖與特征相關性中識別的特征是細胞亞群的標記基因時,這些細胞亞群也共享可檢測的基因表達全局相似性抖剿。 )朽寞。
6、However, we also see cases in which graph-vs-feature analysis reveals a correlation, for example between expression of a specific gene and usage of a particular V gene segment, that is not characterized by global similarity with respect to both gene expression and TCR sequence斩郎。(這個相關性,相信很難被發(fā)現(xiàn))喻频。
7缩宜、These two approaches are also quite complementary: retrospective analysis(回顧分析) of graph-vs-graph correlations can, as in the case of the putative MHC-independent population described below, suggest specific gene expression or TCR sequence features that can then be input to graph-vs-feature analysis for sensitive detection of specific correlations(相關性還是很強的)。

(定義)CoNGA similarity graphs are defined at the level of clonotypes rather than individual cells.T cells of the same clonotype, which by definition have the same TCR sequence, tend to have similar GEX profiles甥温。

圖片.png

Thus, similarity graphs based on gene expression drawn at the level of individual cells will contain many edges connecting cells within the same clonal family.(相互證明的關系)锻煌。

8、In the TCR similarity graph, each node (clonotype) is connected by edges to its K nearest-neighbor (KNN) nodes based on TCR similarity as assessed by the TCRdist measure, which scores sequence similarity in the pMHC-contacting CDR loops of the TCR alpha and beta chains (here K is an adjustable parameter specified as a fraction of the total number of clonotypes).(看起來套用了單細胞轉錄組數(shù)據(jù)找鄰居的方法姻蚓,但是這里用到的方法是TCRdist宋梧,前面的文章已經(jīng)介紹過)。
9狰挡、In the gene expression (GEX) similarity graph, each clonotype is connected by edges to its KNN clonotypes based on similarity in GEX profile(這個后面的方法詳細講如何做的). Expanded clones are represented by the GEX profile of a single representative cell, the one with the smallest average distance to the rest of the clonal family.
10捂龄、In graph-vs-graph correlation analysis(圖片的內(nèi)容需要好好分析),CoNGA identifies statistically significant overlap between the GEX similarity graph and the TCR similarity graph.
圖片.png
  • 圖注:T cell clonotype neighbor graph analysis (CoNGA). (a) In graph-vs-graph analysis, CoNGA identifies correlation between T cell gene expression (GEX) and TCR sequence by constructing a gene expression similarity graph and a TCR sequence similarity graph and looking for statistically significant overlap between them. Overlap is assessed on a per-clonotype basis by counting the number of edges that originate at each clonotype and are shared between the two graphs, or equivalently by measuring the overlap between each clonotype’s GEX graph neighbors and its TCR graph neighbors, and assigning a score that reflects the likelihood of seeing equal or greater overlap by chance. (b) A single clonotype and its GEX and TCR neighbors are shown in the GEX (left panel) and TCR (right panel) 2D UMAP projections for the 10x 200k donor2a dataset. The clonotype is marked with a black ’x’, its GEX neighbors are shown as blue points, its TCR neighbors as green points, and the clonotypes that are both GEX and TCR neighbors are shown in red. The significance of the observed overlap—8 clones shared between two neighbor sets of size 24 in a total population of 2427 clonotypes—is calculated using the hypergeometric distribution, giving a P value of 1.7×10?11释涛。
11.方法:We consider each node (clonotype) in turn, count the overlap between its neighbors in the two graphs,and assign a significance score that contrasts this observed overlap to that expected under a simple null model: the CoNGA score for this clonotype, equal to the hypergeometric probability of seeing the observed overlap by chance, multiplied by the total number of clonotypes, to adjust for multiple testing.(看來需要點算法內(nèi)容了)倦沧。
12唇撬、分數(shù)的界定,CoNGA scores range from 0 to the number of clonotypes; scores close to 0 are significant, scores around 1 are borderline, and scores above 1 are expected to occur by chance展融。This mode of analysis identifies T cell clonotypes whose neighbors in gene expression space overlap significantly with their neighbors in TCR sequence space.(其實這種聯(lián)合分析的方法窖认,應該更有價值,但是我們之前一般只關注轉錄組或者TCR告希,忽略了這部分聯(lián)合的信息)扑浸。
13、model(塑造) the concept of a clonotype's neighbors in GEX or TCR space using the mathematical concept of a graph neighborhood, defined as all the vertices directly connected to one central vertex
圖片.png
  • 圖注:(d) The gene KLRB1 (CD161) shows a non-uniform distribution over the TCR sequence landscape—discrete regions of higher expression (red) against a background of lower expression (blue)—suggesting correlation between gene expression and TCR sequence. This is quantified for a single clonotype (green outline) and its TCR sequence neighbors (black outlines) in the inset violin plot, which shows the KLRB1 expression level for the clonotype and its neighbors on the right and for the remainder of the dataset on the left. The Mann-Whitney-Wilcoxon P value for this expression difference is 1.5×10?46燕偶。
14首装、CoNGA's second mode of analysis, graph-vs-feature analysis, was developed to detect GEX/TCR correlation that involves specific gene expression or TCR features rather than overall similarity.(這個分析可能價值更大一點)。This mode of analysis can identify TCR sequence neighborhoods with differentially expressed genes(這才對),for example, or gene expression neighborhoods with distinctive CDR3 sequence features (length, hydrophobicity, charge, etc). (果然有了一些基礎之后讀起來輕松多了杭跪,剛開始直接讀這一篇很多不知道在說什么)仙逻。
15、In graph-vs-feature correlation analysis,
圖片.png
  • 圖注c:In graph-vs-feature analysis, a numerical feature defined by one property (here gene expression) is mapped onto a similarity graph defined by the other property (TCR sequence), and graph neighborhoods with skewed score distributions are identified using statistical tests that compare the scores for each neighborhood (including the central vertex) with the scores of the remaining clonotypes涧尿。
CoNGA maps numerical features derived from one property (gene expression or TCR sequence) onto the similarity graph defined by the other property and looks for neighborhoods in the graph with unexpectedly high or low feature distributions(當然系奉,展示結果也很好)。

接下來具體情況進行分析(講道理姑廉,理解起來有點難叭绷痢)

CoNGA graph-vs-graph analysis identifies correlation between gene expression and TCR sequence(首先是公共數(shù)據(jù),成對的T細胞轉錄組和TCR)桥言。

圖片.png

上圖illustrates the CoNGA graph-vs-graph analysis workflow for two datasets of human peripheral blood T cells, one a mix of CD4+ and CD8+ cells (vdj_v1_hs_pbmc, Fig. 2a-c) and one containing flow-sorted CD8+ T cells (10x_200k_donor2a, Fig. 2d-f)萌踱。看看詳細分析步驟号阿。

第一步:the UMAP algorithm is applied to the gene expression and TCRdist (TCR序列相似度構成的矩陣)matrices of each dataset to generate two dimensional projections of the GEX and TCR landscapes并鸵。(相當于降維)。
第二步:a graph-based clustering algorithm is applied to the GEX matrix to partition the dataset into clusters of clonotypes with similar transcriptional profiles and to the TCR distance matrix to produce clusters of clonotypes with similar TCR sequences扔涧。(相當于聚類)园担。The GEX and TCR landscape projections are colored by CoNGA score to visualize the relative location of the topscoring CoNGA hits in these landscapes。(分數(shù)的定義前面說了枯夜,大家注意分數(shù)的特征屬性)
第三步:the GEX and TCR cluster assignments of CoNGA hits with scores below a threshold (here 1.0) are shown in the 2D projections using bicolored disks whose left (right) half corresponds to the GEX (TCR) cluster assignment弯汰。(分布CoNFA分數(shù))。
These plots reveal that both datasets contain a substantial number of clonotypes with significant CoNGA scores, and that these CoNGA hits are located in specific regions of the GEX and TCR landscapes.(跟之前的預期一致)湖雹。
第四步:To gain insight into these groups of related clonotypes, we leverage the fact that each dataset has been clustered for both GEX and TCR sequence similarity, independently, and thus each clonotype maps to a pair of clusters (a GEX cluster and a TCR sequence cluster).(映射咏闪,cluster之間進行配對).These cluster pairs provide useful handles by which to identify CoNGA hits because they contain information on GEX and TCR, allowing us to map between the two landscapes (which would require a four-dimensional plot for direct visual correspondence).
For example, in Figure 2a at the top of the GEX landscape we can see a cluster of CoNGA hits which all belong to GEX cluster 2 (light green on the left half of the disk) and TCR cluster 3 (red on the right half of the disk), or equivalently, cluster pair (2,3);(這樣的配對,文獻中還是第一次見)摔吏。we can infer that these correspond to the group of clonotypes in the TCR landscape also located near the top of the plot, that they are likely CD8+ (from the thumbnail in Fig. 2b), and largely TRAV14 (from the TCR cluster identifier in Fig. 2a).(感覺很智慧鸽嫂,這樣的配對方法)纵装。
每個包含任意最小數(shù)量的 CoNGA hits(此處為 5)的集群對的特征在于一行序列標志樣式的可視化(圖 2c/f),這些可視化標識了這些 CoNGA hits的顯著特征溪胶,包括最重要的 DEG搂擦、TCR 基因片段使用、CDR3 motif和 GEX 標志哗脖,突出顯示了定義典型 T 細胞亞群(CD4瀑踢、CD8 等)的幾個標志性基因。 這些以一致的格式排列才避,可以掃描以快速評估集群在主要細胞子集中的位置橱夭。
當然,接下來還有一些其他用 graph-vs-graph分析的案例桑逝,找到了一些原本沒有發(fā)現(xiàn)的問題棘劣。主要這樣尋找到的pairs,既可以表征基因的marker(當然楞遏,主要指顯著的差異基因)茬暇,也可以表征關聯(lián)的TCR群的motif序列,兩者關聯(lián)起來寡喝,非常nice糙俗,從功能和抗原表位識別兩個角度詮釋生物學問題,我們來看一看這些案例预鬓。

案例1 CoNGA defines a HOBIT+/HELIOS+ T cell population shared

across multiple donors

1巧骚、當然,數(shù)據(jù)用到的是用pMHC multimers抗原表位富集后的CD8+ T 細胞類群格二,實驗也證明了有明顯的表位特異性反應劈彪。當然,也發(fā)現(xiàn)了一些不特異的結合顶猜。for example to MAIT cells, or to cells that were very likely part of epitope-specific responses to other epitopes.說明收集到的T細胞并不是所有的TCR都特異結合提供的抗原表位沧奴。CoNGA detected a large number of significant GEX/TCR correlations across these datasets, identifying 62 cluster pairs of size at least 5 and 42 using the more stringent size threshold of 0.1% of the dataset. (看來分析的結果存在很多的可能。)
圖片.png
進一步分析可以對數(shù)據(jù)分成三組驶兜,(如上圖所示)扼仲,(1) Flu M158-responding clones; (2) MAIT cells; (3) a population of clonotypes with a shared expression profile (high expression of genes including the transcription factors ZNF683 (aka HOBIT) and IKZF2 (aka HELIOS), along with DUSP1/2, CD7, CD99, and KLRD1), diverse TCR gene usage, and rather long CDR3 regions. 其中第三個類群是我們關注的重點 (這個分析的組合分類的目的其實也是在尋找pairs對生物學識別抗原表位及基因表達變化的生物學意義)。
為了進一步剖析第三個類群(HOBIT-expressing clonotypes),對該類群和背景的TCR序列進行了比較抄淑,As expected from examination of the TCR sequence logos in Figure 3, the CDR3α and CDR3β loops are significantly longer in the HOBIT+ CoNGA population than in background(長度有變化,這也是關注的一個點)驰后;CDR3 也 (1) 帶更多正電荷 (P<10-40)肆资; (2) 芳香族殘基含量較高,尤其是色氨酸(P<10-60)灶芝,一般為疏水性和體積較大的氨基酸 ;(3) 半胱氨酸含量更高(>100 倍富含 CDR3β郑原,P<10-50)唉韭。(說明對TCR的分析最終還是歸結于蛋白序列的分析)。這些序列特征與 MHC 基因敲除小鼠 TCR 庫的實驗研究中 MHC 非依賴性 TCR 序列與 MHC 限制性 TCR 序列的比較中發(fā)現(xiàn)的特征極為相似犯犁。
特異結合MHC的TCR的半胱氨酸被認為反映了二硫鍵的形成(和其他的半胱氨酸)由胸腺中的負選擇施加的 MHC 呈遞肽属愤。位于 CDR3 區(qū)域頂點內(nèi)的疏水殘基對于介導與胸腺中自身肽 MHC 的相互作用很重要,基于這些趨勢酸役,我們假設這個 CoNGA 鑒定的群體代表了一個非規(guī)范的住诸、自身特異性或 MHC 獨立的 T 細胞群體。為了便于分析涣澡,開發(fā)了一個數(shù)字評分贱呐,即 iMHC 評分(用于“獨立于 pMHC”),它捕獲了它們定義的 CDR3 序列特征 入桂。(這個獨立于其實就是特異性結合的意思,這里其實就是特異性結合MHC的分數(shù)奄薇,越高越好)。
我們接下來試圖根據(jù)從它們的 DEG 中識別出的推定細胞表面標記來確定外周血 T 細胞中 HOBIT+ 群體的頻率抗愁。 其中第一個樣本(10x_200k_donor1)suggested that they were likely CD45RA+ CD45ROdim based on TotalSeq labeling, negative for CCR7 expression, and positive for KLRC2, KLRC3, and a number of KIR genes.那個在這個病人中該類群的特征就是CD45RA+ CD45ROdim/- CCR7- KLRC2+ KLRC3+ KIR+/-,值得注意的是馁蒂,在描述 HOBIT 單克隆抗體生成的報告中(這是臨床的報告),發(fā)現(xiàn)其在 CD45RA+ CCR7-CD8 T 細胞中的表達最高蜘腌,用這些細胞表面標志物對來自健康獻血者的 PBMC 樣本進行流式細胞術分析沫屡,證實存在表達 KLRC2 和 KIR2D(即 KLRC2+KIR2D-、KLRC2+KIR2D+ 和 KLRC2-KIR2D+) 逢捺。當然谁鳍,KLRC2必定表達,KIR2D的表達卻存在隨機性劫瞳。然而倘潜,KLRC2-KIR2D+ 表型與這些標準不一致,可能代表一個不同(但相當大)的 CD8 子集志于。
圖片.png

圖片.png
As a percentage of total PBMC CD8 T cells, the KLRC2+ KIR2D+/- subset is in the range of 0.2-10.1% while KLRC2- KIR2D+ cells ranged between 0.3-7.6%
圖片.png
接下來涮因,我們對 KLRC2+ KIR2D+/- 和 KLRC2-KIR2D+ CD8 T 細胞進行了分類,并使用 qRT-PCR 測量了這些群體中 ZNF683伺绽、KLRC2 和 KLRC3 相對于每個供體自己分類的 CD8+CD45RA-CD45RO+ memory subset 的表達养泡。 Here, we found expression of KLRC2 and KLRC3 was enriched in the KLRC2+ KIR2D+/- CD8 T cells, and to a lesser extent in the KLRC2- KIR2D+ subset 。
圖片.png
However, ZNF683 appeared to be enriched only within the KLRC2+ KIR2D+/- subset, supporting their identity as the putative HOBIT+ population and further suggesting KLRC2- KIR2D+ T cells are in fact a separate, distinct subset奈应。
綜上所述澜掩,這些數(shù)據(jù)證實了外周血中表達 ZNF683 的 CD8+ CD45RA+ CD45ROdim/- CCR7-KLRC2+ KIR2D+/- T 細胞的存在與 HOBIT+ 群體一致,并且該亞組雖然因個體而異杖挣,但占 CD8 的很大一部分 T 細胞(在某些個體中高達 10%)肩榕。(發(fā)現(xiàn)了新的亞群)。

案例2 CoNGA identifies GEX/TCR correlation in thymic T cells(對胸腺T細胞的分析惩妇,數(shù)據(jù)包含了不同時間段的人群株汉,totaling over 9400 clonotypes with paired alpha and beta TCR sequences.)筐乳。

CoNGA identified a large number of significant hits in this rich and complex dataset, primarily within the DP (double-positive), CD8 single positive (SP), CD4 SP, Treg, and CD8αα+ thymic populations
圖片.png
In TCR sequence space, we see a concentration of hits in the TRAV41 cluster (this TRAV gene is enriched in DP cells),the TRAV1 and TRAV12 clusters (enriched in CD8 cells), and in the TRAV14 cluster (enriched in CD8αα cells)由 CoNGA 鑒定的 CD8+ 簇對還顯示出高 CD8 序列分數(shù)和高分數(shù)(“alphadist”),該分數(shù)反映了納入克隆型 TCR α 鏈的 TRAV 和 TRAJ 基因片段之間的基因組距離乔妈。 DP cluster對顯示出較低的 alphadist 分數(shù)蝙云,優(yōu)先選擇基因座 3' 末端的 TRAV41 和其他 TRAV 基因,較長的 CDR3 環(huán)(CDR3 長度已顯示在胸腺選擇期間減少)路召,以及邊緣勃刨、表面的較高分數(shù) 和無序的氨基酸特性,這可能表明 CDR3 區(qū)域極性更大优训、體積更小朵你、相互作用力更弱。 however, CoNGA further identified high iMHC scores and longer CDR3 loops as TCR features of these clusters. Interestingly, the CD8αα(II) cluster pair expressed both ZNF683 and IKZF2, which together with TCR features similar to those of the HOBIT+ T cells in the blood identified above, suggests a possible precursor relationship between these two populations that warrants further investigation.(確實值得進一步研究)揣非。

接下來第二部分抡医,CoNGA graph-vs-feature analysis confirms sharing of the HOBIT+/HELIOS+ T cell subset across donors(需要轉換思路了)。

案例1

我們已經(jīng)看到早敬,CoNGA graph-vs-graph分析可以識別基因表達和 TCR 序列之間的各種相關性忌傻,從不變的 MAIT 和 iNKT 譜系,到表位特異性反應中的序列基序和表達偏向搞监,再到較弱的 CDR3 表征 HOBIT+ 群體的序列偏好和差異表達基因 水孩。(which would likely be difficult to identify from analysis of TCR sequence or gene expression alone)。因此琐驴,僅涉及少數(shù)基因或非常特定的 TCR 序列特征的相關性俘种,或者我們global GEX 和 TCR 距離測量未能很好地捕捉到的相關性,可能無法檢測到绝淡。這里我們就要用到 graph-vs-feature的分析策略宙刘。
To be detected, these correlations must be characterized by some degree of elevated global similarity in both transcriptional profile and TCR sequence within the relevant cell population(相關性存在層級).Thus, correlations that involve only a few genes or very specific TCR sequence features, or ones that are not well captured by our global GEX and TCR distance measures, may go undetected.(分析更加的細化了)。
2牢酵、CoNGA graph-vs-feature analysis was developed as a complementary graph based approach that could detect GEX/TCR correlations that are not characterized by global similarity of both properties(這個才是重點)悬包。
3、In graph-vs-feature analysis, numerical features calculated on the basis of one cellular property, GEX or TCR sequence, are mapped onto a similarity graph defined by the other property, and the feature score distributions for each of the neighborhoods in the graph are compared to the background distributions to identify neighborhoods with skewed scores (here a graph neighborhood consists of a single central vertex together with all of its directly connected neighbors).(全局轉向局部)馍乙。
圖片.png
4布近、As GEX features, we consider the expression levels of individual genes, and for TCR sequence features, we use a set of CDR3 amino acid property values as well as a handful of additional, sequence-based scores(像這樣說點大白話不行么 ??)。
5丝格、We used graph-vs-feature analysis to identify additional members of the HOBIT+/HELIOS+ unconventional T cell subset by looking for GEX graph neighborhoods with elevated iMHC scores. Although the per-clonotype iMHC score is highly variable撑瞧,by computing averages over GEX graph neighborhoods we can identify a subregion of GEX space with enhanced scores,whose significance can be assessed using standard statistical tests显蝌。
圖片.png
  • 圖注 fig6.
Three of the four 10x_200k donors show populations of clonotypes with significantly enhanced iMHC scores (Fig. 6c-f) whose DEGs correlate well with one another and with the key marker genes (ZNF683, CD7, CD99, DUSP1/2) for the original HOBIT+ CoNGA clusters季蚂,其中異常的那個群年齡比較大,年齡越大琅束,natural T cell的數(shù)目會下降扭屁,HOBIT+ CoNGA 克隆型的 iMHC 評分分布與具有已知 MHC 限制性的 TCR 的 iMHC 評分分布的比較表明可能與其他 MHC 依賴性 T 細胞亞群有親和力。
圖片.png
  • 圖注:Single-chain iMHC score distributions for TCR subsets. Score distributions for CDR3α repertoires are shown on the left and for CDR3β repertoires on the right. Single-chain variants of the iMHC score were fit with L1-regularized logistic regression just as for the paired iMHC score.

案例2涩禀,Graph-vs-feature analysis reveals differential gene expression

across the TCR landscape

(轉換策略)We applied graph-vs-feature analysis in the reverse direction to identify genes that are differentially expressed in specific TCR graph neighborhoods.(the top significant gene for each cluster pair and a maximum of 10 genes per dataset )料滥。
下圖 illustrates four graph-vs-feature correlations, showing visually how specific TCR-based and GEX-based features correlate across the 2D clonotype landscapes.
圖片.png
Our TCR graph-based differential expression analysis identified several associations with the EPHB6 gene (and its murine homolog Ephb6), which codes for the Ephrin-B receptor Type 6 protein EPHB6。A recurring feature of these associations is the usage of the TRBV30 gene segment (TRBV31 in mouse).A focused search for covariation between TCR gene segment usage and gene expression using differential expression analysis confirmed a strong tendency for higher EPHB6 expression in clonotypes that incorporate the TRBV30 gene segment(看來確實具有很強的相關性)艾船。
The TRBV30 segment is unique among TRBV genes in being located downstream of the TRBJ and TRBC genes at the end of the TCR beta locus; incorporation of TRBV30 into the TCR by V(D)J recombination requires an altered joining process in which intervening DNA sequence is inverted rather than being deleted . Providing a potential clue into the mechanism underlying this covariation, EPHB6 is located adjacent to TRBV30 on Chromosome 7, ~40kb downstream from the TCR beta locus葵腹。
圖片.png
The strong correlation between TRBV30 usage and EPHB6 expression may indicate that expression of a TRBV30-containing TCR transcript also boosts expression of the EPHB6 gene (the mouse TRBV31 gene segment is located at an analogous location to that of TRBV30 in the mouse TR locus, and is also directly adjacent to the mouse homolog Ephb6).(看來基因之間的表達,具有協(xié)同性的原因可能是染色體的位置相近)屿岂。

這些發(fā)現(xiàn)表明 TCR 基因座邊緣的 TCR 基因的使用與該基因座側翼非 TCR 基因的表達之間存在相互作用(最為重要的一點)践宴。

案例3,Neighbor-graph analysis of TCR:pMHC binding highlights GEX

similarity among T cells that recognize the same epitope

對于每個 pMHC爷怀,我們查看在該 pMHC 陽性的克隆型集合中是否存在比我們偶然預期更多的 GEX(或 TCR)similarity edges阻肩,并通過計算fold-enrichment以及 近似 P 值
圖片.png
From this analysis we can see, as expected, that nearly all the pMHC-positive clonotype subsets show greater than expected TCR sequence similarity。Indeed, the only pMHCs with a negative TCR neighbor-enrichment score are A03_KLG, which appears to show high levels of non-specific binding运授。pMHCs with large numbers of analyzed clonotypes show highly significant TCR similarity as assessed by the TCR-pMHC graph overlap烤惊,當然,pMHC-positive populations show greater than expected GEX similarity, with highly significant P-values and large fold-enrichments for most pMHCs with a sufficient number of analyzed clones吁朦。這些結果表明柒室,相同 pMHC 陽性的克隆型具有比偶然預期的更相似的基因表達譜。
圖片.png

最后逗宜,作者說了一些該軟件的缺點雄右。

1、a consequence of operating at the level of clonotypes rather than individual cells is that we miss out on variation within the cells of expanded clones.
2纺讲、Although we found that gene expression was largely consistent within clonally related cells, it may be worth exploring approaches in which cellular resolution is preserved, for example by defining graphs at the level of individual cells and masking out intra-clonotype neighbor edges to eliminate the strong signal of clonal GEX/TCR correlation(這確實是一個問題)擂仍。
3、results of applying CoNGA will depend critically on the distance measures

used to define clonotype similarity and construct the neighbor graphs刻诊。很多方法可以選擇防楷。

4、Another limitation is that, in our experience, successful application of

CoNGA requires a relatively large number of unique clones(至少幾百则涯,單細胞數(shù)數(shù)據(jù)很合適)复局。which depending on the degree of clonal expansion may require a substantially larger number of individual cells.

We are optimistic that new analytical approaches combined with novel high-throughput single-cell experiments will continue to illuminate new aspects of adaptive immunology in the coming years.當然,方法很新粟判,就意味著需要很多的補充亿昏。

這個專題太多了,生活很好档礁,有你更好角钩,下一篇我們介紹這個軟件的兩種分析方法的原理。

最后編輯于
?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者
禁止轉載,如需轉載請通過簡信或評論聯(lián)系作者递礼。
  • 序言:七十年代末惨险,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子脊髓,更是在濱河造成了極大的恐慌辫愉,老刑警劉巖,帶你破解...
    沈念sama閱讀 206,013評論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件将硝,死亡現(xiàn)場離奇詭異恭朗,居然都是意外死亡,警方通過查閱死者的電腦和手機依疼,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,205評論 2 382
  • 文/潘曉璐 我一進店門痰腮,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人律罢,你說我怎么就攤上這事膀值。” “怎么了弟翘?”我有些...
    開封第一講書人閱讀 152,370評論 0 342
  • 文/不壞的土叔 我叫張陵虫腋,是天一觀的道長。 經(jīng)常有香客問我稀余,道長悦冀,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 55,168評論 1 278
  • 正文 為了忘掉前任睛琳,我火速辦了婚禮盒蟆,結果婚禮上,老公的妹妹穿的比我還像新娘师骗。我一直安慰自己历等,他們只是感情好,可當我...
    茶點故事閱讀 64,153評論 5 371
  • 文/花漫 我一把揭開白布辟癌。 她就那樣靜靜地躺著寒屯,像睡著了一般。 火紅的嫁衣襯著肌膚如雪黍少。 梳的紋絲不亂的頭發(fā)上寡夹,一...
    開封第一講書人閱讀 48,954評論 1 283
  • 那天,我揣著相機與錄音厂置,去河邊找鬼菩掏。 笑死,一個胖子當著我的面吹牛昵济,可吹牛的內(nèi)容都是我干的智绸。 我是一名探鬼主播野揪,決...
    沈念sama閱讀 38,271評論 3 399
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼瞧栗!你這毒婦竟也來了斯稳?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 36,916評論 0 259
  • 序言:老撾萬榮一對情侶失蹤沼溜,失蹤者是張志新(化名)和其女友劉穎平挑,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體系草,經(jīng)...
    沈念sama閱讀 43,382評論 1 300
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 35,877評論 2 323
  • 正文 我和宋清朗相戀三年唆涝,在試婚紗的時候發(fā)現(xiàn)自己被綠了找都。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 37,989評論 1 333
  • 序言:一個原本活蹦亂跳的男人離奇死亡廊酣,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情青扔,我是刑警寧澤栽连,帶...
    沈念sama閱讀 33,624評論 4 322
  • 正文 年R本政府宣布,位于F島的核電站凡辱,受9級特大地震影響戒职,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜透乾,卻給世界環(huán)境...
    茶點故事閱讀 39,209評論 3 307
  • 文/蒙蒙 一洪燥、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧乳乌,春花似錦捧韵、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,199評論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至磷瘤,卻和暖如春芒篷,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背膀斋。 一陣腳步聲響...
    開封第一講書人閱讀 31,418評論 1 260
  • 我被黑心中介騙來泰國打工梭伐, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人仰担。 一個月前我還...
    沈念sama閱讀 45,401評論 2 352
  • 正文 我出身青樓糊识,卻偏偏與公主長得像绩社,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子赂苗,可洞房花燭夜當晚...
    茶點故事閱讀 42,700評論 2 345

推薦閱讀更多精彩內(nèi)容