今天我們來(lái)到了單細(xì)胞空間聯(lián)合分析的第五個(gè)部分捆等,可能有部分同學(xué)有這樣的疑問(wèn),為什么要分享和研究這么多的方法续室, 有一個(gè)不就好了么栋烤?這個(gè)問(wèn)題,說(shuō)明你站的高度還需要提升挺狰。好了明郭,開(kāi)始我們今天的分享,單細(xì)胞和空間聯(lián)合分析的方法----spatialDWLS
文章在SpatialDWLS: accurate deconvolution of spatial transcriptomic data,目前處于前發(fā)的狀態(tài)丰泊,中國(guó)人寫(xiě)的薯定,里面用到的方法也是解卷積,大家要對(duì)比之前分享的方法SPOTlight一起對(duì)比學(xué)習(xí)瞳购,我們這里關(guān)注重點(diǎn)话侄。
基礎(chǔ)知識(shí)部分
一、為什么不能利用bulk-seq數(shù)據(jù)解卷積方法学赛,直接對(duì)空間轉(zhuǎn)錄組數(shù)據(jù)進(jìn)行解卷積年堆??
(1)the number of cells within each spot is typically small. For example, each spot in the 10X Genomics Visium platform has the diameter of 55 μm, corresponding to a spatial resolution of 5-10 cells. The application of a bulk RNAseq deconvolution method to such a small sample size would result in noise from unrelated cell types罢屈。(noise)
(2)as spatial expression datasets usually contain thousands of spots, it would be time and memory consuming if deconvolution methods designed for bulk RNA-seq were applied on spatial expression datasets.(第二個(gè)原因還是次要的嘀韧,主要是第一個(gè))
二、spatialDWLS的分析原理
(1)it identifies cell types that likely to be present at each location by using a recently developed cell-type enrichment analysis method(注意這里用到了一種富集方法缠捌,算法中我們探討一下)锄贷。
(2)the cell type composition at each location is inferred by extending the dampened weighted least squares (DWLS) method译蒂,which was originally developed for deconvolving bulk RNAseq data(我們先來(lái)記住這個(gè)簡(jiǎn)單的過(guò)程)。
三谊却、spatialDWLS方法的評(píng)估
the Root Mean Square Error (RMSE) associated with oligodendrocytes is only 0.03 with the predicted values approximately center around ground-truth柔昼。
這里有一個(gè)Root Mean Square Error (RMSE),大家可以參考均方根誤差炎辨。
可見(jiàn)方法中對(duì)之前介紹的SPOTlight進(jìn)行了比較捕透。
這里提一句,文章寫(xiě)肯定自己的方法最好碴萧,但是乙嘀,我們要甄別。
四破喻、運(yùn)用驗(yàn)證虎谢,這里就列舉其中的一個(gè)例子
During embryonic development, the spatial-temporal distribution of cell types changes
dramatically. Therefore, it is of interest to test whether spatialDWLS could aid the discovery of such dynamic changes. Recently, Asp and colleagues studied the development of human heart in early embryos (4.5–5, 6.5, and 9 post-conception weeks) by using the Spatial Transcriptomics (ST) technology。 Since the data does not have single-cell resolution, they were not able to identify cell-type distribution directly from the ST data. In order to apply spatialDWLS, we utilized the single-cell RNAseq derived gene signatures from this study as reference. All the cell types were mapped to expected locations .
In order to quantitatively compare the change of spatial-temporal organization of cell type composition during embryonic heart development, we first examined the overall abundance of different cell types
有些細(xì)胞增多了曹质,有些細(xì)胞減少了(聯(lián)合分析的結(jié)果看)婴噩,總之,結(jié)果很好羽德,大家嘗試(作者的觀點(diǎn))几莽。
這里我們要重點(diǎn)關(guān)注一點(diǎn)文章的方法了。
Cell type selection of spatial expression data by enrichment analysis We use an enrichment based weighted least squares approach for deconvolution of spatial
expression datasets
(1)enrichment analysis using Parametric Analysis of Gene Set Enrichment (PAGE) method22 is applied on spatial expression dataset as previously reported宅静。這里的富集方法就是GSEA章蚣。The marker genes can be identified via differential expression gene analysis of Giotto based on the single cell RNA-seq data provided by users(單細(xì)胞數(shù)據(jù)提供的marker,感覺(jué)有點(diǎn)Low坏为,)究驴。Alternatively, users can also provide marker gene expression for each cell type for deconvolution.(或者自己提供marker,更扯了)匀伏。
細(xì)胞marker gene的數(shù)量為m洒忧,對(duì)于每個(gè)基因,我們將倍數(shù)變化計(jì)算為每個(gè)點(diǎn)的表達(dá)值與所有點(diǎn)的平均表達(dá)之比够颠,The mean and standard deviation of the fold change values are defined as μ and δ, respectively.In addition, we calculate the mean fold change of the m marker genes, which is defined as Sm. The enrichment score (ES) is defined as follows:
Then, we binarize the enrichment matrix with the cutoff value of ES = 2 to select cell types that are likely to be present at each point.
恕我直言熙侍,這個(gè)富集方法,很飄啊履磨。
Estimating cell type composition by using a weighted least squares approach
In previous work, we developed dampened weighted least squares (DWLS) for deconvolution of single-cell RNAseq data.(這個(gè)方法大家可以查一下)蛉抓,This method is extended here to deconvolve spatial transcriptomic data using the signature gene identification step described above. In short, DWLS uses a weighted least squares approach to infer cell-type composition, where the weight is selected to minimize the overall relative error rate. In addition, a damping constant d is used to enhance numerical stability, whose value is determined by using a cross-validation procedure. Here, we use the same sets of weights and damping constant across spots within same clusters to reduce technical variation. Finally, since the number of cells present at each spot is generally small, we perform another round deconvolution by remove those cell types that are predicted to present at a low frequency by imposing an additional thresholding (min frequency = 0.02 by default).(這個(gè)地方還是需要涉及到算法,大家可以深入)剃诅。
最后來(lái)一張效果圖
這個(gè)方法在spatialDWLS,代碼都很簡(jiǎn)單巷送,只需要關(guān)注一個(gè)函數(shù)runDWLSDeconv,算法才是精髓矛辕。
生活很好笑跛,有你更好1