10X單細(xì)胞空間聯(lián)合分析之六(依據(jù)每個(gè)spot的細(xì)胞數(shù)量進(jìn)行單細(xì)胞空間聯(lián)合分析)----Tangram

今天我們來(lái)分享一個(gè)新的10X單細(xì)胞空間聯(lián)合分析的方法----Tangram,一定要注意這個(gè)軟件的優(yōu)勢(shì)娩怎,這里強(qiáng)調(diào)一下蚂子, 根據(jù)染色圖片推斷每個(gè)spot擁有細(xì)胞核的數(shù)量,從而得到每個(gè)spot的細(xì)胞量蚤氏,根據(jù)這個(gè)前提進(jìn)行10X空間數(shù)據(jù)的解卷積分析。

我們首先來(lái)看文獻(xiàn)的內(nèi)容

Squidpy allows analysis of images in spatial omics analysis workflows

我們首先來(lái)掌握一些基礎(chǔ)的知識(shí)

1踊兜、什么是Image Container

The Image Container is an object for microscopy(微觀(guān)) tissue images associated with spatial molecular datasets(可見(jiàn)Image Container是對(duì)圖片和數(shù)據(jù)進(jìn)行聯(lián)合處理的這樣一個(gè)軟件). The object is a thin wrapper of an xarray(xarray軟件).Dataset and provides efficient access to in-memory and on-disk images. On-disk files are loaded lazily using dask through rasterio , meaning content is only read in memory when requested. The object can be saved as a zarr store zarr. This allows handling very large files that do not fit in memory.說(shuō)白了就是圖片處理器。
Image Container is initialised with an in-memory array or a path to an image file on disk. Images are saved with the key layer. If lazy loading is desired, the chunks parameter needs to be specified.

sq.im.ImageContainer ( PATH , layer = < str >, chunks = < int >)

More images layers with the same spatial dimensions x and y like segmentation masks can be added to an existing Image Container.

img.add_img ( PATH , layer_added = < str >)

The Image Container is able to interface with Anndata objects(這個(gè)地方大家應(yīng)該熟悉吧佳恬,scanpy處理單細(xì)胞數(shù)據(jù)就是產(chǎn)生這樣一個(gè)對(duì)象), in order to relate any pixel-level information to the observations stored in Anndata. For instance, it is possible to create a generator that yields image’s crops on-the-fly corresponding to locations of the spots in the image:(這個(gè)地方也就是說(shuō)可以直接讀取anndata對(duì)象中的圖片信息)捏境。

spot_generator = img.generate_spot_crops(adata)
lambda x: ( x for x in spot_generator ) # yields crops at spots location

This of course works for both features computed at crop-level but also at segmentation-object level. For instance, it is possible to get centroids coordinates as well as several features of the segmentation object that overlap with the spot capture area.(這個(gè)地方了解就可以了)。

第二部分我們來(lái)了解一下圖片的處理過(guò)程

(1)Image processing
Before extracting features from microscopy images, the images can be pre-processed. Squidpy implements functions for commonly used preprocessing functions like conversion to gray-scale or smoothing using a gaussian kernel.

sq.im.process ( img , method =" gray ")##這里的圖片就是我們的原始圖片

Implementations are based on the Scikit-image package and allow processing of very large images through tiling the image into smaller crops and processing these.(這個(gè)地方對(duì)圖片進(jìn)行預(yù)處理)毁葱,大家用的時(shí)候注意格式問(wèn)題垫言。

(2)Image segmentation(這個(gè)地方可以理解為圖片的精細(xì)化)
Nuclei segmentation is an important step when analysing microscopy images(重點(diǎn)來(lái)了,每個(gè)spot的nulei數(shù)量的分析倾剿,這個(gè)跟染色有關(guān))筷频。It allows the quantitative analysis of the number of nuclei, their areas, and morphological features.(量化每個(gè)spot的細(xì)胞數(shù)量,獲得區(qū)域和形態(tài)學(xué)的特征)前痘。There are a wide range of approaches for nuclei segmentation, from established techniques like thresholding to modern deep learning-based approaches(這樣的分析方法很多凛捏?,那我也需要多多學(xué)習(xí)了)芹缔。
A difficulty for nuclei segmentation is to distinguish between partially overlapping nuclei.(overlap的核如何識(shí)別坯癣,這個(gè)是個(gè)很重要的問(wèn)題,尤其癌區(qū)域最欠,細(xì)胞小而且密集)示罗。Watershed is a classic algorithm used to separate overlapping objects by treating pixel values as local topology.(處理圖片的像素作為局部的形態(tài)學(xué)特征)。For this, starting from points of lowest intensity, the image is flooded until basins from different starting points meet at the watershed ridge lines.(處理的軟件及方式芝硬,圖片處理的知識(shí)作者知道的也不多)蚜点。

sq.im.segment ( img , method =" watershed ")

其實(shí)這個(gè)地方和stlearn的圖片處理比較相似。
Implementations in Squidpy are based on the original Scikit-image python implementation(圖片處理的軟件是python模塊Scikit-image拌阴,有空大家可以深入學(xué)習(xí)一下)绍绘。
(3)Custom approaches with deep learning(數(shù)據(jù)的深入分割)
Depending on the quality of the data, simple segmentation approaches like watershed might not be appropriate. Nowadays, many complex segmentation algorithms are provided as pre-trained deep learning models, such as Stardist, Splinedist and Cellpose. These models can be easily used within the segmentation function.(這個(gè)地方是對(duì)數(shù)據(jù)的分割,注意這里的數(shù)據(jù)是圖片的信息迟赃,而不是我們測(cè)序的轉(zhuǎn)錄組數(shù)據(jù))脯倒。

sq.im.segment ( img , method = < pre - trained model >)

(4) Image features(圖片的特征)。
Tissue organisation in microscopic images can be analysed with different image features.This filters relevant information from the (high dimensional) images, allowing for easy interpretation and comparison with other features obtained at the same spatial location.(不同圖片相同空間區(qū)域的特征比較)捺氢, Image features are calculated from the tissue image at each location (x, y) where there is transcriptomics information available, resulting in a obs x features features matrix similar to the obs x gene matrix.(類(lèi)似單細(xì)胞矩陣)藻丢。This image feature matrix can then be used in any single-cell analysis workflow, just like the gene matrix.(看來(lái)這部分主要是對(duì)測(cè)序的數(shù)據(jù)進(jìn)行一個(gè)下游的分析)。
The scale and size of the image used to calculate features can be adjusted using the scale and spot_scale parameters. Feature extraction can be parallelized by providing n_jobs.The calculated feature matrix is stored in adata[key] .

sq.im.calculate_image_features ( adata , img , features = < list >, spot_scale = < float > ,
scale = < float > , key_added = < str >)

這個(gè)地方要注意了摄乒,圖片和數(shù)據(jù)開(kāi)始聯(lián)合起來(lái)進(jìn)行分析
Summary features calculate the mean, the standard variation or specific quantiles for a color channel.Similarly, histogram features scan the histogram of a color channel to calculate quantiles according a defined number of bins(一些參數(shù)的作用)悠反。

sq.im.calculate_image_features ( adata , img , features =" summary ")
sq.im.calculate_image_features ( adata , img , features =" histogram ")

后面也介紹了一些據(jù)不數(shù)據(jù)處理的方法残黑,但是已經(jīng)不是我們研究的重點(diǎn)了,看看即可斋否。

2梨水、我們來(lái)看一下文獻(xiàn)的正文部分。

Squidpy implements a pipeline based on Scikit-image for preprocessing and segmenting images, extracting morphological, texture, and deep learning-powered features茵臭。

圖片.png

這個(gè)地方大家不要太輕視疫诽,首先,軟件可以處理熒光染色或者H&E染色的圖片旦委,前處理和分割都是對(duì)圖片進(jìn)行一個(gè)處理奇徒,最后結(jié)合測(cè)序數(shù)據(jù)進(jìn)行一個(gè)特征提取。當(dāng)然這個(gè)地方研究的還不是很深缨硝,仍需要修煉摩钙。
To enable efficient processing of very large images, this pipeline utilises lazy loading, image tiling and multi-processing(處理過(guò)程,前面提到了)查辩。
圖片.png

Features can be extracted from a raw tissue image crop, or Squidpy’s nuclei-segmentation module can be used to extract nuclei counts and nuclei sizes(提取核數(shù)量的分析)胖笛。
圖片.png

For instance, we can leverage segmented nuclei to inform cell-type deconvolution methods such as Tangram(我們今天的重點(diǎn)) or Cell2Location(這個(gè)我之前分享過(guò),文章在10X單細(xì)胞和空間聯(lián)合分析的方法---cell2location,大家對(duì)比著看)宜岛。
圖片.png

接下來(lái)進(jìn)入我們的重中之重

Cell-type deconvolution using Tangram

Mapping single-cell atlases to spatial transcriptomics data is a crucial analysis steps to integrate cell-type annotation across technologies. Information on the number of nuclei under each spot can help cell-type deconvolution methods. (利用每個(gè)spot的核數(shù)量來(lái)進(jìn)行10X單細(xì)胞空間的聯(lián)合分析)长踊。
Tangram ([Biancalani et al., 2020], code) is a cell-type deconvolution method that enables mapping of cell-types to single nuclei under each spot. We will show how to leverage the image container segmentation capabilities, together with Tangram, to map cell types of the mouse cortex from sc-RNA-seq data to Visium data.
代碼部分我們就不全部重復(fù)了,大家根據(jù)自己的需求個(gè)性化設(shè)計(jì)萍倡。
加載模塊,剛才提到的模塊都在范圍之內(nèi)之斯。

import scanpy as sc
import squidpy as sq
import numpy as np
import pandas as pd
from anndata import AnnData
import pathlib
import matplotlib.pyplot as plt
import matplotlib as mpl
import skimage
# import tangram for spatial deconvolution
import tangram as tg

這里我們以示例數(shù)據(jù)為準(zhǔn),這個(gè)地方大家主要看看數(shù)據(jù)里面包含的內(nèi)容
首先是轉(zhuǎn)錄組數(shù)據(jù):

圖片.png

全部的10X空間轉(zhuǎn)錄組數(shù)據(jù)的處理結(jié)果遣铝,注意這里是python版本分析結(jié)果
其次是圖片處理數(shù)據(jù)
圖片.png

注意這里的圖片信息佑刷,如果我們需要分析自己的數(shù)據(jù),需要讀入自己的高清圖片酿炸。
最后是單細(xì)胞數(shù)據(jù)
最重要的就是注釋的結(jié)果瘫絮。
圖片.png

Nuclei segmentation and segmentation features(每個(gè)spot細(xì)胞數(shù)量的分析)

sq.im.process(img=img, layer="image", method="smooth")
sq.im.segment(
    img=img,
    layer="image_smooth",
    method="watershed",
    channel=0,
)

可視化

inset_y = 1500
inset_x = 1700
inset_sy = 400
inset_sx = 500

fig, axs = plt.subplots(1, 3, figsize=(30, 10))
sc.pl.spatial(
    adata_st, color="cluster", alpha=0.7, frameon=False, show=False, ax=axs[0], title=""
)
axs[0].set_title("Clusters", fontdict={"fontsize": 20})
sf = adata_st.uns["spatial"]["V1_Adult_Mouse_Brain_Coronal_Section_2"]["scalefactors"][
    "tissue_hires_scalef"
]
rect = mpl.patches.Rectangle(
    (inset_y * sf, inset_x * sf),
    width=inset_sx * sf,
    height=inset_sy * sf,
    ec="yellow",
    lw=4,
    fill=False,
)
axs[0].add_patch(rect)

axs[0].axes.xaxis.label.set_visible(False)
axs[0].axes.yaxis.label.set_visible(False)

axs[1].imshow(
    img["image"][inset_y : inset_y + inset_sy, inset_x : inset_x + inset_sx, 0] / 65536,
    interpolation="none",
)
axs[1].grid(False)
axs[1].set_xticks([])
axs[1].set_yticks([])
axs[1].set_title("DAPI", fontdict={"fontsize": 20})

crop = img["segmented_watershed"][
    inset_y : inset_y + inset_sy, inset_x : inset_x + inset_sx
].values
crop = skimage.segmentation.relabel_sequential(crop)[0]
cmap = plt.cm.plasma
cmap.set_under(color="black")
axs[2].imshow(crop, interpolation="none", cmap=cmap, vmin=0.001)
axs[2].grid(False)
axs[2].set_xticks([])
axs[2].set_yticks([])
axs[2].set_title("Nucleous segmentation", fontdict={"fontsize": 20})

不知道大家python畫(huà)圖的能力怎么樣

圖片.png

We then need to extract some image features useful for the deconvolution task downstream. Specifically, we will need: - the number of unique segmentation objects (i.e. nuclei) under each spot. - the coordinates of the centroids of the segmentation object.(分析每個(gè)spot里面的細(xì)胞數(shù)量)。

# define image layer to use for segmentation
features_kwargs = {
    "segmentation": {
        "label_layer": "segmented_watershed",
        "props": ["label", "centroid"],
        "channels": [1, 2],
    }
}
# calculate segmentation features
sq.im.calculate_image_features(
    adata_st,
    img,
    layer="image",
    key_added="image_features",
    features_kwargs=features_kwargs,
    features="segmentation",
    mask_circle=True,
)

adata_st.obs["cell_count"] = adata_st.obsm["image_features"]["segmentation_label"]
sc.pl.spatial(adata_st, color=["cluster", "cell_count"], frameon=False)
圖片.png

從而得到每個(gè)spot的細(xì)胞數(shù)量填硕,進(jìn)行精細(xì)化的NMF分析麦萤。

Deconvolution and mapping

At this stage, we have all we need for the deconvolution task. First, we need to find a set of common genes the single cell and spatial datasets. We will use the intersection of the highly variable genes.(提取聯(lián)合分析的基因)
這個(gè)地方根據(jù)自己的需求進(jìn)行分析

sc.tl.rank_genes_groups(adata_sc, groupby="cell_subclass")
markers_df = pd.DataFrame(adata_sc.uns["rank_genes_groups"]["names"]).iloc[0:100, :]
genes_sc = np.unique(markers_df.melt().value.values)
genes_st = adata_st.var_names.values
genes = list(set(genes_sc).intersection(set(genes_st)))

開(kāi)始進(jìn)行解卷積的分析

mapper = tg.mapping_optimizer.MapperConstrained(
    S=S,
    G=G,
    d=d,
    device=device,
    **hyperparm,
    target_count=adata_st.obs.cell_count.sum()
)

我們來(lái)看一下分析的結(jié)果


圖片.png

不知道大家是否喜歡這個(gè)聯(lián)合分析的方法
生活很好,有你更好

分析的網(wǎng)址在squidpy

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
禁止轉(zhuǎn)載扁眯,如需轉(zhuǎn)載請(qǐng)通過(guò)簡(jiǎn)信或評(píng)論聯(lián)系作者壮莹。
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市姻檀,隨后出現(xiàn)的幾起案子命满,更是在濱河造成了極大的恐慌,老刑警劉巖绣版,帶你破解...
    沈念sama閱讀 206,013評(píng)論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件胶台,死亡現(xiàn)場(chǎng)離奇詭異歼疮,居然都是意外死亡,警方通過(guò)查閱死者的電腦和手機(jī)诈唬,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,205評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門(mén)韩脏,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái),“玉大人铸磅,你說(shuō)我怎么就攤上這事赡矢。” “怎么了阅仔?”我有些...
    開(kāi)封第一講書(shū)人閱讀 152,370評(píng)論 0 342
  • 文/不壞的土叔 我叫張陵吹散,是天一觀(guān)的道長(zhǎng)。 經(jīng)常有香客問(wèn)我霎槐,道長(zhǎng),這世上最難降的妖魔是什么梦谜? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 55,168評(píng)論 1 278
  • 正文 為了忘掉前任丘跌,我火速辦了婚禮,結(jié)果婚禮上唁桩,老公的妹妹穿的比我還像新娘闭树。我一直安慰自己,他們只是感情好荒澡,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,153評(píng)論 5 371
  • 文/花漫 我一把揭開(kāi)白布报辱。 她就那樣靜靜地躺著,像睡著了一般单山。 火紅的嫁衣襯著肌膚如雪碍现。 梳的紋絲不亂的頭發(fā)上,一...
    開(kāi)封第一講書(shū)人閱讀 48,954評(píng)論 1 283
  • 那天米奸,我揣著相機(jī)與錄音昼接,去河邊找鬼。 笑死悴晰,一個(gè)胖子當(dāng)著我的面吹牛慢睡,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播铡溪,決...
    沈念sama閱讀 38,271評(píng)論 3 399
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼漂辐,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了棕硫?” 一聲冷哼從身側(cè)響起髓涯,我...
    開(kāi)封第一講書(shū)人閱讀 36,916評(píng)論 0 259
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎哈扮,沒(méi)想到半個(gè)月后复凳,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體瘤泪,經(jīng)...
    沈念sama閱讀 43,382評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 35,877評(píng)論 2 323
  • 正文 我和宋清朗相戀三年育八,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了对途。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 37,989評(píng)論 1 333
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡髓棋,死狀恐怖实檀,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情按声,我是刑警寧澤膳犹,帶...
    沈念sama閱讀 33,624評(píng)論 4 322
  • 正文 年R本政府宣布,位于F島的核電站签则,受9級(jí)特大地震影響须床,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜渐裂,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,209評(píng)論 3 307
  • 文/蒙蒙 一豺旬、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧柒凉,春花似錦族阅、人聲如沸。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 30,199評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至蔬咬,卻和暖如春鲤遥,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背林艘。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 31,418評(píng)論 1 260
  • 我被黑心中介騙來(lái)泰國(guó)打工渴频, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人北启。 一個(gè)月前我還...
    沈念sama閱讀 45,401評(píng)論 2 352
  • 正文 我出身青樓卜朗,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親咕村。 傳聞我的和親對(duì)象是個(gè)殘疾皇子场钉,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,700評(píng)論 2 345