10X單細胞空間聯(lián)合分析之再次解讀cell2location

到了年根了胡本，其實想總結(jié)一些之前的東西，其中cell2location很想再次分享一下兢哭，好好研讀一下庶香。

Cell2location 是一種有principled Bayesian model，可以解析空間轉(zhuǎn)錄組數(shù)據(jù)中的 fine-grained細胞類型泛源，并創(chuàng)建不同組織的綜合細胞圖拔妥。

Given cell type annotation for each cell, the corresponding reference cell type signatures (g_{f,g}), which represent the average mRNA count of each gene (g) in each cell type (f), can be estimated from sc/snRNA-seq data using 2 provided methods (see below). Cell2location needs untransformed unnormalised spatial mRNA counts as input. You also need to provide cell2location with the expected average cell abundance per location which is used as a prior to guide estimation of absolute cell abundance. This value depends on the tissue and can be estimated by counting nuclei for a few locations in the paired histology image but can be approximate (see paper methods for more guidance).

provide 2 methods for estimating reference cell type signatures from scRNA-seq data:

一種基于負二項式回歸的統(tǒng)計方法。通常建議使用 NB 回歸达箍，它允許跨技術(shù)和批次穩(wěn)健地組合數(shù)據(jù)没龙，從而提高空間映射精度。
單個基因的每個cluster平均 mRNA 計數(shù)的硬編碼計算 (cell2location.cluster_averages.compute_cluster_averages)缎玫。當(dāng)批次效應(yīng)較小時硬纤，這種更快的硬編碼計算每個集群平均值的方法提供了類似的高準(zhǔn)確度。

代碼部分也很值得多多學(xué)習(xí)

Loading packages

import sys

#if branch is stable, will install via pypi, else will install from source
branch = "github"
user = "BayraktarLab"
IN_COLAB = "google.colab" in sys.modules

if IN_COLAB and branch == "stable":
    !pip install --quiet cell2location
elif IN_COLAB and branch != "stable":
    !pip install --quiet --upgrade jsonschema
    !pip install --quiet git+https://github.com/BayraktarLab/cell2location#egg=cell2location[tutorials]

import scanpy as sc
import anndata
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

import cell2location
import scvi

from matplotlib import rcParams
rcParams['pdf.fonttype'] = 42 # enables correct plotting of text
import seaborn as sns

First, let’s define where we save the results of our analysis:

results_folder = './results/lymph_nodes_analysis/'

# create paths and names to results folders for reference regression and cell2location models
ref_run_name = f'{results_folder}/reference_signatures'
run_name = f'{results_folder}/cell2location_map'

這里的用法我很喜歡赃磨， f'{results_folder}/reference_signatures'

Loading Visium and scRNA-seq reference data

首先從 10X Space Ranger 輸出中讀取空間 Visium 數(shù)據(jù)筝家。可以使用 scanpy 方便地下載和導(dǎo)入此數(shù)據(jù)集。

adata_vis = sc.datasets.visium_sge(sample_id="V1_Human_Lymph_Node")
adata_vis.obs['sample'] = list(adata_vis.uns['spatial'].keys())[0]

# rename genes to ENSEMBL
adata_vis.var['SYMBOL'] = adata_vis.var_names
adata_vis.var_names = adata_vis.var['gene_ids']
adata_vis.var_names.name = None

這里對scanpy讀入的對象的信息添加很值得學(xué)習(xí)邻辉。

注意溪王！線粒體編碼的基因（基因名稱以前綴 mt- 或 MT- 開頭）與空間映射無關(guān)，因為它們的表達代表了單細胞和細胞核數(shù)據(jù)中的技術(shù)產(chǎn)物值骇，而不是線粒體的生物學(xué)豐度莹菱。然而，這些基因在每個位置構(gòu)成了 15-40% 的 mRNA吱瘩。因此道伟，為了避免映射偽影，我們強烈建議去除線粒體基因使碾。

# find mitochondria-encoded (MT) genes
adata_vis.var['MT_gene'] = [gene.startswith('MT-') for gene in adata_vis.var['SYMBOL']]

# remove MT genes for spatial mapping (keeping their counts in the object)
adata_vis.obsm['MT'] = adata_vis[:, adata_vis.var['MT_gene'].values].X.toarray()
adata_vis = adata_vis[:, ~adata_vis.var['MT_gene'].values]

在單細胞參考中包含了跨越淋巴結(jié)蜜徽、脾臟和扁桃體的 scRNA-seq 數(shù)據(jù)集裹芝，以確保我們捕獲了空間轉(zhuǎn)錄組數(shù)據(jù)集中可能存在的免疫細胞狀態(tài)的全部多樣性。

# Download data if not already here
import os
if not os.path.exists('./data/sc.h5ad'):
    !cd ./data/ && wget https://cell2location.cog.sanger.ac.uk/paper/integrated_lymphoid_organ_scrna/RegressionNBV4Torch_57covariates_73260cells_10237genes/sc.h5ad

# Read data
adata_ref = sc.read(f'./data/sc.h5ad')

# Use ENSEMBL as gene IDs to make sure IDs are unique and correctly matched
adata_ref.var['SYMBOL'] = adata_ref.var.index
adata_ref.var.index = adata_ref.var['GeneID-2'].copy()
adata_ref.var_names = adata_ref.var['GeneID-2'].copy()
adata_ref.var.index.name = None
adata_ref.raw.var['SYMBOL'] = adata_ref.raw.var.index
adata_ref.raw.var.index = adata_ref.raw.var['GeneID-2'].copy()
adata_ref.raw.var.index.name = None
# before we estimate the reference cell type signature we recommend to perform very permissive genes selection
# in this 2D histogram orange rectangle lays over excluded genes.
# In this case, the downloaded dataset was already filtered using this method,
# hence no density under the orange rectangle
from cell2location.utils.filtering import filter_genes
selected = filter_genes(adata_ref, cell_count_cutoff=5, cell_percentage_cutoff2=0.03, nonz_mean_cutoff=1.12)

# filter the object
adata_ref = adata_ref[:, selected].copy()

圖片.png

Estimation of reference cell type signatures (NB regression)

The signatures are estimated from scRNA-seq data, accounting for batch effect, using a Negative binomial regression model.(這里需要考慮批次效應(yīng))

# prepare anndata for the regression model
scvi.data.setup_anndata(adata=adata_ref,
                        # 10X reaction / sample / batch
                        batch_key='Sample',
                        # cell type, covariate used for constructing signatures
                        labels_key='Subset',
                        # multiplicative technical effects (platform, 3' vs 5', donor effect)
                        categorical_covariate_keys=['Method']
                       )
scvi.data.view_anndata_setup(adata_ref)

圖片.png

# create and train the regression model
from cell2location.models import RegressionModel
mod = RegressionModel(adata_ref)

# Use all data for training (validation not implemented yet, train_size=1)
mod.train(max_epochs=250, batch_size=2500, train_size=1, lr=0.002, use_gpu=True)

# plot ELBO loss history during training, removing first 20 epochs from the plot
mod.plot_history(20)

圖片.png

# In this section, we export the estimated cell abundance (summary of the posterior distribution).
adata_ref = mod.export_posterior(
    adata_ref, sample_kwargs={'num_samples': 1000, 'batch_size': 2500, 'use_gpu': True}
)

# Save model
mod.save(f"{ref_run_name}", overwrite=True)

# Save anndata object with results
adata_file = f"{ref_run_name}/sc.h5ad"
adata_ref.write(adata_file)

Examine QC plots

1娜汁、重建準(zhǔn)確性以評估推理是否存在任何問題。
2兄朋、由于批次效應(yīng)掐禁，估計的表達特征不同于每個cluster中的平均表達。對于不受批效應(yīng)影響的 scRNA-seq 數(shù)據(jù)集（該數(shù)據(jù)集有）颅和，可以使用聚類平均表達而不是用模型估計特征傅事。當(dāng)此圖與對角線圖非常不同時（例如，Y 軸上的值非常低峡扩，到處都是密度）蹭越，則表明特征估計存在問題。

mod.plot_QC()

圖片.png

The model and output h5ad can be loaded later like this:

mod = cell2location.models.RegressionModel.load(f"{ref_run_name}", adata_ref)
adata_file = f"{ref_run_name}/sc.h5ad"
adata_ref = sc.read_h5ad(adata_file)

# export estimated expression in each cluster
if 'means_per_cluster_mu_fg' in adata_ref.varm.keys():
    inf_aver = adata_ref.varm['means_per_cluster_mu_fg'][[f'means_per_cluster_mu_fg_{i}'
                                    for i in adata_ref.uns['mod']['factor_names']]].copy()
else:
    inf_aver = adata_ref.var[[f'means_per_cluster_mu_fg_{i}'
                                    for i in adata_ref.uns['mod']['factor_names']]].copy()
inf_aver.columns = adata_ref.uns['mod']['factor_names']
inf_aver.iloc[0:5, 0:5]

圖片.png

Cell2location: spatial mapping

# find shared genes and subset both anndata and reference signatures
intersect = np.intersect1d(adata_vis.var_names, inf_aver.index)
adata_vis = adata_vis[:, intersect].copy()
inf_aver = inf_aver.loc[intersect, :].copy()

# prepare anndata for cell2location model
scvi.data.setup_anndata(adata=adata_vis, batch_key="sample")
scvi.data.view_anndata_setup(adata_vis)

圖片.png

Note! While you can often use the default value of detection_alpha hyperparameter, it is useful to adapt the expected cell abundance N_cells_per_location to every tissue. This value can be estimated from paired histology images and as described in the note above. Change the value presented in this tutorial (N_cells_per_location=30) to the value observed in your your tissue.

# create and train the model
mod = cell2location.models.Cell2location(
    adata_vis, cell_state_df=inf_aver,
    # the expected average cell abundance: tissue-dependent
    # hyper-prior which can be estimated from paired histology:
    N_cells_per_location=30,
    # hyperparameter controlling normalisation of
    # within-experiment variation in RNA detection (using default here):
    detection_alpha=200
)

mod.train(max_epochs=30000,
          # train using full data (batch_size=None)
          batch_size=None,
          # use all data points in training because
          # we need to estimate cell abundance at all locations
          train_size=1,
          use_gpu=True)

# plot ELBO loss history during training, removing first 100 epochs from the plot
mod.plot_history(1000)
plt.legend(labels=['full data training']);

圖片.png

# In this section, we export the estimated cell abundance (summary of the posterior distribution).
adata_vis = mod.export_posterior(
    adata_vis, sample_kwargs={'num_samples': 1000, 'batch_size': mod.adata.n_obs, 'use_gpu': True}
)

# Save model
mod.save(f"{run_name}", overwrite=True)

# mod = cell2location.models.Cell2location.load(f"{run_name}", adata_vis)

# Save anndata object with results
adata_file = f"{run_name}/sp.h5ad"
adata_vis.write(adata_file)

Visualising cell abundance in spatial coordinates

# add 5% quantile, representing confident cell abundance, 'at least this amount is present',
# to adata.obs with nice names for plotting
adata_vis.obs[adata_vis.uns['mod']['factor_names']] = adata_vis.obsm['q05_cell_abundance_w_sf']

# select one slide
from cell2location.utils import select_slide
slide = select_slide(adata_vis, 'V1_Human_Lymph_Node')

# plot in spatial coordinates
with mpl.rc_context({'axes.facecolor':  'black',
                     'figure.figsize': [4.5, 5]}):

    sc.pl.spatial(slide, cmap='magma',
                  # show first 8 cell types
                  color=['B_Cycling', 'B_GC_LZ', 'T_CD4+_TfH_GC', 'FDC',
                         'B_naive', 'T_CD4+_naive', 'B_plasma', 'Endo'],
                  ncols=4, size=1.3,
                  img_key='hires',
                  # limit color scale at 99.2% quantile of cell abundance
                  vmin=0, vmax='p99.2'
                 )

圖片.png

# Now we use cell2location plotter that allows showing multiple cell types in one panel
from cell2location.plt import plot_spatial

# select up to 6 clusters
clust_labels = ['T_CD4+_naive', 'B_naive', 'FDC']
clust_col = ['' + str(i) for i in clust_labels] # in case column names differ from labels

slide = select_slide(adata_vis, 'V1_Human_Lymph_Node')

with mpl.rc_context({'figure.figsize': (15, 15)}):
    fig = plot_spatial(
        adata=slide,
        # labels to show on a plot
        color=clust_col, labels=clust_labels,
        show_img=True,
        # 'fast' (white background) or 'dark_background'
        style='fast',
        # limit color scale at 99.2% quantile of cell abundance
        max_color_quantile=0.992,
        # size of locations (adjust depending on figure size)
        circle_diameter=6,
        colorbar_position='right'
    )

圖片.png

Downstream analysis

Identifying discrete tissue regions by Leiden clustering

通過使用由 cell2location 估計的細胞豐度對位置進行聚類來識別細胞組成不同的組織區(qū)域骂维。

我們通過使用每種細胞類型的估計細胞豐度對 Visium 點進行聚類來找到組織區(qū)域摇锋。我們構(gòu)建了一個 K-nearest neigbour (KNN) 圖奋献，表示估計細胞豐度中位置的相似性，然后應(yīng)用 Leiden 聚類买置。 KNN 鄰居的數(shù)量應(yīng)適應(yīng)數(shù)據(jù)集的大小和解剖學(xué)定義區(qū)域的大小（即海馬區(qū)域相當(dāng)小强霎，因此可能被大型 n_neighbors 掩蓋）忿项。這可以針對范圍 KNN 鄰居和 Leiden 聚類分辨率完成，直到獲得與組織解剖結(jié)構(gòu)匹配的聚類城舞。

聚類是在所有 Visium 部分/批次中聯(lián)合完成的轩触，因此區(qū)域身份是直接可比的。當(dāng)多個批次之間存在很強的技術(shù)影響時（這里不是這種情況）家夺，原則上可以使用 sc.external.pp.bbknn 來解釋 KNN 構(gòu)建過程中的這些影響脱柱。

The resulting clusters are saved in adata_vis.obs['region_cluster'].

# compute KNN using the cell2location output stored in adata.obsm
sc.pp.neighbors(adata_vis, use_rep='q05_cell_abundance_w_sf',
                n_neighbors = 15)

# Cluster spots into regions using scanpy
sc.tl.leiden(adata_vis, resolution=1.1)

# add region as categorical variable
adata_vis.obs["region_cluster"] = adata_vis.obs["leiden"].astype("category")

# compute UMAP using KNN graph based on the cell2location output
sc.tl.umap(adata_vis, min_dist = 0.3, spread = 1)

# show regions in UMAP coordinates
with mpl.rc_context({'axes.facecolor':  'white',
                     'figure.figsize': [8, 8]}):
    sc.pl.umap(adata_vis, color=['region_cluster'], size=30,
               color_map = 'RdPu', ncols = 2, legend_loc='on data',
               legend_fontsize=20)
    sc.pl.umap(adata_vis, color=['sample'], size=30,
               color_map = 'RdPu', ncols = 2,
               legend_fontsize=20)

# plot in spatial coordinates
with mpl.rc_context({'axes.facecolor':  'black',
                     'figure.figsize': [4.5, 5]}):
    sc.pl.spatial(adata_vis, color=['region_cluster'],
                  size=1.3, img_key='hires', alpha=0.5)

圖片.png

Identifying cellular compartments / tissue zones using matrix factorisation (NMF)(這部分應(yīng)該是新的內(nèi)容)

在這里，我們使用 cell2location 映射結(jié)果來識別細胞類型的空間共現(xiàn)拉馋，以便更好地了解組織組織并預(yù)測細胞相互作用褐捻。我們對來自 cell2location 的細胞類型豐度估計進行了非負矩陣分解（NMF）。與將 NMF 應(yīng)用于傳統(tǒng) scRNA-seq 的既定好處類似椅邓，附加 NMF 分解產(chǎn)生了一組空間細胞類型豐度曲線柠逞，將其分組為捕獲共定位細胞類型的組件。這種基于 NMF 的分解自然地解釋了這樣一個事實景馁，即多種細胞類型和微環(huán)境可以在相同的 Visium 位置共存板壮，同時跨組織區(qū)域（例如單個生發(fā)中心）共享信息。

提示在實踐中合住，最好針對一系列因子 (R={5, .., 30}) 訓(xùn)練 NMF绰精，并選擇 (R) 作為捕獲精細組織區(qū)域和拆分已知區(qū)室之間的平衡撒璧。如果您想找到幾個最明顯的細胞隔室，請使用少量因子笨使。如果您想找到非常強的協(xié)同定位信號并假設(shè)大多數(shù)細胞類型不協(xié)同定位卿樱，請使用很多因子（> 30 - 此處使用）。

Below we show how to perform this analysis. To aid this analysis, we wrapped the analysis shown the notebook on advanced downstream analysis into a pipeline that automates training of the NMF model with varying number of factors:

from cell2location import run_colocation
res_dict, adata_vis = run_colocation(
    adata_vis,
    model_name='CoLocatedGroupsSklearnNMF',
    train_args={
      'n_fact': np.arange(11, 13), # IMPORTANT: use a wider range of the number of factors (5-30)
      'sample_name_col': 'sample', # columns in adata_vis.obs that identifies sample
      'n_restarts': 3 # number of training restarts
    },
    export_args={'path': f'{run_name}/CoLocatedComb/'}
)

For every factor number, the model produces the following list of folder outputs:

cell_type_fractions_heatmap/: a dot plot of the estimated NMF weights of cell types (rows) across NMF components (columns)
cell_type_fractions_mean/: the data used for dot plot
factor_markers/: tables listing top 10 cell types most speficic to each NMF factor
models/: saved NMF models
predictive_accuracy/: 2D histogram plot showing how well NMF explains cell2location output
spatial/: NMF weights across locatinos in spatial coordinates
location_factors_mean/: the data used for the plot in spatial coordiantes
stability_plots/: stability of NMF weights between training restarts

檢查的關(guān)鍵輸出是 cell_type_fractions_heatmap/ 中的文件硫椰，它們顯示了與細胞隔室相對應(yīng)的 NMF 組件（列）中細胞類型（行）的估計 NMF 權(quán)重的點圖繁调。顯示的是相對權(quán)重，對每種細胞類型的組件進行了標(biāo)準(zhǔn)化靶草。

# Here we plot the NMF weights (Same as saved to `cell_type_fractions_heatmap`)
res_dict['n_fact12']['mod'].plot_cell_type_loadings()

圖片.png

Advanced use

Estimate cell-type specific expression of every gene in the spatial data

For this, we adapt the approach of estimating conditional expected expression proposed by RCTD (Cable et al) method. With cell2location, we can look at the posterior distribution rather than just point estimates of cell type specific expression (see mod.samples.keys() and next section on using full distribution).

# Compute expected expression per cell type
expected_dict = mod.module.model.compute_expected_per_cell_type(
    mod.samples["post_sample_q05"], mod.adata
)

# Add to anndata layers
for i, n in enumerate(mod.factor_names_):
    adata_vis.layers[n] = expected_dict['mu'][i]

# Save anndata object with results
adata_file = f"{run_name}/sp.h5ad"
adata_vis.write(adata_file)
adata_file

# Look at cell type specific expression in spatial coordinates,
# Here we highlight CD3D, pan T-cell marker expressed by
# 2 subtypes of T cells in distinct locations but not expressed by co-located B cells
ctypes = ['T_CD4+_TfH_GC', 'T_CD4+_naive', 'B_GC_LZ']
genes = ['CD3D', 'CR2']

with mpl.rc_context({'axes.facecolor':  'black'}):
    # select one slide
    slide = select_slide(adata_vis, 'V1_Human_Lymph_Node')

    from tutorial_utils import plot_genes_per_cell_type
    plot_genes_per_cell_type(slide, genes, ctypes);

圖片.png

就是想回顧一下蹄胰，當(dāng)然了，軟件也更新了很多內(nèi)容

生活很好奕翔，有你更好

最后編輯于：2021.12.30 13:39:35

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

禁止轉(zhuǎn)載裕寨，如需轉(zhuǎn)載請通過簡信或評論聯(lián)系作者。

人面猴
序言：七十年代末派继，一起剝皮案震驚了整個濱河市宾袜，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌驾窟，老刑警劉巖试和，帶你破解...
沈念sama閱讀 206,013評論 6贊 481
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異纫普，居然都是意外死亡阅悍，警方通過查閱死者的電腦和手機，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,205評論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門昨稼，熙熙樓的掌柜王于貴愁眉苦臉地迎上來节视，“玉大人，你說我怎么就攤上這事假栓⊙靶校” “怎么了？”我有些...
開封第一講書人閱讀 152,370評論 0贊 342
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵匾荆，是天一觀的道長拌蜘。經(jīng)常有香客問我，道長牙丽，這世上最難降的妖魔是什么简卧？我笑而不...
開封第一講書人閱讀 55,168評論 1贊 278
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮烤芦，結(jié)果婚禮上举娩，老公的妹妹穿的比我還像新娘。我一直安慰自己，他們只是感情好铜涉，可當(dāng)我...
茶點故事閱讀 64,153評論 5贊 371
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布智玻。她就那樣靜靜地躺著，像睡著了一般芙代。火紅的嫁衣襯著肌膚如雪吊奢。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 48,954評論 1贊 283
城市分裂傳說
那天纹烹，我揣著相機與錄音页滚，去河邊找鬼。笑死滔韵，一個胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的掌实。我是一名探鬼主播陪蜻，決...
沈念sama閱讀 38,271評論 3贊 399
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼贱鼻！你這毒婦竟也來了宴卖？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 36,916評論 0贊 259
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤邻悬，失蹤者是張志新（化名）和其女友劉穎症昏，沒想到半個月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體父丰，經(jīng)...
沈念sama閱讀 43,382評論 1贊 300
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡肝谭，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 35,877評論 2贊 323
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了蛾扇。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片攘烛。...
茶點故事閱讀 37,989評論 1贊 333
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖镀首，靈堂內(nèi)的尸體忽然破棺而出坟漱，到底是詐尸還是另有隱情，我是刑警寧澤更哄，帶...
沈念sama閱讀 33,624評論 4贊 322
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布芋齿，位于F島的核電站，受9級特大地震影響成翩，放射性物質(zhì)發(fā)生泄漏觅捆。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點故事閱讀 39,209評論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一麻敌、第九天我趴在偏房一處隱蔽的房頂上張望惠拭。院中可真熱鬧，春花似錦、人聲如沸职辅。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,199評論 0贊 19
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽域携。三九已至簇秒，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間秀鞭，已是汗流浹背趋观。一陣腳步聲響...
開封第一講書人閱讀 31,418評論 1贊 260
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留锋边，地道東北人皱坛。一個月前我還...
沈念sama閱讀 45,401評論 2贊 352
代替公主和親
正文我出身青樓，卻偏偏與公主長得像豆巨，于是被迫代替她去往敵國和親剩辟。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點故事閱讀 42,700評論 2贊 345

10X單細胞空間聯(lián)合分析之再次解讀cell2location

到了年根了胡本，其實想總結(jié)一些之前的東西，其中cell2location很想再次分享一下兢哭，好好研讀一下庶香。

provide 2 methods for estimating reference cell type signatures from scRNA-seq data:

代碼部分也很值得多多學(xué)習(xí)

Loading packages

First, let’s define where we save the results of our analysis:

Loading Visium and scRNA-seq reference data

在單細胞參考中包含了跨越淋巴結(jié)蜜徽、脾臟和扁桃體的 scRNA-seq 數(shù)據(jù)集裹芝，以確保我們捕獲了空間轉(zhuǎn)錄組數(shù)據(jù)集中可能存在的免疫細胞狀態(tài)的全部多樣性。

Estimation of reference cell type signatures (NB regression)

The signatures are estimated from scRNA-seq data, accounting for batch effect, using a Negative binomial regression model.(這里需要考慮批次效應(yīng))

Examine QC plots

The model and output h5ad can be loaded later like this:

Cell2location: spatial mapping

Visualising cell abundance in spatial coordinates

Downstream analysis

Identifying discrete tissue regions by Leiden clustering

The resulting clusters are saved in adata_vis.obs['region_cluster'].

Identifying cellular compartments / tissue zones using matrix factorisation (NMF)(這部分應(yīng)該是新的內(nèi)容)

Below we show how to perform this analysis. To aid this analysis, we wrapped the analysis shown the notebook on advanced downstream analysis into a pipeline that automates training of the NMF model with varying number of factors:

For every factor number, the model produces the following list of folder outputs:

Advanced use

Estimate cell-type specific expression of every gene in the spatial data

就是想回顧一下蹄胰，當(dāng)然了，軟件也更新了很多內(nèi)容

推薦閱讀更多精彩內(nèi)容