10X單細胞空間回顧之stlearn

今天我們還有復習一下stlearn阳懂,很棒的軟件

stSME clustering tutorial(空間平滑)

stSME is a novel normalisation method implemented in stLearn software.

It’s designed for spatial transcriptomics data and utilised tissue Spatial location, Morphology, , and gene Expression.

This tutorial demonstrates how to use stLearn to perform stSME clustering for spatial transcriptomics data

Mouse Brain (Coronal)

1. Preparation

# import module
import stlearn as st
from pathlib import Path
st.settings.set_figure_params(dpi=180)
# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_coronal")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# output path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_coronl")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

2. run stSME clustering

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)
data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)
# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=19, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")
../_images/tutorials_stSME_clustering_14_1.png
# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=17,use_rep='X_pca')
st.tl.clustering.louvain(data_SME, resolution=1.19)
st.pl.cluster_plot(data_SME,use_label="louvain")
../_images/tutorials_stSME_clustering_15_1.png

we now move to Mouse Brain (Sagittal Posterior) Visium dataset from 10x genomics website.

Mouse Brain (Sagittal Posterior)

1. Preparation

# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_s_p_1/")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# outpot path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_s_p_1/")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)
# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

2. run stSME clustering

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)
data_SME = data.copy()
# apply stSME to normalise log transformed data
# with weights from morphological Similarly and physcial distance
st.spatial.SME.SME_normalize(data_SME, use_data="raw",
                             weights="weights_matrix_pd_md")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)
# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=17, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")
../_images/tutorials_stSME_clustering_26_1.png
# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=20,use_rep='X_pca')
st.tl.clustering.louvain(data_SME)
st.pl.cluster_plot(data_SME,use_label="louvain")
../_images/tutorials_stSME_clustering_27_1.png

Then we apply stSME clustering on Human Brain dorsolateral prefrontal cortex (DLPFC) Visium dataset from this paper.

Human Brain dorsolateral prefrontal cortex (DLPFC)

import pandas as pd
from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score, silhouette_score, \
                            homogeneity_completeness_v_measure
from sklearn.metrics.cluster import contingency_matrix
from sklearn.preprocessing import LabelEncoder
import numpy as np
import scanpy
def purity_score(y_true, y_pred):
    # compute contingency matrix (also called confusion matrix)
    cm = contingency_matrix(y_true, y_pred)
    # return purity
    return np.sum(np.amax(cm, axis=0)) / np.sum(cm)
# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/Human_Brain_spatialLIBD")
# here we include all 12 samples
sample_list = ["151507", "151508", "151509",
               "151510", "151669", "151670",
               "151671", "151672", "151673",
               "151674", "151675", "151676"]

Ground truth

for i in range(len(sample_list)):
    sample = sample_list[i]

    GROUND_TRUTH_PATH = BASE_PATH / sample / "cluster_labels_{}.csv".format(sample)
    ground_truth_df = pd.read_csv(GROUND_TRUTH_PATH, sep=',', index_col=0)
    ground_truth_df.index = ground_truth_df.index.map(lambda x: x[7:])

    le = LabelEncoder()
    ground_truth_le = le.fit_transform(list(ground_truth_df["ground_truth"].values))
    ground_truth_df["ground_truth_le"] = ground_truth_le

    # load data
    data = st.Read10X(BASE_PATH / sample)
    ground_truth_df = ground_truth_df.reindex(data.obs_names)
    data.obs["ground_truth"] = pd.Categorical(ground_truth_df["ground_truth"])
    st.pl.cluster_plot(data, use_label="ground_truth", cell_alpha=0.5)
../_images/tutorials_stSME_clustering_35_1.png
../_images/tutorials_stSME_clustering_35_3.png
../_images/tutorials_stSME_clustering_35_5.png
../_images/tutorials_stSME_clustering_35_7.png
../_images/tutorials_stSME_clustering_35_9.png
../_images/tutorials_stSME_clustering_35_11.png
../_images/tutorials_stSME_clustering_35_13.png
../_images/tutorials_stSME_clustering_35_15.png
../_images/tutorials_stSME_clustering_35_17.png
../_images/tutorials_stSME_clustering_35_19.png
../_images/tutorials_stSME_clustering_35_21.png
../_images/tutorials_stSME_clustering_35_23.png
def calculate_clustering_matrix(pred, gt, sample, methods_):
    df = pd.DataFrame(columns=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"])

    pca_ari = adjusted_rand_score(pred, gt)
    df = df.append(pd.Series([sample, pca_ari, "pca", methods_, "Adjusted_Rand_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_nmi = normalized_mutual_info_score(pred, gt)
    df = df.append(pd.Series([sample, pca_nmi, "pca", methods_, "Normalized_Mutual_Info_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_purity = purity_score(pred, gt)
    df = df.append(pd.Series([sample, pca_purity, "pca", methods_, "Purity_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_homogeneity, pca_completeness, pca_v_measure = homogeneity_completeness_v_measure(pred, gt)

    df = df.append(pd.Series([sample, pca_homogeneity, "pca", methods_, "Homogeneity_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    df = df.append(pd.Series([sample, pca_completeness, "pca", methods_, "Completeness_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    df = df.append(pd.Series([sample, pca_v_measure, "pca", methods_, "V_Measure_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)
    return df
df = pd.DataFrame(columns=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"])
for i in range(12):
    sample = sample_list[i]
    GROUND_TRUTH_PATH = BASE_PATH / sample / "cluster_labels_{}.csv".format(sample)
    ground_truth_df = pd.read_csv(GROUND_TRUTH_PATH, sep=',', index_col=0)
    ground_truth_df.index = ground_truth_df.index.map(lambda x: x[7:])

    le = LabelEncoder()
    ground_truth_le = le.fit_transform(list(ground_truth_df["ground_truth"].values))
    ground_truth_df["ground_truth_le"] = ground_truth_le
    TILE_PATH = Path("/tmp/{}_tiles".format(sample))
    TILE_PATH.mkdir(parents=True, exist_ok=True)
    data = st.Read10X(BASE_PATH / sample)
    ground_truth_df = ground_truth_df.reindex(data.obs_names)
    n_cluster = len((set(ground_truth_df["ground_truth"]))) - 1
    data.obs['ground_truth'] = ground_truth_df["ground_truth"]
    ground_truth_le = ground_truth_df["ground_truth_le"]

    # pre-processing for gene count table
    st.pp.filter_genes(data,min_cells=1)
    st.pp.normalize_total(data)
    st.pp.log1p(data)

    # run PCA for gene expression data
    st.em.run_pca(data,n_comps=15)

    # pre-processing for spot image
    st.pp.tiling(data, TILE_PATH)

    # this step uses deep learning model to extract high-level features from tile images
    # may need few minutes to be completed
    st.pp.extract_feature(data)

    # stSME
    st.spatial.SME.SME_normalize(data, use_data="raw", weights="physical_distance")
    data_ = data.copy()
    data_.X = data_.obsm['raw_SME_normalized']

    st.pp.scale(data_)
    st.em.run_pca(data_,n_comps=15)

    st.tl.clustering.kmeans(data_, n_clusters=n_cluster, use_data="X_pca", key_added="X_pca_kmeans")
    st.pl.cluster_plot(data_, use_label="X_pca_kmeans")

    methods_ = "stSME_disk"
    results_df = calculate_clustering_matrix(data_.obs["X_pca_kmeans"], ground_truth_le, sample, methods_)
    df = df.append(results_df, ignore_index=True)
../_images/tutorials_stSME_clustering_38_6.png
../_images/tutorials_stSME_clustering_38_15.png
../_images/tutorials_stSME_clustering_38_24.png
../_images/tutorials_stSME_clustering_38_33.png
../_images/tutorials_stSME_clustering_38_44.png
../_images/tutorials_stSME_clustering_38_55.png
../_images/tutorials_stSME_clustering_38_66.png
../_images/tutorials_stSME_clustering_38_77.png
../_images/tutorials_stSME_clustering_38_88.png
../_images/tutorials_stSME_clustering_38_99.png
../_images/tutorials_stSME_clustering_38_110.png
../_images/tutorials_stSME_clustering_38_121.png
# read clustering results from other methods
pca_df = pd.read_csv("./stSME_matrices_other_methods.csv")
df_all = pca_df.append(df, ignore_index=True)
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style("white")
from matplotlib.patches import PathPatch

def adjust_box_widths(g, fac):
    """
 Adjust the widths of a seaborn-generated boxplot.
 """

    # iterating through Axes instances
    for ax in g.axes:

        # iterating through axes artists:
        for c in ax.get_children():

            # searching for PathPatches
            if isinstance(c, PathPatch):
                # getting current width of box:
                p = c.get_path()
                verts = p.vertices
                verts_sub = verts[:-1]
                xmin = np.min(verts_sub[:, 0])
                xmax = np.max(verts_sub[:, 0])
                xmid = 0.5*(xmin+xmax)
                xhalf = 0.5*(xmax - xmin)

                # setting new width of box
                xmin_new = xmid-fac*xhalf
                xmax_new = xmid+fac*xhalf
                verts_sub[verts_sub[:, 0] == xmin, 0] = xmin_new
                verts_sub[verts_sub[:, 0] == xmax, 0] = xmax_new

                # setting new width of median line
                for l in ax.lines:
                    if np.all(l.get_xdata() == [xmin, xmax]):
                        l.set_xdata([xmin_new, xmax_new])

class GridShader():
    def __init__(self, ax, first=True, **kwargs):
        self.spans = []
        self.sf = first
        self.ax = ax
        self.kw = kwargs
        self.ax.autoscale(False, axis="x")
        self.cid = self.ax.callbacks.connect('xlim_changed', self.shade)
        self.shade()
    def clear(self):
        for span in self.spans:
            try:
                span.remove()
            except:
                pass
    def shade(self, evt=None):
        self.clear()
        xticks = self.ax.get_xticks()
        xlim = self.ax.get_xlim()
        xticks = xticks[(xticks > xlim[0]) & (xticks < xlim[-1])]
        locs = np.concatenate(([[xlim[0]], xticks+0.5, [xlim[-1]]]))

        start = locs[1-int(self.sf)::2]
        end = locs[2-int(self.sf)::2]

        for s, e in zip(start, end):
            self.spans.append(self.ax.axvspan(s, e, zorder=0, **self.kw))

import seaborn as sns
fig = plt.figure(figsize=(15, 5))
a = sns.boxplot(x="Method", y="Score", hue="test",

               width=0.7,
               data=df_all)
a.set_xticklabels(a.get_xticklabels(), rotation=45)
sns.despine(left=True)
a.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
adjust_box_widths(fig, 0.7)
plt.autoscale()
gs = GridShader(a, facecolor="lightgrey", first=False, alpha=0.7)
plt.tight_layout()
#plt.savefig("./clustering_performace.png", dpi=300)
plt.show()
../_images/tutorials_stSME_clustering_42_1.png

stSME normalization & imputation effects(插補數(shù)據(jù))

This tutorial shows the stSME normalization effect between of two scenarios: - (1) normal (without stSME) - (2) stSME applied on raw gene counts

In this tutorial we use Mouse Brain (Coronal) Visium dataset from 10x genomics website.

# import module
import stlearn as st
from pathlib import Path
st.settings.set_figure_params(dpi=180)
# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_coronal")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# output path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_coronl")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

(1) normal (without stSME)

<pre>data_normal = data.copy()
</pre>

marker gene for CA3

i="Lhfpl1"
st.pl.gene_plot(data_normal, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
cb = plt.colorbar(plot, aspect=10, shrink=0.5, cmap=cmap)
../_images/tutorials_stSME_comparison_11_1.png

marker gene for DG

i="Pla2g2f"
st.pl.gene_plot(data_normal, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
../_images/tutorials_stSME_comparison_13_0.png

(2) stSME applied on raw gene counts

data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.em.run_pca(data_SME,n_comps=50)

marker gene for CA3

i="Lhfpl1"
st.pl.gene_plot(data_SME, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
  cb = plt.colorbar(plot, aspect=10, shrink=0.5, cmap=cmap)
../_images/tutorials_stSME_comparison_17_1.png

marker gene for DG

i="Pla2g2f"
st.pl.gene_plot(data_SME, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
../_images/tutorials_stSME_comparison_19_0.png

Spatial trajectory inference analysis tutorial(空間軌跡)

1. Preparation

We are trying to keep it focus on spatial trajectory inference then every step from reading data, processing to clustering, we will give the code here to easier for user to use.

Read and preprocess data

import stlearn as st
import scanpy as sc
sc.settings.verbosity = 3
st.settings.set_figure_params(dpi=120)
# Reading data
data = st.Read10X(path="path_to_BCBA")
# Save raw_count
data.raw = data.X
# Preprocessing
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)

Clustering data

# Run PCA
st.em.run_pca(data,n_comps=50,random_state=0)
# Tiling image
st.pp.tiling(data,out_path="tiling",crop_size = 40)
# Using Deep Learning to extract feature
st.pp.extract_feature(data)
# Apply stSME spatial-PCA option
st.spatial.morphology.adjust(data,use_data="X_pca",radius=50,method="mean")
st.pp.neighbors(data,n_neighbors=25,use_rep='X_pca_morphology',random_state=0)
st.tl.clustering.louvain(data,random_state=0)
st.pl.cluster_plot(data,use_label="louvain",image_alpha=1,size=7)
../_images/tutorials_Pseudo-time-space-tutorial_8_0.png
st.add.annotation(data,label_list=['Fatty tissue,immune/lymphoid 1 MALAT1+',
                                   'Invasive cancer,fibrous tissue 1 CXCL14+',
                                   'Invasive cancer,fibrous tissue 2 CRISP3+',
                                   'Invasive cancer,fibrous tissue, fatty tissue',
                                   'Fatty tissue,immune/lymphoid 2 IGKC+',
                                   'Fibrous tissue',
                                   'Invasive cancer,fibrous tissue (DCIS)',
                                   'Fatty tissue, Fibrous tissue',
                                   'Invasive cancer,immune/lymphoid (IDC)' ,
                                   'Invasive cancer,fatty tissue 3 MUC5B+',
                                   'Fatty tissue'],
                 use_label="louvain")
st.pl.cluster_plot(data,use_label="louvain_anno",image_alpha=1,size=7)
../_images/tutorials_Pseudo-time-space-tutorial_9_1.png

2. Spatial trajectory inference

Choosing root

3733 is the index of the spot that we chose as root. It in the DCIS cluster (6). We recommend the root spot should be in the end/begin of a cluster in UMAP space. You can find min/max point of a cluster in UMAP as root.

data.uns["iroot"] = 3733
st.spatial.trajectory.pseudotime(data,eps=50,use_rep="X_pca",use_sme=True)

Spatial trajectory inference - global level

We run the global level of pseudo-time-space (PSTS) method to reconstruct the spatial trajectory between cluster 6 (DCIS) and 8 (lesions IDC)

st.spatial.trajectory.pseudotimespace_global(data,use_label="louvain",list_clusters=[6,8])
st.pl.cluster_plot(data,use_label="louvain",show_trajectories=True,list_clusters=[6,8],show_subcluster=True)
../_images/tutorials_Pseudo-time-space-tutorial_14_3.png
st.pl.trajectory.tree_plot(data)
../_images/tutorials_Pseudo-time-space-tutorial_15_0.png

Transition markers detection

Based on the spatial trajectory/tree plot, we can see 2 clades are started from sub-cluster 6 and 13. Then we run the function to detect the highly correlated genes with the PSTS values.

st.spatial.trajectory.detect_transition_markers_clades(data,clade=6,use_raw_count=False,cutoff_spearman=0.3)
st.spatial.trajectory.detect_transition_markers_clades(data,clade=13,use_raw_count=False,cutoff_spearman=0.3)

For the transition markers plot, genes from left side (red) are negatively correlated with the spatial trajectory and genes from right side (blue) are positively correlated with the spatial trajectory.

st.pl.trajectory.transition_markers_plot(data,top_genes=30,trajectory="clade_6")
../_images/tutorials_Pseudo-time-space-tutorial_20_0.png
st.pl.trajectory.transition_markers_plot(data,top_genes=30,trajectory="clade_13")
../_images/tutorials_Pseudo-time-space-tutorial_21_0.png

We also provide a function to compare the transition markers between two clades.

st.spatial.trajectory.compare_transitions(data,trajectories=("clade_6","clade_13"))
st.settings.set_figure_params(dpi=150)
st.pl.trajectory.DE_transition_plot(data,top_genes=10)
../_images/tutorials_Pseudo-time-space-tutorial_24_0.png

We can visualize some genes that different between two clades.

st.pl.gene_plot(data,gene_symbols="CTPS2",list_clusters=[6,8])
../_images/tutorials_Pseudo-time-space-tutorial_26_0.png
st.pl.gene_plot(data,gene_symbols="IFITM3",list_clusters=[6,8])
../_images/tutorials_Pseudo-time-space-tutorial_27_0.png

stLearn Cell-cell interaction analysis(空間臨近通訊)

1. Data loading and preprocessing

import stlearn as st
import pandas as pd
import random
st.settings.set_figure_params(dpi=100)
# read in visium dataset downloaded from: support.10xgenomics.com/spatial-gene-expression/datasets/1.0.0/V1_Breast_Cancer_Block_A_Section_2
data = st.Read10X("[PATH_TO_DATASET]")
# st.add.image(adata=data, imgpath="C:\\Users\\uqjxu8\\GIH\\Bioinformatics\\SPA\\Data\\visium\\Human_Breast_Cancer_Block_A_Section_1\\spatial\\tissue_hires_nobg.png",
#             library_id="V1_Breast_Cancer_Block_A_Section_1",visium=True)
# preprocessing
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.scale(data)

2. Count cell type diversity (between-spots)

Read in the cell type predictions for each spot from Seurat label transfer results based on a labelled scRNA dataset

st.add.labels(data, 'tutorials/label_transfer_bc.csv', sep='\t')
st.pl.cluster_plot(data,use_label="predictions")
../_images/tutorials_stLearn-CCI_4_1.png

Count cell type diversity for between-spot mode

st.tl.cci.het.count(data, use_label='label_transfer')
st.pl.het_plot(data, use_het='cci_het')
../_images/tutorials_stLearn-CCI_6_1.png

3. Ligand-receptor co-expression (between-spots)

Read in user input LR pair

data.uns["lr"] = ['IL34_CSF1R']

Ligand-receptor co-expression in the neighbouring spots

st.tl.cci.lr(adata=data)
st.pl.het_plot(data, use_het='cci_lr', image_alpha=0.7)
Altogether 2 valid L-R pairs
L-R interactions with neighbours are counted and stored into adata.obsm['cci_lr']
../_images/tutorials_stLearn-CCI_11_1.png

4. Calculate merged CCI scores (between-spots)

st.tl.cci.merge(data, use_lr='cci_lr', use_het='cci_het')
st.pl.het_plot(data, use_het='merged', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_13_1.png

5. Permutation test (between-spots) (could be time consuming)

Permutation Run

st.tl.cci.permutation(data, use_het='cci_het', n_pairs=200)

Significance Test against null distribution

# plot the -log10(pvalue) from permutation test on each spot
st.pl.het_plot(data, use_het='merged_pvalues', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_18_0.png

Final significant hotspots

st.pl.het_plot(data, use_het='merged_sign', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_20_0.png

6. Count cell type diversity (within-spots)

st.tl.cci.het.count(data, use_label='label_transfer', distance=0)
st.pl.het_plot(data, use_het='cci_het')
../_images/tutorials_stLearn-CCI_22_1.png

7. Ligand-receptor co-expression (within-spot)

Read in user input LR pair

data.uns["lr"] = ['IL34_CSF1R']

Ligand-receptor co-expression within each spot

st.tl.cci.lr(adata=data, distance=0)
st.pl.het_plot(data, use_het='cci_lr', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_27_1.png

8. Calculate merged CCI scores (within-spot)

st.tl.cci.merge(data, use_lr='cci_lr', use_het='cci_het')
st.pl.het_plot(data, use_het='merged', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_29_1.png

9. Permutation test (within-spot) (could be time consuming)

Permutation Run

st.tl.cci.permutation(data, use_het='cci_het', n_pairs=200, distance=0)

Significance Test against null distribution

# plot the -log10(pvalue) from permutation test on each spot
st.pl.het_plot(data, use_het='merged_pvalues', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_34_0.png

Final significant hotspots

st.pl.het_plot(data, use_het='merged_sign', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_36_0.png

Core plotting functions(核心繪圖功能)

Loading processed data

import stlearn as st
import scanpy as sc
st.settings.set_figure_params(dpi=120)

# Ignore all warnings
import warnings
warnings.filterwarnings("ignore")
# Read raw data
adata = sc.datasets.visium_sge()
adata = st.convert_scanpy(adata)

# Read processed data
adata_processed = st.datasets.example_bcba()

Gene plot

Here is the standard plot for gene expression, we provide 2 options for single genes and multiple genes:

st.pl.gene_plot(adata, gene_symbols="BRCA1")
../_images/tutorials_Core_plots_7_0.png

For multiple genes, you can combine multiple genes by 'CumSum'cummulative sum or 'NaiveMean'naive mean:

st.pl.gene_plot(adata, gene_symbols=["BRCA1","BRCA2"], method="CumSum")
../_images/tutorials_Core_plots_9_0.png
st.pl.gene_plot(adata, gene_symbols=["BRCA1","BRCA2"], method="NaiveMean")
../_images/tutorials_Core_plots_10_0.png

You also can plot genes with contour plot to see clearer about the distribution of genes:

st.pl.gene_plot(adata, gene_symbols="GAPDH", contour=True,cell_alpha=0.5)
../_images/tutorials_Core_plots_12_0.png

You can change the step_size to cut the range of display in contour

st.pl.gene_plot(adata, gene_symbols="GAPDH", contour=True,cell_alpha=0.5, step_size=200)
../_images/tutorials_Core_plots_14_0.png

Cluster plot

We provide different options for display clustering results. Several show_* options that user can control to display different parts of the figure:

st.pl.cluster_plot(adata_processed,use_label="louvain")
../_images/tutorials_Core_plots_17_0.png
st.pl.cluster_plot(adata_processed,use_label="louvain",show_cluster_labels=True,show_color_bar=False)
../_images/tutorials_Core_plots_18_0.png

Subcluster plot

We also provide option to plot spatial subclusters based on the spatial location within a cluster.

You have two options here, display subclusters for multiple clusters using show_subcluster in st.pl.cluster_plot or use st.pl.subcluster_plot to display subclusters within a cluster but with different color.

st.pl.cluster_plot(adata_processed,use_label="louvain",show_subcluster=True,show_color_bar=False, list_clusters=[6,8])
../_images/tutorials_Core_plots_21_0.png
st.pl.subcluster_plot(adata_processed, use_label="louvain", cluster = 6)
../_images/tutorials_Core_plots_22_0.png

Spatial trajectory plot

We provided st.pl.trajectory.pseudotime_plot to visualize PAGA graph that maps into spatial transcriptomics array.

st.pl.trajectory.pseudotime_plot(adata_processed, use_label="louvain",show_node=T)
../_images/tutorials_Core_plots_25_0.png

You can plot spatial trajectory analysis results with the node in each subcluster by show_trajectories and show_node parameters.

st.pl.cluster_plot(adata_processed,use_label="louvain",show_trajectories=True,
                   show_color_bar=True, list_clusters=[6,8], show_node=True)
../_images/tutorials_Core_plots_27_0.png

Cell-cell interaction plot

For the cell-cell interaction, you can display the result by st.pl.het_plot. We also provide an option to display the contour plot.

st.pl.het_plot(adata_processed, use_het="lr_pvalues")
../_images/tutorials_Core_plots_30_0.png
st.pl.het_plot(adata_processed, use_het="lr",contour=True,step_size=1)
../_images/tutorials_Core_plots_31_0.png

Spatial transcriptomics deconvolution visualization(單細胞空間聯(lián)合)

SPOTlight

You can follow the tutorial of SPOTlight in their git repository: https://github.com/MarcElosua/SPOTlight

After done the tutorial, you can run this R code to get deconvolution_result.csv

# This is R code. You should run this after done SPOTlight tutorial

tmp = as.data.frame(decon_mtrx)
row.names(tmp) <- row.names(brain@meta.data)
write.csv(t(tmp[1:(length(tmp)-1)]),"deconvolution_result.csv")

Note that: - brain is the Seurat object of Spatial Transcriptomics - We save the decon_mtrx to .csv file as the input of our deconvolution visualization function

Seurat label transferring

Seurat provided a vignette about spatial transcriptomics analysis (https://satijalab.org/seurat/v3.2/spatial_vignette.html).

In the section: Integration with single-cell data, you can follow this to do the label transferring.

After that, you can run this code in R to get the deconvolution_result.csv

# This is R code. You should run this after done Integration with single-cell data tutorial

write.csv(predictions.assay@data[-nrow(predictions.assay@data),],"deconvolution_result.csv")

Other software

If you use other software, you should convert the result to this format:

deconvolution_result
圖片.png

stLearn deconvolution visualization

Running the basic analysis step of stLearn

import stlearn as st
data = st.Read10X(path="BRAIN_PATH")
data.var_names_make_unique()
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)
st.em.run_pca(data,n_comps=10,random_state=0)
st.pp.neighbors(data,n_neighbors=15,use_rep='X_pca',random_state=0)
st.tl.clustering.louvain(data,random_state=0)
st.pl.cluster_plot(data,use_label="louvain",image_alpha=1,size=7)
../_images/tutorials_ST_deconvolution_visualization_17_1.png

Add annotation from other software and visualize it into a scatter pie plot with donut chart

st.add.add_deconvolution(data,annotation_path="deconvolution_result.csv")

We also provide a threshold parameter that do the filtering based on quantile. The objective is removing the noise labels.

st.pl.deconvolution_plot(data,threshold=0.5)
../_images/tutorials_ST_deconvolution_visualization_21_0.png
st.pl.deconvolution_plot(data,threshold=0.0)
../_images/tutorials_ST_deconvolution_visualization_22_0.png

You also can examine the proportion of cell types in each cluster

deconvolution_plot(data,cluster=9)
../_images/tutorials_ST_deconvolution_visualization_24_0.png

生活很好,有你更好

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
禁止轉(zhuǎn)載柜思,如需轉(zhuǎn)載請通過簡信或評論聯(lián)系作者岩调。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市赡盘,隨后出現(xiàn)的幾起案子号枕,更是在濱河造成了極大的恐慌,老刑警劉巖陨享,帶你破解...
    沈念sama閱讀 218,204評論 6 506
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件葱淳,死亡現(xiàn)場離奇詭異,居然都是意外死亡霉咨,警方通過查閱死者的電腦和手機蛙紫,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,091評論 3 395
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來途戒,“玉大人坑傅,你說我怎么就攤上這事∨缯” “怎么了唁毒?”我有些...
    開封第一講書人閱讀 164,548評論 0 354
  • 文/不壞的土叔 我叫張陵蒜茴,是天一觀的道長。 經(jīng)常有香客問我浆西,道長粉私,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 58,657評論 1 293
  • 正文 為了忘掉前任近零,我火速辦了婚禮诺核,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘久信。我一直安慰自己窖杀,他們只是感情好,可當我...
    茶點故事閱讀 67,689評論 6 392
  • 文/花漫 我一把揭開白布裙士。 她就那樣靜靜地躺著入客,像睡著了一般。 火紅的嫁衣襯著肌膚如雪腿椎。 梳的紋絲不亂的頭發(fā)上桌硫,一...
    開封第一講書人閱讀 51,554評論 1 305
  • 那天,我揣著相機與錄音啃炸,去河邊找鬼铆隘。 笑死,一個胖子當著我的面吹牛肮帐,可吹牛的內(nèi)容都是我干的咖驮。 我是一名探鬼主播,決...
    沈念sama閱讀 40,302評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼训枢,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了忘巧?” 一聲冷哼從身側(cè)響起恒界,我...
    開封第一講書人閱讀 39,216評論 0 276
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎砚嘴,沒想到半個月后十酣,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,661評論 1 314
  • 正文 獨居荒郊野嶺守林人離奇死亡际长,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,851評論 3 336
  • 正文 我和宋清朗相戀三年耸采,在試婚紗的時候發(fā)現(xiàn)自己被綠了掂林。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片菠发。...
    茶點故事閱讀 39,977評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡褥影,死狀恐怖商乎,靈堂內(nèi)的尸體忽然破棺而出赖阻,到底是詐尸還是另有隱情,我是刑警寧澤栏渺,帶...
    沈念sama閱讀 35,697評論 5 347
  • 正文 年R本政府宣布瘩将,位于F島的核電站,受9級特大地震影響搪泳,放射性物質(zhì)發(fā)生泄漏稀轨。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 41,306評論 3 330
  • 文/蒙蒙 一岸军、第九天 我趴在偏房一處隱蔽的房頂上張望奋刽。 院中可真熱鬧,春花似錦艰赞、人聲如沸杨名。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,898評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽台谍。三九已至,卻和暖如春吁断,著一層夾襖步出監(jiān)牢的瞬間趁蕊,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,019評論 1 270
  • 我被黑心中介騙來泰國打工仔役, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留掷伙,地道東北人。 一個月前我還...
    沈念sama閱讀 48,138評論 3 370
  • 正文 我出身青樓又兵,卻偏偏與公主長得像任柜,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子沛厨,可洞房花燭夜當晚...
    茶點故事閱讀 44,927評論 2 355

推薦閱讀更多精彩內(nèi)容