10X單細胞空間分析回顧之SPOTlight

一年了，我們都需要總結(jié)鹃答，這一篇我們回顧一下SPOTlight乎澄，非常好的方法，建議大家也總結(jié)一下测摔，看看2021年置济，自己都得到了什么。

SPOTlight 的目標是提供一種工具锋八，能夠?qū)Π毎旌衔锏拿總€捕獲位置中存在的細胞類型和細胞類型比例進行解卷積浙于，最初是為 10X 的 Visium - 空間轉(zhuǎn)錄組學技術(shù)開發(fā)的， it can be used for all technologies returning mixtures of cells挟纱。 SPOTlight 基于通過 NMFreg 模型為每種細胞類型查找topics profiles singatures羞酗，并找到最適合我們想要解卷積的spot的組合。

圖片.png

Libraries

library(Matrix)
library(data.table)
library(Seurat)
library(SeuratData)
library(dplyr)
library(gt)
library(SPOTlight)
library(igraph)
library(RColorBrewer)

Load data

Load single-cell reference dataset.

path_to_data <- system.file(package = "SPOTlight")
cortex_sc <- readRDS(glue::glue("{path_to_data}/allen_cortex_dwn.rds"))

Load Spatial data

if (! "stxBrain" %in% SeuratData::AvailableData()[, "Dataset"]) {
  # If dataset not downloaded proceed to download it
  SeuratData::InstallData("stxBrain")
}

# Load data
anterior <- SeuratData::LoadData("stxBrain", type = "anterior1")

Pre-processing,設置種子的功能大家還知道么紊服？檀轨？

set.seed(123)
cortex_sc <- Seurat::SCTransform(cortex_sc, verbose = FALSE) %>%
  Seurat::RunPCA(., verbose = FALSE) %>%
  Seurat::RunUMAP(., dims = 1:30, verbose = FALSE)

Visualize the clustering

Seurat::DimPlot(cortex_sc,
                group.by = "subclass",
                label = TRUE) + Seurat::NoLegend()

圖片.png

Descriptive

cortex_sc@meta.data %>%
  dplyr::count(subclass) %>%
  gt::gt(.[-1, ]) %>%
  gt::tab_header(
    title = "Cell types present in the reference dataset",
  ) %>%
  gt::cols_label(
    subclass = gt::html("Cell Type")
  )

圖片.png

Compute marker genes

為了確定最重要的標記基因，我們可以使用函數(shù) Seurat::FindAllMarkers围苫，它將返回每個cluster的標記裤园。

Seurat::Idents(object = cortex_sc) <- cortex_sc@meta.data$subclass
cluster_markers_all <- Seurat::FindAllMarkers(object = cortex_sc, 
                                              assay = "SCT",
                                              slot = "data",
                                              verbose = TRUE, 
                                              only.pos = TRUE)

saveRDS(object = cluster_markers_all,
        file = here::here("inst/markers_sc.RDS"))

SPOTlight Decomposition

set.seed(123)

spotlight_ls <- spotlight_deconvolution(
  se_sc = cortex_sc,
  counts_spatial = anterior@assays$Spatial@counts,
  clust_vr = "subclass", # Variable in sc_seu containing the cell-type annotation
  cluster_markers = cluster_markers_all, # Dataframe with the marker genes
  cl_n = 100, # number of cells per cell type to use
  hvg = 3000, # Number of HVG to use
  ntop = NULL, # How many of the marker genes to use (by default all)
  transf = "uv", # Perform unit-variance scaling per cell and spot prior to factorzation and NLS
  method = "nsNMF", # Factorization method
  min_cont = 0 # Remove those cells contributing to a spot below a certain threshold 
  )

saveRDS(object = spotlight_ls, file = here::here("inst/spotlight_ls.rds"))

Read RDS object

spotlight_ls <- readRDS(file = here::here("inst/spotlight_ls.rds"))

nmf_mod <- spotlight_ls[[1]]
decon_mtrx <- spotlight_ls[[2]]

Assess deconvolution

Before even looking at the decomposed spots we can gain insight on how well the model performed by looking at the topic profiles for the cell types.
The first thing we can do is look at how specific the topic profiles are for each cell type.

h <- NMF::coef(nmf_mod[[1]])
rownames(h) <- paste("Topic", 1:nrow(h), sep = "_")
topic_profile_plts <- SPOTlight::dot_plot_profiles_fun(
  h = h,
  train_cell_clust = nmf_mod[[2]])

topic_profile_plts[[2]] + ggplot2::theme(
  axis.text.x = ggplot2::element_text(angle = 90), 
  axis.text = ggplot2::element_text(size = 12))

圖片.png

接下來我們可以看看每個細胞類型中每個細胞的各個topic profiles的行為。
在這里剂府，我們期望來自同一細胞類型的所有細胞顯示出相似的topic profiles分布拧揽，否則該cluster中可能會有更多的子結(jié)構(gòu)，我們可能只捕獲其中一個。

topic_profile_plts[[1]] + theme(axis.text.x = element_text(angle = 90), 
                                axis.text = element_text(size = 12))

圖片.png

Lastly we can take a look at which genes are the most important for each topic and therefore get an insight into which genes are driving them.

basis_spotlight <- data.frame(NMF::basis(nmf_mod[[1]]))

colnames(basis_spotlight) <- unique(stringr::str_wrap(nmf_mod[[2]], width = 30))

basis_spotlight %>%
  dplyr::arrange(desc(Astro)) %>%
  round(., 5) %>% 
  DT::datatable(., filter = "top")

圖片.png

Visualization

Join decomposition with metadata

# This is the equivalent to setting min_cont to 0.04
decon_mtrx_sub <- decon_mtrx[, colnames(decon_mtrx) != "res_ss"]
decon_mtrx_sub[decon_mtrx_sub < 0.08] <- 0
decon_mtrx <- cbind(decon_mtrx_sub, "res_ss" = decon_mtrx[, "res_ss"])
rownames(decon_mtrx) <- colnames(anterior)

decon_df <- decon_mtrx %>%
  data.frame() %>%
  tibble::rownames_to_column("barcodes")

anterior@meta.data <- anterior@meta.data %>%
  tibble::rownames_to_column("barcodes") %>%
  dplyr::left_join(decon_df, by = "barcodes") %>%
  tibble::column_to_rownames("barcodes")

Specific cell-types

we can use the standard Seurat::SpatialFeaturePlot to view predicted celltype proportions one at a time.

Seurat::SpatialFeaturePlot(
  object = anterior,
  features = c("L2.3.IT", "L6b", "Meis2", "Oligo"),
  alpha = c(0.1, 1))

圖片.png

Spatial scatterpies

cell_types_all <- colnames(decon_mtrx)[which(colnames(decon_mtrx) != "res_ss")]

SPOTlight::spatial_scatterpie(se_obj = anterior,
                              cell_types_all = cell_types_all,
                              img_path = "sample_data/spatial/tissue_lowres_image.png",
                              pie_scale = 0.4)

圖片.png

Plot spot composition of spots containing cell-types of interest

SPOTlight::spatial_scatterpie(se_obj = anterior,
                              cell_types_all = cell_types_all,
                              img_path = "sample_data/spatial/tissue_lowres_image.png",
                              cell_types_interest = "L6b",
                              pie_scale = 0.8)

圖片.png

Spatial interaction graph

現(xiàn)在我們知道在每個點內(nèi)發(fā)現(xiàn)了哪些細胞類型淤袜，我們可以制作一個表示空間相互作用的圖痒谴，其中細胞類型之間的邊緣越強，我們在同一點內(nèi)發(fā)現(xiàn)它們的頻率越高铡羡。為此积蔚，我們只需要運行 get_spatial_interaction_graph 函數(shù)，該函數(shù)將打印繪圖并返回繪圖所需的元素烦周。

graph_ntw <- SPOTlight::get_spatial_interaction_graph(decon_mtrx = decon_mtrx[, cell_types_all])

If you want to tune how the graph looks you can do the following or you can check out more options here:

deg <- degree(graph_ntw, mode="all")

# Get color palette for difusion
edge_importance <- E(graph_ntw)$importance

# Select a continuous palette
qual_col_pals <- brewer.pal.info[brewer.pal.info$category == 'seq',]

# Create a color palette
getPalette <- colorRampPalette(brewer.pal(9, "YlOrRd"))

# Get how many values we need
grad_edge <- seq(0, max(edge_importance), 0.1)

# Generate extended gradient palette dataframe
graph_col_df <- data.frame(value = as.character(grad_edge),
                           color = getPalette(length(grad_edge)),
                           stringsAsFactors = FALSE)
# Assign color to each edge
color_edge <- data.frame(value = as.character(round(edge_importance, 1)), stringsAsFactors = FALSE) %>%
  dplyr::left_join(graph_col_df, by = "value") %>%
  dplyr::pull(color)

# Open a pdf file
plot(graph_ntw,
     # Size of the edge
     edge.width = edge_importance,
     edge.color = color_edge,
     # Size of the buble
     vertex.size = deg,
     vertex.color = "#cde394",
     vertex.frame.color = "white",
     vertex.label.color = "black",
     vertex.label.family = "Ubuntu", # Font family of the label (e.g.“Times”, “Helvetica”)
     layout = layout.circle)

圖片.png

Lastly one can compute cell-cell correlations to see groups of cells that correlate positively or negatively.

# Remove cell types not predicted to be on the tissue
decon_mtrx_sub <- decon_mtrx[, cell_types_all]
decon_mtrx_sub <- decon_mtrx_sub[, colSums(decon_mtrx_sub) > 0]

# Compute correlation
decon_cor <- cor(decon_mtrx_sub)

# Compute correlation P-value
p.mat <- corrplot::cor.mtest(mat = decon_mtrx_sub, conf.level = 0.95)

# Visualize
ggcorrplot::ggcorrplot(
  corr = decon_cor,
  p.mat = p.mat[[1]],
  hc.order = TRUE,
  type = "full",
  insig = "blank",
  lab = TRUE,
  outline.col = "lightgrey",
  method = "square",
  # colors = c("#4477AA", "white", "#BB4444"))
  colors = c("#6D9EC1", "white", "#E46726"),
  title = "Predicted cell-cell proportion correlation",
  legend.title = "Correlation\n(Pearson)") +
  ggplot2::theme(
    plot.title = ggplot2::element_text(size = 22, hjust = 0.5, face = "bold"),
    legend.text = ggplot2::element_text(size = 12),
    legend.title = ggplot2::element_text(size = 15),
    axis.text.x = ggplot2::element_text(angle = 90),
    axis.text = ggplot2::element_text(size = 18, vjust = 0.5))

圖片.png

Step-by-Step insight

Here we are going to show step by step what is going on and all the different steps involved in the process

圖片.png

Downsample data

如果數(shù)據(jù)集非常大尽爆，我們希望在細胞數(shù)量和基因數(shù)量方面對其進行下采樣，以訓練模型读慎。為了進行下采樣漱贱，我們希望保留每個簇的代表性細胞數(shù)量和最重要的基因。我們表明這種下采樣不會影響模型的性能并大大加快了模型訓練的速度夭委。

# Downsample scRNAseq to select gene set and number of cells to train the model
se_sc_down <- downsample_se_obj(se_obj = cortex_sc,
                                clust_vr = "subclass",
                                cluster_markers = cluster_markers_all,
                                cl_n = 100,
                                hvg = 3000)

Train NMF model

Once we have the data ready to pass to the model we can train it as shown below.

start_time <- Sys.time()
nmf_mod_ls <- train_nmf(cluster_markers = cluster_markers_all, 
                        se_sc = se_sc_down, 
                        mtrx_spatial = anterior@assays$Spatial@counts,
                        clust_vr = "subclass",
                        ntop = NULL,
                        hvg = 3000,
                        transf = "uv",
                        method = "nsNMF")

nmf_mod <- nmf_mod_ls[[1]]

Extract matrices form the model:

# get basis matrix W
w <- basis(nmf_mod)
dim(w)

# get coefficient matrix H
h <- coef(nmf_mod)
dim(h)

Look at cell-type specific topic profile

rownames(h) <- paste("Topic", 1:nrow(h), sep = "_")
topic_profile_plts <- dot_plot_profiles_fun(
  h = h,
  train_cell_clust = nmf_mod_ls[[2]]
  )

topic_profile_plts[[2]] + theme(axis.text.x = element_text(angle = 90))

Spot Deconvolution

# Extract count matrix
spot_counts <- anterior@assays$Spatial@counts

# Subset to genes used to train the model
spot_counts <- spot_counts[rownames(spot_counts) %in% rownames(w), ]

Run spots through the basis to get the pertinent coefficients. To do this for every spot we are going to set up a system of linear equations where we need to find the coefficient, we will use non-negative least squares to determine the best coefficient fit.

ct_topic_profiles <- topic_profile_per_cluster_nmf(h = h,
                              train_cell_clust = nmf_mod_ls[[2]])

decon_mtrx <- mixture_deconvolution_nmf(nmf_mod = nmf_mod,
                          mixture_transcriptome = spot_counts,
                          transf = "uv",
                          reference_profiles = ct_topic_profiles, 
                          min_cont = 0.09)

生活很好幅狮，有你更好

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

禁止轉(zhuǎn)載，如需轉(zhuǎn)載請通過簡信或評論聯(lián)系作者株灸。

人面猴
序言：七十年代末崇摄，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子慌烧，更是在濱河造成了極大的恐慌逐抑，老刑警劉巖，帶你破解...
沈念sama閱讀 206,013評論 6贊 481
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件杏死，死亡現(xiàn)場離奇詭異泵肄，居然都是意外死亡，警方通過查閱死者的電腦和手機淑翼，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,205評論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來品追，“玉大人玄括，你說我怎么就攤上這事∪馔撸” “怎么了遭京？”我有些...
開封第一講書人閱讀 152,370評論 0贊 342
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長泞莉。經(jīng)常有香客問我哪雕，道長，這世上最難降的妖魔是什么鲫趁？我笑而不...
開封第一講書人閱讀 55,168評論 1贊 278
?港島之戀（遺憾婚禮）
正文為了忘掉前任斯嚎，我火速辦了婚禮，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘堡僻。我一直安慰自己糠惫，他們只是感情好，可當我...
茶點故事閱讀 64,153評論 5贊 371
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布钉疫。她就那樣靜靜地躺著硼讽，像睡著了一般。火紅的嫁衣襯著肌膚如雪牲阁。梳的紋絲不亂的頭發(fā)上固阁，一...
開封第一講書人閱讀 48,954評論 1贊 283
城市分裂傳說
那天，我揣著相機與錄音城菊，去河邊找鬼您炉。笑死，一個胖子當著我的面吹牛役电，可吹牛的內(nèi)容都是我干的赚爵。我是一名探鬼主播，決...
沈念sama閱讀 38,271評論 3贊 399
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼法瑟，長吁一口氣：“原來是場噩夢啊……” “哼冀膝！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起霎挟，我...
開封第一講書人閱讀 36,916評論 0贊 259
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤窝剖，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后酥夭，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體赐纱，經(jīng)...
沈念sama閱讀 43,382評論 1贊 300
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 35,877評論 2贊 323
?白月光啟示錄
正文我和宋清朗相戀三年熬北，在試婚紗的時候發(fā)現(xiàn)自己被綠了疙描。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 37,989評論 1贊 333
活死人
序言：一個原本活蹦亂跳的男人離奇死亡讶隐，死狀恐怖起胰，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情巫延，我是刑警寧澤效五，帶...
沈念sama閱讀 33,624評論 4贊 322
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站炉峰，受9級特大地震影響畏妖，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜疼阔，卻給世界環(huán)境...
茶點故事閱讀 39,209評論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一戒劫、第九天我趴在偏房一處隱蔽的房頂上張望半夷。院中可真熱鬧，春花似錦谱仪、人聲如沸玻熙。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,199評論 0贊 19
一樁弒父案疯攒，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽嗦随。三九已至，卻和暖如春敬尺，著一層夾襖步出監(jiān)牢的瞬間枚尼，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 31,418評論 1贊 260
情欲美人皮
我被黑心中介騙來泰國打工砂吞，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留署恍，地道東北人。一個月前我還...
沈念sama閱讀 45,401評論 2贊 352
代替公主和親
正文我出身青樓蜻直，卻偏偏與公主長得像盯质，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子概而，可洞房花燭夜當晚...
茶點故事閱讀 42,700評論 2贊 345

10X單細胞空間分析回顧之SPOTlight

一年了，我們都需要總結(jié)鹃答，這一篇我們回顧一下SPOTlight乎澄，非常好的方法，建議大家也總結(jié)一下测摔，看看2021年置济，自己都得到了什么。

Libraries

Load data

Pre-processing,設置種子的功能大家還知道么紊服？檀轨？

Descriptive

Compute marker genes

SPOTlight Decomposition

Assess deconvolution

Visualization

Specific cell-types

Spatial scatterpies

Plot spot composition of spots containing cell-types of interest

Spatial interaction graph

Step-by-Step insight

Downsample data

Train NMF model

Extract matrices form the model:

Look at cell-type specific topic profile

Spot Deconvolution

Run spots through the basis to get the pertinent coefficients. To do this for every spot we are going to set up a system of linear equations where we need to find the coefficient, we will use non-negative least squares to determine the best coefficient fit.

推薦閱讀更多精彩內(nèi)容