流程更新----空間細(xì)胞聚類及配受體共現(xiàn)分析（針對visium退疫、bin模式的Stereo-seq渠缕、以及HD）

作者，Evil Genius

今天我們分享一個(gè)簡答的內(nèi)容褒繁，空間細(xì)胞聚類與配受體共現(xiàn)亦鳞。

今年的空間課程給了大家一個(gè)方法，當(dāng)然了棒坏，都可以用燕差，也都有高分文章引用，我們今天更新一個(gè)方法俊抵，結(jié)果如下：

老粉應(yīng)該有印象分享的是哪篇文章谁不。

針對bin模式的Stereo-seq或者標(biāo)準(zhǔn)模式HD分析，不做圖像分割的情況下徽诲，合并后的superspot都跟visium分析差不多刹帕，需要和單細(xì)胞數(shù)據(jù)一起進(jìn)行解卷積。當(dāng)然了谎替，這就會(huì)有課程上講到的分析偷溺，分子聚類、細(xì)胞聚類钱贯。

解卷積的方法么挫掏，一般都是cell2location、RCTD居多秩命，當(dāng)然了尉共，像CellTrek褒傅、CellScope等方法也都有人引用，分析完拿到空間細(xì)胞矩陣袄友，針對這個(gè)矩陣殿托，也會(huì)有很多的個(gè)性化分析。

我們更新一下這個(gè)空間細(xì)胞聚類的方法剧蚣。分析細(xì)胞類型的空間共現(xiàn)支竹。

簡單的例子

出處：https://github.com/ati-lz/ISCHIA

代碼示例

# Loading required packages
library(ISCHIA)
library(robustbase)
library(data.table)
library(ggplot2)
library(Seurat)
library(dplyr)
library(factoextra)
library(cluster)
library(showtext)
library(gridExtra)
library(pdftools)

# Set random seed for reproducibility
set.seed(123)

# Load data
pdac <- readRDS("/path/to/pdac_mets_rctd.rds")
assay_matrix <- pdac[["rctd_tier1"]]@data
norm_weights <- as.data.frame(t(assay_matrix))

# Elbow Method
k.values <- 1:20
wss_values <- sapply(k.values, function(k) kmeans(norm_weights, k, nstart = 10)$tot.withinss)

pdf("1_elbow_plot.pdf")
plot(k.values, wss_values, type = "b", pch = 19, frame = FALSE, 
     xlab = "Number of clusters K", ylab = "Total within-cluster sum of squares",
     main = "Elbow Method for Optimal K")
dev.off()

# Gap Statistic
gap_stat <- function(k) {
  km.res <- kmeans(norm_weights, k, nstart = 10)
  if (k == 1) return(NA)
  obs_disp <- sum(km.res$withinss)
  reference_disp <- mean(replicate(10, {
    km.null <- kmeans(matrix(rnorm(nrow(norm_weights) * ncol(norm_weights)), 
                             ncol = ncol(norm_weights)), k, nstart = 10)
    sum(km.null$withinss)
  }))
  log(reference_disp) - log(obs_disp)
}

gap_stat_values <- sapply(k.values, gap_stat)

pdf("2_gap_statistic_plot.pdf")
plot(k.values, gap_stat_values, type = "b", pch = 19, frame = FALSE, 
     xlab = "Number of Clusters (K)", ylab = "Gap Statistic",
     main = "Gap Statistic: Determining Optimal K")
dev.off()

# Calinski-Harabasz Index
calinski_harabasz_index <- function(data, labels) {
  num_clusters <- length(unique(labels))
  num_points <- nrow(data)
  centroids <- tapply(data, labels, FUN = colMeans)
  between_disp <- sum(sapply(1:num_clusters, function(i) {
    cluster_points <- data[labels == i, ]
    nrow(cluster_points) * sum((colMeans(cluster_points) - centroids[i, ]) ^ 2)
  }))
  within_disp <- sum(sapply(1:num_clusters, function(i) {
    cluster_points <- data[labels == i, ]
    sum((cluster_points - centroids[i, ]) ^ 2)
  }))
  (between_disp / (num_clusters - 1)) / (within_disp / (num_points - num_clusters))
}

ch_values <- sapply(k.values, function(k) {
  km.res <- kmeans(norm_weights, k, nstart = 10)
  calinski_harabasz_index(norm_weights, km.res$cluster)
})

pdf("3_calinski_harabasz_plot.pdf")
plot(k.values, ch_values, type = "b", pch = 19, frame = FALSE,
     xlab = "Number of Clusters (K)", ylab = "Calinski-Harabasz Index",
     main = "Calinski-Harabasz Index: Determining Optimal K")
dev.off()

# ISCHIA Analysis
pdf("4_composition_cluster_k_plot.pdf")
Composition.cluster.k(norm_weights, 20)
dev.off()

pdac <- Composition.cluster(pdac, norm_weights, 12)
pdac$cc_12 <- pdac$CompositionCluster_CC

# Spatial Dimension Plot
image_names <- c("IU_PDA_T1", "IU_PDA_T2", "IU_PDA_HM2", "IU_PDA_HM2_2", "IU_PDA_NP2", 
                 "IU_PDA_T3", "IU_PDA_HM3", "IU_PDA_T4", "IU_PDA_HM4", "IU_PDA_HM5", 
                 "IU_PDA_T6", "IU_PDA_HM6", "IU_PDA_LNM6", "IU_PDA_LNM7", "IU_PDA_T8", 
                 "IU_PDA_HM8", "IU_PDA_LNM8", "IU_PDA_T9", "IU_PDA_HM9", "IU_PDA_T10", 
                 "IU_PDA_HM10", "IU_PDA_LNM10", "IU_PDA_NP10", "IU_PDA_T11", "IU_PDA_HM11", 
                 "IU_PDA_NP11", "IU_PDA_T12", "IU_PDA_HM12", "IU_PDA_LNM12", "IU_PDA_HM13")

paletteMartin <- c('#e6194b', '#3cb44b', '#ffe119', '#4363d8', '#f58231', '#911eb4', 
                   '#46f0f0', '#f032e6', '#bcf60c', '#fabebe', '#008080', '#e6beff')

all_ccs <- unique(pdac$CompositionCluster_CC)
color_mapping <- setNames(paletteMartin[1:length(all_ccs)], all_ccs)

pdf("5_spatial_plots_K12.pdf", width = 10, height = 7)
for (image_name in image_names) {
  plot <- SpatialDimPlot(pdac, group.by = "CompositionCluster_CC", images = image_name) +
    scale_fill_manual(values = color_mapping) +
    theme_minimal() +
    ggtitle(image_name)
  print(plot)
}
dev.off()

# Enriched Cell Types
save_cc_plot <- function(cc) {
  plot <- Composition_cluster_enrichedCelltypes(pdac, cc, as.matrix(norm_weights))
  pdf_name <- paste0(cc, ".pdf")
  pdf(file = pdf_name)
  print(plot)
  dev.off()
}

ccs <- paste0("CC", 1:12)
for (cc in ccs) {
  save_cc_plot(cc)
}

pdf_files <- paste0("CC", 1:12, ".pdf")
pdf_combine(pdf_files, output = "6_enrichedCelltypes_CC_12.pdf")

# UMAP
pdac.umap <- Composition_cluster_umap(pdac, norm_weights)
pdf("7_umap_pie_chart.pdf")
print(pdac.umap$umap.deconv.gg)
dev.off()

# Add UMAP to Seurat object
emb.umap <- pdac.umap$umap.table
emb.umap$CompositionCluster_CC <- NULL
emb.umap$Slide <- NULL
emb.umap <- as.matrix(emb.umap)
colnames(emb.umap) <- c("UMAP1", "UMAP2")

pdac[['umap.ischia12']] <- CreateDimReducObject(embeddings = emb.umap, key = 'umap.ischia12_', assay = 'rctd_tier1')

pdf("8_seurat_ischia_umap_12.pdf")
DimPlot(pdac, reduction = "umap.ischia12", label = FALSE, group.by="cc_12")
dev.off()

# Bar plots
pdf("9_barplot_SampVsorig_12.pdf", height=12, width=20)
dittoBarPlot(pdac, "orig.ident", group.by = "cc_12")
dev.off()

pdf("10_barplot_origVsSamp_12.pdf", height=10, width=20)
dittoBarPlot(pdac, "cc_12", group.by = "orig.ident")
dev.off()

# Cell type co-occurrence
CC4.celltype.cooccur <- spatial.celltype.cooccurence(spatial.object=pdac, deconv.prob.mat=norm_weights, 
                                                     COI="CC4", prob.th= 0.05, 
                                                     Condition=unique(pdac$orig.ident))
pdf("11_celltype_cooccurrence_CC4.pdf")
plot.celltype.cooccurence(CC4.celltype.cooccur)
dev.off()

最后畫一畫這個(gè)圖

### This script performs the wilcoxon rank sum test and hierarchical clustering on the RCTD tier data to identify the significant abundant cell types between the clusters.
  
#### Load necessary packages
library(Seurat)
library(compositions)
library(tidyverse)
library(clustree)
library(patchwork)
library(uwot)
library(scran)
library(cluster)
library(ggrastr)
library(cowplot)
# library(conflicted) # to be loaded in case of a conflict arises.
 
config <- config::get()
 
# source(here::here("pdac_nac", "visualization", "eda.R"))
 
### Load the seurat object and get the proportions data
so <- readRDS(here::here(config$data_processed, "06-pdac_CC10_msig.rds"))
 
# Join with metadata if needed
metadata <- so@meta.data %>%
  select(orig.ident, patient_id, neoadjuvant_chemo, CompositionCluster_CC) %>%
  rownames_to_column("row_id")
 
 
# Get the proportions data
rctd_tier2 <- t(so@assays$rctd_tier2@data)
 
# Ensure the data is in the right format
rownames(rctd_tier2) <- make.unique(rownames(rctd_tier2))
 
# Log transformation of rctd_tier1
log_comps <- log10(rctd_tier2)
## Perform the summary statistics
We perform the summary statistics for the RCTD tier data. We perform hierarchical clustering and do the wilcoxon rank sum test to identify the differentially abundant cell types between the clusters. 
#### Prepare the data for the summary statistics
# Prepare data for summary statistics
cluster_summary_pat <- rctd_tier2 %>%
  as.data.frame() %>%
  rownames_to_column("row_id") %>%
  left_join(metadata, by = "row_id") %>%  # Join with meta_data using row_id as the key
  pivot_longer(-c(row_id, orig.ident, patient_id, neoadjuvant_chemo, CompositionCluster_CC), values_to = "ct_prop", names_to = "cell_type") %>%
  group_by(orig.ident, patient_id, neoadjuvant_chemo, CompositionCluster_CC, cell_type) %>%
  summarize(median_ct_prop = median(ct_prop, na.rm = TRUE))

# Aggregate data for median ct prop
cluster_summary <- cluster_summary_pat %>%
  ungroup() %>%
  group_by(CompositionCluster_CC, cell_type) %>%
  summarize(patient_median_ct_prop = median(median_ct_prop, na.rm = TRUE))
 
# Prepare matrix for hierarchical clustering
cluster_summary_mat <- cluster_summary %>%
  pivot_wider(values_from = patient_median_ct_prop, names_from = cell_type, values_fill = list(patient_median_ct_prop = 0)) %>%
  column_to_rownames("CompositionCluster_CC") %>%
  as.matrix()
 
# conflicts_prefer(stats::"dist") # resolve conflicts between %*% functions

# Perform hierarchical clustering
cluster_order <- hclust(dist(cluster_summary_mat))$labels[hclust(dist(cluster_summary_mat))$order] # use this if you want to order the clusters based on the hierarchical clustering
ct_order <- hclust(dist(t(cluster_summary_mat)))$labels[hclust(dist(t(cluster_summary_mat)))$order]
 
# Order Clusters in ascending order
# cluster_order1 <- c("CC1", "CC2", "CC3", "CC4", "CC5", "CC6", "CC7", "CC8", "CC9", "CC10")
 
# Wilcoxon test for characteristic cell types
run_wilcox_up <- function(prop_data) {
  prop_data_group <- prop_data[["CompositionCluster_CC"]] %>% unique() %>% set_names()
  map(prop_data_group, function(g) {
    test_data <- prop_data %>%
      mutate(test_group = ifelse(CompositionCluster_CC == g, "target", "rest")) %>%
      mutate(test_group = factor(test_group, levels = c("target", "rest")))
    wilcox.test(median_ct_prop ~ test_group, data = test_data, alternative = "greater") %>%
      broom::tidy()
  }) %>% enframe("CompositionCluster_CC") %>% unnest()
}

 
wilcoxon_res <- cluster_summary_pat %>%
  ungroup() %>%
  group_by(cell_type) %>%
  nest() %>%
  mutate(wres = map(data, run_wilcox_up)) %>%
  dplyr::select(wres) %>%
  unnest() %>%
  ungroup() %>%
  mutate(p_corr = p.adjust(p.value)) %>%
  mutate(significant = ifelse(p_corr <= 0.15, "*", ""))

 
#### Save the summary statistics

# give the path to save the summary statistics
file_path_cluster_summ <- here::here(config$data_interim,  "summary_of_clusters.txt")
file_path_wilcox_res <- here::here(config$data_interim, ? "wilcoxon_res_cells_clusters.txt")
 
# Save the summary statistics
write.table(cluster_summary_pat, file = file_path_cluster_summ, col.names = TRUE, row.names = FALSE, quote = FALSE, sep = "\t")
write.table(wilcoxon_res, file = file_path_wilcox_res, col.names = TRUE, row.names = FALSE, quote = FALSE, sep = "\t")

 
#### Plot the summary statistics
 
# Plotting mean ct prop and barplots
mean_ct_prop_plt <- cluster_summary %>%
  left_join(wilcoxon_res, by = c("CompositionCluster_CC", "cell_type")) %>%
  mutate(cell_type = factor(cell_type, levels = ct_order), CompositionCluster_CC = factor(CompositionCluster_CC, levels = cluster_order)) %>%
  ungroup() %>%
  group_by(cell_type) %>%
  mutate(scaled_pat_median = (patient_median_ct_prop - mean(patient_median_ct_prop)) / sd(patient_median_ct_prop)) %>%
  ungroup() %>%
  ggplot(aes(x = cell_type, y = CompositionCluster_CC, fill = scaled_pat_median)) +
  geom_tile(color = "black") +
  geom_text(aes(label = significant)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 12), legend.position = "bottom", plot.margin = unit(c(0, 0, 0, 0), "cm"), axis.text.y = element_text(size = 12)) +
  scale_fill_gradient2()
 
cluster_counts <- cluster_info %>%
  dplyr::select_at(c("row_id", "CompositionCluster_CC")) %>%
  group_by(CompositionCluster_CC) %>%
  summarize(nspots = length(CompositionCluster_CC)) %>%
  mutate(prop_spots = nspots / sum(nspots))
 
file_path_cluster_prop_summ <- here::here(config$data_interim, "cluster_prop_summary.csv")
 
write_csv(cluster_counts, file_path_cluster_prop_summ)
 
#barplots for cluster counts
barplts <- cluster_counts %>%
  mutate(CompositionCluster_CC = factor(CompositionCluster_CC, levels = cluster_order)) %>%
  ggplot(aes(y = CompositionCluster_CC, x = prop_spots)) +
  geom_bar(stat = "identity") +
  theme_classic() + ylab("") +
  theme(axis.text.y = element_blank(), plot.margin = unit(c(0, 0, 0, 0), "cm"), axis.text.x = element_text(size = 12))
 
cluster_summary_plt <- cowplot::plot_grid(mean_ct_prop_plt, barplts, align = "hv", axis = "tb") # use if barplots are needed to show the spot counts otherwise directly use mean_ct_prop_plt for the plot

 
#### plot the summary clusters
 
pdf_path_summ_clust <- here::here(config$plots, "wilcox_summary_clusters.pdf")
 
pdf(pdf_path_summ_clust, width = 20, height = 10)
plot(cluster_summary_plt)
dev.off()
 
#box plot for median ct prop
pdf_path_boxplot <- here::here(config$plots,  "wilcox_bboxplot_median_ct_prop.pdf")
 
plt <- cluster_summary_pat %>%
  ggplot(aes(x = CompositionCluster_CC, y = median_ct_prop)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5)) +
  facet_wrap(. ~ cell_type, ncol = 3, scales = "free_y")
 
pdf(pdf_path_boxplot, width = 20, height = 10)
plot(plt)
dev.off()

生活很好，有你更好

最后編輯于：2024.11.18 10:31:48

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末鸠按，一起剝皮案震驚了整個(gè)濱河市礼搁，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌目尖，老刑警劉巖馒吴，帶你破解...
沈念sama閱讀 206,013評論 6贊 481
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異卑雁，居然都是意外死亡募书，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,205評論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門测蹲，熙熙樓的掌柜王于貴愁眉苦臉地迎上來莹捡，“玉大人，你說我怎么就攤上這事扣甲±河” “怎么了？”我有些...
開封第一講書人閱讀 152,370評論 0贊 342
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵琉挖，是天一觀的道長启泣。經(jīng)常有香客問我，道長示辈，這世上最難降的妖魔是什么寥茫？我笑而不...
開封第一講書人閱讀 55,168評論 1贊 278
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮矾麻，結(jié)果婚禮上纱耻，老公的妹妹穿的比我還像新娘。我一直安慰自己险耀，他們只是感情好弄喘，可當(dāng)我...
茶點(diǎn)故事閱讀 64,153評論 5贊 371
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著甩牺，像睡著了一般蘑志。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 48,954評論 1贊 283
城市分裂傳說
那天急但，我揣著相機(jī)與錄音澎媒，去河邊找鬼。笑死羊始，一個(gè)胖子當(dāng)著我的面吹牛旱幼，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播突委，決...
沈念sama閱讀 38,271評論 3贊 399
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼冬三！你這毒婦竟也來了匀油？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 36,916評論 0贊 259
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤勾笆，失蹤者是張志新（化名）和其女友劉穎敌蚜，沒想到半個(gè)月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體窝爪，經(jīng)...
沈念sama閱讀 43,382評論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡弛车，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 35,877評論 2贊 323
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了蒲每。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片纷跛。...
茶點(diǎn)故事閱讀 37,989評論 1贊 333
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡，死狀恐怖邀杏，靈堂內(nèi)的尸體忽然破棺而出贫奠，到底是詐尸還是另有隱情，我是刑警寧澤望蜡，帶...
沈念sama閱讀 33,624評論 4贊 322
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布唤崭，位于F島的核電站，受9級特大地震影響脖律，放射性物質(zhì)發(fā)生泄漏谢肾。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,209評論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一小泉、第九天我趴在偏房一處隱蔽的房頂上張望芦疏。院中可真熱鬧，春花似錦膏孟、人聲如沸眯分。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,199評論 0贊 19
一樁弒父案柒桑，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽弊决。三九已至，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間飘诗，已是汗流浹背与倡。一陣腳步聲響...
開封第一講書人閱讀 31,418評論 1贊 260
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留昆稿，地道東北人纺座。一個(gè)月前我還...
沈念sama閱讀 45,401評論 2贊 352
代替公主和親
正文我出身青樓，卻偏偏與公主長得像溉潭，于是被迫代替她去往敵國和親净响。傳聞我的和親對象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,700評論 2贊 345

流程更新----空間細(xì)胞聚類及配受體共現(xiàn)分析（針對visium场仲、bin模式的Stereo-seq、以及HD）

流程更新----空間細(xì)胞聚類及配受體共現(xiàn)分析（針對visium退疫、bin模式的Stereo-seq渠缕、以及HD）

作者，Evil Genius

今天我們分享一個(gè)簡答的內(nèi)容褒繁，空間細(xì)胞聚類與配受體共現(xiàn)亦鳞。

今年的空間課程給了大家一個(gè)方法，當(dāng)然了棒坏，都可以用燕差，也都有高分文章引用，我們今天更新一個(gè)方法俊抵，結(jié)果如下：

老粉應(yīng)該有印象分享的是哪篇文章谁不。

解卷積的方法么挫掏，一般都是cell2location、RCTD居多秩命，當(dāng)然了尉共，像CellTrek褒傅、CellScope等方法也都有人引用，分析完拿到空間細(xì)胞矩陣袄友，針對這個(gè)矩陣殿托，也會(huì)有很多的個(gè)性化分析。

我們更新一下這個(gè)空間細(xì)胞聚類的方法剧蚣。分析細(xì)胞類型的空間共現(xiàn)支竹。

簡單的例子

出處：https://github.com/ati-lz/ISCHIA

代碼示例

最后畫一畫這個(gè)圖

生活很好，有你更好

推薦閱讀更多精彩內(nèi)容