Integrative analysis in Seurat v5
Reference
Introduction
單細(xì)胞測(cè)序數(shù)據(jù)集的整合葵蒂,例如跨實(shí)驗(yàn)批次悴了、donor或條件的整合,通常是scRNA-seq工作流程中的重要一步扎谎。整合分析可以幫助匹配數(shù)據(jù)集之間的共享細(xì)胞類型和狀態(tài)藕施,這可以提高統(tǒng)計(jì)能力摩幔,最重要的是劳殖,有助于跨數(shù)據(jù)集進(jìn)行準(zhǔn)確的比較分析铐尚。【選擇合適整合方法哆姻,去除批次效應(yīng)帶來(lái)的細(xì)胞差異宣增,關(guān)注細(xì)胞在生物學(xué)上的真實(shí)分群、差異表達(dá)】
Seurat v5使用
IntegrateLayers
功能實(shí)現(xiàn)了簡(jiǎn)化的整合分析矛缨。目前支持五種方法爹脾。這些方法中的每一種都在低維空間中執(zhí)行集成,并返回降維(即integrated.xxx)箕昭,該降維旨在跨批次共同嵌入共享細(xì)胞類型誉简。
- Anchor-based CCA integration (method=CCAIntegration)
- Anchor-based RPCA integration (method=RPCAIntegration)
- Harmony (method=HarmonyIntegration)
- FastMNN (method= FastMNNIntegration)
- scVI (method=scVIIntegration)
選一種方法整合 IntegrateLayers
obj <- IntegrateLayers(
object = obj, method = CCAIntegration,
orig.reduction = "pca", new.reduction = "integrated.cca",
verbose = FALSE
)
obj <- IntegrateLayers(
object = obj, method = RPCAIntegration,
orig.reduction = "pca", new.reduction = "integrated.rpca",
verbose = FALSE
)
obj <- IntegrateLayers(
object = obj, method = HarmonyIntegration,
orig.reduction = "pca", new.reduction = "harmony",
verbose = FALSE
)
obj <- IntegrateLayers(
object = obj, method = FastMNNIntegration,
new.reduction = "integrated.mnn",
verbose = FALSE
)
# For example, scVI integration requires `reticulate` which can be installed from CRAN (`install.packages("reticulate")`) as well as `scvi-tools` and its dependencies installed in a conda environment.
#Please see scVI installation instructions [here](https://docs.scvi-tools.org/en/stable/installation.html).
obj <- IntegrateLayers(
object = obj, method = scVIIntegration,
new.reduction = "integrated.scvi",
conda_env = "../miniconda3/envs/scvi-env", verbose = FALSE
)
選一種方法,可視化+聚類
-
FindNeighbors()
,FindClusters()
,RunUMAP()
## CCA--------------------------------------------------
obj <- FindNeighbors(obj, reduction = "integrated.cca", dims = 1:30)
obj <- FindClusters(obj, resolution = 2, cluster.name = "cca_clusters")
obj <- RunUMAP(obj, reduction = "integrated.cca", dims = 1:30, reduction.name = "umap.cca")
p1 <- DimPlot(
obj,
reduction = "umap.cca",
group.by = c("Method", "predicted.celltype.l2", "cca_clusters"),
combine = FALSE, label.size = 2
)
## SCVI--------------------------------------------------
obj <- FindNeighbors(obj, reduction = "integrated.scvi", dims = 1:30)
obj <- FindClusters(obj, resolution = 2, cluster.name = "scvi_clusters")
obj <- RunUMAP(obj, reduction = "integrated.scvi", dims = 1:30, reduction.name = "umap.scvi")
p2 <- DimPlot(
obj,
reduction = "umap.scvi",
group.by = c("Method", "predicted.celltype.l2", "scvi_clusters"),
combine = FALSE, label.size = 2
)
wrap_plots(c(p1, p2), ncol = 2, byrow = F)
- 在選擇方法時(shí)盟广,主要考慮聚類中保留的生物學(xué)信息∥驮浚【看不同聚類marker gene 的特異性 :聚類是不是一個(gè)類型聚在一起筋量,而不是一個(gè)批次聚在一起(marker gene 在好幾類都高表達(dá))】
比較整合結(jié)果--Marker Gene
p1 <- VlnPlot(
obj,
features = "rna_CD8A", group.by = "unintegrated_clusters"
) + NoLegend() + ggtitle("CD8A - Unintegrated Clusters")
p2 <- VlnPlot(
obj, "rna_CD8A",
group.by = "cca_clusters"
) + NoLegend() + ggtitle("CD8A - CCA Clusters")
p3 <- VlnPlot(
obj, "rna_CD8A",
group.by = "scvi_clusters"
) + NoLegend() + ggtitle("CD8A - scVI Clusters")
p1 | p2 | p3
- 看看CCA整合后的聚類烹吵,在其他整合中的分布 :
obj <- RunUMAP(obj, reduction = "integrated.rpca", dims = 1:30, reduction.name = "umap.rpca")
p4 <- DimPlot(obj, reduction = "umap.unintegrated", group.by = c("cca_clusters"))
p5 <- DimPlot(obj, reduction = "umap.rpca", group.by = c("cca_clusters"))
p6 <- DimPlot(obj, reduction = "umap.scvi", group.by = c("cca_clusters"))
p4 | p5 | p6
將選擇的整合后的結(jié)果作為新layer
進(jìn)行分析
Seurat v5 assays store data in
layers
. These layers can store raw, un-normalized counts (layer='counts'), normalized data (layer='data'), or z-scored/variance-stabilized data (layer='scale.data').
obj <- JoinLayers(obj)
obj
## An object of class Seurat
## 35789 features across 10434 samples within 5 assays
## Active assay: RNA (33694 features, 2000 variable features)
## 3 layers present: data, counts, scale.data
## 4 other assays present: prediction.score.celltype.l1, prediction.score.celltype.l2, prediction.score.celltype.l3, mnn.reconstructed
## 12 dimensional reductions calculated: integrated_dr, ref.umap, pca, umap.unintegrated, integrated.cca, integrated.rpca, harmony, integrated.mnn, integrated.scvi, umap.cca, umap.scvi, umap.rpca
- SCT normalization +整合 示例:
options(future.globals.maxSize = 3e+09)
obj <- SCTransform(obj)
obj <- RunPCA(obj, npcs = 30, verbose = F)
obj <- IntegrateLayers(
object = obj,
method = RPCAIntegration,
normalization.method = "SCT",
verbose = F
)
obj <- FindNeighbors(obj, dims = 1:30, reduction = "integrated.dr")
obj <- FindClusters(obj, resolution = 2)