單細(xì)胞測序分析: Seurat V3聯(lián)合harmony進(jìn)行單細(xì)胞數(shù)據(jù)整合分析

Harmony integrates spatially resolved transcriptomic with dissociated scRNAseq datasets

#R包導(dǎo)入

library(Seurat)
library(cowplot)
library(harmony)

#數(shù)據(jù)

數(shù)據(jù)下載pbmc_stim
數(shù)據(jù)導(dǎo)入

load('data/pbmc_stim.RData')

#初始化Seurat 對象

使用Harmony之前师枣，需要構(gòu)建一個(gè)Seurat對象，進(jìn)行一個(gè)標(biāo)準(zhǔn)的Seurat分析（包括PCA）。

Seurat中使用Harmony的流程與常規(guī)流程的區(qū)別是：可以所有細(xì)胞創(chuàng)建一個(gè)Seurat 對象黔夭，不需要為每個(gè)數(shù)據(jù)集創(chuàng)建一個(gè)Seurat 對象（Seurat list）帽哑。

pbmc <- CreateSeuratObject(counts = cbind(stim.sparse, ctrl.sparse), project = "PBMC", min.cells = 5) %>%
    Seurat::NormalizeData(verbose = FALSE) %>%
    FindVariableFeatures(selection.method = "vst", nfeatures = 2000) %>% 
    ScaleData(verbose = FALSE) %>% 
    RunPCA(pc.genes = pbmc@var.genes, npcs = 20, verbose = FALSE)

在object的metadata中定義細(xì)胞ID信息奇适，變量名為stim.

pbmc@meta.data$stim <- c(rep("STIM", ncol(stim.sparse)), rep("CTRL", ncol(ctrl.sparse)))

現(xiàn)在還沒有使用Harmony矯正主成分分析結(jié)果的數(shù)據(jù)，數(shù)據(jù)集之間還是有很大的差異芦鳍。

options(repr.plot.height = 5, repr.plot.width = 12)
p1 <- DimPlot(object = pbmc, reduction = "pca", pt.size = .1, group.by = "stim", do.return = TRUE)
p2 <- VlnPlot(object = pbmc, features = "PC_1", group.by = "stim", do.return = TRUE, pt.size = .1)
plot_grid(p1,p2)

Before Harmony

#運(yùn)行Harmony進(jìn)行數(shù)據(jù)整合(矯正批次效應(yīng))

輸入：使用Harmony嚷往，需要一個(gè)Seurat 對象和指定metadata信息中需要整合的變量名。
輸出：返回一個(gè)Seurat對象柠衅，以及矯正之后的Harmony 信息
plot_convergenc參數(shù)設(shè)置為TRUE皮仁，保證Harmony 在運(yùn)行中每一次迭代都在使矯正越累越好。

ptions(repr.plot.height = 2.5, repr.plot.width = 6)
pbmc <- pbmc %>% 
    RunHarmony("stim", plot_convergence = TRUE)

RunHarmony

##獲取Harmony 矯正之后的信息菲宴，使用Embeddings()函數(shù)

harmony_embeddings <- Embeddings(pbmc, 'harmony')
harmony_embeddings[1:5, 1:5]

harmony embeddings

##查看數(shù)據(jù)Harmony整合之后的前兩個(gè)維度上數(shù)據(jù)是不是很好的整合贷祈，最好是很好的整合結(jié)果。

options(repr.plot.height = 5, repr.plot.width = 12)
p1 <- DimPlot(object = pbmc, reduction = "harmony", pt.size = .1, group.by = "stim", do.return = TRUE)
p2 <- VlnPlot(object = pbmc, features = "harmony_1", group.by = "stim", do.return = TRUE, pt.size = .1)
plot_grid(p1,p2)

After harmony

#下游分析

下游分析都是基于Harmony矯正之后的PCA降維數(shù)據(jù)喝峦，不是基因表達(dá)數(shù)據(jù)和直接的PCA降維數(shù)據(jù)势誊。設(shè)置reduction = 'harmony'，后續(xù)分析就會(huì)調(diào)用Harmony矯正之后的PCA降維數(shù)據(jù)谣蠢。

使用Harmony 矯正之后的數(shù)據(jù)粟耻，UMAP 和 Nearest Neighbor分析。

pbmc <- pbmc %>% 
    RunUMAP(reduction = "harmony", dims = 1:20) %>% 
    FindNeighbors(reduction = "harmony", dims = 1:20) %>% 
    FindClusters(resolution = 0.5) %>% 
    identity()

##UMAP 結(jié)果

options(repr.plot.height = 4, repr.plot.width = 10)
DimPlot(pbmc, reduction = "umap", group.by = "stim", pt.size = .1, split.by = 'stim')

Umap

##聚類分析

options(repr.plot.height = 4, repr.plot.width = 6)
DimPlot(pbmc, reduction = "umap", label = TRUE, pt.size = .1)

Cluster analysis

#參考：

Seurat V3 with harmony

最后編輯于：2020.09.16 10:44:31

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者