標(biāo)準(zhǔn)的Seurat工作流程
標(biāo)準(zhǔn)的Seurat工作流程采用原始的單細(xì)胞表達(dá)數(shù)據(jù)崖媚,并旨在在數(shù)據(jù)中查找簇痛单。有關(guān)完整的詳細(xì)信息菱皆,請(qǐng)閱讀我們的教程须误。此過程包括數(shù)據(jù)標(biāo)準(zhǔn)化和可變特征選擇,數(shù)據(jù)縮放搔预,可變特征上的PCA霹期,共享最近鄰圖的構(gòu)建以及使用模塊化優(yōu)化器的聚類。最后拯田,我們使用t-SNE在二維空間中可視化群集历造。
pbmc.counts <- Read10X(data.dir = "~/Downloads/pbmc3k/filtered_gene_bc_matrices/hg19/")
pbmc <- CreateSeuratObject(counts = pbmc.counts)
pbmc <- NormalizeData(object = pbmc)
pbmc <- FindVariableFeatures(object = pbmc)
pbmc <- ScaleData(object = pbmc)
pbmc <- RunPCA(object = pbmc)
pbmc <- FindNeighbors(object = pbmc)
pbmc <- FindClusters(object = pbmc)
pbmc <- RunTSNE(object = pbmc)
DimPlot(object = pbmc, reduction = "tsne")
對(duì)象互動(dòng)
使用Seurat v3.0,我們對(duì)Seurat對(duì)象進(jìn)行了改進(jìn)船庇,并增加了用于用戶交互的新方法吭产。我們還為鏡像R的普通任務(wù)引入了簡(jiǎn)單的函數(shù),例如子集和合并鸭轮。
# Get cell and feature names, and total numbers
colnames(x = pbmc)
Cells(object = pbmc)
rownames(x = pbmc)
ncol(x = pbmc)
nrow(x = pbmc)
# Get cell identity classes
Idents(object = pbmc)
levels(x = pbmc)
# Stash cell identity classes
pbmc[["old.ident"]] <- Idents(object = pbmc)
pbmc <- StashIdent(object = pbmc, save.name = "old.ident")
# Set identity classes
Idents(object = pbmc) <- "CD4 T cells"
Idents(object = pbmc, cells = 1:10) <- "CD4 T cells"
# Set identity classes to an existing column in meta data
Idents(object = pbmc, cells = 1:10) <- "orig.ident"
Idents(object = pbmc) <- "orig.ident"
# Rename identity classes
pbmc <- RenameIdents(object = pbmc, `CD4 T cells` = "T Helper cells")
# Subset Seurat object based on identity class, also see ?SubsetData
subset(x = pbmc, idents = "B cells")
subset(x = pbmc, idents = c("CD4 T cells", "CD8 T cells"), invert = TRUE)
# Subset on the expression level of a gene/feature
subset(x = pbmc, subset = MS4A1 > 3)
# Subset on a combination of criteria
subset(x = pbmc, subset = MS4A1 > 3 & PC1 > 5)
subset(x = pbmc, subset = MS4A1 > 3, idents = "B cells")
# Subset on a value in the object meta data
subset(x = pbmc, subset = orig.ident == "Replicate1")
# Downsample the number of cells per identity class
subset(x = pbmc, downsample = 100)
# Merge two Seurat objects
merge(x = pbmc1, y = pbmc2)
# Merge more than two Seurat objects
merge(x = pbmc1, y = list(pbmc2, pbmc3))
訪問數(shù)據(jù)
在Seurat中訪問數(shù)據(jù)非常簡(jiǎn)單臣淤,使用明確定義的訪問器和設(shè)置器即可快速找到所需的數(shù)據(jù)。
# View metadata data frame, stored in object@meta.data
pbmc[[]]
# Retrieve specific values from the metadata
pbmc$nCount_RNA
pbmc[[c("percent.mito", "nFeature_RNA")]]
# Add metadata, see ?AddMetaData
random_group_labels <- sample(x = c("g1", "g2"), size = ncol(x = pbmc), replace = TRUE)
pbmc$groups <- random_group_labels
# Retrieve or set data in an expression matrix ('counts', 'data', and 'scale.data')
GetAssayData(object = pbmc, slot = "counts")
pbmc <- SetAssayData(object = pbmc, slot = "scale.data", new.data = new.data)
# Get cell embeddings and feature loadings
Embeddings(object = pbmc, reduction = "pca")
Loadings(object = pbmc, reduction = "pca")
Loadings(object = pbmc, reduction = "pca", projected = TRUE)
# FetchData can pull anything from expression matrices, cell embeddings, or metadata
FetchData(object = pbmc, vars = c("PC_1", "percent.mito", "MS4A1"))
數(shù)據(jù)可視化
Seurat有一個(gè)龐大的基于ggplot2的繪圖庫窃爷。默認(rèn)情況下邑蒋,所有繪圖功能都將返回ggplot2繪圖,從而允許使用ggplot2輕松自定義按厘。
# Dimensional reduction plot for PCA or tSNE
DimPlot(object = pbmc, reduction = "tsne")
DimPlot(object = pbmc, reduction = "pca")
# Dimensional reduction plot, with cells colored by a quantitative feature
FeaturePlot(object = pbmc, features = "MS4A1")
# Scatter plot across single cells, replaces GenePlot
FeatureScatter(object = pbmc, feature1 = "MS4A1", feature2 = "PC_1")
FeatureScatter(object = pbmc, feature1 = "MS4A1", feature2 = "CD3D")
# Scatter plot across individual features, repleaces CellPlot
CellScatter(object = pbmc, cell1 = "AGTCTACTAGGGTG", cell2 = "CACAGATGGTTTCT")
VariableFeaturePlot(object = pbmc)
# Violin and Ridge plots
VlnPlot(object = pbmc, features = c("LYZ", "CCL5", "IL32"))
RidgePlot(object = pbmc, feature = c("LYZ", "CCL5", "IL32"))
# Heatmaps
DoHeatmap(object = pbmc, features = heatmap_markers)
DimHeatmap(object = pbmc, reduction = "pca", cells = 200)
# New things to try! Note that plotting functions now return ggplot2 objects, so you can add themes, titles, and options
# onto them
VlnPlot(object = pbmc, features = "MS4A1", split.by = "groups")
DotPlot(object = pbmc, features = c("LYZ", "CCL5", "IL32"), split.by = "groups")
FeaturePlot(object = pbmc, features = c("MS4A1", "CD79A"), blend = TRUE)
DimPlot(object = pbmc) + DarkTheme()
DimPlot(object = pbmc) + labs(title = "2,700 PBMCs clustered using Seurat and viewed\non a two-dimensional tSNE")
Seurat提供了許多預(yù)建主題医吊,可以將其添加到ggplot2圖中以進(jìn)行快速自定義
主題 功能
DarkTheme 設(shè)置帶有白色文本的黑色背景
FontSize 設(shè)置圖的各種元素的字體大小
NoAxes 刪除軸和軸文本
NoLegend 刪除所有圖例元素
RestoreLegend 刪除后恢復(fù)圖例
RotatedAxis 旋轉(zhuǎn)x軸標(biāo)簽
# Plotting helper functions work with ggplot2-based scatter plots, such as DimPlot, FeaturePlot, CellScatter, and
# FeatureScatter
plot <- DimPlot(object = pbmc) + NoLegend()
# HoverLocator replaces the former `do.hover` argument It can also show extra data throught the `information` argument,
# designed to work smoothly with FetchData
HoverLocator(plot = plot, information = FetchData(object = pbmc, vars = c("ident", "PC_1", "nFeature_RNA")))
# FeatureLocator replaces the former `do.identify`
select.cells <- FeatureLocator(plot = plot)
# Label points on a ggplot object
LabelPoints(plot = plot, points = TopCells(object = pbmc[["pca"]]), repel = TRUE)
多重分析功能
使用Seurat v3.0,您可以輕松地在單個(gè)細(xì)胞水平上在不同的測(cè)定之間切換(例如逮京,來自CITE-seq的ADT計(jì)數(shù)或經(jīng)過積分/批校正的數(shù)據(jù))∏涮茫現(xiàn)在,大多數(shù)功能都帶有化驗(yàn)參數(shù)懒棉,但是您可以將默認(rèn)化驗(yàn)設(shè)置為不重復(fù)的語句草描。
cbmc <- CreateSeuratObject(counts = cbmc.rna)
# Add ADT data
cbmc[["ADT"]] <- CreateAssayObject(counts = cbmc.adt)
# Run analyses by specifying the assay to use
NormalizeData(object = cbmc, assay = "RNA")
NormalizeData(object = cbmc, assay = "ADT", method = "CLR")
# Retrieve and set the default assay
DefaultAssay(object = cbmc)
DefaultAssay(object = cbmc) <- "ADT"
DefaultAssay(object = cbmc)
# Pull feature expression from both assays by using keys
FetchData(object = cbmc, vars = c("rna_CD3E", "adt_CD3"))
# Plot data from multiple assays using keys
FeatureScatter(object = cbmc, feature1 = "rna_CD3E", feature2 = "adt_CD3")
V2 V3 區(qū)別
https://satijalab.org/seurat/essential_commands.html