往期回顧:
在前面的課程中我們已經(jīng)進(jìn)行過“單樣本數(shù)據(jù)分析”赠摇、“多樣本數(shù)據(jù)整合”固逗、“細(xì)胞類型注釋”等內(nèi)容的學(xué)習(xí),相信大家現(xiàn)在已經(jīng)能夠?qū)渭?xì)胞測序數(shù)據(jù)分析流程及Seurat對象的基本結(jié)構(gòu)擁有了一定的了解藕帜。這一講主要帶領(lǐng)大家進(jìn)行組間差異的計(jì)算及可視化方法的學(xué)習(xí)烫罩,這部分內(nèi)容能夠幫助科研工作者直接證明該數(shù)據(jù)集的前期試驗(yàn)設(shè)計(jì),從前期枯燥的數(shù)據(jù)預(yù)處理走向文章中的Figure洽故!
視頻教程:
保姆級教程 《手把手教你做單細(xì)胞測序數(shù)據(jù)分析》(六)——組間差異分析及可視化
(B站同步播出贝攒,先看一遍視頻再跟著代碼一起操作,建議每個視頻至少看三遍)
代碼:
測試數(shù)據(jù)與第四講多樣本整合相同:
讀入并檢查數(shù)據(jù)
library(Seurat)
## Attaching SeuratObject
library(dplyr)
##
## 載入程輯包:'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
pbmc <- readRDS('pbmcrenamed.rds')
pbmc
## An object of class Seurat
## 22254 features across 900 samples within 2 assays
## Active assay: RNA (20254 features, 0 variable features)
## 1 other assay present: integrated
## 3 dimensional reductions calculated: pca, umap, tsne
DimPlot(pbmc)
names(pbmc@meta.data)
## [1] "orig.ident" "nCount_RNA"
## [3] "nFeature_RNA" "percent.mt"
## [5] "group" "integrated_snn_res.0.025"
## [7] "seurat_clusters" "celltype.group"
## [9] "celltype"
unique(pbmc$group)
## [1] "C57" "AS1" "P3"
DimPlot(pbmc,split.by = 'group')
差異分析
pbmc$celltype.group <- paste(pbmc$celltype, pbmc$group, sep = "_")
pbmc$celltype <- Idents(pbmc)
Idents(pbmc) <- "celltype.group"
mydeg <- FindMarkers(pbmc,ident.1 = 'VSMC_AS1',ident.2 = 'VSMC_C57', verbose = FALSE, test.use = 'wilcox',min.pct = 0.1)
head(mydeg)
## p_val avg_log2FC pct.1 pct.2 p_val_adj
## Cd24a 6.327111e-07 1.4046048 0.500 0.016 0.01281493
## Spta1 9.387127e-07 0.3391453 0.375 0.000 0.01901269
## Lum 9.387127e-07 3.8953383 0.375 0.000 0.01901269
## Gda 9.387127e-07 0.6064680 0.375 0.000 0.01901269
## Isg20 6.651476e-06 1.4016408 0.500 0.032 0.13471900
## Hbb-bt 7.937909e-06 4.3779094 0.500 0.032 0.16077441
解放生產(chǎn)力 通過循環(huán)自動計(jì)算差異基因
cellfordeg<-levels(pbmc$celltype)
for(i in 1:length(cellfordeg)){
CELLDEG <- FindMarkers(pbmc, ident.1 = paste0(cellfordeg[i],"_P3"), ident.2 = paste0(cellfordeg[i],"_AS1"), verbose = FALSE)
write.csv(CELLDEG,paste0(cellfordeg[i],".CSV"))
}
list.files()
## [1] "B cell.CSV" "EC.CSV"
## [3] "Fibro.CSV" "Macro.CSV"
## [5] "Mono.CSV" "Myeloid cells.CSV"
## [7] "Neut.CSV" "pbmcrenamed.rds"
## [9] "T cell.CSV" "VSMC.CSV"
## [11] "組間差異分析及可視化.html" "組間差異分析及可視化.Rmd"
## [13] "組間差異分析及可視化_files" "組間差異及可視化.r"
差異分析解果解讀:
可視化方法
library(dplyr)
top10 <- CELLDEG %>% top_n(n = 10, wt = avg_log2FC) %>% row.names()
top10
## [1] "Thbs1" "Acta2" "Myl9" "Tagln" "Ccn2" "Plvap"
## [7] "Igfbp7" "Ifi27l2a" "Dcn" "Gdf15"
pbmc <- ScaleData(pbmc, features = rownames(pbmc))
## Centering and scaling data matrix
DoHeatmap(pbmc,features = top10,size=3)
Idents(pbmc) <- "celltype"
VlnPlot(pbmc,features = top10,split.by = 'group',idents = 'EC')
## The default behaviour of split.by has changed.
## Separate violin plots are now plotted side-by-side.
## To restore the old behaviour of a single split violin,
## set split.plot = TRUE.
##
## This message will be shown once per session.
FeaturePlot(pbmc,features = top10,split.by = 'group')
#DotPlot(pbmc,features = top10,split.by ='group')#默認(rèn)只有兩種顏色
DotPlot(pbmc,features = top10,split.by ='group',cols = c('blue','yellow','pink'))
提取表達(dá)量时甚,用ggplot2 DIY一個箱線圖
####提取表達(dá)量#######
mymatrix <- as.data.frame(pbmc@assays$RNA@data)
mymatrix2<-t(mymatrix)%>%as.data.frame()
mymatrix2[,1]<-pbmc$celltype
colnames(mymatrix2)[1] <- "celltype"
mymatrix2[,ncol(mymatrix2)+1]<-pbmc$group
colnames(mymatrix2)[ncol(mymatrix2)] <- "group"
#繪圖
library(ggplot2)
p1<- ggplot2::ggplot(mymatrix2,aes(x=celltype,y=Thbs1,fill=group))+
geom_boxplot(alpha=0.7)+
scale_y_continuous(name = "Expression")+
scale_x_discrete(name="Celltype")+
scale_fill_manual(values = c('DeepSkyBlue','Orange','pink'))
p1
#########另一種提取方法########
Idents(pbmc) <- colnames(pbmc)
mymatrix <- log1p(AverageExpression(pbmc, verbose = FALSE)$RNA)
mymatrix2<-t(mymatrix)%>%as.data.frame()
mymatrix2[,1]<-pbmc$celltype
colnames(mymatrix2)[1] <- "celltype"
mymatrix2[,ncol(mymatrix2)+1]<-pbmc$group
colnames(mymatrix2)[ncol(mymatrix2)] <- "group"
library(ggplot2)
p2<- ggplot2::ggplot(mymatrix2,aes(x=celltype,y=Thbs1,fill=group))+
geom_boxplot(alpha=0.7)+
scale_y_continuous(name = "Expression")+
scale_x_discrete(name="Celltype")+
scale_fill_manual(values = c('DeepSkyBlue','Orange','pink'))
###比較一下兩種方法隘弊,發(fā)現(xiàn)并沒有差異
library(patchwork)
p1|p2
往期回顧
單細(xì)胞數(shù)據(jù)分析系列教程:
B站視頻,先看一遍視頻再去看推送操作荒适,建議至少看三遍:https://www.bilibili.com/video/BV1S44y1b76Z/
單細(xì)胞測序基礎(chǔ)數(shù)據(jù)分析保姆級教程梨熙,代碼部分整理在往期推送之中:
手把手教你做單細(xì)胞測序數(shù)據(jù)分析(一)——緒論
手把手教你做單細(xì)胞測序數(shù)據(jù)分析(二)——各類輸入文件讀取
手把手教你做單細(xì)胞測序數(shù)據(jù)分析(三)——單樣本分析
手把手教你做單細(xì)胞測序數(shù)據(jù)分析(四)——多樣本整合
手把手教你做單細(xì)胞測序數(shù)據(jù)分析(五)——細(xì)胞類型注釋
歡迎關(guān)注同名公眾號~