系列回顧:
ArchR官網(wǎng)教程學(xué)習(xí)筆記1:Getting Started with ArchR
ArchR官網(wǎng)教程學(xué)習(xí)筆記2:基于ArchR推測(cè)Doublet
ArchR官網(wǎng)教程學(xué)習(xí)筆記3:創(chuàng)建ArchRProject
ArchR官網(wǎng)教程學(xué)習(xí)筆記4:ArchR的降維
大多數(shù)單細(xì)胞聚類方法專注于計(jì)算降維的nearest neighbor graphs购对,然后識(shí)別“社區(qū)”(communities)或細(xì)胞群。這些方法非常有效,是scRNA-seq的標(biāo)準(zhǔn)方法。由于這個(gè)原因,ArchR使用來(lái)自scRNA-seq包現(xiàn)有的最先進(jìn)的clustering方法進(jìn)行聚類。
(一)使用Seurat的FindClusters()
功能
我們使用Seurat的圖聚類實(shí)現(xiàn)方法取得了很大的成功。在ArchR中,使用addClusters()
函數(shù)來(lái)執(zhí)行聚類肠阱,它允許更多的聚類參數(shù),傳遞給Seurat::FindClusters()
函數(shù)朴读。使用Seurat::FindClusters()的聚類是確定性的屹徘,這意味著完全相同的輸入會(huì)產(chǎn)生完全相同的輸出結(jié)果。
> projHeme2 <- addClusters(
input = projHeme2,
reducedDims = "IterativeLSI",
method = "Seurat",
name = "Clusters",
resolution = 0.8
)
ArchR logging to : ArchRLogs\ArchR-addClusters-28e87e1c6324-Date-2020-11-20_Time-03-10-43.log
If there is an issue, please report to github with logFile!
2020-11-20 03:10:44 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.006 mins elapsed.
Computing nearest neighbor graph
Computing SNN
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 10251
Number of edges: 499370
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8573
Number of communities: 12
Elapsed time: 1 seconds
2020-11-20 03:11:16 : Testing Outlier Clusters, 0.549 mins elapsed.
2020-11-20 03:11:16 : Assigning Cluster Names to 12 Clusters, 0.549 mins elapsed.
2020-11-20 03:11:17 : Finished addClusters, 0.551 mins elapsed.
可以看一下聚類結(jié)果:
> head(projHeme2$Clusters)
[1] "C4" "C7" "C9" "C9" "C9" "C4"
查看每個(gè)cluster有多少個(gè)細(xì)胞數(shù):
> table(projHeme2$Clusters)
C1 C10 C11 C12 C2 C3 C4 C5 C6 C7 C8 C9
1479 436 306 383 1102 845 1168 1403 806 1268 705 350
為了更好地理解哪些樣本位于哪些cluster中衅金,我們可以使用confusionMatrix()
函數(shù)在每個(gè)樣本之間創(chuàng)建一個(gè)混合cluster矩陣:
> cM <- confusionMatrix(paste0(projHeme2$Clusters), paste0(projHeme2$Sample))
> cM
12 x 3 sparse Matrix of class "dgCMatrix"
scATAC_BMMC_R1 scATAC_CD34_BMMC_R1 scATAC_PBMC_R1
C4 352 813 3
C7 1222 . 46
C9 350 . .
C10 258 4 174
C1 1448 4 27
C5 139 1264 .
C3 189 646 10
C8 133 1 571
C11 152 145 9
C6 254 . 552
C12 93 290 .
C2 99 1 1002
然后把這個(gè)混合的矩陣用熱圖畫出來(lái):
> library(pheatmap)
> cM <- cM / Matrix::rowSums(cM)
> p <- pheatmap::pheatmap(
mat = as.matrix(cM),
color = paletteContinuous("whiteBlue"),
border_color = "black"
)
> p
有時(shí)噪伊,細(xì)胞在二維嵌入中的相對(duì)位置與確定的clusters并不完全一致。更明確地說(shuō)氮唯,單個(gè)cluster的細(xì)胞可能出現(xiàn)在嵌入的多個(gè)不同區(qū)域酥宴。在這種情況下,適當(dāng)?shù)卣{(diào)整聚類參數(shù)或嵌入?yún)?shù)您觉,直到兩者達(dá)成一致拙寡。
(二)使用scran進(jìn)行聚類
第二種聚類的方法,通過(guò)更改addClusters()
里的method參數(shù)來(lái)調(diào)整:
> projHeme2 <- addClusters(
input = projHeme2,
reducedDims = "IterativeLSI",
method = "scran",
name = "ScranClusters",
k = 15
)
ArchR logging to : ArchRLogs\ArchR-addClusters-2e10d2f4585-Date-2020-11-20_Time-03-47-21.log
If there is an issue, please report to github with logFile!
2020-11-20 03:47:22 : Running Scran SNN Graph (Lun et al. F1000Res. 2016), 0.017 mins elapsed.
2020-11-20 03:47:30 : Identifying Clusters (Lun et al. F1000Res. 2016), 0.152 mins elapsed.
2020-11-20 03:50:33 : Testing Outlier Clusters, 3.199 mins elapsed.
2020-11-20 03:50:33 : Assigning Cluster Names to 9 Clusters, 3.199 mins elapsed.
2020-11-20 03:50:33 : Finished addClusters, 3.201 mins elapsed.