免疫組庫(kù)基礎(chǔ)知識(shí)見:單細(xì)胞免疫組庫(kù):TCR基因重排原理和TCR測(cè)序建庫(kù)方法
scRepertoire由華盛頓大學(xué)的Nick Borcherding博士開發(fā),是一款針對(duì)10X V(D)J數(shù)據(jù)的免疫組庫(kù)分析工具。它直接導(dǎo)入10X cellranger的輸出結(jié)果,具有豐富的分析項(xiàng)目和美觀的可視化結(jié)果攘轩。
官方教程:https://ncborcherding.github.io/vignettes/vignette.html
1. 數(shù)據(jù)準(zhǔn)備
library(devtools)
install_github("ncborcherding/scRepertoire")
#正式版许昨,需要R 4.0
V(D)J數(shù)據(jù)分析
導(dǎo)入文件說明:scRepertoire需要導(dǎo)入后綴為contig_annotations.csv的CellRanger輸出文件禀倔,要求barcode沒有前綴且以列表的方式傳遞給scRepertoire鳖粟。本教程使用scRepertoire包自帶的VDJ數(shù)據(jù),這些數(shù)據(jù)來自于三個(gè)腎透明細(xì)胞癌患者的T細(xì)胞效览,由成對(duì)的外周血和腫瘤浸潤(rùn)組成,有效地為T細(xì)胞受體(TCR)富集創(chuàng)造了6個(gè)不同的runs荡短∝ね鳎可通過data("contig_list"
)加載。
library(Seurat)
library(scRepertoire)
library(tidyverse)
library(patchwork)
rm(list=ls())
dir.create("VDJ")
#加載scRepertoire自帶的contig_annotations文件
data("contig_list")
head(contig_list[[1]])
## barcode is_cell contig_id high_confidence length
## 1 AAACCTGAGAGCTGGT TRUE AAACCTGAGAGCTGGT-1_contig_1 TRUE 705
## 2 AAACCTGAGAGCTGGT TRUE AAACCTGAGAGCTGGT-1_contig_2 TRUE 502
## 3 AAACCTGAGCATCATC TRUE AAACCTGAGCATCATC-1_contig_1 TRUE 693
## 4 AAACCTGAGCATCATC TRUE AAACCTGAGCATCATC-1_contig_2 TRUE 567
## 5 AAACCTGAGCATCATC TRUE AAACCTGAGCATCATC-1_contig_5 TRUE 361
## 6 AAACCTGAGTGGTCCC TRUE AAACCTGAGTGGTCCC-1_contig_1 TRUE 593
## chain v_gene d_gene j_gene c_gene full_length productive cdr3
## 1 TRB TRBV20-1 TRBD1 TRBJ1-5 TRBC1 TRUE TRUE CSASMGPVVSNQPQHF
## 2 TRB None None TRBJ1-5 TRBC1 FALSE None None
## 3 TRB TRBV5-1 TRBD2 TRBJ2-2 TRBC2 TRUE TRUE CASSWSGAGDGELFF
## 4 TRA TRAV12-1 None TRAJ37 TRAC TRUE TRUE CVVNDEGSSNTGKLIF
## 5 TRB None None TRBJ1-5 TRBC1 FALSE None None
## 6 TRB TRBV7-9 TRBD1 TRBJ2-5 TRBC2 TRUE TRUE CASSPSEGGRQETQYF
## cdr3_nt reads umis raw_clonotype_id
## 1 TGCAGTGCTAGCATGGGACCGGTAGTGAGCAATCAGCCCCAGCATTTT 16718 6 clonotype96
## 2 None 6706 3 clonotype96
## 3 TGCGCCAGCAGCTGGTCAGGAGCGGGAGACGGGGAGCTGTTTTTT 26719 11 clonotype97
## 4 TGTGTGGTGAACGATGAAGGCTCTAGCAACACAGGCAAACTAATCTTT 18297 6 clonotype97
## 5 None 882 4 clonotype97
## 6 TGTGCCAGCAGCCCCTCCGAAGGGGGGAGACAAGAGACCCAGTACTTC 11218 6 clonotype98
## raw_consensus_id
## 1 clonotype96_consensus_1
## 2 None
## 3 clonotype97_consensus_2
## 4 clonotype97_consensus_1
## 5 None
## 6 clonotype98_consensus_1
2. Combining the Contigs
cellranger輸出的文件是對(duì)TCRA和TCRB鏈的量化掘托,沒有細(xì)胞barcode與clonotype直接對(duì)應(yīng)的數(shù)據(jù)瘦锹,因此不能直接整合到seurat中。scRepertoire分析的第一步是把上面的文件轉(zhuǎn)換為barcode與clonotype對(duì)應(yīng)的數(shù)據(jù)闪盔,它通過combineTCR函數(shù)實(shí)現(xiàn)(如果是BCR用combineBCR)弯院,輸入的是contig_list。還可以根據(jù)樣本和身份信息重新標(biāo)記條形碼泪掀,以防止重復(fù)听绳。
?combineTCR
combined <- combineTCR(contig_list,
samples = c("PY", "PY", "PX", "PX", "PZ","PZ"), #樣本來自Patient Y,Patient X和Patient Z(各2個(gè))
ID = c("P", "T", "P", "T", "P", "T"), #P是外周血 T是腫瘤
cells ="T-AB")
#參數(shù)說明:
#samples:用于指定樣本名稱的參數(shù),因?yàn)閏ontig_list包含了6個(gè)樣本的VDJ數(shù)據(jù)异赫,所以這里傳遞了一個(gè)6個(gè)字符元素的向量椅挣;
#ID:用于指定樣本名稱之外的其他分類信息头岔,例如分組信息;
#cells:指定VDJ類型鼠证,"T-AB"代表TCR的α和β鏈配對(duì)切油,"T-GD"代表γ和δ配對(duì);
#removeNA + TRUE -這是一個(gè)嚴(yán)格的過濾器名惩,用于移除至少有一個(gè)NA值的細(xì)胞條碼+ FALSE -包含和合并NA值為1的細(xì)胞的默認(rèn)設(shè)置澎胡。
#removeMulti + TRUE -這是一個(gè)嚴(yán)格的過濾器,可以移除任何超過2個(gè)免疫受體鏈的細(xì)胞條碼+ FALSE -包含和合并帶有> 2鏈的細(xì)胞的默認(rèn)設(shè)置娩鹉。
#filterMulti + TRUE -用多個(gè)鏈分離細(xì)胞條形碼中前2個(gè)表達(dá)的鏈+ FALSE -包含和合并細(xì)胞與> 2鏈的默認(rèn)設(shè)置攻谁。
輸出的結(jié)果combined依然是一個(gè)列表
View(combined[[1]])
可以看到barcode都加上了由samples和ID組成的前綴,并且把每個(gè)細(xì)胞配對(duì)的兩條鏈放在一行弯予,之后的分析都基于配對(duì)信息定義的clonotype戚宦。下面分析所用的函數(shù),一般都有兩個(gè)通用參數(shù):
exportTable
:默認(rèn)賦值F锈嫩,函數(shù)分析結(jié)果為圖形受楼;改為T之后函數(shù)分析結(jié)果輸出為表格。
cloneCall
:可選"aa", "nt", "gene", "gene+nt"呼寸,代表用CDR3氨基酸序列艳汽、CDR3核苷酸序列、VDJ基因還是VDJ基因+CDR3核苷酸序列中的哪一種來定義克隆型对雪。
3. 其他處理功能
3.1 加入其他變量
如果除了sample和ID之外還需要添加更多的變量可以使用addVariable()
函數(shù)來添加河狐。我們需要的只是要添加的變量的名稱和特定的字符或數(shù)字值(變量)。
#例如添加樣品被處理和測(cè)序的批次
example <- addVariable(combined, name = "batch",
variables = c("b1", "b1", "b2", "b2", "b2", "b2"))
example[[1]][1:5,ncol(example[[1]])] # This is showing the first 5 values of the new column added
## [1] "b1" "b1" "b1" "b1" "b1"
3.2 Subsetting Contigs
同樣瑟捣,我們可以使用subsetContig()
函數(shù)在combineTCR()之后刪除特定的列表元素馋艺。為了進(jìn)行子集化,我們需要確定要用于子集化的向量(名稱)和要子集化的變量值(變量)迈套。下面你可以看到我們從PX和PY中分離出4個(gè)測(cè)序結(jié)果捐祠。
subset <- subsetContig(combined, name = "sample", variables = c("PX", "PY"))
4. 可視化
- cloneCall
"gene":VDJ基因
"nt":CDR3核苷酸序列
"aa":代表用CDR3氨基酸序列、
"gene+nt":VDJ基因+CDR3核苷酸序列
需要注意的是桑李,克隆型基本上是利用兩個(gè)位點(diǎn)的基因組合或nt/aa CDR3序列來命名的踱蛀。在scRepertoire實(shí)現(xiàn)中,clonotype調(diào)用沒有整合CDR3序列中的微小變異芙扎。因此星岗,gene
方法將是最敏感的,而使用nt
或aa
是中度敏感戒洼,對(duì)克隆型最具特異性的則是gene+nt
俏橘。此外,克隆型calling試圖整合兩個(gè)位點(diǎn)圈浇,即TCRA和TCRB鏈寥掐,以及如果單個(gè)細(xì)胞條碼有多個(gè)序列被識(shí)別(即一個(gè)細(xì)胞中表達(dá)2個(gè)TCRA鏈)靴寂。使用10x方法有時(shí),有一部分barcode將只返回一個(gè)免疫受體鏈召耘,未返回鏈會(huì)分配一個(gè)NA值百炬。
研究克隆類型的第一個(gè)函數(shù)是quantContig()
,它返回唯一克隆類型的總數(shù)或相對(duì)數(shù)量污它。
- scale
TRUE -獨(dú)特克隆型的相對(duì)百分比由克隆型庫(kù)的總大小
FALSE -報(bào)告獨(dú)特克隆型的總數(shù)量剖踊。
##展示每個(gè)樣本的克隆型數(shù)量
p1 <- quantContig(combined, cloneCall="gene+nt", scale = F)
p2 <- quantContig(combined, cloneCall="gene+nt", scale = T)
plotc = p1 + p2
ggsave('VDJ/quantContig.png', plotc, width = 8, height = 4)
#設(shè)置exportTable = T,則輸出表格而非圖形
quantContig(combined, cloneCall="gene+nt", scale = T, exportTable = T)
# contigs values total scaled
# 1 2692 PY_P 3208 83.91521
# 2 1513 PY_T 3119 48.50914
# 3 823 PX_P 1068 77.05993
# 4 928 PX_T 1678 55.30393
# 5 1147 PZ_P 1434 79.98605
# 6 764 PZ_T 2768 27.60116
我們也可以通過豐度來查看克隆型的相對(duì)分布德澈。在這里,abundanceContig=()
將生成一個(gè)線圖固惯,展示樣本中總clonotypes數(shù)量梆造。與上面一樣,我們還可以使用函數(shù)中的group變量根據(jù)contig對(duì)象中的向量對(duì)其進(jìn)行分組葬毫。
abundanceContig(combined, cloneCall = "gene", scale = F)
abundanceContig(combined, cloneCall = "gene", exportTable = T)
## # A tibble: 7,248 x 3
## CTgene Abundance values
## <chr> <int> <chr>
## 1 NA_TRBV10-1.TRBJ2-2.TRBD1.TRBC2 1 PY_P
## 2 NA_TRBV10-2.TRBJ1-1.TRBD1.TRBC1 1 PY_P
## 3 NA_TRBV10-2.TRBJ2-1.TRBD2.TRBC2 1 PY_P
## 4 NA_TRBV10-2.TRBJ2-3.TRBD2.TRBC2 1 PY_P
## 5 NA_TRBV10-3.TRBJ1-1.None.TRBC1 2 PY_P
## 6 NA_TRBV10-3.TRBJ1-1.TRBD1.TRBC1 2 PY_P
## 7 NA_TRBV10-3.TRBJ1-5.None.TRBC1 1 PY_P
## 8 NA_TRBV10-3.TRBJ1-5.TRBD2.TRBC1 1 PY_P
## 9 NA_TRBV10-3.TRBJ2-1.TRBD1.TRBC2 2 PY_P
## 10 NA_TRBV10-3.TRBJ2-2.TRBD2.TRBC2 1 PY_P
## # … with 7,238 more rows
我們可以通過調(diào)用lengtheContig()
函數(shù)來查看CDR3序列的長(zhǎng)度分布镇辉。重要的是,與其他基本可視化不同贴捡,在這里忽肛, cloneCall只能是nt
或aa
。
##CDR3序列的長(zhǎng)度分布栈暇,“aa”代表統(tǒng)計(jì)氨基酸序列長(zhǎng)度
p1 <- lengthContig(combined, cloneCall="aa", chains = "combined")
p2 <- lengthContig(combined, cloneCall="aa", chains = "single")
plotc = p1 + p2
ggsave('VDJ/lengthContig.png', plotc, width = 8, height = 4)
我們還可以使用compareClonotypes()
函數(shù)來查看樣本之間的克隆型的動(dòng)態(tài)變化箍镜。
##對(duì)比兩個(gè)樣本的克隆型
p1 = compareClonotypes(combined, numbers = 10, samples = c("PX_P", "PX_T"),
cloneCall="aa", graph = "alluvial")
ggsave('VDJ/compareClonotypes.png', p1, width = 8, height = 4)
vizVgenes(combined, TCR="TCR1", facet.x = "sample", facet.y = "ID")
5. 更多高級(jí)克隆分析
5.1 克隆空間內(nèi)穩(wěn)態(tài)(Clonal Space Homeostasis)
此分析將克隆型按其相對(duì)豐度分為rare, small, medium, large, hyperexpanded 5大類源祈,并統(tǒng)計(jì)各個(gè)類別的占比
clonalHomeostasis(combined, cloneCall = "gene")
clonalHomeostasis(combined, cloneCall = "aa")
5.2 克隆型分類占比
與上一個(gè)分析相似,只是分類方法調(diào)整了
clonalProportion(combined, cloneCall = "gene")
clonalProportion(combined, cloneCall = "nt")
5.3 Overlap Analysis
使用clonalOverlap()
分析樣本相似性
目前有兩種方法可用于clonalOverlap():1)overlap coefficient和2)Morisita index色迂。前者是在較小的樣本中以獨(dú)特的克隆型的長(zhǎng)度來衡量克隆型的重疊性香缺。Morisita index更復(fù)雜,它是對(duì)一個(gè)細(xì)胞群中個(gè)體分散程度的一種生態(tài)測(cè)量歇僧,整合了細(xì)胞群的大小图张。
clonalOverlap(combined, cloneCall = "aa", method = "morisita")
scRepertoire的另一個(gè)特性改編自powerTCR R包,使用clonesizeDistribution()
通過克隆大小分布對(duì)樣本進(jìn)行聚類诈悍。
clonesizeDistribution(combined, cloneCall = "gene+nt", method="ward.D2")
5.4 多樣性分析
多樣性也可以通過樣本或其他變量來衡量祸轮。多樣性是通過以下四個(gè)度量來計(jì)算的: 1)Shannon, 2) inverse Simpson, 3) Chao1和4)based Coverage Estimator (ACE)。前兩者一般用于估計(jì)基線多樣性侥钳,Chao/ACE指數(shù)用于估計(jì)樣本的豐富度适袜。
clonalDiversity(combined, cloneCall = "aa", group = "samples")
clonalDiversity(combined, cloneCall = "gene", group = "ID")
5.5 Clustering Clonotypes
sub_combined <- clusterTCR(combined[[1]], chain = "TCRA",
sequence = "aa", threshold = 0.85)
sub_combined <- as.data.frame(sub_combined)
counts_TCRAcluster <- table(sub_combined$TCRA_cluster)
counts_TCRA<- table(sub_combined$cdr3_aa1)
#Change in histogram range using clusters over exact amino acid sequence
plot(counts_TCRAcluster, xaxt='n')
plot(counts_TCRA, xaxt='n')
6. VDJ與scRNA(Seurat)整合分析
前面介紹的分析內(nèi)容只能依據(jù)樣本或分組信息(可通過ID參數(shù)指定)開展,其實(shí)我們還可以通過barcode將VDJ數(shù)據(jù)與scRNA數(shù)據(jù)聯(lián)系起來舷夺,這樣就可以將clonotype顯示在降維圖上苦酱,也可以基于cluster展示clonotype的情況售貌。
官方提供的scRNA數(shù)據(jù)需要下載:https://drive.google.com/open?id=1np-EzG7U9W_Fz_SchBrsAhtqE3_rB_H9
下載之后會(huì)得到一個(gè)seurat2.rda的文件,這是一個(gè)降維聚類后的seurat對(duì)象疫萤,請(qǐng)保存在分析相關(guān)目錄下颂跨。
seurat <- get(load("VDJ/seurat2.rda"))
#查看UMAP圖
DimPlot(seurat, label = T) + NoLegend()
#查看原始seurat的meta.data有哪些內(nèi)容
names(seurat@meta.data)
# [1] "nCount_RNA" "nFeature_RNA" "integrated_snn_res.0.5" "seurat_clusters"
# [5] "Patient" "Type" "RawBarcode"
接下來獲取clonotypic信息,并使用combineExpression()
函數(shù)將其添加到Seurat對(duì)象中扯饶。需要注意恒削,附件的主要需求是匹配contig細(xì)胞條形碼和seurat或SCE對(duì)象的元數(shù)據(jù)行名中的條形碼。如果這些不匹配尾序,就會(huì)失敗蔓同。因此建議對(duì)Seurat對(duì)象的行名稱進(jìn)行更改。
?combineExpression
seurat <- combineExpression(combined, seurat,
cloneCall="gene", groupBy = "sample", proportion = FALSE,
cloneTypes=c(Single=1, Small=5, Medium=20, Large=100, Hyperexpanded=500))
combineExpression
的源代碼??
combineExpression
function (df, sc, cloneCall = "gene+nt", groupBy = "none", proportion = TRUE,
cloneTypes = c(Rare = 1e-04, Small = 0.001, Medium = 0.01,
Large = 0.1, Hyperexpanded = 1), filterNA = FALSE)
{
cloneTypes <- c(None = 0, cloneTypes)
df <- checkList(df)
cloneCall <- theCall(cloneCall)
Con.df <- NULL
meta <- grabMeta(sc)
cell.names <- rownames(meta)
if (groupBy == "none") {
for (i in seq_along(df)) {
data <- data.frame(df[[i]], stringsAsFactors = FALSE)
data2 <- unique(data[, c("barcode", cloneCall)])
data2 <- na.omit(data2[data2[, "barcode"] %in% cell.names,
])
if (proportion == TRUE) {
data2 <- data2 %>% group_by(data2[, cloneCall]) %>%
summarise(Frequency = n()/nrow(data2))
}
else {
data2 <- data2 %>% group_by(data2[, cloneCall]) %>%
summarise(Frequency = n())
}
colnames(data2)[1] <- cloneCall
data <- merge(data, data2, by = cloneCall, all = TRUE)
Con.df <- rbind.data.frame(Con.df, data)
}
}
else if (groupBy != "none") {
data <- data.frame(bind_rows(df), stringsAsFactors = FALSE)
data2 <- na.omit(unique(data[, c("barcode", cloneCall,
groupBy)]))
data2 <- data2[data2[, "barcode"] %in% cell.names, ]
data2 <- as.data.frame(data2 %>% group_by(data2[, cloneCall],
data2[, groupBy]) %>% summarise(Frequency = n()))
colnames(data2)[c(1, 2)] <- c(cloneCall, groupBy)
x <- unique(data[, groupBy])
for (i in seq_along(x)) {
sub1 <- subset(data, data[, groupBy] == x[i])
sub2 <- subset(data2, data2[, groupBy] == x[i])
merge <- merge(sub1, sub2, by = cloneCall)
if (proportion == TRUE) {
merge$Frequency <- merge$Frequency/length(merge$Frequency)
}
Con.df <- rbind.data.frame(Con.df, merge)
}
nsize <- Con.df %>% group_by(Con.df[, paste0(groupBy,
".x")]) %>% summarise(n = n())
}
Con.df$cloneType <- NA
for (x in seq_along(cloneTypes)) {
names(cloneTypes)[x] <- paste0(names(cloneTypes[x]),
" (", cloneTypes[x - 1], " < X <= ", cloneTypes[x],
")")
}
for (i in 2:length(cloneTypes)) {
Con.df$cloneType <- ifelse(Con.df$Frequency > cloneTypes[i -
1] & Con.df$Frequency <= cloneTypes[i], names(cloneTypes[i]),
Con.df$cloneType)
}
PreMeta <- unique(Con.df[, c("barcode", "CTgene", "CTnt",
"CTaa", "CTstrict", "Frequency", "cloneType")])
rownames(PreMeta) <- PreMeta$barcode
if (inherits(x = sc, what = "Seurat")) {
col.name <- names(PreMeta) %||% colnames(PreMeta)
sc[[col.name]] <- PreMeta
}
else {
rownames <- rownames(colData(sc))
colData(sc) <- cbind(colData(sc), PreMeta[rownames, ])[,
union(colnames(colData(sc)), colnames(PreMeta))]
rownames(colData(sc)) <- rownames
}
if (filterNA == TRUE) {
sc <- filteringNA(sc)
}
return(sc)
}
<bytecode: 0x7fdcedf8ecc0>
<environment: namespace:scRepertoire>
查看整合后的metadata
names(seurat@meta.data)
# [1] "nCount_RNA" "nFeature_RNA" "integrated_snn_res.0.5"
# [4] "seurat_clusters" "Patient" "Type"
# [7] "RawBarcode" "barcode" "CTgene"
#[10] "CTnt" "CTaa" "CTstrict"
#[13] "Frequency" "cloneType"
查看外周血和腫瘤中的T細(xì)胞
colorblind_vector <- colorRampPalette(c("#FF4B20", "#FFB433",
"#C6FDEC", "#7AC5FF", "#0348A6"))
DimPlot(seurat, group.by = "Type") + NoLegend() +
scale_color_manual(values=colorblind_vector(2))
UMAP圖展示cloneType的分布
slot(seurat, "meta.data")$cloneType <- factor(slot(seurat, "meta.data")$cloneType,
levels = c("Hyperexpanded (100 < X <= 500)",
"Large (20 < X <= 100)",
"Medium (5 < X <= 20)",
"Small (1 < X <= 5)",
"Single (0 < X <= 1)", NA))
DimPlot(seurat, group.by = "cloneType") +
scale_color_manual(values = colorblind_vector(5), na.value="grey")
6.1 clonalOverlay
clonalOverlay(seurat, reduction = "umap",
freq.cutpoint = 30, bins = 10, facet = "Patient") +
guides(color = FALSE)
UMAP圖展示特定克隆型的分布
seurat <- highlightClonotypes(seurat, cloneCall= "aa", sequence = c("CAVNGGSQGNLIF_CSAEREDTDTQYF", "NA_CATSATLRVVAEKLFF"))
DimPlot(seurat, group.by = "highlight")
6.2 occupiedscRepertoire
我們還可以使用occupiedscRepertoire()
函數(shù)并通過定義x.axis來顯示細(xì)胞對(duì)象的元數(shù)據(jù)中的集群或其他變量蹲诀,從而查看分配到特定頻率范圍的集群的單元數(shù)斑粱。
occupiedscRepertoire(seurat, x.axis = "cluster")
6.3 alluvialClonotypes
桑基圖展示特定克隆型的來源
最后脯爪,在修改了所有元數(shù)據(jù)之后则北,我們可以使用alluvialClonotypes()
函數(shù)查看跨多個(gè)類別的clonotypes。
??要理解這種繪圖方法的基本概念痕慢,強(qiáng)烈推薦閱讀Alluvial Plots in ggplot2尚揣。(本質(zhì)上我們能夠使用繪圖來檢查分類變量的流向。因?yàn)檫@個(gè)函數(shù)將生成一個(gè)圖形掖举,其中每個(gè)克隆型按所謂的分層進(jìn)行排列快骗,這將花費(fèi)一些時(shí)間,具體時(shí)間取決于細(xì)胞總數(shù)的大小塔次。)
alluvialClonotypes(seurat, cloneCall = "gene",
y.axes = c("Patient", "cluster", "Type"),
color = "TRAV12-2.TRAJ42.TRAC_TRBV20-1.TRBJ2-3.TRBD2.TRBC2") +
scale_fill_manual(values = c("grey", colorblind_vector(1)))
煞嚼海基圖展示克隆型在病例-cluster-組織類型之間的關(guān)系
alluvialClonotypes(seurat, cloneCall = "gene",
y.axes = c("Patient", "cluster", "Type"),
color = "cluster")
6.4 getCirclize
library(circlize)
library(scales)
circles <- getCirclize(seurat, groupBy = "cluster")
#Just assigning the normal colors to each cluster
grid.cols <- scales::hue_pal()(length(unique(seurat@active.ident)))
names(grid.cols) <- levels(seurat@active.ident)
#Graphing the chord diagram
circlize::chordDiagram(circles, self.link = 1, grid.col = grid.cols)
subset <- subset(seurat, Type == "T")
circles <- getCirclize(subset, groupBy = "cluster")
grid.cols <- hue_pal()(length(unique(subset@active.ident)))
names(grid.cols) <- levels(subset@active.ident)
chordDiagram(circles, self.link = 1, grid.col = grid.cols)
6.5 Diversity of single-cells using Startrac
StartracDiversity(seurat, type = "Type", sample = "Patient", by = "overall")
## [2021-05-17 16:29:17] initialize Startrac ...
## [2021-05-17 16:29:17] calculate startrac index ...
## [2021-05-17 16:29:17] calculate pairwise index ...
## [2021-05-17 16:29:18] calculate indices of each patient ...
## [2021-05-17 16:29:22] collect result
## [2021-05-17 16:29:22] return
7. Working with clonotypes after clustering
對(duì)于希望在Seurat對(duì)象中使用元數(shù)據(jù)來執(zhí)行scRepertoire提供的分析的用戶來說,還有一個(gè)選項(xiàng)是使用expression2List()
函數(shù)励负,該函數(shù)將獲取元數(shù)據(jù)并按亞群將數(shù)據(jù)輸出為列表藕溅。也就是人們常說的,細(xì)胞類型特異的克隆型分析继榆。
combined2 <- expression2List(seurat, group = "cluster")
length(combined2) #now listed by cluster
## [1] 12
7.1 Clonal Diversity
clonalDiversity(combined2, cloneCall = "nt")
7.2 Clonal Homeostasis
clonalHomeostasis(combined2, cloneCall = "nt")
7.3 Clonal Proportion
clonalProportion(combined2, cloneCall = "nt")
7.4 Clonal Overlap
clonalOverlap(combined2, cloneCall="aa", method="overlap")
8. 總結(jié)
This has been a general overview of the capabilities for scRepertoire from the initial processing and visualization to attach to the mRNA expression values in a Seurat object.
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets
## [7] methods base
## other attached packages:
## [1] scales_1.1.1 patchwork_1.1.1 scRepertoire_1.0.3
## [4] devtools_2.4.0 usethis_2.0.1 forcats_0.5.1
## [7] stringr_1.4.0 dplyr_1.0.5 purrr_0.3.4
## [10] readr_1.4.0 tidyr_1.1.3 tibble_3.1.1
## [13] ggplot2_3.3.3 tidyverse_1.3.1 SeuratObject_4.0.0
## [16] Seurat_4.0.1 nichenetr_1.0.0 circlize_0.4.12
## [19] Biobase_2.50.0 BiocGenerics_0.36.1
## loaded via a namespace (and not attached):
## [1] SparseM_1.81 scattermore_0.7
## [3] ModelMetrics_1.2.2.2 evmix_2.12
## [5] pkgmaker_0.32.2 knitr_1.33
## [7] DelayedArray_0.16.3 irlba_2.3.3
## [9] data.table_1.14.0 rpart_4.1-15
## [11] RCurl_1.98-1.3 doParallel_1.0.16
## [13] generics_0.1.0 callr_3.7.0
## [15] cowplot_1.1.1 VGAM_1.1-5
## [17] RANN_2.6.1 proxy_0.4-25
## [19] future_1.21.0 DiagrammeR_1.0.6.1
## [21] spatstat.data_2.1-0 xml2_1.3.2
## [23] lubridate_1.7.10 httpuv_1.6.0
## [25] isoband_0.2.4 SummarizedExperiment_1.20.0
## [27] ggsci_2.9 assertthat_0.2.1
## [29] gower_0.2.2 xfun_0.22
## [31] hms_1.0.0 evaluate_0.14
## [33] promises_1.2.0.1 fansi_0.4.2
## [35] caTools_1.18.2 dbplyr_2.1.1
## [37] readxl_1.3.1 igraph_1.2.6.9118
## [39] DBI_1.1.1.9000 htmlwidgets_1.5.3
## [41] powerTCR_1.10.3 spatstat.geom_2.1-0
## [43] stringdist_0.9.6.3 stats4_4.0.3
## [45] ellipsis_0.3.2 ggpubr_0.4.0
## [47] backports_1.2.1 permute_0.9-5
## [49] gridBase_0.4-7 MatrixGenerics_1.2.1
## [51] deldir_0.2-10 ggalluvial_0.12.3
## [53] vctrs_0.3.8 remotes_2.3.0
## [55] Cairo_1.5-12.2 ROCR_1.0-11
## [57] abind_1.4-5 caret_6.0-86
## [59] cachem_1.0.4 withr_2.4.2
## [61] checkmate_2.0.0 sctransform_0.3.2
## [63] vegan_2.5-7 fdrtool_1.2.16
## [65] prettyunits_1.1.1 goftest_1.2-2
## [67] cluster_2.1.2 gsl_2.1-6
## [69] lazyeval_0.2.2 crayon_1.4.1
## [71] recipes_0.1.16 pkgconfig_2.0.3
## [73] labeling_0.4.2 GenomeInfoDb_1.26.7
## [75] nlme_3.1-152 pkgload_1.2.1
## [77] nnet_7.3-15 rlang_0.4.11
## [79] globals_0.14.0 lifecycle_1.0.0
## [81] miniUI_0.1.1.1 registry_0.5-1
## [83] modelr_0.1.8 rprojroot_2.0.2
## [85] cellranger_1.1.0 randomForest_4.6-14
## [87] polyclip_1.10-0 matrixStats_0.58.0
## [89] lmtest_0.9-38 rngtools_1.5
## [91] Matrix_1.3-2 carData_3.0-4
## [93] zoo_1.8-9 reprex_2.0.0
## [95] base64enc_0.1-3 ggridges_0.5.3
## [97] GlobalOptions_0.1.2 processx_3.5.2
## [99] pheatmap_1.0.12 png_0.1-7
## [101] viridisLite_0.4.0 rjson_0.2.20
## [103] bitops_1.0-7 KernSmooth_2.23-18
## [105] visNetwork_2.0.9 pROC_1.17.0.1
## [107] shape_1.4.5 parallelly_1.25.0
## [109] jpeg_0.1-8.1 rstatix_0.7.0
## [111] S4Vectors_0.28.1 ggsignif_0.6.1
## [113] memoise_2.0.0 magrittr_2.0.1
## [115] plyr_1.8.6 ica_1.0-2
## [117] zlibbioc_1.36.0 compiler_4.0.3
## [119] tinytex_0.31 RColorBrewer_1.1-2
## [121] clue_0.3-59 fitdistrplus_1.1-3
## [123] cli_2.5.0 XVector_0.30.0
## [125] listenv_0.8.0 pbapply_1.4-3
## [127] ps_1.6.0 htmlTable_2.1.0
## [129] Formula_1.2-4 MASS_7.3-53.1
## [131] mgcv_1.8-35 tidyselect_1.1.1
## [133] stringi_1.5.3 yaml_2.2.1
## [135] latticeExtra_0.6-29 ggrepel_0.9.1
## [137] grid_4.0.3 tools_4.0.3
## [139] future.apply_1.7.0 rio_0.5.26
## [141] rstudioapi_0.13 foreach_1.5.1
## [143] foreign_0.8-81 cubature_2.0.4.2
## [145] gridExtra_2.3 prodlim_2019.11.13
## [147] farver_2.1.0 Rtsne_0.15
## [149] digest_0.6.27 shiny_1.6.0
## [151] lava_1.6.9 Rcpp_1.0.6
## [153] GenomicRanges_1.42.0 car_3.0-10
## [155] broom_0.7.6 later_1.2.0
## [157] RcppAnnoy_0.0.18 httr_1.4.2
## [159] ComplexHeatmap_2.7.10.9001 colorspace_2.0-0
## [161] rvest_1.0.0 fs_1.5.0
## [163] tensor_1.5 reticulate_1.20
## [165] IRanges_2.24.1 splines_4.0.3
## [167] uwot_0.1.10 spatstat.utils_2.1-0
## [169] sessioninfo_1.1.1 plotly_4.9.3
## [171] xtable_1.8-4 truncdist_1.0-2
## [173] jsonlite_1.7.2 timeDate_3043.102
## [175] testthat_3.0.2 ipred_0.9-11
## [177] R6_2.5.0 Hmisc_4.5-0
## [179] pillar_1.6.0 htmltools_0.5.1.1
## [181] mime_0.10 NMF_0.23.0
## [183] glue_1.4.2 fastmap_1.1.0
## [185] DT_0.18 class_7.3-18
## [187] codetools_0.2-18 pkgbuild_1.2.0
## [189] utf8_1.2.1 lattice_0.20-44
## [191] spatstat.sparse_2.0-0 evd_2.3-3
## [193] curl_4.3.1 leiden_0.3.7
## [195] zip_2.1.1 openxlsx_4.2.3
## [197] survival_3.2-11 limma_3.46.0
## [199] rmarkdown_2.7 desc_1.3.0
## [201] munsell_0.5.0 GenomeInfoDbData_1.2.4
## [203] e1071_1.7-6 GetoptLong_1.0.5
## [205] iterators_1.0.13 haven_2.4.1
## [207] reshape2_1.4.4 gtable_0.3.0
## [209] spatstat.core_2.1-2