immunarch — Fast and Seamless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires in R
我們已經(jīng)講到祝钢,immunarch分析免疫主庫數(shù)據(jù)是很有好的搪花,那么有多友好呢?真的可以5行代碼出博士論文嗎键痛?我們來看看這五行代碼:
install.packages("immunarch") # Install the package
library(immunarch); data(immdata) # Load the package and the test dataset
repOverlap(immdata$data) %>% vis() # Compute and visualise the most important statistics:
geneUsage(immdata$data[[1]]) %>% vis() # public clonotypes, gene usage, sample diversity
repDiversity(immdata$data) %>% vis(.by = "Status", .meta = immdata$meta) # Group samples
為了系統(tǒng)地了解這個包,我們引入一個包:pacman。
library(pacman)
p_functions(immunarch)
[1] "apply_asymm"
[2] "apply_symm"
[3] "bunch_translate"
[4] "coding"
[5] "cross_entropy"
[6] "dbAnnotate"
[7] "dbLoad"
[8] "entropy"
[9] "fixVis"
[10] "gene_stats"
[11] "geneUsage"
[12] "geneUsageAnalysis"
[13] "get_genes"
[14] "getKmers"
[15] "immunr_dbscan"
[16] "immunr_hclust"
[17] "immunr_kmeans"
[18] "immunr_mds"
[19] "immunr_pca"
[20] "immunr_tsne"
[21] "inc_overlap"
[22] "inframes"
[23] "js_div"
[24] "kl_div"
[25] "kmer_profile"
[26] "noncoding"
[27] "outofframes"
[28] "public_matrix"
[29] "publicRepertoire"
[30] "publicRepertoireApply"
[31] "publicRepertoireFilter"
[32] "pubRep"
[33] "pubRepApply"
[34] "pubRepFilter"
[35] "pubRepStatistics"
[36] "repClonality"
[37] "repDiversity"
[38] "repExplore"
[39] "repLoad"
[40] "repOverlap"
[41] "repOverlapAnalysis"
[42] "repSample"
[43] "repSave"
[44] "select_barcodes"
[45] "select_clusters"
[46] "spectratype"
[47] "split_to_kmers"
[48] "top"
[49] "trackClonotypes"
[50] "vis"
[51] "vis_bar"
[52] "vis_box"
[53] "vis_circos"
[54] "vis_heatmap"
[55] "vis_heatmap2"
[56] "vis_hist"
[57] "vis_immunr_kmer_profile_main"
[58] "vis_seqlogo"
[59] "vis_textlogo"
也是就是這個R包有59個函數(shù)(功能)曲初,但是我們并不需要全部記住,常用的有:
-
repExplore
- to compute basic statistics, such as number of clones or distributions of lengths and counts. To explore them you need to pass the statistics, e.g. count, to the .method. -
repClonality
- to compute the clonality of repertoires. -
repOverlap
- to compute the repertoire overlap. -
repOverlapAnalysis
- to analyse the repertoire overlap, including different clustering procedures and PCA. -
geneUsage
- to compute the distributions of V or J genes. -
geneUsageAnalysis
- to analyse the distributions of V or J genes, including clustering and PCA. -
repDiversity
- to estimate the diversity of repertoires. -
trackClonotypes
- to analyse the dynamics of repertoires across time points. -
spectratype
- to compute spectratype of clonotypes. -
getKmers
and kmer_profile - to compute distributions of kmers and sequence profiles
在immunarch中統(tǒng)計只是一條命令杯聚,而可視化半條命令就夠了臼婆。每個分析函數(shù)的輸出可以直接傳遞到vis
函數(shù):用于可視化的一般函數(shù)。下面是用法示例幌绍。幾乎所有可視化的分析結(jié)果都支持根據(jù)元數(shù)據(jù)表中的各自屬性或使用用戶提供的屬性對數(shù)據(jù)進行分組颁褂。分組可以通過傳遞.by
參數(shù)或同時傳遞.by
和.meta
參數(shù)給vis
函數(shù)來實現(xiàn)。
library(immunarch) # Load the package into R
data(immdata) # Load the test dataset
immdata$meta
exp_vol <- repExplore(immdata$data, .method = "volume")
p1 <- vis(exp_vol, .by = c("Status"), .meta = immdata$meta)
p2 <- vis(exp_vol, .by = c("Status", "Sex"), .meta = immdata$meta)
p1 + p2
同樣傀广,您可以將.by作為一個字符向量傳遞颁独,它與數(shù)據(jù)中的樣本數(shù)量精確匹配,每個值都應(yīng)該對應(yīng)于樣本的屬性伪冰。它將用于根據(jù)所提供的值對數(shù)據(jù)進行分組誓酒。注意,在這種情況下贮聂,您應(yīng)該將NA傳遞給.meta靠柑。
exp_vol <- repExplore(immdata$data, .method = "volume")
by_vec <- c("C", "C", "C", "C", "C", "C", "MS", "MS", "MS", "MS", "MS", "MS")
p <- vis(exp_vol, .by = by_vec)
p
你要說,哎吓懈,我想提出數(shù)據(jù)自己畫圖歼冰。
exp_vol <- repExplore(immdata$data, .method = "volume")
exp_vol
Sample Volume
A2-i129 A2-i129 6443
A2-i131 A2-i131 6375
A2-i133 A2-i133 6300
A2-i132 A2-i132 6721
A4-i191 A4-i191 5058
A4-i192 A4-i192 5763
MS1 MS1 5301
MS2 MS2 7043
MS3 MS3 6310
MS4 MS4 7313
MS5 MS5 5588
MS6 MS6 7278
如果數(shù)據(jù)是分組的,將執(zhí)行比較組均值的統(tǒng)計檢驗骄瓣,除非提供.test = F停巷。在只有兩組的情況下,采用Wilcoxon秩和檢驗(R函數(shù)wilcox)榕栏。exact = F)參數(shù)進行測試,以測試兩組之間是否存在平均秩值的差異蕾各。當大于兩組時扒磁,采用Kruskal-Wallis檢驗(R函數(shù)kruskar .test),相當于秩次方差分析(ANOVA)式曲,檢驗來自不同組的樣本是否來自同一分布妨托。顯著的Kruskal-Wallis檢驗表明,至少一個樣本隨機地占優(yōu)勢于另一個樣本吝羞。經(jīng)過多次比較調(diào)整后的p值繪制在組的頂部兰伤。p值的調(diào)整使用Holm方法(也稱為Holm- bonferroni校正)。您可以執(zhí)行命令?p.adjust
在R控制臺看到更多钧排。
如果對默認出的圖不滿意敦腔,你可以用fixVis
打開一個shiny窗口,一點一點調(diào)整直到可以發(fā)表恨溜。
# 1. Analyse
exp_len <- repExplore(immdata$data, .method = "len", .col = "aa")
# 2. Visualise
p1 <- vis(exp_len)
# 3. Fix and make publication-ready results
fixVis(p1)
對于基本的探索性分析符衔,比如比較每個指repertoire or distribution的reads/ UMIs數(shù)量找前,可以使用repExplore函數(shù)。
exp_len <- repExplore(immdata$data, .method = "len", .col = "aa")
exp_cnt <- repExplore(immdata$data, .method = "count")
exp_vol <- repExplore(immdata$data, .method = "volume")
p1 <- vis(exp_len)
p2 <- vis(exp_cnt)
p3 <- vis(exp_vol)
p1
p2 + p3
加入分組信息即可得到統(tǒng)計結(jié)果判族。
# You can group samples by their metainformation
p4 <- vis(exp_len, .by = "Status", .meta = immdata$meta)
p5 <- vis(exp_cnt, .by = "Sex", .meta = immdata$meta)
p6 <- vis(exp_vol, .by = c("Status", "Sex"), .meta = immdata$meta)
p4
p5 + p6
評價樣本多樣性的方法之一是評價克隆性(Clonality躺盛,主要區(qū)別于clonotypes)。repClonality 是指最常見或最不常見的克隆性的數(shù)量形帮。有幾種方法來評估克隆性槽惫,讓我們來看看它們。clonal.prop
方法計算細胞克隆池占用曲目的比例:
imm_pr <- repClonality(immdata$data, .method = "clonal.prop")
imm_pr
Clones Percentage Clonal.count.prop
A2-i129 18 10.0 0.0027556644
A2-i131 28 10.0 0.0042728521
A2-i133 9 10.3 0.0014077898
A2-i132 113 10.0 0.0164987589
A4-i191 4 11.5 0.0007773028
A4-i192 8 10.4 0.0013738623
MS1 2 10.1 0.0003700278
MS2 66 10.0 0.0092372288
MS3 2 10.6 0.0003095496
MS4 176 10.0 0.0236336780
MS5 3 13.2 0.0005303164
MS6 150 10.0 0.0202456472
attr(,"class")
[1] "immunr_clonal_prop" "matrix"
第一種方法考慮的是最豐富的細胞克隆型:
imm_top <- repClonality(immdata$data, .method = "top", .head = c(10, 100, 1000, 3000, 10000))
imm_top
10 100 1000 3000 10000
A2-i129 0.08011765 0.17282353 0.3491765 0.5844706 1
A2-i131 0.06670588 0.15647059 0.3467059 0.5820000 1
A2-i133 0.10505882 0.18294118 0.3655294 0.6008235 1
A2-i132 0.02388235 0.09423529 0.3118824 0.5471765 1
A4-i191 0.17176471 0.32047059 0.5122353 0.7475294 1
A4-i192 0.11541176 0.22141176 0.4325882 0.6678824 1
MS1 0.19164706 0.30894118 0.4817647 0.7170588 1
MS2 0.04458824 0.11470588 0.2770588 0.5123529 1
MS3 0.16482353 0.23011765 0.3575294 0.5928235 1
MS4 0.02329412 0.07517647 0.2415294 0.4768235 1
MS5 0.20611765 0.29188235 0.4521176 0.6874118 1
MS6 0.02835294 0.08235294 0.2460000 0.4812941 1
attr(,"class")
[1] "immunr_top_prop" "matrix"
imm_top %>% vis()
而稀有(rare )方法處理的是最不多產(chǎn)的克隆型:
imm_rare <- repClonality(immdata$data, .method = "rare")
imm_rare
imm_rare
1 3 10 30 100 MAX
A2-i129 0.6991765 0.8215294 0.8710588 0.9267059 0.9604706 1
A2-i131 0.6969412 0.8243529 0.8995294 0.9436471 0.9869412 1
A2-i133 0.6800000 0.8100000 0.8690588 0.8974118 0.9357647 1
A2-i132 0.7088235 0.8831765 0.9643529 0.9951765 1.0000000 1
A4-i191 0.5355294 0.6484706 0.7189412 0.7707059 0.8697647 1
A4-i192 0.5976471 0.7487059 0.8342353 0.8675294 0.9156471 1
MS1 0.5780000 0.6692941 0.7438824 0.7929412 0.8571765 1
MS2 0.7788235 0.8891765 0.9322353 0.9718824 1.0000000 1
MS3 0.7272941 0.7825882 0.8071765 0.8385882 0.8652941 1
MS4 0.8109412 0.9343529 0.9725882 1.0000000 1.0000000 1
MS5 0.6112941 0.7001176 0.7575294 0.7809412 0.8284706 1
MS6 0.8107059 0.9208235 0.9703529 0.9897647 1.0000000 1
最后辩撑,用homeo方法評估克隆空間的穩(wěn)態(tài)躯枢,即。槐臀,給定大小的無性系所占曲目的比例
imm_hom <- repClonality(immdata$data,
.method = "homeo",
.clone.types = c(Small = .0001, Medium = .001, Large = .01, Hyperexpanded = 1)
)
imm_hom
Small (0 < X <= 1e-04) Medium (1e-04 < X <= 0.001)
A2-i129 0 0.8634118
A2-i131 0 0.8858824
A2-i133 0 0.8597647
A2-i132 0 0.9542353
A4-i191 0 0.7135294
A4-i192 0 0.8183529
MS1 0 0.7248235
MS2 0 0.9244706
MS3 0 0.8061176
MS4 0 0.9683529
MS5 0 0.7520000
MS6 0 0.9625882
Large (0.001 < X <= 0.01)
A2-i129 0.09705882
A2-i131 0.09011765
A2-i133 0.07600000
A2-i132 0.04576471
A4-i191 0.15623529
A4-i192 0.08611765
MS1 0.12082353
MS2 0.06411765
MS3 0.05917647
MS4 0.03164706
MS5 0.07647059
MS6 0.03741176
Hyperexpanded (0.01 < X <= 1)
A2-i129 0.03952941
A2-i131 0.02400000
A2-i133 0.06423529
A2-i132 0.00000000
A4-i191 0.13023529
A4-i192 0.09552941
MS1 0.15435294
MS2 0.01141176
MS3 0.13470588
MS4 0.00000000
MS5 0.17152941
MS6 0.00000000
attr(,"class")
[1] "immunr_homeo" "matrix"
vis(imm_top) + vis(imm_top, .by = "Status", .meta = immdata$meta)
vis(imm_rare) + vis(imm_rare, .by = "Status", .meta = immdata$meta)
vis(imm_hom) + vis(imm_hom, .by = c("Status", "Sex"), .meta = immdata$meta)
到這里锄蹂,我們已經(jīng)寫完一篇博士論文了。
參考:
https://immunarch.com/articles/web_only/v3_basic_analysis.html