免疫組庫分析實戰(zhàn)/mixcr+vdjtools+R實現(xiàn)

使用《OncoImmunology》一篇paper中的數(shù)據(jù)進行實戰(zhàn)

數(shù)據(jù)獲取地址：https://www.ncbi.nlm.nih.gov/bioproject/PRJEB33490
關于免疫組庫的背景，MIXCR和VDJTOOLS的使用可以參考之前的文章
使用mixcr構建免疫組庫及下游分析
 使用vdjtools進行免疫組庫分析
直接進入主題

1.下載數(shù)據(jù)

1.SRR_Acc_List.txt （所有測序數(shù)據(jù)的樣本號魁淳，直接下載得到）
2.SraRunTable.txt （樣本的分組信息低散，后面數(shù)據(jù)分析會用到）

#linux
prefetch -O ./rawdata --option-file ./SRR_Acc_List.txt #批量下載數(shù)據(jù)
cat SRR_Acc_List.txt | while read id; do fastq-dump --split-files --gzip -O ./rawdata/ ./rawdata/${id}/${id}.sra;done # sra to fastq
fastqc -o ./qc ./rawdata/*gz #質控

詳細方法參考：RNA-seq轉錄組差異分析及可視化

2.使用MIXCR構建免疫組庫

#只分析TRB
cat SRR_Acc_List.txt|while read name
do 
mixcr analyze amplicon \
-s hsa \
--starting-material dna \
--5-end v-primers \
--3-end c-primers \
--adapters adapters-present \
--only-productive \
--receptor-type trb \
--report ./reports/${name}.report \
./rawdata/${name}_1.fastq.gz ./rawdata/${name}_2.fastq.gz ./results/${name}
done

詳細方法參考：使用mixcr構建免疫組庫及下游分析

3.VDJTOOLS分析

這一步需要用到之前下載的SraRunTable.txt蜒程，處理得到想要的分組文件级零。

分組文件

前兩列必須為“file_name”,"sample_id"，后面列可根據(jù)需求進行分組
第一列是MIXCR恢復出來數(shù)據(jù)的路徑囤踩，第二列為樣本名稱

vdjtools=~/project/biosoft/vdjtools/vdjtools-1.2.1/vdjtools-1.2.1.jar
mkdir ./results/TRB/VDJconvert/
mkdir ./results/vdjtools-results/
java -jar $vdjtools Convert -S mixcr -m ./results/TRB/metadata.txt ./results/TRB/VDJconvert/ 
java -jar $vdjtools CalcBasicStats -m ./results/TRB/VDJconvert/metadata.txt ./results/vdjtools-results/
java -jar $vdjtools CalcSegmentUsage -m ./results/TRB/VDJconvert/metadata.txt ./results/vdjtools-results/
java -jar $vdjtools RarefactionPlot -m ./results/TRB/VDJconvert/metadata.txt ./results/vdjtools-results/
java -jar $vdjtools CalcDiversityStats -m ./results/TRB/VDJconvert/metadata.txt ./results/vdjtools-results/

轉換格式后在目標文件夾會生成一個新的metadata.txt珍特，后續(xù)分析使用此分組文件祝峻。
共得到6個結果文件分別為basicstats.txt、diversity.strict.exact.txt扎筒、diversity.strict.resampled.txt呼猪、rarefaction.strict.txt、segments.wt.J.txt砸琅、segments.wt.V.txt。

4.使用R包進行可視化處理

4.1 clonality diversity richness

inputpath <- "./results/vdjtools-results/"
outputpath <- "./R-results/"
library(ggplot2)
library(reshape2)
library(ggpubr)
df <- read.csv(paste0(inputpath,"diversity.strict.exact.txt"), sep = "\t",stringsAsFactors = FALSE)
df$real_diversity <- log(df$shannonWienerIndex_mean) 
df$clonality <- 1 - df$normalizedShannonWienerIndex_mean 
work_df <- df
sample_name <- work_df$sample_id
Group <- work_df$sample_type
richness<- work_df$observedDiversity_mean
diversity <- work_df$real_diversity
clonality <- work_df$clonality
work_df <- data.frame(Group,richness,diversity,clonality)
work_df$Group <- factor(work_df$Group)  
work_res<- melt(work_df, id.vars = "Group")
work_richness <- work_res[which(work_res$variable=="richness"),]
work_diversity <- work_res[which(work_res$variable=="diversity"),]
work_clonality <- work_res[which(work_res$variable=="clonality"),]
p1 <- ggboxplot(work_diversity, x = "Group", y = "value",
                legend = "right",color = "black", palette = "jama",
                add = "jitter",add.params=list(color = "Group",size=3),
                title = "Diversity",  xlab = "",size = 0.5,width =0.5,
                ylab = "Shannon Diversity Index of \n TCR repertoire") +
  theme(axis.text.x =element_text(size=13), axis.title.y=element_text(size=16,face="bold"))
p1 + stat_compare_means(aes(group = work_df$Group), 
                        label.x = 2,label = "p.signif",
                        method = 'wilcox.test',hide.ns = TRUE)
ggsave(paste0(outputpath,"Diversity.png"))
dev.off()
p2 <- ggboxplot(work_richness, x = "Group", y = "value",
                legend = "right",color = "black", palette = "jama",
                add = "jitter",add.params=list(color = "Group",size=3),
                title = "Richness",  xlab = "",size = 0.5,width =0.5,
                ylab = "Richness of \n TCR repertoire")+
  theme(axis.text.x =element_text(size=13), axis.title.y=element_text(size=16,face="bold"))
p2 + stat_compare_means(aes(group =work_df$Group), 
                        label.x = 2,label = "p.signif",
                        method = 'wilcox.test',hide.ns = TRUE)
ggsave(paste0(outputpath,"Richness.png"))
dev.off()
p3 <- ggboxplot(work_clonality, x = "Group", y = "value",
                legend = "right",color = "black", palette = "jama",
                add = "jitter",add.params=list(color = "Group",size=3),
                title = "Clonality",  xlab = "",size = 0.5,width =0.5,
                ylab = "Clonality Index of \n TCR repertoire")+
  theme(axis.text.x =element_text(size=13), axis.title.y=element_text(size=16,face="bold"))
p3 + stat_compare_means(aes(group =work_df$Group),  
                        label = "p.signif",label.x = 2,
                        method = 'wilcox.test',hide.ns = TRUE)
ggsave(paste0(outputpath,"Clonality.png"))
dev.off()

Clonality

Diversity

Richness

4.2Gene Usage熱圖繪制

inputpath <- "./results/vdjtools-results/"
outputpath <- "./R-results/"
library(pheatmap)
file_input <- c(paste0(inputpath,"segments.wt.V.txt"),
                paste0(inputpath,"segments.wt.J.txt"))
file_output <- c(paste0(outputpath,"V_heatmap.png"),paste0(outputpath,"J_heatmap.png")) 
for (i in 1:length(file_input)){
  df <- read.csv(file_input[i],sep="\t",row.names = 1)
  annotation_col <- data.frame(df$sample_type)
  colnames(annotation_col) <- "Type"
  rownames(annotation_col) <- rownames(df)
  work_df <- df
  work_df <- work_df[,-1:-2] # check this and change
  work_df <- t(work_df)
  work_df <- work_df[rowSums(work_df)!=0,]
  pheatmap(work_df,filename = file_output[i],
           annotation_col = annotation_col,
           cluster_cols = FALSE, #height = 10,
           display_numbers = FALSE,labels_col ="" )# check this and change
}

V_heatmap

J_heatmap

4.3 VJ基因的差異分析及火山圖繪制

mymetadata <- read.csv("./results/TRB/VDJconvert/metadata.txt",sep="\t",stringsAsFactors = F)
sample_name <- mymetadata$sample_id
vgene_matrix <- function(df){
  vgene_name <- c()
  for (i in 1:nrow(df)){
    if (df[i,"v"] %in% names(vgene_name)){
      vgene_name[df[i,"v"]] <- vgene_name[df[i,"v"]] + df[i,"count"]
    } else{
      vgene_name[df[i,"v"]] <- df[i,"count"]
    }
  }
  return(vgene_name)
}
jgene_matrix <- function(df){
  jgene_name <- c()
  for (i in 1:nrow(df)){
    if (df[i,"j"] %in% names(jgene_name)){
      jgene_name[df[i,"j"]] <- jgene_name[df[i,"j"]] + df[i,"count"]
    } else{
      jgene_name[df[i,"j"]] <- df[i,"count"]
    }
  }
  return(jgene_name)
}
VGeneMatrix <- data.frame()
JGeneMatrix <- data.frame()
matrix_V <- function(df,V,sample_name){
  for (i in 1:nrow(df)){
    if (df[i,"v"] %in% colnames(V)){
      V[is.na(V)] <- 0
      V[sample_name[n],df[i,"v"]] <- V[sample_name[n],df[i,"v"]] + df[i,"count"]
    } else{
      V[sample_name[n],df[i,"v"]] <- df[i,"count"]
    }
  }
  return(V)
}
matrix_J <- function(df,J,sample_name){
  for (i in 1:nrow(df)){
    if (df[i,"j"] %in% colnames(J)){
      J[is.na(J)] <- 0
      J[sample_name[n],df[i,"j"]] <- J[sample_name[n],df[i,"j"]] + df[i,"count"]
    } else{
      J[sample_name[n],df[i,"j"]] <- df[i,"count"]
    }
  }
  return(J)
}

for (n in 1:length(sample_name)){
  df <- read.csv(paste0("./results/TRB/VDJconvert/",sample_name[n],".txt"),sep="\t",stringsAsFactors = F)
  # assign(paste0(sample_name[n],"_Vgene"),vgene_matrix(df))
  # assign(paste0(sample_name[n],"_Jgene"),jgene_matrix(df)) 每個樣本的表達情況
  VGeneMatrix <- matrix_V(df,VGeneMatrix,sample_name)
  JGeneMatrix <- matrix_J(df,JGeneMatrix,sample_name)
}
write.table(VGeneMatrix,file = paste0(outputpath,"matrix_V.txt"),sep="\t",quote = F)
write.table(JGeneMatrix,file = paste0(outputpath,"matrix_J.txt"),sep="\t",quote = F)

#########DESeq2#######

library(DESeq2)
library(ggpubr)
library(ggthemes)
mycounts <- t(JGeneMatrix)
mycounts <- mycounts[rowSums(mycounts)!=0,]
mymetas <- mymetadata[,2:3]
colnames(mycounts) == mymetas$sample_id
mymetas$sample_type <- factor(mymetas$sample_type)

dds <- DESeqDataSetFromMatrix(countData = mycounts, colData = mymetas,desig=~sample_type)
dds <- DESeq(dds)
res <- results(dds,contrast = c("sample_type","healthy","hematological_cancer"))
res <- data.frame(res)
DEG <- res[order(res$padj),] 


DEG$change =as.factor(ifelse(DEG$pvalue < 0.05 & DEG$log2FoldChange > 0.5,'UP',
                             
                             ifelse(DEG$pvalue < 0.05 & DEG$log2FoldChange < -0.5,'DOWN',
                                    
                                    "NOT")))
DEG$logp = -log10(DEG$pvalue)


table(DEG$change)

DEG <- DEG[order(DEG$pvalue),]
up_genes <- head(row.names(DEG)[which(DEG$change == "UP")],10)
down_genes <- head(row.names(DEG)[which(DEG$change =="DOWN")],10)
DEG_top <- c(as.character(up_genes),as.character(down_genes))
DEG$label  <- ""
DEG$label[match(DEG_top,row.names(DEG))] <- DEG_top

ggscatter(DEG, x = "log2FoldChange", y = "logp",
          color = "change",
          palette = c("#2f5688","#BBBBBB","#CC0000"),
          size = 6,
          label = DEG$label,
          font.label = 10,
          repel = T,
          xlab = "log2FoldChange",
          ylab ="-log10(p value)") + theme_base() +
  geom_hline(yintercept = -log10(0.05),linetype="dashed")+
  geom_vline(xintercept = c(-0.5,0.5),linetype="dashed")

ggsave("./results/J_gene_valcano.png",dpi=600)
dev.off()

V_gene_valcano

J_gene_valcano

最后編輯于：2022.03.14 09:54:46

?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末轴踱，一起剝皮案震驚了整個濱河市症脂，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌，老刑警劉巖诱篷，帶你破解...
沈念sama閱讀 217,277評論 6贊 503
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件壶唤，死亡現(xiàn)場離奇詭異，居然都是意外死亡棕所，警方通過查閱死者的電腦和手機闸盔，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,689評論 3贊 393
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來琳省，“玉大人迎吵，你說我怎么就攤上這事≌氡幔” “怎么了击费？”我有些...
開封第一講書人閱讀 163,624評論 0贊 353
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長桦他。經(jīng)常有香客問我蔫巩，道長，這世上最難降的妖魔是什么快压？我笑而不...
開封第一講書人閱讀 58,356評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任圆仔，我火速辦了婚禮，結果婚禮上蔫劣，老公的妹妹穿的比我還像新娘坪郭。我一直安慰自己，他們只是感情好拦宣，可當我...
茶點故事閱讀 67,402評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布截粗。她就那樣靜靜地躺著，像睡著了一般鸵隧。火紅的嫁衣襯著肌膚如雪绸罗。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 51,292評論 1贊 301
城市分裂傳說
那天豆瘫，我揣著相機與錄音珊蟀，去河邊找鬼。笑死外驱，一個胖子當著我的面吹牛育灸，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播昵宇，決...
沈念sama閱讀 40,135評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼磅崭，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了瓦哎？” 一聲冷哼從身側響起砸喻，我...
開封第一講書人閱讀 38,992評論 0贊 275
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤柔逼，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后割岛，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體愉适，經(jīng)...
沈念sama閱讀 45,429評論 1贊 314
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 37,636評論 3贊 334
?白月光啟示錄
正文我和宋清朗相戀三年癣漆，在試婚紗的時候發(fā)現(xiàn)自己被綠了维咸。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 39,785評論 1贊 348
活死人
序言：一個原本活蹦亂跳的男人離奇死亡惠爽，死狀恐怖癌蓖，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情疆股，我是刑警寧澤费坊，帶...
沈念sama閱讀 35,492評論 5贊 345
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站旬痹，受9級特大地震影響附井，放射性物質發(fā)生泄漏。R本人自食惡果不足惜两残，卻給世界環(huán)境...
茶點故事閱讀 41,092評論 3贊 328
男人毒藥：我在死后第九天來索命
文/蒙蒙一永毅、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧人弓，春花似錦沼死、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,723評論 0贊 22
一樁弒父案意蛀，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至健芭，卻和暖如春县钥，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背慈迈。一陣腳步聲響...
開封第一講書人閱讀 32,858評論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工若贮，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人痒留。一個月前我還...
沈念sama閱讀 47,891評論 2贊 370
代替公主和親
正文我出身青樓谴麦，卻偏偏與公主長得像，于是被迫代替她去往敵國和親伸头。傳聞我的和親對象是個殘疾皇子匾效，可洞房花燭夜當晚...
茶點故事閱讀 44,713評論 2贊 354