一直運(yùn)行很正常的Trinity潘悼,突然出現(xiàn)了從未有過(guò)的報(bào)錯(cuò)律秃,并且是,有的樣本報(bào)錯(cuò)無(wú)法運(yùn)行治唤,另一些樣本可以正常運(yùn)行棒动。
完整報(bào)錯(cuò)信息如下:
Converting input files. (in parallel)Tuesday, April 4, 2023: 10:59:14 CMD: gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_1.clean.fq.gz | fastool --illumina-trinity --to-fasta >> left.fa 2> /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_1.clean.fq.gz.readcount
Tuesday, April 4, 2023: 10:59:14 CMD: gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_2.clean.fq.gz | fastool --illumina-trinity --to-fasta >> right.fa 2> /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_2.clean.fq.gz.readcount
Thread 1 terminated abnormally: Error, cmd: gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_1.clean.fq.gz | fastool --illumina-trinity --to-fasta >> left.fa 2> /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_1.clean.fq.gz.readcount died with ret 256 at /home/jjp/Software/miniconda3/bin/Trinity line 2183.
Thread 2 terminated abnormally: Error, counts of reads in FQ: 20882642 (as per gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_2.clean.fq.gz | wc -l) doesn't match fastool's report of FA records: 4497094 at /home/jjp/Software/miniconda3/bin/Trinity line 3060 thread 2.
main::ensure_complete_FQtoFA_conversion("gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0"..., "/home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_2.clea"...) called at /home/jjp/Software/miniconda3/bin/Trinity line 2099 thread 2
main::prep_seqs(ARRAY(0x55898fddbc28), "fq", "right", undef) called at /home/jjp/Software/miniconda3/bin/Trinity line 1313 thread 2
eval {...} called at /home/jjp/Software/miniconda3/bin/Trinity line 1313 thread 2
Trinity run failed. Must investigate error above.
乍一看,信息是再gunzip 后接 fastool這一步出的問(wèn)題宾添。
那么查看兩個(gè)命令船惨,gunzip 和 fastool柜裸,都可以正常運(yùn)行。那么不是這兩個(gè)軟件的調(diào)用問(wèn)題粱锐。
考慮到有的文件可以成功疙挺,有的則運(yùn)行失敗,可能是文件本身的問(wèn)題怜浅。所以檢查了所有數(shù)據(jù)的MD5值铐然。檢查完后也沒(méi)有啥問(wèn)題。
然后看命令恶座,錯(cuò)誤在trinity執(zhí)行以下兩行命令:
gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_2.clean.fq.gz | fastool --illumina-trinity --to-fasta >> right.fa 2> /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_2.clean.fq.gz.readcount
gunzip -c /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_1.clean.fq.gz | fastool --illumina-trinity --to-fasta >> left.fa 2> /home/jjp/Project/trans_119/test/Unknown_BD459-02T0001_1.clean.fq.gz.readcount
那么搀暑,單獨(dú)執(zhí)行這兩行命令,發(fā)現(xiàn)是可以運(yùn)行成功的跨琳。運(yùn)行成功后自点,在此執(zhí)行trinity腳本。
----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------
----------------------------------------------------------------------------------
Converting input files. (in parallel)Tuesday, April 4, 2023: 11:05:31 CMD: touch left.fa.ok right.fa.ok
Tuesday, April 4, 2023: 11:05:31 CMD: cat left.fa right.fa > both.fa
Tuesday, April 4, 2023: 11:05:32 CMD: touch both.fa.ok
-------------------------------------------
----------- Jellyfish --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------
* Running CMD: jellyfish count -t 40 -m 25 -s 61096915194 --canonical both.fa
* Running CMD: jellyfish dump -L 1 mer_counts.jf > jellyfish.kmers.fa
* Running CMD: jellyfish histo -t 40 -o jellyfish.kmers.fa.histo mer_counts.jf
可以看到脉让,運(yùn)行正常桂敛,且在 Trinity Phase 1 中直接跳過(guò)了這一步,直接進(jìn)入 CMD: cat left.fa right.fa > both.fa這一步侠鳄。
這樣問(wèn)題就基本解決了埠啃。總結(jié)一下伟恶,問(wèn)題在于trinity調(diào)用gunzip及fastool時(shí)出現(xiàn)錯(cuò)誤,所以事先手動(dòng)完成這一步毅该,生成left.fa和right.fa博秫,并提前建立默認(rèn)的trinity_out_dir文件夾,并將這兩個(gè)文件放進(jìn)去眶掌。(或者建立單獨(dú)的其它名字文件夾)
最終代碼
for fn in *_1.clean.fq.gz
do
sample=${fn%_1.clean*}
left_all=${sample}_1.clean.fq.gz
right_all=${sample}_2.clean.fq.gz
mkdir ${sample}_trinity
gunzip -c ./${sample}_1.clean.fq.gz | fastool --illumina-trinity --to-fasta >> ./${sample}_trinity/left.fa 2> ./${sample}_1.clean.fq.gz.readcount
gunzip -c ./${sample}_2.clean.fq.gz | fastool --illumina-trinity --to-fasta >> ./${sample}_trinity/right.fa 2> ./${sample}_2.clean.fq.gz.readcount
Trinity \
--seqType fq \
--max_memory ${Mem}G \
--left $left_all \
--right $right_all \
--CPU $thread \
--output ${sample}_trinity
done