進(jìn)到align目錄
對(duì)質(zhì)量好的測(cè)序數(shù)據(jù)進(jìn)行比對(duì)
1. 一個(gè)個(gè)比對(duì)猖腕,生成BAM文件
align目錄
sample=SRR7696207
bwa mem -t 2 -R "@RG\tID:$sample\tSM:$sample\tLB:WGS\tPL:Illumina" ../hg38/bwa_index/gatk_hg38 ../clean/SRR7696207_1_val_1.fq.gz ../clean/SRR7696207_2_val_2.fq.gz |samtools sort -@ 2 -o SRR7696207.bam -
不用-R參數(shù)也可以執(zhí)行拆祈,但后面gatk的時(shí)候會(huì)報(bào)錯(cuò)
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 143150 sequences (20000163 bp)...
[M::process] read 142658 sequences (20000278 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (1, 61056, 1, 1)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (135, 165, 207)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 351)
[M::mem_pestat] mean and std.dev: (174.05, 52.67)
......
2或者循環(huán)批量比對(duì)
#clean目錄
ls *1.fastq.gz > 1
ls *2.fastq.gz > 2
paste 1 2 > config
vim config
增加第一列文件名,記得不能空格倘感,要Tab分隔
如果樣本量很多就用腳本放坏,具體見(jiàn)大樣本分析那篇
align目錄下
INDEX=../hg38/bwa_index/gatk_hg38
cat ../clean/config|while read id
do
arr=($id)
sample=${arr[0]}
fq1=${arr[1]}
fq2=${arr[2]}
bwa mem -t 5 -R "@RG\tID:$sample\tSM:$sample\tLB:WGS\tPL:Illumina" $INDEX ../clean/$fq1 ../clean/$fq2 |samtools sort -@ 2 -o $sample.bam -
done &
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 142876 sequences (20000122 bp)...
[M::process] read 142628 sequences (20000141 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 61992, 1, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (137, 174, 219)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 383)
[M::mem_pestat] mean and std.dev: (181.21, 59.08)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 465)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 142876 reads in 24.094 CPU sec, 11.833 real sec
3 查看bam文件
$ samtools view -H SRR8517853.bam |grep -v "SQ"
@HD VN:1.6 SO:coordinate
@RG ID:SRR8517853/tSM:SRR8517853 LB:WGS PL:Illumina
@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem -t 2 -R @RG\tID:SRR8517853/tSM:SRR8517853\tLB:WGS\tPL:Illumina ../hg38/bwa_index/gatk_hg38 ../clean/SRR8517853_1_val_1.fq.gz ../clean/SRR8517853_2_val_2.fq.gz