在203年還在更新的一款軟件 https://www.genome.med.kyoto-u.ac.jp/HLA-HD/
官方文檔寫的非常清楚置鼻,安裝和使用的過程并沒有遇到什么問題俩功,所以這里只做一些記錄。
安裝
需要提前安裝bowtie2 sudo apt install bowtie2
下載安裝包剑梳,解壓呜达,
sh install.sh
安裝完成谣蠢。
export PATH=$PATH:/path_to_HLA-HD_install_directory/bin
更新dictionary
sh update.dictionary.sh
有點(diǎn)耗時(shí),要幾分鐘闻丑。
running
ulimit -Sa
ulimit -n 1024
如果你的內(nèi)存非常大漩怎,這個(gè)數(shù)據(jù)可以設(shè)置大一點(diǎn),否則有時(shí)多線程跑的時(shí)候還會(huì)報(bào)錯(cuò)嗦嗡。
官方說要解壓fastq.gz,但實(shí)際上壓縮文件也可以跑饭玲!
例子:fastq
hlahd.sh -t [thread_num] -m [minimum length of reads] \
-c [trimming rate] \
-f [path_to freq_data directory] \
fastq_1 fastq_2 \
gene_split_filt path_to_dictionary_directory \
IDNAME[any name] output_directory
hlahd.sh -t 2 -m 100 -c 0.95 -f freq_data/ \
data/sample_1.fastq data/sample_2.fastq \
HLA_gene.split.txt dictionary/ \
sampleID estimation
如果是bam文件:
Using bam files mapped to human genome
If you have mapped result to human genome, you can create fastq of mhc region and unmapped reads by using samtools and picard tools as follows:
#Extract MHC region
:for GRCh38.p12
>samtools view -h -b sample.hgmap.sorted.bam chr6:28,510,120-33,480,577 > sample.mhc.bam
:for GRCh37
>samtools view -h -b sample.hgmap.sorted.bam chr6:28,477,797-33,448,354 > sample.mhc.bam
#Extract unmap reads
>samtools view -b -f 4 sample.sorted.bam > sample.unmap.bam
#Merge bam files
>samtools merge -o sample.merge.bam sample.unmap.bam sample.mhc.bam
#Create fastq
>java -jar picard.jar SamToFastq I=sample.merge.bam F=sample.hlatmp.1.fastq F2=sample.hlatmp.2.fastq
#Change fastq ID
>cat sample.hlatmp.1.fastq |awk ‘{if(NR%4 == 1){O=$0;gsub(“/1″,” 1″,O);print O}else{print $0}}’ > sample.hla.1.fastq
>cat sample.hlatmp.2.fastq |awk ‘{if(NR%4 == 1){O=$0;gsub(“/2″,” 2″,O);print O}else{print $0}}’ > sample.hla.2.fastq
其他HLA 分型工具:點(diǎn)擊下面鏈接查看
10X單細(xì)胞RNAseq數(shù)據(jù)HLA分型工具:scHLAcount
WES數(shù)據(jù)只能檢測ABC三種結(jié)果的: OptiType
檢測gene數(shù)量比較多的HLA_scan