1 軟件安裝
http://www.reibang.com/p/eb89ab4af035
linux平臺下需要安裝的軟件:fastqc揽涮,fastp,hisat2,samtools绝葡,htseq
2下載基因組序列和基因組注釋文件
黑曲霉N402基因組:
Ensembl Fungi
或者NCBI:
Aspergillus niger (ID 429) - Genome - NCBI (nih.gov)
wget -c https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/248/155/GCA_900248155.1_Aniger_ATCC_64974_N402/GCA_900248155.1_Aniger_ATCC_64974_N402_genomic.fna.gz
構(gòu)建索引文件
hisat2-build -p 3 GCA_900248155.1_Aniger_ATCC_64974_N402_genomic.fna genome
下載基因組注釋文件
wget -c https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/248/155/GCA_900248155.1_Aniger_ATCC_64974_N402/GCA_900248155.1_Aniger_ATCC_64974_N402_genomic.gff.gz
過濾raw reads
mkdir -p fastp
ls *.fastq.gz|while read id;
do
fastp -5 20 -3 20 -i $id -o ${id%%.*}.clean.fq.gz \
-h ./fastp/${id%%.*}.html -j ./fastp/${id%%.*}.json;
done
比對
ls *clean.fq.gz|while read id;
do
hisat2 -t -p 3 -x /media/lzx/0000678400004823/Indexs/Hisat2/Aspergillus_niger/Aspergillus_niger \
-U $id \
2>${id%%.*}.hisat2.log \
|samtools sort -@ 3 -o ${id%%.*}_ht2p.bam
done
計(jì)數(shù)
mkdir -p htseq
ls *.bam |while read id;
do
htseq-count -f bam -s no -t gene -i Dbxref $id /media/lzx/0000678400004823/Gtf_gff/Aspergillus_niger/GCF_000002855.3_ASM285v2_genomic.gff \
1>./htseq/${id%_*}.txt 2>./htseq/${id%_*}.HTseq.log
done
ID轉(zhuǎn)換文件下載:
Aspergillus niger (ID 429) - Genome - NCBI (nih.gov)