目前常用的參考基因組主要有三個來源:
- Ensembl:ftp://ftp.ensembl.org/pub/
- UCSC:http://hgdownload.cse.ucsc.edu/downloads.html
- NCBI:ftp://ftp.ncbi.nih.gov/genomes/
一般來說卷胯,需要下載的文件包括:fasta序列迅脐、GTF文件培廓;
最常用的可以從Ensembl下載
hg38
genome fasta格式
wget ftp://ftp.ensembl.org/pub/release-99/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
注釋 gtf格式
wget ftp://ftp.ensembl.org/pub/release-99/gtf/homo_sapiens/Homo_sapiens.GRCh38.99.chr.gtf.gz
mus_musculus
genome
wget ftp://ftp.ensembl.org/pub/release-99/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.toplevel.fa.gz
gtf
wget ftp://ftp.ensembl.org/pub/release-99/gtf/mus_musculus/Mus_musculus.GRCm38.99.chr.gtf.gz
caenorhabditis_elegans
genome
wget ftp://ftp.ensembl.org/pub/release-99/fasta/caenorhabditis_elegans/dna/Caenorhabditis_elegans.WBcel235.dna.toplevel.fa.gz
gtf
wget ftp://ftp.ensembl.org/pub/release-99/gtf/caenorhabditis_elegans/Caenorhabditis_elegans.WBcel235.99.gtf.gz