生物信息學(xué)常見的數(shù)據(jù)下載,包括基因組凛捏,gtf担忧,bed,注釋

cd ~/reference

mkdir -p genome/hg19? && cd genome/hg19?

nohup wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz &

tar zvfx chromFa.tar.gz

cat *.fa > hg19.fa

rm chr*.fa



cd ~/reference

mkdir -p genome/hg38? && cd genome/hg38?

nohup wget http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz? &


cd ~/reference

mkdir -p? genome/mm10? && cd genome/mm10?

nohup wget http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz? &

tar zvfx chromFa.tar.gz

cat *.fa > mm10.fa

rm chr*.fa



cd ~/biosoft/RNA-SeQC

wget http://www.broadinstitute.org/cancer/cga/sites/default/files/data/tools/rnaseqc/ThousandReads.bam

wget http://www.broadinstitute.org/cancer/cga/sites/default/files/data/tools/rnaseqc/gencode.v7.annotation_goodContig.gtf.gz

wget http://www.broadinstitute.org/cancer/cga/sites/default/files/data/tools/rnaseqc/Homo_sapiens_assembly19.fasta.gz

wget http://www.broadinstitute.org/cancer/cga/sites/default/files/data/tools/rnaseqc/Homo_sapiens_assembly19.other.tar.gz

wget http://www.broadinstitute.org/cancer/cga/sites/default/files/data/tools/rnaseqc/gencode.v7.gc.txt

wget http://www.broadinstitute.org/cancer/cga/sites/default/files/data/tools/rnaseqc/rRNA.tar.gz


cd ~/reference

mkdir -p index/bowtie && cd index/bowtie?

nohup time ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2-build? ~/reference/genome/hg19/hg19.fa? ~/reference/index/bowtie/hg19 1>hg19.bowtie_index.log 2>&1 &

nohup time ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2-build? ~/reference/genome/hg38/hg38.fa? ~/reference/index/bowtie/hg38 1>hg38.bowtie_index.log 2>&1 &

nohup time ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2-build? ~/reference/genome/mm10/mm10.fa? ~/reference/index/bowtie/mm10 1>mm10.bowtie_index.log 2>&1 &


cd ~/reference

mkdir -p index/bwa && cd index/bwa?

nohup time ~/biosoft/bwa/bwa-0.7.15/bwa index?? -a bwtsw?? -p ~/reference/index/bwa/hg19? ~/reference/genome/hg19/hg19.fa 1>hg19.bwa_index.log 2>&1?? &

nohup time ~/biosoft/bwa/bwa-0.7.15/bwa index?? -a bwtsw?? -p ~/reference/index/bwa/hg38? ~/reference/genome/hg38/hg38.fa 1>hg38.bwa_index.log 2>&1?? &

nohup time ~/biosoft/bwa/bwa-0.7.15/bwa index?? -a bwtsw?? -p ~/reference/index/bwa/mm10? ~/reference/genome/mm10/mm10.fa 1>mm10.bwa_index.log 2>&1?? &


cd ~/reference

mkdir -p index/hisat && cd index/hisat?

nohup wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/data/hg19.tar.gz? &

nohup wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/data/hg38.tar.gz? &

nohup wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/data/grcm38.tar.gz &

nohup wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/data/mm10.tar.gz? &

tar zxvf hg19.tar.gz

tar zxvf grcm38.tar.gz

tar zxvf hg38.tar.gz

tar zxvf mm10.tar.gz



mkdir -p ~/annotation/variation/human/ExAC

cd ~/annotation/variation/human/ExAC

## http://exac.broadinstitute.org/

## ftp://ftp.broadinstitute.org/pub/ExAC_release/current

wget ftp://ftp.broadinstitute.org/pub/ExAC_release/current/ExAC.r0.3.1.sites.vep.vcf.gz.tbi

nohup wget ftp://ftp.broadinstitute.org/pub/ExAC_release/current/ExAC.r0.3.1.sites.vep.vcf.gz &

wget ftp://ftp.broadinstitute.org/pub/ExAC_release/current/cnv/exac-final-cnv.gene.scores071316

wget ftp://ftp.broadinstitute.org/pub/ExAC_release/current/cnv/exac-final.autosome-1pct-sq60-qc-prot-coding.cnv.bed



mkdir -p ~/annotation/variation/human/dbSNP

cd ~/annotation/variation/human/dbSNP

## https://www.ncbi.nlm.nih.gov/projects/SNP/

## ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh38p2/

## ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/

nohup wget ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160601.vcf.gz &

wget ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160601.vcf.gz.tbi



mkdir -p ~/annotation/variation/human/1000genomes

cd ~/annotation/variation/human/1000genomes?

## ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/

nohup wget? -c -r -nd -np -k -L -p? ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502 &


mkdir -p ~/annotation/variation/human/cosmic

cd ~/annotation/variation/human/cosmic

## we need to register before we can download this file.


mkdir -p ~/annotation/variation/human/ESP6500

cd ~/annotation/variation/human/ESP6500

# http://evs.gs.washington.edu/EVS/

nohup wget http://evs.gs.washington.edu/evs_bulk_data/ESP6500SI-V2-SSA137.GRCh38-liftover.snps_indels.vcf.tar.gz &


mkdir -p ~/annotation/variation/human/UK10K

cd ~/annotation/variation/human/UK10K

# http://www.uk10k.org/

nohup wget ftp://ngs.sanger.ac.uk/production/uk10k/UK10K_COHORT/REL-2012-06-02/UK10K_COHORT.20160215.sites.vcf.gz &


mkdir -p ~/annotation/variation/human/gonl

cd ~/annotation/variation/human/gonl

## http://www.nlgenome.nl/search/

## https://molgenis26.target.rug.nl/downloads/gonl_public/variants/release5/

nohup wget? -c -r -nd -np -k -L -p? https://molgenis26.target.rug.nl/downloads/gonl_public/variants/release5? &


mkdir -p ~/annotation/variation/human/omin

cd ~/annotation/variation/human/omin


mkdir -p ~/annotation/variation/human/GWAS

cd ~/annotation/variation/human/GWAS


mkdir -p ~/annotation/variation/human/hapmap

cd ~/annotation/variation/human/hapmap

# ftp://ftp.ncbi.nlm.nih.gov/hapmap/

wget ftp://ftp.ncbi.nlm.nih.gov/hapmap/phase_3/relationships_w_pops_051208.txt

nohup wget -c -r -np -k -L -p? -nd -A.gz ftp://ftp.ncbi.nlm.nih.gov/hapmap/phase_3/hapmap3_reformatted &

# ftp://ftp.hgsc.bcm.tmc.edu/pub/data/HapMap3-ENCODE/ENCODE3/ENCODE3v1/

wget ftp://ftp.hgsc.bcm.tmc.edu/pub/data/HapMap3-ENCODE/ENCODE3/ENCODE3v1/bcm-encode3-QC.txt

wget ftp://ftp.hgsc.bcm.tmc.edu/pub/data/HapMap3-ENCODE/ENCODE3/ENCODE3v1/bcm-encode3-submission.txt.gz





## 1 million single nucleotide polymorphisms (SNPs) for DNA samples from each of the three ethnic groups in Singapore – Chinese, Malays and Indians.

## The Affymetrix Genome-Wide Human SNP Array 6.0?? && The Illumina Human1M single BeadChip

## http://www.statgen.nus.edu.sg/~SGVP/

## http://www.statgen.nus.edu.sg/~SGVP/singhap/files-website/samples-information.txt

# http://www.statgen.nus.edu.sg/~SGVP/singhap/files-website/genotypes/2009-01-30/QC/


## Singapore Sequencing Malay Project (SSMP)

mkdir -p ~/annotation/variation/human/SSMP

cd ~/annotation/variation/human/SSMP

## http://www.statgen.nus.edu.sg/~SSMP/

## http://www.statgen.nus.edu.sg/~SSMP/download/vcf/2012_05



## Singapore Sequencing Indian Project (SSIP)

mkdir -p ~/annotation/variation/human/SSIP

cd ~/annotation/variation/human/SSIP

# http://www.statgen.nus.edu.sg/~SSIP/

## http://www.statgen.nus.edu.sg/~SSIP/download/vcf/dataFreeze_Feb2013




wget ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz

wget ftp://ftp.ensembl.org/pub/release-86/gtf/homo_sapiens/Homo_sapiens.GRCh38.86.chr.gtf.gz


mkdir -p ~/reference/gtf/gencode

cd? ~/reference/gtf/gencode

## https://www.gencodegenes.org/releases/current.html

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.2wayconspseudos.gtf.gz

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.long_noncoding_RNAs.gtf.gz

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.polyAs.gtf.gz

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.annotation.gtf.gz

## https://www.gencodegenes.org/releases/25lift37.html

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.annotation.gtf.gz

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.metadata.HGNC.gz

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.metadata.EntrezGene.gz

wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.metadata.RefSeq.gz



mkdir -p ~/reference/gtf/ensembl/homo_sapiens_86

cd? ~/reference/gtf/ensembl/homo_sapiens_86

## http://asia.ensembl.org/info/data/ftp/index.html




cd ~/reference

mkdir -p? genome/human_g1k_v37? && cd genome/human_g1k_v37

# http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/

nohup wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_g1k_v37.fasta.gz? &

gunzip human_g1k_v37.fasta.gz

wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_g1k_v37.fasta.fai

wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/README.human_g1k_v37.fasta.txt

java -jar ~/biosoft/picardtools/picard-tools-1.119/CreateSequenceDictionary.jar R=human_g1k_v37.fasta O=human_g1k_v37.dict


## ftp://ftp.broadinstitute.org/bundle/b37/

mkdir -p ~/annotation/GATK

cd ~/annotation/variation/GATK

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.snps.high_confidence.b37.vcf.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/dbsnp_138.b37.vcf.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/human_g1k_v37.fasta.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/NA12878.HiSeq.WGS.bwa.cleaned.raw.subset.b37.sites.vcf.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/Mills_and_1000G_gold_standard.indels.b37.vcf.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/hapmap_3.3.b37.vcf.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.indels.b37.vcf.gz

wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.indels.b37.vcf.idx.gz

gunzip 1000G_phase1.indels.b37.vcf.idx.gz

gunzip 1000G_phase1.indels.b37.vcf.gz



mkdir -p? ~/institute/ENSEMBL/gtf

cd? ~/institute/ENSEMBL/gtf

wget ftp://ftp.ensembl.org/pub/release-87/gtf/homo_sapiens/Homo_sapiens.GRCh38.87.chr.gtf.gz

wget ftp://ftp.ensembl.org/pub/release-87/gtf/mus_musculus/Mus_musculus.GRCm38.87.chr.gtf.gz

wget ftp://ftp.ensembl.org/pub/release-87/gtf/danio_rerio/Danio_rerio.GRCz10.87.chr.gtf.gz






cd ~/institute/TCGA/firehose

## https://gdac.broadinstitute.org/

wget http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/ACC/20160128/gdac.broadinstitute.org_ACC.Merge_snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg19__seg.Level_3.2016012800.0.0.tar.gz? -O ACC.gistic.seg.tar.gz

wget http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/ACC/20160128/gdac.broadinstitute.org_ACC.Merge_snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_hg19__seg.Level_3.2016012800.0.0.tar.gz? -O ACC.raw.seg.tar.gz

wget http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/ACC/20160128/gdac.broadinstitute.org_ACC.Mutation_Packager_Calls.Level_3.2016012800.0.0.tar.gz -O ACC.maf.tar.gz

wget http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/ACC/20160128/gdac.broadinstitute.org_ACC.Mutation_Packager_Oncotated_Calls.Level_3.2016012800.0.0.tar.gz -O ACC.maf.anno.tar.gz

參考帖子:http://www.biotrainee.com/forum.php?mod=viewthread&tid=857&page=1&authorid=16

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末坯癣,一起剝皮案震驚了整個(gè)濱河市瓶盛,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖惩猫,帶你破解...
    沈念sama閱讀 216,372評(píng)論 6 498
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件芝硬,死亡現(xiàn)場離奇詭異,居然都是意外死亡轧房,警方通過查閱死者的電腦和手機(jī)拌阴,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,368評(píng)論 3 392
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來奶镶,“玉大人迟赃,你說我怎么就攤上這事〕д颍” “怎么了纤壁?”我有些...
    開封第一講書人閱讀 162,415評(píng)論 0 353
  • 文/不壞的土叔 我叫張陵,是天一觀的道長捺信。 經(jīng)常有香客問我酌媒,道長,這世上最難降的妖魔是什么迄靠? 我笑而不...
    開封第一講書人閱讀 58,157評(píng)論 1 292
  • 正文 為了忘掉前任秒咨,我火速辦了婚禮,結(jié)果婚禮上梨水,老公的妹妹穿的比我還像新娘拭荤。我一直安慰自己,他們只是感情好疫诽,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,171評(píng)論 6 388
  • 文/花漫 我一把揭開白布舅世。 她就那樣靜靜地躺著,像睡著了一般奇徒。 火紅的嫁衣襯著肌膚如雪雏亚。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,125評(píng)論 1 297
  • 那天摩钙,我揣著相機(jī)與錄音罢低,去河邊找鬼。 笑死胖笛,一個(gè)胖子當(dāng)著我的面吹牛网持,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播长踊,決...
    沈念sama閱讀 40,028評(píng)論 3 417
  • 文/蒼蘭香墨 我猛地睜開眼功舀,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了身弊?” 一聲冷哼從身側(cè)響起辟汰,我...
    開封第一講書人閱讀 38,887評(píng)論 0 274
  • 序言:老撾萬榮一對(duì)情侶失蹤列敲,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后帖汞,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體戴而,經(jīng)...
    沈念sama閱讀 45,310評(píng)論 1 310
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,533評(píng)論 2 332
  • 正文 我和宋清朗相戀三年翩蘸,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了所意。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 39,690評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡鹿鳖,死狀恐怖扁眯,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情翅帜,我是刑警寧澤姻檀,帶...
    沈念sama閱讀 35,411評(píng)論 5 343
  • 正文 年R本政府宣布,位于F島的核電站涝滴,受9級(jí)特大地震影響绣版,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜歼疮,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,004評(píng)論 3 325
  • 文/蒙蒙 一杂抽、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧韩脏,春花似錦缩麸、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,659評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至吹散,卻和暖如春弧械,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背空民。 一陣腳步聲響...
    開封第一講書人閱讀 32,812評(píng)論 1 268
  • 我被黑心中介騙來泰國打工刃唐, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人界轩。 一個(gè)月前我還...
    沈念sama閱讀 47,693評(píng)論 2 368
  • 正文 我出身青樓画饥,卻偏偏與公主長得像,于是被迫代替她去往敵國和親浊猾。 傳聞我的和親對(duì)象是個(gè)殘疾皇子荒澡,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,577評(píng)論 2 353

推薦閱讀更多精彩內(nèi)容