- 批量下載EBI數(shù)據(jù)
1痪署、獲取ascp下載地址
EBI網(wǎng)站首頁(yè)輸入檢索號(hào)典奉,分別選擇顯示條目sample name和FASTQ files (Aspera),點(diǎn)擊 TEXT 進(jìn)行下載腥例,生成文件: sample_alias.txt和fastq_aspera.txt
[EBI網(wǎng)址]https://www.ebi.ac.uk/ena/data/view/PRJEB7888
2干跛、下載測(cè)序數(shù)據(jù)
cat fastq_aspera.txt | sed '1d' | sed 's/fasp.sra.ebi.ac.uk://g' | sed 's/;/\n/g' > sample_fastq.txt
nohup /export/home/hushy/.aspera/connect/bin/ascp -i /export/home/hushy/.aspera/connect/etc/asperaweb_id_dsa.openssh -k1 -Tr -l100m -P33001 --mode recv --host fasp.sra.ebi.ac.uk --user era-fasp --file-list sample_fastq.txt . &
# .表示輸出到當(dāng)前路徑子姜,也可以指定為其他路徑
3、將樣本名稱和數(shù)據(jù)路徑合并成一個(gè)文件
cat sample_alias.txt | sed '1d' > sample_name.txt
find /data/hushy/study03/* -name "*fastq.gz" | tee sample_data.txt | cut -d '/' -f 5 - | cut -d '_' -f 1 - | paste - sample_data.txt | awk '{a[$1]=a[$1]$2" "}END{for(i in a){print i,a[i]}}' | awk 'FNR==NR{a[NR]=$1;next}{$1=a[FNR]}1' sample_name.txt - | sed 's/ /\t/g' > study03_sample_list.txt
- 分批下載EBI數(shù)據(jù)
1楼入、獲取下載地址
EBI搜索PRJEB ID哥捕,單擊樣本的右鍵,復(fù)制鏈接嘉熊,如:
fasp.sra.ebi.ac.uk:/vol1/fastq/ERR221/002/ERR2216042/ERR2216042_1.fastq.gz
2遥赚、下載數(shù)據(jù)
/export/home/hushy/.aspera/connect/bin/ascp -i /export/home/hushy/.aspera/connect/etc/asperaweb_id_dsa.openssh -k1 -Tr -l100m -P33001 era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR221/002/ERR2216042/ERR2216042_1.fastq.gz .
# ENA在Aspera的用戶名是era-fasp,ENA數(shù)據(jù)庫(kù)的數(shù)據(jù)存放位置是fasp.sra.ebi.ac.uk记舆,命令末尾的空格和.符號(hào)不可省略
- 分批下載NCBI數(shù)據(jù)
1鸽捻、獲取下載地址
ftp.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/SRR/SRR699/SRR6994553/SRR6994553.sra
2、下載數(shù)據(jù)
/export/home/hushy/.aspera/connect/bin/ascp -i /export/home/hushy/.aspera/connect/etc/asperaweb_id_dsa.openssh -k1 -Tr -l100m anonftp@ftp.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/SRR/SRR699/SRR6994553/SRR6994553.sra
3泽腮、使用sratoolkit將.sra文件轉(zhuǎn)換成.fastq.gz文件
/share/apps/sratoolkit.2.9.6-1-centos_linux64/bin/fastq-dump --split-3 SRR6994553.sra