wget http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz 下載參考基因組
tar -zxvf chromFa.tar.gz 解壓縮
cat *.fa > mm10.fa 合并文件
解壓縮后文件情況:
-rw-rw-r--. 1 yangjy yangjy 133308900 Feb 10 2012 chr10.fa
-rw-rw-r--. 1 yangjy yangjy 124524201 Feb 10 2012 chr11.fa
-rw-rw-r--. 1 yangjy yangjy 122531610 Feb 10 2012 chr12.fa
-rw-rw-r--. 1 yangjy yangjy 122830079 Feb 10 2012 chr13.fa
-rw-rw-r--. 1 yangjy yangjy 127400296 Feb 10 2012 chr14.fa
-rw-rw-r--. 1 yangjy yangjy 106124566 Feb 10 2012 chr15.fa
-rw-rw-r--. 1 yangjy yangjy 100171931 Feb 10 2012 chr16.fa
-rw-rw-r--. 1 yangjy yangjy 96887024 Feb 10 2012 chr17.fa
-rw-rw-r--. 1 yangjy yangjy 92516699 Feb 10 2012 chr18.fa
-rw-rw-r--. 1 yangjy yangjy 62660205 Feb 10 2012 chr19.fa
-rw-rw-r--. 1 yangjy yangjy 199381417 Feb 10 2012 chr1.fa
-rw-rw-r--. 1 yangjy yangjy 173142 Feb 10 2012 chr1_GL456210_random.fa
-rw-rw-r--. 1 yangjy yangjy 246592 Feb 10 2012 chr1_GL456211_random.fa
-rw-rw-r--. 1 yangjy yangjy 156713 Feb 10 2012 chr1_GL456212_random.fa
-rw-rw-r--. 1 yangjy yangjy 40149 Feb 10 2012 chr1_GL456213_random.fa
-rw-rw-r--. 1 yangjy yangjy 211123 Feb 10 2012 chr1_GL456221_random.fa
-rw-rw-r--. 1 yangjy yangjy 185755495 Feb 10 2012 chr2.fa
-rw-rw-r--. 1 yangjy yangjy 163240480 Feb 10 2012 chr3.fa
-rw-rw-r--. 1 yangjy yangjy 159638285 Feb 10 2012 chr4.fa
-rw-rw-r--. 1 yangjy yangjy 68029 Feb 10 2012 chr4_GL456216_random.fa
-rw-rw-r--. 1 yangjy yangjy 232548 Feb 10 2012 chr4_GL456350_random.fa
-rw-rw-r--. 1 yangjy yangjy 15266 Feb 10 2012 chr4_JH584292_random.fa
-rw-rw-r--. 1 yangjy yangjy 212150 Feb 10 2012 chr4_JH584293_random.fa
-rw-rw-r--. 1 yangjy yangjy 195766 Feb 10 2012 chr4_JH584294_random.fa
-rw-rw-r--. 1 yangjy yangjy 2038 Feb 10 2012 chr4_JH584295_random.fa
-rw-rw-r--. 1 yangjy yangjy 154871384 Feb 10 2012 chr5.fa
-rw-rw-r--. 1 yangjy yangjy 199935 Feb 10 2012 chr5_GL456354_random.fa
-rw-rw-r--. 1 yangjy yangjy 203378 Feb 10 2012 chr5_JH584296_random.fa
-rw-rw-r--. 1 yangjy yangjy 209914 Feb 10 2012 chr5_JH584297_random.fa
-rw-rw-r--. 1 yangjy yangjy 187895 Feb 10 2012 chr5_JH584298_random.fa
-rw-rw-r--. 1 yangjy yangjy 972095 Feb 10 2012 chr5_JH584299_random.fa
-rw-rw-r--. 1 yangjy yangjy 152731283 Feb 10 2012 chr6.fa
-rw-rw-r--. 1 yangjy yangjy 148350295 Feb 10 2012 chr7.fa
-rw-rw-r--. 1 yangjy yangjy 179510 Feb 10 2012 chr7_GL456219_random.fa
-rw-rw-r--. 1 yangjy yangjy 131989244 Feb 10 2012 chr8.fa
-rw-rw-r--. 1 yangjy yangjy 127087019 Feb 10 2012 chr9.fa
-rw-rw-r--. 1 yangjy yangjy 16631 Feb 10 2012 chrM.fa
-rw-rw-r--. 1 yangjy yangjy 40874 Feb 10 2012 chrUn_GL456239.fa
-rw-rw-r--. 1 yangjy yangjy 23450 Feb 10 2012 chrUn_GL456359.fa
-rw-rw-r--. 1 yangjy yangjy 32355 Feb 10 2012 chrUn_GL456360.fa
-rw-rw-r--. 1 yangjy yangjy 48031 Feb 10 2012 chrUn_GL456366.fa
-rw-rw-r--. 1 yangjy yangjy 42915 Feb 10 2012 chrUn_GL456367.fa
-rw-rw-r--. 1 yangjy yangjy 20629 Feb 10 2012 chrUn_GL456368.fa
-rw-rw-r--. 1 yangjy yangjy 27316 Feb 10 2012 chrUn_GL456370.fa
-rw-rw-r--. 1 yangjy yangjy 29254 Feb 10 2012 chrUn_GL456372.fa
-rw-rw-r--. 1 yangjy yangjy 32251 Feb 10 2012 chrUn_GL456378.fa
-rw-rw-r--. 1 yangjy yangjy 73849 Feb 10 2012 chrUn_GL456379.fa
-rw-rw-r--. 1 yangjy yangjy 26405 Feb 10 2012 chrUn_GL456381.fa
-rw-rw-r--. 1 yangjy yangjy 23638 Feb 10 2012 chrUn_GL456382.fa
-rw-rw-r--. 1 yangjy yangjy 39449 Feb 10 2012 chrUn_GL456383.fa
-rw-rw-r--. 1 yangjy yangjy 35961 Feb 10 2012 chrUn_GL456385.fa
-rw-rw-r--. 1 yangjy yangjy 25195 Feb 10 2012 chrUn_GL456387.fa
-rw-rw-r--. 1 yangjy yangjy 29364 Feb 10 2012 chrUn_GL456389.fa
-rw-rw-r--. 1 yangjy yangjy 25178 Feb 10 2012 chrUn_GL456390.fa
-rw-rw-r--. 1 yangjy yangjy 24118 Feb 10 2012 chrUn_GL456392.fa
-rw-rw-r--. 1 yangjy yangjy 56842 Feb 10 2012 chrUn_GL456393.fa
-rw-rw-r--. 1 yangjy yangjy 24826 Feb 10 2012 chrUn_GL456394.fa
-rw-rw-r--. 1 yangjy yangjy 21681 Feb 10 2012 chrUn_GL456396.fa
-rw-rw-r--. 1 yangjy yangjy 116758 Feb 10 2012 chrUn_JH584304.fa
-rw-rw-r--. 1 yangjy yangjy 174451931 Feb 10 2012 chrX.fa
-rw-rw-r--. 1 yangjy yangjy 343694 Feb 10 2012 chrX_GL456233_random.fa
-rw-rw-r--. 1 yangjy yangjy 93579598 Feb 10 2012 chrY.fa
-rw-rw-r--. 1 yangjy yangjy 186016 Feb 10 2012 chrY_JH584300_random.fa
-rw-rw-r--. 1 yangjy yangjy 265095 Feb 10 2012 chrY_JH584301_random.fa
-rw-rw-r--. 1 yangjy yangjy 158977 Feb 10 2012 chrY_JH584302_random.fa
-rw-rw-r--. 1 yangjy yangjy 161283 Feb 10 2012 chrY_JH584303_random.fa
合并文件
cat *.fa > mm10.fa
使用bowtie2 built建立索引:
bowtie2-build mm10.fa mm10
索引文件格式如下:
-rw-r--r--. 1 yangjy yangjy 888464705 May 2 2012 mm10.1.bt2
-rw-r--r--. 1 yangjy yangjy 663195880 May 2 2012 mm10.2.bt2
-rw-r--r--. 1 yangjy yangjy 6119 May 2 2012 mm10.3.bt2
-rw-r--r--. 1 yangjy yangjy 663195875 May 2 2012 mm10.4.bt2
-rw-r--r--. 1 yangjy yangjy 888464705 May 3 2012 mm10.rev.1.bt2
-rw-r--r--. 1 yangjy yangjy 663195880 May 3 2012 mm10.rev.2.bt2
處理信息(需要一定時間)
(chipseq) [yangjy@GSCG01 align]$ bowtie2-build mm10.fa mm10
Settings:
Output files: "mm10.*.bt2" # 輸出的文件格式:mm10.*.bt2
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
mm10.fa
Building a SMALL index
Reading reference sizes
Time reading reference sizes: 00:00:11
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:09
bmax according to bmaxDivN setting: 663195875
Using parameters --bmax 497396907 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 497396907 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:56
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:14
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:25
Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
Splitting and merging
Splitting and merging time: 00:00:00
Avg bucket size: 2.65278e+09 (target: 497396906)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 2652783500 for bucket 1
(Using difference cover)
運行bowtie2 獲取 SAM 文件
bowtie2 -p 6 -3 5 --local -x mm10 -1 example_1.fastq -2 example_2.fastq -S example.sam