2018-02-06-blast 與 diamond 的區(qū)別

一氛雪、NCBI blast+

1. 安裝配置BLAST+程序

在ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/中下載最新的BLAST可執(zhí)行程序（不要下載源代碼`敦冬，源碼編譯非常慢）偶宫，選擇預(yù)編譯版本矿咕，如ncbi-blast-2.2.30+-x64-linux.tar.gz。如果服務(wù)器能聯(lián)網(wǎng)春贸，可直接用wget下載居触。或者痹屹，下載后用SFTP客戶端傳輸?shù)椒?wù)器上章郁。

wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.2.30+-x64-linux.tar.gz

解壓縮：

tar -zxvf ncbi-blast-2.2.30+-x64-linux.tar.gz

2.基本用法

**提示：blast輸出格式有多種，其中11包含信息最全，其它格式都可用blast_formatter程序由11轉(zhuǎn)化為其它格式暖庄。所以聊替，比對(duì)結(jié)果請(qǐng)使用11格式。

1) 對(duì)相應(yīng)的序列進(jìn)行建庫(kù)

makeblastdb -in db.fasta -dbtype nucl -parse_seqids -out dbname

**其中 -dbtype 為 nucl 則表示對(duì)核酸類型的序列建庫(kù)培廓，為 prot 則表示對(duì)氨基酸類型的序列進(jìn)行建庫(kù)

2) 建庫(kù)之后惹悄，就是拿目標(biāo)序列比對(duì)

blastn -query test.fa -db daname -outfmt 11 -out "test.blastn@nr.asn"?-num_threads 8

**其中輸出文件名test.blastn@nr.asn是個(gè)人習(xí)慣，即“序列文件名.blast子程序名@庫(kù)名.結(jié)果格式”肩钠，結(jié)果簡(jiǎn)單明了

**如果目標(biāo)序列是蛋白序列泣港，匹配到 nr 數(shù)據(jù)庫(kù)或者其他蛋白類數(shù)據(jù)庫(kù)，以及其他自己構(gòu)建的蛋白序列庫(kù)時(shí)价匠，則用 blastp, 其他參數(shù)類似当纱。

二、diamond程序

1. 安裝diamond程序

在diamond下載界面獲得下載鏈接

wget http://github.com/bbuchfink/diamond/releases/download/v0.9.17/diamond-linux64.tar.gz

tar xzf diamond-linux64.tar.gz

**解壓結(jié)果為一個(gè)二進(jìn)制可執(zhí)行文件 diamond, 直接添加環(huán)境變量即可

2. 基本用法

To now run an alignment task, we assume to have a protein database file in FASTA format named?nr.faa?and a file of DNA reads that we want to align namedreads.fna.

1) 建庫(kù) In order to set up a reference database for DIAMOND, the?makedb?command needs to be executed with the following command line:

$ diamond makedb --in nr.faa -d nr ## 建庫(kù)

$ diamond help

diamond helpdiamond v0.8.8.70 | by Benjamin BuchfinkCheck http://github.com/bbuchfink/diamond for updates.

Syntax: diamond COMMAND [OPTIONS]

Commands:

makedb Build DIAMOND database from a FASTA file

blastp Align amino acid query sequences against a protein reference database

blastx Align DNA query sequences against a protein reference database

view View DIAMOND alignment archive (DAA) formatted file

help Produce help message

version Display version information

General options:

--threads (-p)? ? ? ? number of CPU threads

--db (-d)? ? ? ? ? ? ? database file

--daa (-a)? ? ? ? ? ? DIAMOND alignment archive (DAA) file

--verbose (-v)? ? ? ? verbose console output

--log? ? ? ? ? ? ? ? ? enable debug log

--quiet? ? ? ? ? ? ? ? disable console output

Makedb options:

--in? ? ? ? ? ? ? ? ? input reference file in FASTA format

--block-size (-b)? ? ? sequence block size in billions of letters (default=2)

Aligner options:

--query (-q)? ? ? ? ? input query file

--max-target-seqs (-k) maximum number of target sequences to report alignments for

--top? ? ? ? ? ? ? ? ? report alignments within this percentage range of top alignment score (overrides --max-target-seqs)

--compress? ? ? ? ? ? compression for output files (0=none, 1=gzip)

--evalue (-e)? ? ? ? ? maximum e-value to report alignments

--min-score? ? ? ? ? ? minimum bit score to report alignments (overrides e-value setting)

--id? ? ? ? ? ? ? ? ? minimum identity% to report an alignment

--query-cover? ? ? ? ? minimum query cover% to report an alignment

--sensitive? ? ? ? ? ? enable sensitive mode (default: fast)

--index-chunks (-c)? ? number of chunks for index processing

--tmpdir (-t)? ? ? ? ? directory for temporary files

--gapopen? ? ? ? ? ? ? gap open penalty (default=11 for protein)

--gapextend? ? ? ? ? ? gap extension penalty (default=1 for protein)

--matrix? ? ? ? ? ? ? score matrix for protein alignment

--seg? ? ? ? ? ? ? ? ? enable SEG masking of queries (yes/no)

--salltitles? ? ? ? ? print full subject titles in output files

Advanced options:

--seed-freq? ? ? ? ? ? maximum seed frequency

--run-len (-l)? ? ? ? mask runs between stop codons shorter than this length

--max-hits (-C)? ? ? ? maximum number of hits to consider for one seed

--id2? ? ? ? ? ? ? ? ? minimum number of identities for stage 1 hit

--window (-w)? ? ? ? ? window size for local hit search

--xdrop (-x)? ? ? ? ? xdrop for ungapped alignment

--gapped-xdrop (-X)? ? xdrop for gapped alignment in bits

--ungapped-score? ? ? minimum raw alignment score to continue local extension

--hit-band? ? ? ? ? ? band for hit verification

--hit-score? ? ? ? ? ? minimum score to keep a tentative alignment

--band? ? ? ? ? ? ? ? band for dynamic programming computation

--shapes (-s)? ? ? ? ? number of seed shapes (0 = all available)

--index-mode? ? ? ? ? index mode (0=4x12, 1=16x9)

--fetch-size? ? ? ? ? trace point fetch size

--single-domain? ? ? ? Discard secondary domains within one target sequence

--dbsize? ? ? ? ? ? ? effective database size (in letters)

--no-auto-append? ? ? disable auto appending of DAA and DMND file extensions

View options:

--out (-o)? ? ? ? ? ? output file

--outfmt (-f)? ? ? ? ? output format (tab/sam/xml)

--forwardonly? ? ? ? ? only show alignments of forward strand

2) 序列比對(duì)

** 上面建庫(kù)之后會(huì)生成一個(gè) nr.dmnd 文件踩窖，The alignment task may then be initiated using the?blastx?command like this:

$ diamond blastx -d nr -q reads.fna -o matches.m8

The output file here is specified with the?–o?option and named?matches.m8. By default, it is generated in BLAST tabular format.

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末坡氯，一起剝皮案震驚了整個(gè)濱河市，隨后出現(xiàn)的幾起案子洋腮，更是在濱河造成了極大的恐慌箫柳，老刑警劉巖，帶你破解...
沈念sama閱讀 206,839評(píng)論 6贊 482
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件徐矩，死亡現(xiàn)場(chǎng)離奇詭異滞时，居然都是意外死亡叁幢，警方通過查閱死者的電腦和手機(jī)滤灯，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,543評(píng)論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來曼玩，“玉大人鳞骤，你說我怎么就攤上這事∈蚺校” “怎么了豫尽？”我有些...
開封第一講書人閱讀 153,116評(píng)論 0贊 344
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)顷帖。經(jīng)常有香客問我美旧，道長(zhǎng)，這世上最難降的妖魔是什么贬墩？我笑而不...
開封第一講書人閱讀 55,371評(píng)論 1贊 279
?港島之戀（遺憾婚禮）
正文為了忘掉前任榴嗅，我火速辦了婚禮，結(jié)果婚禮上陶舞，老公的妹妹穿的比我還像新娘嗽测。我一直安慰自己，他們只是感情好肿孵，可當(dāng)我...
茶點(diǎn)故事閱讀 64,384評(píng)論 5贊 374
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布唠粥。她就那樣靜靜地躺著疏魏，像睡著了一般。火紅的嫁衣襯著肌膚如雪晤愧。梳的紋絲不亂的頭發(fā)上大莫，一...
開封第一講書人閱讀 49,111評(píng)論 1贊 285
城市分裂傳說
那天，我揣著相機(jī)與錄音官份，去河邊找鬼葵硕。笑死，一個(gè)胖子當(dāng)著我的面吹牛贯吓，可吹牛的內(nèi)容都是我干的懈凹。我是一名探鬼主播，決...
沈念sama閱讀 38,416評(píng)論 3贊 400
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼悄谐，長(zhǎng)吁一口氣：“原來是場(chǎng)噩夢(mèng)啊……” “哼介评！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起爬舰，我...
開封第一講書人閱讀 37,053評(píng)論 0贊 259
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤们陆，失蹤者是張志新（化名）和其女友劉穎，沒想到半個(gè)月后情屹，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體坪仇，經(jīng)...
沈念sama閱讀 43,558評(píng)論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,007評(píng)論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年垃你，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了椅文。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 38,117評(píng)論 1贊 334
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡惜颇，死狀恐怖皆刺，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情凌摄，我是刑警寧澤羡蛾，帶...
沈念sama閱讀 33,756評(píng)論 4贊 324
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站锨亏，受9級(jí)特大地震影響痴怨，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜器予，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,324評(píng)論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一浪藻、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧劣摇，春花似錦珠移、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,315評(píng)論 0贊 19
一樁弒父案钧惧，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)暇韧。三九已至，卻和暖如春浓瞪，著一層夾襖步出監(jiān)牢的瞬間懈玻，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 31,539評(píng)論 1贊 262
情欲美人皮
我被黑心中介騙來泰國(guó)打工乾颁，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留涂乌，地道東北人。一個(gè)月前我還...
沈念sama閱讀 45,578評(píng)論 2贊 355
代替公主和親
正文我出身青樓英岭，卻偏偏與公主長(zhǎng)得像湾盒，于是被迫代替她去往敵國(guó)和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子诅妹，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,877評(píng)論 2贊 345