What
RepeatMasker是一款基于Library-based,通過(guò)相似性比對(duì)來(lái)識(shí)別重復(fù)序列邪蛔,可以屏蔽序列中轉(zhuǎn)座子重復(fù)序列和低復(fù)雜度序列(默認(rèn)將其替換成N),幾乎用于所有物種路操,是做基因組廓奕、非編碼RNA的必備軟件。在人類基因組分析當(dāng)中宇整,大約 56% 的序列會(huì)被mask;RepeatMasker在進(jìn)行序列比對(duì)時(shí)可以選用常見(jiàn)的幾種算法芋膘,包括nhmmer鳞青、cross_match、ABBlast/WUBlast为朋、RMBlast 臂拓、Decypher(可以安裝多個(gè)比對(duì)引擎,但每次只能使用其中一個(gè))习寸。
Repbase是由美國(guó)遺傳信息研究所(GIRI)創(chuàng)建并維護(hù)胶惰,收錄了轉(zhuǎn)座子和其他重復(fù)序列及其注釋信息。
RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library derived from Repbase sequences ) and Repbase, a service of the Genetic Information Research Institute.
在線服務(wù)
- RepeatMasker提供了在線服務(wù)霞溪,將核酸序列或者FASTA文件上傳孵滞,選擇比對(duì)程序、速度/特異性鸯匹、物種以及結(jié)果呈現(xiàn)形式坊饶,點(diǎn)擊提交,幾分鐘之后即可得到結(jié)果殴蓬,實(shí)乃一大利器匿级。
- Search Engine
- abblast
- rmblast
- hmmer
- cross_match
- Speed/Sensitivity
- rush
- quick
- default
- slow
- DNA source
- Human
- Mouse
- Arabidopsis
本地安裝RepeatMasker
本地安裝RepeatMasker,除了需要RepeatMasker主程序外,還需要TRF(Tandem Repeats Finder)痘绎、序列搜索引擎(以RMBlast為例)以及Repbase數(shù)據(jù)庫(kù)津函。
wget http://tandem.bu.edu/trf/downloads/trf407b.linux
sudo mv trf407b.linux /usr/local/bin/trf # 記住這個(gè)地址1
sudo /usr/local/bin/trf
- RMBlast
wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/rmblast/2.2.28/ncbi-rmblastn-2.2.28-src.tar.gz
tar -zvcf ncbi-rmblastn-2.2.28-src.tar.gz
cd ncbi-rmblastn-2.2.28-src/c++
./configure --with-mt --prefix=/usr/local/rmblast --without-debug
make
sudo make install
# 記住安裝RMBlast的地址2, */ncbi-rmblastn-2.2.28-src/c++/GCC480-ReleaseMT64/bin
Repbase
這個(gè)需要在官網(wǎng)注冊(cè)才能下載孤页,其中商業(yè)機(jī)構(gòu)需要收費(fèi)尔苦,非營(yíng)利性組織可以免費(fèi)使用,人工審批散庶!也可以Google蕉堰、百度上找資源,下載后解壓備用悲龟。RepeatMasker
wget http://www.repeatmasker.org/RepeatMasker-open-4-0-6.tar.gz
cd RepeatMasker
perl configure
<PRESS ENTER TO CONTINUE> # 回車?yán)^續(xù)
Enter path [ ]: # 輸入perl程序路徑
Enter path [ ]: # 輸入RepeatMasker要安裝的路徑
Enter path [ ]: # 輸入TRF路徑(地址1)
Add a Search Engine: # 選擇一個(gè)搜索引擎(需要事先安裝好)屋讶,并輸入引擎路徑(地址2)
1. CrossMatch: [ Un-configured ]
2. RMBlast - NCBI Blast with RepeatMasker extensions: [ Un-configured ]
3. WUBlast/ABBlast (required by DupMasker): [ Un-configured ]
4. HMMER3.1 & DFAM: [ Un-configured ]
5. Done
Do you want RMBlast to be your default # 設(shè)置默認(rèn)搜索引擎
search engine for Repeatmasker? (Y/N) [ Y ]:
# 可以安裝多個(gè)引擎,完成后按5
Congratulations! RepeatMasker is now ready to use. # 提示已經(jīng)安裝完成
# RepeatMasker已經(jīng)安裝完成须教,下一步將之前下載解壓的Repbase文件COPY到RepeatMasker安裝路徑下的Libraries文件夾中即可
- Simple ues
RepeatMasker -species human test.fa