課程來(lái)源于生信技能樹(shù):http://www.biotrainee.com/forum.php?mod=viewthread&tid=1750#lastpost
最好是有mac或者linux系統(tǒng),8G+的內(nèi)存勤篮,500G的存儲(chǔ)即可。
如果你是Windows戳护,那么安裝必須安裝git,notepad++,everything瀑焦,還有虛擬機(jī),在虛擬機(jī)里面安裝linux梗肝,最好是ubuntu榛瓮。需要安裝的軟件包括 sratoolkit,fastqc,hisats,samtools,htseq-count,R,Rstudio軟件安裝的代碼,在生信技能樹(shù)公眾號(hào)后臺(tái)回復(fù)老司機(jī)即可拿到巫击。
系統(tǒng)準(zhǔn)備:
windows 7旗艦版禀晓;VMware Workstation下安裝Ubuntu 14.04.5 LTS
軟件準(zhǔn)備:
軟件包存放和安裝路徑:/work/LXJ/software
SRA Toolkit
功能:下載、整理NCBI SRA數(shù)據(jù)
網(wǎng)址:https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software
安裝:
#下載安裝包
$ wget https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.8.2-1/sratoolkit.2.8.2-1-ubuntu64.tar.gz
# 解壓
$ tar -zxvf sratoolkit.2.8.2-1-ubuntu64.tar.gz
# 添加環(huán)境變量
$ echo 'PATH=$PATH:/work/LXJ/software/sratoolkit.2.8.2-1-ubuntu64/bin' >> ~/.bashrc
# 更新初始文件
$ source ~/.bashrc
# 查看安裝是否成功
$ prefetch -v
# 移除安裝包
$ rm sratoolkit.2.8.2-1-ubuntu64.tar.gz
Fastqc
功能:檢查二代測(cè)序數(shù)據(jù)質(zhì)量
網(wǎng)站:http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
安裝:
# 判斷系統(tǒng)是否安裝java
$ java -version
# 若未安裝坝锰,用以下命令安裝
sudo apt install openjdk-9-jdk
# 驗(yàn)證是否安裝java成功
$ java -version
# 安裝fastqc
$ wget http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.5.zip
$ unzip fastqc_v0.11.5.zip
$ cd /FastQC
$ chmod 770 fastqc
# 添加環(huán)境變量
$ vim ~/.bashrc
$ export PATH后添加::/work/LXJ/software/FastQC
$ source ~/.bashrc
# 查看安裝是否成功
$ fastqc -v
samtools
SAM: 存放高通量測(cè)序比對(duì)結(jié)果的標(biāo)準(zhǔn)格式
功能: Reading/writing/editing/indexing/viewing SAM/BAM/CRAM format
網(wǎng)站: http://samtools.sourceforge.net/
安裝:
依賴包:zlib2,bzip2,curses,htslib
$ sudo apt install autoconf libz-dev libbz2-dev liblzma-dev libssl-dev
#zlib2
$ wget http://zlib.net/zlib-1.2.11.tar.gz
$ tar -zxvf zlib-1.2.11.tar.gz && cd zlib-1.2.11 && make && sudo make install && cd .. && rm -rf zlib-1.2.11
#bzip2
$ wget http://bzip.org/1.0.6/bzip2-1.0.6.tar.gz
$ tar -zxvf bzip2-1.0.6.tar.gz && cd bzip2-1.0.6 && make && sudo make install && cd .. && rm -rf bzip2-1.0.6
#curses
$ sudo apt-get install libncurses5-dev
#htslib
$ git clone https://github.com/samtools/htslib.git
$ cd htslib
$ autoreconf
# building samtools
$ git clone https://github.com/samtools/samtools.git
$ cd samtools
$ autoconf -Wno-syntax
$ ./configure
$ make && make install prefix=$HOME/biosoft/samtools
$ vim ~/.bashrc
#export PATH后添加::/work/LXJ/software/samtools
$ source ~/.bashrc
$ samtools --help
#安裝采用github粹懒,所以更新就用下面命令:
$ cd htslib; git pull
$ cd ../bcftools; git pull
$ make clean
$ make
HISAT2
功能: 將測(cè)序結(jié)果比對(duì)到參考基因組上
網(wǎng)站: http://ccb.jhu.edu/software/hisat2/index.shtml
安裝:
linux版Hisat2下載,解壓顷级,可以使用了:
$ wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/downloads/hisat2-2.1.0-Linux_x86_64.zip
解壓(-d 解壓到指定文件):
$ unzip -d /work/LXJ/software/ hisat2-2.1.0-Linux_x86_64.zip
檢查是否可以運(yùn)行:
$ ./hisat2
(ERR): hisat2-align exited with value 1:可以忽略
$ sudo vi ~/.bashrc
$ export PATH后添加:/work/LXJ/software/hisat2-2.1.0
$ source ~/.bashrc
HTSeq
功能: 根據(jù)比對(duì)結(jié)果統(tǒng)計(jì)基因count
網(wǎng)站: http://htseq.readthedocs.io/en/release_0.9.1/
安裝:
HTSeq依賴包:setuptools,cython,Numpy,pysam愕把。參考安裝
$ wget https://pypi.python.org/packages/fd/94/b7c8c1dcb7a3c3dcbde66b8d29583df4fa0059d88cc3592f62d15ef539a2/HTSeq-0.9.1.tar.gz#md5=fc71e021bf284a68f5ac7533a57641ac
$ tar zxvf /work/LXJ/software
$ cd HTSeq-0.9.1/
$ sudo python setup.py install
MultiQC
功能:把多個(gè)測(cè)序結(jié)果的qc結(jié)果整合成一個(gè)報(bào)告。
網(wǎng)站:http://multiqc.info/
安裝:
#conda 直接安裝multiqc
$ conda install -c bioconda multiqc
檢測(cè)安裝是否成功
$ multiqc --help Options: -f, --force Overwrite any existing reports -n, --filename TEXT Report filename. Use 'stdout' to print to standard out. -o, --outdir TEXT Create report in the specified output directory. --pdf Creates PDF report with 'simple' template. Requires Pandoc to be installed.
使用:
$ multiqc *fastqc.zip --pdf
#掃描當(dāng)前文件夾
$ multiqc .
$ multiqc pwd
參考:
hoptop的文章:轉(zhuǎn)錄組入門(1):軟件準(zhǔn)備
lxmic的文章:轉(zhuǎn)錄組入門(1):計(jì)算機(jī)及軟件安裝