導讀
上一篇介紹了MetaPhlAn:宏基因組微生物分類分析教程婿失,這次來學習MetaPhlAn2的使用方法。
bitbucket地址:https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2
依賴:
Python (version >= 2.7)
Bowtie2
Numpy
Pandas (optional, only required by utility scripts)
BioPython (optional, only required by utility scripts)
SciPy (optional, only required by utility scripts)
Matplotlib (optional, only required by utility scripts)
biom (optional, only required for <tt class="docutils literal">biom</tt> format input/output)
一益缎、conda安裝
conda install -c bioconda metaphlan2
二、測序數(shù)據(jù)
windows下載:
SRS014476-Supragingival_plaque.fasta.gz
SRS014494-Posterior_fornix.fasta.gz
SRS014459-Stool.fasta.gz
SRS014464-Anterior_nares.fasta.gz
SRS014470-Tongue_dorsum.fasta.gz
SRS014472-Buccal_mucosa.fasta.gz
linux下載:
curl -O https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/input/SRS014476-Supragingival_plaque.fasta.gz
curl -O https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/input/SRS014494-Posterior_fornix.fasta.gz
curl -O https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/input/SRS014459-Stool.fasta.gz
三验夯、MetaPhlAn2分析
1. 準備
mkdir metaphlan2_analysis
mv ~/Downloads/SRS*.fasta.gz metaphlan2_analysis/
cd metaphlan2_analysis
ls
2. 單樣品分析
# 分析第一個樣品
metaphlan2.py SRS014476-Supragingival_plaque.fasta.gz --input_type fasta > SRS014476-Supragingival_plaque_profile.txt
# 查看比對結(jié)果
less -S SRS014476-Supragingival_plaque.fasta.gz.bowtie2out.txt
# 查看單樣品物種豐度表
less -S SRS014476-Supragingival_plaque_profile.txt
# 多線程模式芦圾,第2個樣品
metaphlan2.py SRS014459-Stool.fasta.gz --input_type fasta --nproc 4 > SRS014459-Stool_profile.txt
3. 多樣品分析
# 剩下的4個樣品
metaphlan2.py SRS014464-Anterior_nares.fasta.gz --input_type fasta --nproc 4 > SRS014464-Anterior_nares_profile.txt
metaphlan2.py SRS014470-Tongue_dorsum.fasta.gz --input_type fasta --nproc 4 > SRS014470-Tongue_dorsum_profile.txt
metaphlan2.py SRS014472-Buccal_mucosa.fasta.gz --input_type fasta --nproc 4 > SRS014472-Buccal_mucosa_profile.txt
metaphlan2.py SRS014494-Posterior_fornix.fasta.gz --input_type fasta --nproc 4 > SRS014494-Posterior_fornix_profile.txt
或者
# 一個循環(huán)完成6個樣品的分析
for f in SRS*.fasta.gz
do
metaphlan2.py $f --input_type fasta --nproc 4 > ${f%.fasta.gz}_profile.txt
done
4. 六個樣品的物種豐度表
SRS014459-Stool_profile.txt
SRS014464-Anterior_nares_profile.txt SRS014470-Tongue_dorsum_profile.txt
SRS014472-Buccal_mucosa_profile.txt
SRS014476-Supragingival_plaque_profile.txt
SRS014494-Posterior_fornix_profile.txt
5. 六個樣品的比對結(jié)果
SRS014459-Stool.fasta.gz.bowtie2out.txt
SRS014464-Anterior_nares.fasta.gz.bowtie2out.txt
SRS014470-Tongue_dorsum.fasta.gz.bowtie2out.txt
SRS014472-Buccal_mucosa.fasta.gz.bowtie2out.txt
SRS014476-Supragingival_plaque.fasta.gz.bowtie2out.txt
SRS014494-Posterior_fornix.fasta.gz.bowtie2out.txt
6. 合并六個樣品的物種豐度表
merge_metaphlan_tables.py *_profile.txt > merged_abundance_table.txt
獲取結(jié)果總表:merged_abundance_table.txt
# 查看結(jié)果總表
less -S merged_abundance_table.txt
四昔案、hcluast2繪制熱圖
1. conda安裝hclust2
conda install -c biobakery hclust2
2. 提取種水平豐度信息
grep -E "(s__)|(^ID)" merged_abundance_table.txt | grep -v "t__" | sed 's/^.*s__//g' > merged_abundance_table_species.txt
3. 繪制熱圖
hclust2.py -i merged_abundance_table_species.txt -o abundance_heatmap_species.png --ftop 25 --f_dist_f braycurtis --s_dist_f braycurtis --cell_aspect_ratio 0.5 -l --flabel_size 6 --slabel_size 6 --max_flabel_len 100 --max_slabel_len 100 --minv 0.1 --dpi 300
五乒验、GraPhlAn繪制進化樹
1. conda安裝GraPhlAn
conda install -c biobakery graphlan
2. 準備輸入文件
獲取merged_abundance.tree.txt和merged_abunance.annot.txt
export2graphlan.py --skip_rows 1,2 -i merged_abundance_table.txt --tree merged_abundance.tree.txt --annotation merged_abundance.annot.txt --most_abundant 100 --abundance_threshold 1 --least_biomarkers 10 --annotations 5,6 --external_annotations 7 --min_clade_size 1
3. 繪制進化樹
獲取:
merged_abundance.xml
merged_abundance.png
merged_abundance_legend.png
merged_abundance_annot.png
graphlan_annotate.py --annot merged_abundance.annot.txt merged_abundance.tree.txt merged_abundance.xml
graphlan.py --dpi 300 merged_abundance.xml merged_abundance.png --external_legends
六愚隧、PanPhlAn繪制種水平heatmap
1. 輸入數(shù)據(jù)
MetaPhlAn intermediate bowtie2 output files
13530241_SF05.fasta.gz.bowtie2out.txt
13530241_SF06.fasta.gz.bowtie2out.txt
19272639_SF05.fasta.gz.bowtie2out.txt
19272639_SF06.fasta.gz.bowtie2out.txt
40476924_SF05.fasta.gz.bowtie2out.txt
40476924_SF06.fasta.gz.bowtie2out.txt
2. 創(chuàng)建所選物種豐度表
物種:s__Eubacterium_siraeum
豐度:大于1%
metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 13530241_SF05.fasta.gz.bowtie2out.txt > 13530241_SF05.siraeum.txt
metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 13530241_SF06.fasta.gz.bowtie2out.txt > 13530241_SF06.siraeum.txt
metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 19272639_SF05.fasta.gz.bowtie2out.txt > 19272639_SF05.siraeum.txt
metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 19272639_SF06.fasta.gz.bowtie2out.txt > 19272639_SF06.siraeum.txt
metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 40476924_SF05.fasta.gz.bowtie2out.txt > 40476924_SF05.siraeum.txt
metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 40476924_SF06.fasta.gz.bowtie2out.txt > 40476924_SF06.siraeum.txt
結(jié)果:
13530241_SF05.siraeum.txt
13530241_SF06.siraeum.txt
19272639_SF05.siraeum.txt
19272639_SF06.siraeum.txt
40476924_SF05.siraeum.txt
40476924_SF06.siraeum.txt
3. 結(jié)果合并
merge_metaphlan_tables.py *.siraeum.txt > siraeum_tracker.txt
4. 繪制熱圖
hclust2.py -i siraeum_tracker.txt -o siraeum_tracker.png --skip_rows 1 --f_dist_f hamming --no_flabels --dpi 300 --cell_aspect_ratio 0.01