前面寫了16S rDNA進(jìn)化樹监徘。關(guān)于細(xì)菌基因組水平诚欠,下面記錄這篇文章用了三種不同的流程reconstruct phylogenetic structure的策略婚度。
標(biāo)題:Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle
雜志:CELL
時(shí)間:2019
策略一(Figure 1)
PhyloPhlAn軟件
400 universal PhyloPhlAn markers構(gòu)建phylogeny
phylophlan參數(shù):--diversity high --accurate --min_num_markers 80
Internal steps:
diamond:
blastx --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0
diamond:
blastp --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0
mafft --anysymbol 對齊
trimal -gappyout 修剪
RAxML -m PROTCATLG -p 1989 建樹
策略二(Figure S3)
PhyloPhlAn軟件
400 PhyloPhlAn markers reconstruct phylogeny
phylophlan參數(shù):--diversity high --fast --min_num_markers 80
Internal steps:
diamond:
blastx --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0
diamond:
blastp --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0
mafft --anysymbol 對齊
trimal -gappyout 修剪
RAxML -m PROTCATLG -p 1989 建樹
IQ-TREE -nt AUTO -m LG 建樹
策略三(Figure 3)
Roary identified set of cores genes at 95%
roary -e -n -v -p 4 -i 95 \
-f ./result_roary/ \
./out/*.gff
PhyloPhlAn --diversity low --fast
--min_num_markers <50% of the number of core genes identified>
--min_num_entries <90% of the number of input genomes>
--diversity {low,medium,high}
Specify the expected diversity of the phylogeny,
automatically adjust some parameters: "low": for
genus-/species-/strain-level phylogenies; "medium":
for class-/order-level phylogenies; "high": for
phylum-/tree-of-life size phylogenies (default: None)
--fast Perform more a faster phylogeny reconstruction by
reducing the phylogenetic positions to use; affected
parameters depend on the "--diversity" level (default:
False)
--min_num_markers MIN_NUM_MARKERS
Input genomes or proteomes that map to less than the
specified number of markers will be discarded
(default: 1)
--min_num_entries MIN_NUM_ENTRIES
The minimum number of entries to be present for each
of the markers in the database (default: 4)
blastn -outfmt 6 -max_target_seqs 1000000
mafft --anysymbol --auto 對齊
trimal -gappyout 修剪
FastTree -mlacc 2 -slownni -spr 4 -fastest -mlnni 4 -no2nd -gtr -nt 建樹
RAxML -p 1989 -m GTRCAT
-t <phylogenetic tree computed by FastTree>
NMDS基于 Roary 遺傳距離
The non-metric multidimensional scaling plots
were computed on pairwise genetic distances between core gene alignments produced by Roary
using the nmds function in the ecodist R package
可視化
The phylogenetic trees were generated using GraPhlAn and the phylogenies were generated using FigTree
還有更多的方法
文章:Insights on the Evolutionary Genomics of the Blautia Genus: Potential New Species and Genetic Content Among Lineages
雜志:Frontiers in Microbiology
時(shí)間:2021
策略四
OrthoFinder獲取conserved gene families (Orthogroups)
perl retrieve protein sequence
MAFFT (L-INS-i mode)對齊Orthogroups
ModelTest-NG:
Akaike information criterion (AIC)
IQ-TREE 2:
1000 replicates of ultrafast bootstrap
UFBOOT trees by NNI (–bnni)
SH-like approximate likelihood ratio test (–alrt)
策略五
panX core genome to construct single nucleotide polymorphism (SNP)-based tree
cophylo比較phylogenomic aminoacid and the SNP-based trees
TypeMat from the Microbial Genomes Atlas (MiGA) 進(jìn)行細(xì)菌分類剔除anomalous classification