Microbiome Analysis Tools Developed at EMBL
Welcome to the web portal for computational microbiome analysis tools developed at EMBL by the groups of Peer Bork and Georg Zeller. The graph illustrates which analysis tasks can be performed by each of the tools listed to the right and how they can be combined into complex microbiome analysis pipelines. Find out more about the tools by expanding the corresponding list items. Tags indicating application areas show you which tools are applicable to the analysis of 16S amplicon sequencing data or prokaryotic genome sequencing data in addition to shotgun metagenomics data.
Metagenomicgene catalog Annotatedmetagenomicgene catalog Marker genedatabase Taxonomiccommunityprofile Functionalcommunityprofile Predictivemicrobiomemodels Metabolic pathwaymaps Communitytypes(enterotypes) Differentiallyabundantmicrobial features Phylogenetictree ofmicrobial life Metagenomicgenes mOTUdatabase Raw shotgunmetagenomicread data Referencegenomesequences Meta-data(external factors) Intermediate results & resources Raw inputdata Results &visualizations Preprocessedinput data High-qualitymicrobialreads Visualexploration Phylogeneticanalysis &visualization Statisticalanalysis Clusteranalysis Quality and contaminationfiltering Assembly Gene prediction Redundancyremoval Mapping againstfunctional DBs Quality control andannotation Extractionof phylogeneticmarker genes Clustering Mapping againsttaxonomic DBs Functional &phylogeneticannotation
MOCAT
Metagenomics assembly and profiling pipeline
MOCAT is a modular and scalable software pipeline for analyzing shotgun metagenomics datasets generated with Illumina technology. Starting from raw fastQ files, it can quality-filter and remove contaminants from them, assemble metagenomic reads into contigs, predict prokaryotic genes on these, identify phylogenetic marker genes and generate taxonomic abundance profiles by mapping reads to these marker genes.

Homepage Source code User manual

Contact: zeller@embl.de
Version: 2.0

Give feedback

Publications:

Kultima et al., MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics 16. 2016. PMID: 27153620

Kultima et al., MOCAT: a metagenomics assembly and gene prediction toolkit. PLoS ONE 10. 2012. PMID: 23082188



Metagenomics


proGenomes
Taxonomic and functional annotation of prokaryotes
proGenomes is a database providing consistent taxonomic and functional annotations for 87,920 bacterial and archaeal genomes belonging to over 12,000 species. These can be interactively explored and downloaded, whereby subsets can be customized, e.g. taxonomic clades, representatives of each species or habitat-specific organisms.

Homepage Documentation

Contact: zeller@embl.de
Version: 2.0

Give feedback

Publications:

Mende et al., proGenomes2: an improved database for accurate and consistent habitat, taxonomic and functional annotations of prokaryotic genomes. Nucleic Acids Res. D1. 2020. PMID: 31647096

Mende et al., proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes. Nucleic Acids Res. D1. 2017. PMID: 28053165



Metagenomics
Genomics
Data submission, annotation and curation
Taxonomy


mOTUs
Metagenomic species profiling
Metagenomic operational taxonomic units (mOTUs) enable high-accuracy taxonomic profiling of known (sequenced) and unknown microorganisms at species-level resolution from shotgun metagenomic or metatranscriptomic data. The method clusters single-copy phylogenetic marker gene sequences from metagenomes and reference genomes into mOTUs to quantify their abundances in meta-omics data with very high precision and recall.

Homepage Source code Biological data Container Documentation

Contact: zeller@embl.de
Version: 2.5

Give feedback

Publications:

Milanese et al., Microbial abundance, activity and population genomic profiling with mOTUs2. Nat Commun 1. 2019. PMID: 30833550

Sunagawa et al., Metagenomic species profiling using universal phylogenetic marker genes. Nat. Methods 12. 2013. PMID: 24141494



Metagenomics


eggNOG
Resource for orthologous genes
eggNOG is a database of nested orthologous gene groups (NOGs) inferred using unsupervised clustering applied to >5,000 complete genomes followed by comprehensive characterization and analysis of the resulting gene families. eggNOG provides orthologous group assignments at 379 different taxonomic levels as well as multiple sequence alignments, maximum-likelihood trees and broad functional annotations for each group accessible via a web interface or through bulk download.

Homepage Biological data Documentation

Contact: zeller@embl.de
Version: 5.0

Give feedback

Publications:

Huerta-Cepas et al., eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. D1. 2019. PMID: 30418610

Huerta-Cepas et al., Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper. Mol. Biol. Evol. 8. 2017. PMID: 28460117

Huerta-Cepas et al., eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. D1. 2016. PMID: 26582926

Powell et al., eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res. Database issue. 2014. PMID: 24297252

Powell et al., eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res. Database issue. 2012. PMID: 22096231

Muller et al., eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res. Database issue. 2010. PMID: 19900971

Jensen et al., eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic Acids Res. Database issue. 2008. PMID: 17942413



Metagenomics
Genomics


iPATH
Cellular pathway mapping tool
iPath is a web-based tool for the visualization and analysis of cellular pathways. Based on functional annotations (such as KEGG), it provides pathway maps for primary cellular metabolism as well as for some additional secondary metabolite synthesis and regulatory pathways. Users can map their own data onto these pathway maps. Due to its navigation and customization functions, iPATH thus allows users to easily explore and analyze the functional and metabolic capabilities of their (meta-)genomic data sets.

Homepage Documentation

Contact: zeller@embl.de
Version: 3.0

Give feedback

Publications:

Darzi et al., iPath3.0: interactive pathways explorer v3. Nucleic Acids Res. W1. 2018. PMID: 29718427

Yamada et al., iPath2.0: interactive pathway explorer. Nucleic Acids Res. Web Server issue. 2011. PMID: 21546551

Letunic et al., iPath: interactive exploration of biochemical pathways and networks. Trends Biochem. Sci. 3. 2008. PMID: 18276143



Metagenomics
Genomics


Interactive Tree Of Life (iTOL)
Interactive tree visualization
Interactive Tree Of Life (iTOL) is an online tool for the display and manipulation of phylogenetic trees. It provides a large variety of tree layouts, drawing and annotation features including circular tree layout. iTOL is well-suited for a wide range of tree sizes up to several thousand leaves. Tree displays can be exported in several graphical formats, both bitmap and vector based.

Homepage User manual

Contact: zeller@embl.de
Version: 5.0

Give feedback

Publications:

Letunic & Bork, Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. W1. 2019. PMID: 30931475

Letunic & Bork, Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. W1. 2016. PMID: 27095192

Letunic & Bork, Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. Web Server issue. 2011. PMID: 21470960

Letunic & Bork, Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 1. 2007. PMID: 17050570



Phylogenetics
Metagenomics
Genomics


Enterotyping
Gut microbial community typing
Enterotypes are densely populated regions in a high-dimensional space of microbiome community composition, by which human individuals can be stratified (Arumugam, Raes et al. Nature 2011). Computational methods to detect and characterise enterotypes in any dataset, either to reproduce previous reports or determine enterotypes in new studies, are provided and explained.

Homepage User manual

Contact: zeller@embl.de
Version: 2.0

Give feedback

Publications:

Costea et al., Enterotypes in the landscape of gut microbial community composition. Nat Microbiol 1. 2018. PMID: 29255284

Arumugam et al., Enterotypes of the human gut microbiome. Nature 7346. 2011. PMID: 21508958



Metagenomics


SIAMCAT
Statistical analysis of microbiome data
SIAMCAT is a modular R toolbox, which offers machine learning and statistical testing workflows enabling the user to associate microbial community profiles with host phenotypes, such as disease states in clinical case-control studies. SIAMCAT combines statistical rigor with flexible customization for training and evaluating several machine learning classifiers. From these, microbiome biomarkers or signatures can be extracted and validated.

Homepage Source code Bioconductor Vignette

Contact: zeller@embl.de
Version: 1.9.0

Give feedback

Publications:

Wirbel et al., SIAMCAT: user-friendly and versatile machine learning workflows for statistically rigorous microbiome analyses bioRxiv. 2020. doi:10.1101/2020.02.06.931808



Metagenomics
Genomics
Statistics and probability


For questions & feedback, please contact Peer Bork or Georg Zeller.
Usage of this site follows EMBL's Privacy Policy. In accordance with that policy, we use Matomo to collect anonymised data on visits to, downloads from, and searches of this site.