MetaProteomeAnalyzer

Development and goal of the MPA software

The MetaProteomeAnalyzer (MPA) is a dedicated software suite for metaprotemics. It features an intuitive graphical user interface for metaproteomics data analysis and interpretation. The first version of the MPA was developed by Thilo Muth, Alexander Behne, Robert Heyer, Fabian Kohrs between 2013 and 2015.

Muth T, et al. The MetaProteomeAnalyzer: a powerful open-source software suite for metaproteomics data analysis and interpretation. J Proteome Res 14, 1557-1565 (2015).

In the meantime the software was update and the MPA 3.0 is available. Please use as citation:

Heyer, R.  Schallert, K., (shared first) et al. MPA-WORKFLOW: A robust and universal metaproteomics workflow for research studies and routine diagnostics within 24°h using phenol extraction, FASP digest, and the MetaProteomeAnalyzer. Frontiers in Microbiology, 10: 1883 (2019). doi: 10.3389/fmicb.2019.01883

The metaprotein concept

Metaproteins are protein groups that consider the special use case of metaproteomics. In order to deal with homologous proteins, which are expected in a multi-species system, proteins are grouped into metaproteins using a set of rules. The metaprotein will then be assigned a taxonomy based on the proteins that it contains depending on the specification the user provides. Unlike protein groups used by other proteomics tools, metaproteins should not be considered a single protein with an ambiguous identification, but instead they constitute a group of related proteins all of which are potentially contained in the sample. From this it follows that metaproteins will sometimes be assigned apparently unspecific taxonomies (i.e. Superkingdom rank), which indicates that the protein sequences on which the metaprotein is based are highly conserved across different taxa, making a specific taxonomic assignment impossible in a microbial community of multiple unknown species. Metaproteins will also combine other metadata from its proteins into a single entry: UniProt Keywords, UniRef Clusters, KEGG Orthologies and enzyme commission numbers (EC).

Metaproteins will be created according to the rules the user chooses. All three rules can be combined in any combination. The three rules are: 1. Peptide Rule, 2. Cluster Rule and 3. Taxonomy Rule as seen in Figure 1. Table 1 shows all available options and gives a description of how it will affect the metaprotein generation.

Figure 1: Metaprotein Rules. Different rules can be applied to determine how proteins are grouped together into metaproteins: 1. Peptide Rule, 2. Cluster Rule, 3. Taxonomy Rule.

 

Table 1: List of metaprotein rules and other options.

Metaprotein Rule Description
Peptide Rule: Shared Peptide Two proteins will be considered for one metaprotein if they have at least one peptide in common. Using this rule, two proteins of a metaprotein may have no peptides in common if they share a peptide with a third protein.
Peptide Rule: Shared Peptide Subset Two proteins will be considered for one metaprotein if they share a common set of peptides. This means that either both proteins contain the exact same set of peptides or if they share all the same peptides where one protein may have fewer peptides from the total set. Using this rule, two proteins will not be grouped if both possess unique peptides.
Peptide Rule: Leucine/Isoleucine Since Leucine and Isoleucine have the same molecular weight, they are considered to be indistinguishable by mass spectrometry. This option will either consider peptides that only differ in these amino acids equal or distinct for the purpose of other peptide rules.
Peptide Rule: Levenshtein distance The Levenshtein distance measures the number of single amino acid substitutions between two peptide sequences. Using this rule, peptides with the Levenshtein distance that are set by the user will be considered equal for the purpose of other peptide rules.
Cluster Rule: UniRef100 Using this Cluster Rule, proteins will be grouped into a metaprotein if they belong to the same UniRef100 cluster.
Cluster Rule: UniRef90 Using this Cluster Rule, proteins will be grouped into a metaprotein if they belong to the same UniRef90 cluster. This will always include all proteins that also share the UniRef100 cluster.
Cluster Rule: UniRef50 Using this Cluster Rule, proteins will be grouped into a metaprotein if they belong to the same UniRef50 cluster. This will always include all proteins that also share the UniRef90 and UniRef100 cluster.
Taxonomy Rule The taxonomy rule will prevent two proteins from being grouped into a metaprotein if they are not taxonomically close enough. In this option the highest taxonomic rank is chosen for which proteins are still grouped into a metaprotein. This rule does not work on its own and has to be used together with the peptide or cluster rule.
Peptide-to-Protein Taxonomy Two options are available to determine in which way protein taxonomies are redefined based on the peptide taxonomy: lowest common ancestor (LCA) or most specific member. LCA will find the lowest common ancestor taxonomy (up to “root”) to which all peptides of this protein belong. Most specific member will select the first taxonomy of those peptide taxonomies with the lowest rank (i.e. sup-species).
Protein-to-Metaprotein Taxonomy Similarly, two options are available to determine in which way metaprotein taxonomies are generated based on the protein taxonomy: lowest common ancestor (LCA) or most specific member. LCA will find the lowest common ancestor taxonomy (up to “root”) to which all proteins of this metaprotein belong. Most specific member will select the first taxonomy of those protein taxonomies with the lowest rank (i.e. sup-species).

 

Figure 2: Metaprotein Taxonomy.The five main steps A-E are followed, when creating metaproteins to determine the taxonomy of the metaprotein. The “Protein-to-Peptide” (C) taxonomy is set to be the lowest common ancestor taxonomy (LCF). The “Peptide-to-Protein” (D) and “Protein-to-Metaprotein” (E) taxonomies can be set to LCA or “most specific member” independently of each other.

Remote server solution: MPAv2

The development of the MPA software is ongoing and many useful features have been added since the original version was published. Furthermore, the de.NBI partner project MetaProtServ was started in 2017, which aims to make the MPA easily available to researchers. To this end, the MPA is now provided as a central remote server solution by the de.NBI network that can be freely accessed and used by researchers worldwide, resembling a major step towards completely cloud based data analysis solutions.

To gain access to the remote server version, contact us under mpa@ovgu.de

Local installation of the MPAv2

Version Local

MPA Portable version

Version Portable

Future development: Web-based version of the MPA

Version Web in development