PMN SAVI Database Refinement

SAVI is a Semi-Automated Validation and Integration pipeline designed to select the final set of pathways contained in the database for each species 

The SAVI pipeline uses the inital set of pathways predicted by the Pathologic program for a given plant species, the enzyme predictions from E2P2 for the species, and six sets of pre-sorted SAVI pathways to generate the final set of pathways in each species-specific database. These pathways describe the basic "primary" metabolism shared throughout the plant kingdom and also provides access to experimentally verified and computationally-predicted specialized (secondary) metabolic pathways.

Manually Check Pathway (MCP) 1.00
Conditionally Accepted PMN Pathway (CAPP) 3.00
Accept-if-Predicted Pathway (AIPP) 3.00
Non-PMN Pathway (NPP) 6.00
Common Viridiplantae Pathway (CVP) 4.00
Ubiquitous land Plant Pathway (UPP) 6.00

The SAVI pathway sets are used to help expedite the validation and new pathway integration process when generating new single-species databases. These lists are not comprehensive and grow over time as more pathways are reviewed and created. 

These SAVI pathway lists are updated during the preparation of each new PMN release and current and archived versions are available at our ftp site.


  • The Ubiquitous land Plant Pathway (UPP) list contains pathways that are expected to be present in all "land"plants, also referred to as "higher" plants or Embryophyta. 
    The UPP list can be used to:

    • validate pathways that have been predicted for Embryophyta species using Pathologic
    • import pathways from MetaCyc that were NOT predicted for Embryophyta species, likely due to genome or proteome annotation problems

  • The Common Viridiplantae Pathway (CVP) list contains pathways that are expected to be present in all or most Viridiplantae species including algae. The CVP is a subset of the UPP. 
    The CVP list can be used to:

    • validate pathways that have been predicted for non-Embryophyta Viridiplantae species using Pathologic
    • import pathways from MetaCyc that were NOT predicted for non-Embryophyta Viridiplantae species, likely due to genome or proteome annotation problems

  • The Non-PMN Pathway (NPP) list contains pathways that are excluded from PMN databases for several reasons including:

    • non-plant variants of common primary metabolic pathways
    • redundant short pathways that are wholly contained within larger more coherent pathways
    • non-small-molecule metabolic pathways (e.g. related to protein modification)

    The NPPs are automatically removed whenever they are predicted for new species using the Pathologic program


  • The Accept-if-Predicted Pathway (AIPP) list contains pathways that are automatically accepted if they are predicted for any plant species database, similar to the UPP or CVP.However AIPP pathways are not automatically imported into databases if they are not predicted by the Pathologic program.

    • AIPPs are chosen based on several criteria including:

      • primary metabolic pathway believed to be widespread
      • one of several variants of a primary metabolic pathway that has been identified in plants

    • The Conditionally Accepted PMN Pathway (CAPP) list contains pathways that are only accepted if they are predicted by the Pathologic program and meet additional criteria assigned to each pathway by PMN curators. Each CAPP can have criteria based on reactions and/or expected taxonomic range that are used to determine whether it should be accepted for a given species.

      • CAPP Reactions: A CAPP pathway will be accepted if it has one or more enzymes predicted (by E2P2 v2.1) to catalyze one or more "key" reaction(s) assigned to the pathway. "Key" reactions are selected by curators based on a number of criteria including:
        • Uniquely appears in a single pathway
        • Differentiates a particular variant pathway from other related variants
        • Appears as the final step in a biosynthetic pathway or as the first step in a degradation pathway
      • The CAPP approval process also recognizes enzyme predictions linked to non-specific Enzyme Commission (EC) numbers (e.g.1.1.1.1) and does not use these for accepting any CAPP pathway.
      • Please note: The acceptance of a pathway based on reaction criteria does not necessarily mean that the exact pathway depicted will be present in the given species, especially in the case of "specialized" or "secondary" metabolic pathways. The pathway is retained to allow users to see the metabolic context of the "key" reaction(s) in different organisms. These pathways have a computational evidence code to indicate that there is no experimental support for their existence in the given species. Moreover, if a published paper suggests that a pathway is likely only found in a very restricted taxonomic range or is excluded from a specific taxonomic range, a special warning comment is placed at the beginning of the pathway summary if this pathway is accepted for a species outside of its expected taxonomic range

       

      • CAPP Taxonomic Range: A CAPP pathway will be accepted if it falls with an expected taxonomic range as described in a published manuscript. This range is typically only applied if this pathway is expected to be widespread in this taxon, e.g. a pathway that is very common to many legume species.

      A CAPP taxonomic range criterion is usually added to avoid rejecting true positive pathways when key enzymes are missing due to genome or proteome annotation problems.

      • Decision rules for CAPP pathways:
      1. If a pathway has more than one CAPP-Reaction and/or CAPP-taxonomic range associated with it, it will be accepted as long as one of the criteria is met for a given species. All criteria are treated as "OR" when the pathway is automatically evaluated during the SAVI pipeline.
      2. Reaction criteria are given first precedence and whenever possible, reaction criteria are used in addition to or instead of using expected taxonomic range, especially given the number of examples of scattered occurences of pathways and compounds throughout the plant lineage, particularly in the area of "specialized" or "secondary" metabolism.

    • The Manually Check Pathway (MCP) list contains a very small number of pathways that are likely to be relevant to the users of the databases and are likely to be reported in published literature for many plant species. The current manual list contains four pathways: three pathways describe different C4 photosynthetic variants and one related to photosynthesis in seeds. Curators attempt to find references to determine whether any of the pathways on this list exist in a given species whenever they are predicted using the Pathologic program