We are pleased to announce the PMN10 release. In this release, we introduce 5 new metabolic pathway databases for species including tomato, potato, two diploid progenitors of the bread wheat, Aegilops tauschii (D-genome) and Triticum Urartu (A-genome), and the great duckweed (Spirodela polyrhiza). We substantially updated the 17 existing species-specific databases and the multi-species database PlantCyc. PMN now hosts 23 databases in total.
The computational predictions of enzyme functions in PMN 10 are based on an enhanced Ensemble Enzyme Prediction Pipeline (E2P2 v3.0), with increased precision and a 62% increase of enzyme function space. We also made the following major changes to the scope and rules used in PMNs databases:
- To reinforce the PMNs focus on small molecule metabolism, enzymes that metabolize macromolecules such as serine protein kinase are excluded from the pathway databases.
- In our enzyme to reaction association, EC number-based enzyme predictions are transferred to distinct MetaCyc reaction IDs. To reduce false-positive associations, when multiple MetaCyc reactions exist for the same EC number, only the MetaCyc reaction corresponding to the official EC reaction is used for annotation.
- When multiple splice variants of a gene locus exist in a genome, only the protein sequence that represents the longest coding sequence is included in the pathway database.