ValidatorDB
Database of validation results for ligands and non-standard residues in the Protein Data Bank.
Database last updated 29/9/2024:
225358 entries from PDBe.org, molecules relevant for validation,
42907 models from wwPDB CCD.
The molecules deemed relevant for validation are all ligands and non-standard residues with reasonable size (more than six heavy atoms). Standard amino acids and nucleotides are not covered.
The validation is performed against models from wwPDB Chemical Component Dictionary (wwPDB CCD). The database is updated weekly.
This is the ValidatorDB synopsis page. Access different tabs for overviews and statistical evaluation of the
validation results, in graphical or tabular form.
Specific results can be examined in deeper detail by accessing the ValidatorDB specifics page.
Different sections of the web page offer interactive guides indicated by
which give a quick walk through all the main elements of the page.
Further help is provided by the info icons
.
Many tool tips are available by hovering over any graphical or textual element in the interface.
ValidatorDB is a part of services provided by ELIXIR –
European research infrastructure for biological information.
For other services provided by ELIXIR's Czech Republic Node visit www.elixir-czech.cz/services.
Incomplete structure and chirality errors on carbon
Chlorophyll a (CLA) is an important green pigment used in oxygenic photosynthesis. In contrast to its paramount importance, it is structurally one of the worst determined ligands in the PDB. More than half of the CLA molecules found in the PDB are incomplete, in general missing the entire hydrocarbon tail. A fifth of the molecules which are complete exhibit at least one chirality error on a carbon atom (especially C8). Overall, only 20% of CLA molecules are complete and correct.
Incomplete structure and chirality errors on carbon
n-Dodecyl-β-D-Maltoside (DMU) is a detergent used for the solubilization of membrane proteins to preserve their activity. Almost 40% of DMU molecules in the PDB are incomplete, missing rings or the hydrocarbon tail. Over a third of the structurally complete molecules exhibit multiple chirality issues.
Missing rings
The nicotinamide-adenine-dinucleotide phosphate (NAP) is an important cofactor in anabolic reactions. Approximately 8% of NAP molecules are incomplete, missing ring atoms. In the majority of cases, entire parts of the molecule are missing, rather than just a few atoms.
Chirality errors on carbon
D-malate (MLT) is the metabolic product of the naturally occurring L-malate (LMT). All the MLT molecules in the PDB are complete. However more than 40% of them exhibit wrong chirality on the C2 carbon, indicating that they are, in fact, incorrectly annotated LMTs.
Chirality errors on phosphorus
Adenosine triphosphate (ATP) is a coenzyme used for transporting chemical energy within cells for metabolism. Perhaps due to the fact that it is heavily studied, just a fraction of the ATP molecules in the PDB have incomplete structure.
Low error rate
This example illustrates a representative well-determined ligand, as almost all protoporphyrin ix molecules containing Fe (HEM) are structurally complete. All reported chirality errors lie on the planar Fe atom, most likely a result of slightly different puckering of the heme plane, and not a real chirality error.
Low error rate
Another example of a structurally well-determined ligand is the redox cofactor FAD (flavin-adenine dinucleotide). Again, almost all the FAD molecules in the PDB are structurally complete. Less than 3% of the validated FAD molecules exhibit chirality errors on carbon atoms.
PDB Entries
Examples of PDB entries having a high ratio of errors.
Missing atoms and chirality errors on carbon Viral envelope glycoprotein
The glycoprotein gp350, enabling virus attachment, is highly covered by polyglycans. Validation identified foreign atoms in 90% of these residues, indicating long and branched oligosaccharides. A few residues exhibit substitutions (NAG, NDG) at the points of attachment of the polyglycan to the protein. About 15% of the polyglycan residues are incomplete (NAG, MAN), and 25% have incorrect chirality at position C1 (mostly MAN, FUC).
Chirality errors on carbon Collagen complex structure
A fragment of the collagen-binding adhesin is complexed with collagen fibrils containing 47 (2S,4R)-4-hydroxyproline (HYP) molecules, none correct. 5 HYP molecules are incomplete, as they lie at the carboxyl end of 5 out of the 6 polypeptide chains which make up the two triple-helical collagen fibrils. All HYP molecules exhibit a chirality error on the CG carbon, and thus the (2S,4S)-4-hydroxyproline isomer (HZP) should be used in the PDB file instead. Additional analysis of the electron density map revealed that the input orientation of the hydroxyl groups at the CG carbons in the crystal structure is indeed incorrect and they should be present in the 4R stereochemistry.
Incomplete residues and chirality errors on carbon and phosphorus Spinach major light-harvesting complex
The major light-harvesting complex of photosystem II contains a variety of ligands, especially chlorophyll a and b. On average, one third of chlorophyll molecules are incomplete, missing part of the hydrocarbon tail due to its high flexibility and therefore, difficult identification in the crystal structure. Chirality issues are reported for 25% of chlorophylls and residual lipids.
Step-by-step Guides
Step-by-step guides for retrieving validation results from ValidatorDB.
Video Slides
Search by PDB Entry
Showcases validation results for molecules in selected PDB entries.
Video Slides
Search by Molecule Identifier
Showcases validation results for selected molecules.
Video Slides
Search by Molecule Annotation
Showcases validation results for molecules with specific annotations.
Video Slides
Search by PDB Entry and Molecule Annotation
Showcases validation results for molecules with specific annotations, from selected PDB entries.
Only categories with at least 0.5% representation are shown. Use the CSV link to view all data.
A list of PDB entry/molecule identifiers, chain identifier is case sensitive,
separated by a new line or a comma
(e.g. ).
A list of PDB entry identifiers, separated by a new line or a comma.
You may also paste a list of pre-filtered PDB IDs (e.g., by organism, molecular weight, etc.) from PDB.org: