ValidatorDB:Database contents

From WebChem Wiki
Revision as of 14:14, 3 September 2014 by Crina (talk | contribs)

Jump to: navigation, search

Please make sure to go through the terms and principles governing ValidatorDB before you read this page.

For each molecule, the validation report contains several types of results, based on how the molecule fared during the various validation analyses . The evaluation relies on comparing all atoms and bonds in the validated molecules to those in the model. ValidatorDB reports correct structures, as well as all potential issues found during validation, namely structures that are wrong either because they are incomplete (missing atoms or rings), or because the chirality of some atoms is incorrect.

If no issues are found during the completeness and chirality analyses, the molecule is marked as having complete structure and correct chirality.

Potential issues found during the completeness analyses.

Validated molecules exhibiting an error in at least one of the Completeness analyses are denoted as incomplete, whereas the remaining molecules are reported as complete.

Incomplete structures

Incomplete structures are validated residues, which were lacking atoms in their structure in comparison to the model residue, or the inter-atomic distances were way too big or too low. Namely:

  • Missing atoms: An atom in the model residue has no corresponding atom in the validated motif.
  • Missing rings: At least one missing atom originates from cycles (rings).
  • Degenerate: Those motifs could not be properly analyzed due their degenerate structure, i.e. suspicious discrepancies between the atoms and inter-atomic bonds in the validated motif and in the model residue prevented proper validation.

Please note, chirality is only evaluated for those motifs which are complete. This is because the absence of some atoms can prevent the proper evaluation of chirality on the chiral centers present in the validated motif. Therefore, note that all motifs which are counted in the Wrong chirality category are in fact complete. At the same time, the motifs with no missing atoms and no chirality error are actually counted in a category called Correct chirality.


Complete structures

Complete structures can contain chirality issues on a variety of different atom types, here all possible cases occuring in ValidatorDB are illustrated.

Complete validated structures include all the atoms which are present in the model residue. However, there are still possible issues and warnings can be raised:

  • Wrong chirality: an atom from the validated motif has different chirality than the corresponding atom from the model residue. This category can be further specified by identifying a source of the chirality issue.
    • C atom: the chirality error was found on a single bonded carbon atom.
    • Planar atom: the chirality error was found on a planar chiral center. Because of their spacial distribution, planar chiral centers are very sensitive even to small perturbations in the position of the substituents. Therefore, some of the errors reported here might not be significant.
    • Metal atom: the chirality error was found on a metal chiral center.
    • High order atom: the chirality error was detected on an atom, which is bonded with a bond of higher order than one. A typical example is phosphorus.
    • Other: Additional chirality issue which could not be asserted to any of the above categories, for example chirality issues on Nitrogen.

If no issues are detected during the Chirality analyses, the validated molecule is marked as having Correct chirality, whereas the remaining molecules are marked as having Wrong chirality. When issues are found during an Advanced analysis, a warning is reported: Substitution, Foreign atom, Different naming, Zero RMSD or Alternate conformations. Each type of issue encountered during validation has a specific color code.

Some types of chirality errors do not constitute real issues, but are artifacts of the automated chirality determination procedure. Specifically, an error in planar chirality may just mean that the chiral atom is situated slightly above or below the plane compared to its equivalent in the model from wwPDB CCD. Further, an error in high order chirality often marks the involvement of phosphate O atoms in salt or ester formation, or merely a different PDB format identification of phosphate O atoms of the validated molecule compared to the model. Therefore, if the validated molecule is found to have planar or high order chirality errors, but no other type of chirality issues, the molecule is marked as having Correct chirality (tolerant).

While the results of the Advanced analyses have no bearing over the chemical soundness of the validated molecules, they indicate that further, especially automated processing of these structures can be very problematic. Comparison between the structures of molecules with the same annotation (3-letter code) from different PDB entries might even be impossible in the presence of a substitution, as the corresponding atoms have different chemical elements. PDB atom names cannot be used straightforwardly, since even element symbols can differ and atoms can be formally included in neighboring residues.



Note that in case a validated residues belongs to more than one category, it can be found in all of them (i.e. Planar and Metal, which is often case of HEM residues)

  • Uncertain chirality: the presence of unusual bonds may cause an improper evaluation of chirality.
  • Substitution: An atom from the validated motif is of a different chemical element than the corresponding atom in the model residue (e.g. O mapped to N). This happens often at linkage sites.
  • Different naming: An atom from the validated motif has a different PDB atom name than the corresponding atom from the model residue (e.g. the C1 atom mapped to the C7 atom). This happens often when the original PDB files were produced by different software.
  • Foreign atom: An atom from the model residue was mapped to an atom from outside the validated residue (i.e. from its surroundings).
  • Alternate conformations: In the original PDB file, the validated residue contains atoms which were given in conformations (i.e., most probably different rotamers). Only the first rotamer was considered during validation.
  • Zero model RMSD: The superimposition between the model residue and the validated motif has a root mean square deviation of zero, i.e., the validated motif is identical to the model residue used as reference.

The last category Correct chirality includes all the validated residues, where all the atoms from the validated motif have a matching partner in the model structure and the chirality of the chiral atoms is correct. Last but not least, additional category Correct (Tolerant) chirality further includes motifs with identified chirality issues as Planar and High order since these might not be significant.

Furthermore, for each validation run, there is Processing warnings tab available, where one can find warnings about the processed structures. Typically, there is an information about skipped alternate conformation residues, wrong CONNECT records from the parent PDB file, processed models (in case the parent PDB is composed of multiple models) etc. Finally, as a general rule, in the validation interface, errors are marked in red (missing atoms) or dark yellow (wrong chirality), correct structures in green, and warnings in cyan.

ValidatorDB also reports warnings in the complete structures such as substitutions, different naming or foreign atoms.