Difference between revisions of "ValidatorDB:Database contents"

From WebChem Wiki
Jump to: navigation, search
(Created page with "=====FIX ME CRINA===== For each validated motif, a validation report contains several types of results. Since the evaluation of the validated motif relies on comparing all at...")
 
 
(14 intermediate revisions by one other user not shown)
Line 1: Line 1:
=====FIX ME CRINA=====
+
'''Please make sure to go through the [[ValidatorDB:Terminology | terms]] of MotiveValidator and ValidatorDB and principles governing [[ValidatorDB:Principles | ValidatorDB]] or [[MotiveValidator:Principles | MotiveValidator]]  before you read this page
  
For each validated motif, a validation report contains several types of results. Since the evaluation of the validated motif relies on comparing all atoms and bonds in the validated motif to those in the model residue, the first results that can be encountered are errors. In the first level '''Complete''' and '''Incomplete''' structures are distinguished. '''If you are unsure about the Terminology used in this section, please see [[MotiveValidator:Terminology | this webpage]]'''.
+
For each molecule, the validation report contains several types of results, based on how the molecule fared during the various [[ValidatorDB:Principles#Validation_analyses | validation analyses ]]. The evaluation relies on comparing all atoms and bonds in each ''validated molecule'' to those in the corresponding ''model''. '''ValidatorDB''' and '''MotiveValidator''' report correct structures, as well as all potential issues found during validation, namely structures that are wrong either because they are incomplete (missing atoms or rings), or because the chirality of some atoms is incorrect. If no issues are found during the ''completeness'' and ''chirality'' analyses, the molecule is marked as having ''complete structure and correct chirality''. The results of the ''advanced'' analyses are reported as ''warnings''. Additionally, unusual circumstances encountered during validation are reported separately as ''processing warnings''.
  
==Incomplete structures==
+
In '''ValidatorDB''' and '''MotiveValidator''', the results for different types of validation analyses are labeled using the unified color scheme: <span style="color:#006400">complete structure and correct chirality</span>, <span style="color:#FF0000">incomplete structure</span>, <span style="color:#DAA520">wrong chirality</span>, <span style="color:#607d8b">warning</span>.
  
Incomplete structures are validated residues, which were lacking atoms in their structure in comparison to the model residue, or the inter-atomic distances were way too big or too low. Namely:
+
__TOC__
* '''Missing atoms''': An atom in the model residue has no corresponding atom in the validated motif.
 
* '''Missing rings''': At least one missing atom originates from cycles (rings).
 
* '''Degenerate''': Those motifs could not be properly analyzed due their degenerate structure, i.e. suspicious discrepancies between the atoms and inter-atomic bonds in the validated motif and in the model residue prevented proper validation.
 
  
Please note, ''chirality'' is only evaluated for those motifs which are complete. This is because the absence of some atoms can prevent the proper evaluation of chirality on the chiral centers present in the validated motif. Therefore, note that all motifs which are counted in the Wrong chirality category are in fact complete. At the same time, the motifs with no missing atoms and no chirality error are actually counted in a category called '''Correct chirality'''.
+
=Incomplete structures=
 +
Validated molecules exhibiting an error in at least one of the ''Completeness'' analyses are denoted as ''incomplete'', whereas the remaining molecules are reported as ''complete''. ''Incomplete structures'' refer to molecules which lack some atoms, or whose structure is significantly distorted in comparison to the model:
 +
* '''Missing atoms''': An atom in the model has no corresponding atom in the validated molecule.
 +
* '''Missing rings''': At least one missing atom originates from cycles (rings). The formal distinction between ring atoms and non-ring atoms (simply denoted as atoms) is meant to allow a quick localization of potential issues in molecules containing rings, especially where atom identifiers are not useful.
 +
* '''Degenerate''': The molecules could not be validated due to their distorted structure, i.e. significant discrepancies between the atoms and inter-atomic bonds in the validated molecule and in the model. This generally happens when residues are overlapping in the 3D space, or some atoms appear disconnected from the rest of the structure.
  
[[File:VDB incomplete.png|thumb|center|1200px| '''ValidatorDB''' reports correct structures, as well as all potential issues found during validation, namely structures that are wrong either because they are incomplete (missing atoms or rings), or because the chirality of some atoms is incorrect. Here all the are potential issues of the incomplete structure are illustrated.]]
+
[[File:VDB incomplete.png|thumb|center|1100px| Potential issues found during the ''completeness'' analyses. All ''incomplete structures'' are marked in <span style="color:#FF0000">red</span> highlight]]
  
==Complete structures==
+
=Complete structures=
[[File:VDB chirality.png|thumb|right|500px| Complete structures can contain chirality issues on a variety of different atom types, here all possible cases occuring in '''ValidatorDB''' are illustrated.]]
+
[[File:VDB chirality.png|thumb|right|400px| Complete structures can contain chirality issues on a variety of different atom types, here all possible cases occuring in '''ValidatorDB''' are illustrated.]]
  
Complete validated structures include all the atoms which are present in the model residue. However, there are still possible issues and warnings can be raised:
+
If no issue was found during the ''completeness'' analyses, the validated molecule is marked as ''complete'', because it contains all the atoms which are present in the model. ''Chirality'' is further evaluated only for these molecules with ''complete structure'', because the absence of some atoms can make it difficult to check the chirality of the remaining atoms.
  
* '''Wrong chirality''': an atom from the validated motif has different chirality than the corresponding atom from the model residue. This category can be further specified by identifying a source of the chirality issue.
+
==Wrong chirality==
** '''C atom''': the chirality error was found on a single bonded carbon atom.
+
'''ValidatorDB''' and '''MotiveValidator''' present the results of several kinds of ''chirality'' analyses. If an issue is found, the following types of results are reported:
** '''Planar atom''': the chirality error was found on a planar chiral center. Because of their spacial distribution, planar chiral centers are very sensitive even to small perturbations in the position of the substituents. Therefore, some of the errors reported here might not be significant.
+
** '''Wrong chirality (Carbon)''': the chirality error was found on an sp3 hybridized carbon atom.
** '''Metal atom''': the chirality error was found on a metal chiral center.  
+
** '''Wrong chirality (Planar)''': the chirality error was found on a planar chiral center. Because of their spacial distribution, planar chiral centers are very sensitive even to small perturbations in the position of the substituents. Therefore, some of the errors reported here might not be significant.
** '''High order atom''': the chirality error was detected on an atom, which is bonded with a bond of higher order than one. A typical example is phosphorus.  
+
** '''Wrong chirality (Metal)''': the chirality error was found on a metal chiral center.  
** '''Other''': Additional chirality issue which could not be asserted to any of the above categories, for example chirality issues on Nitrogen.
+
** '''Wrong chirality (High order)''': the chirality error was detected on a chiral atom (e.g., phosphorus) where at least one substituent is bound by a bond of higher order.  
 +
** '''Other''': Additional chirality issue which could not be listed in any of the above categories (e.g., chirality issues on nitrogen atoms). These appear rarely in the Protein Data Bank (<0.5% of molecules).
  
 +
If any issue is detected on any atom during any ''chirality'' analysis, the molecule is marked as having ''wrong chirality'', because there is at least one atom in the validated molecule with different chirality than the corresponding atom in the model. If more atoms in a molecule exhibit the same type of chirality issue, the validated molecule is counted only once in each type of statistics (e.g., if 3 carbon atoms in one glucose molecule have wrong chirality, this glucose molecule will contribute only once to the statistics on "wrong chirality (carbon)"). If a validated molecule exhibits issues in more than one type of ''chirality'' analysis, it will be counted once in each corresponding statistics (e.g., heme molecules may appear both in the statistics for ''wrong chirality (planar)'', and for the ''wrong chirality (metal)'').
  
 +
==Correct chirality==
 +
If no issues are detected during the ''chirality'' analyses, the validated molecule is marked as having '''correct chirality''', because it contains all atoms present in the model, and each of these atoms has the same chirality in the validated molecule and in the model.
  
 +
Some types of chirality errors do not constitute real issues, but are artifacts of the automated chirality determination procedure. Specifically, an error in planar chirality may just mean that the chiral atom is situated slightly above or below the plane compared to its equivalent in the model from wwPDB CCD. Further, an error in high order chirality often marks the involvement of phosphate O atoms in salt or ester formation, or merely a different PDB format identification of phosphate O atoms of the validated molecule compared to the model. Therefore, if the validated molecule is found to have planar or high order chirality errors, but no other type of chirality issues, the molecule is marked as having '''correct chirality (tolerant)'''.
 +
<br style="clear:both" />
  
 +
=Warnings=
 +
[[File:VDB warnings.png|thumb|right|700px| Examples of ''warnings'' reported during the ''advanced'' analyses.]]
  
 +
Aside from the ''completeness'' and ''chirality'' analyses, '''ValidatorDB''' and '''MotiveValidator''' also perform a set of ''advanced'' analyses. These are meant to bring attention to aspects which may be indicators of further problems in the structure, or, on the contrary, explain why some errors were found during the ''completeness'' and ''chirality'' analyses. For example, if a molecule is part of a polymer, a reported change in chirality might be a consequence of some glitch in the polymer building process. Of course, each situation should be studied individually, but the ''advanced'' analyses provide a good starting point.
  
 +
When issues are found during an ''advanced'' analysis, a warning is reported:
 +
* '''Substitution''': An atom from the validated molecule is of a different chemical element than the corresponding atom in the model (e.g. O from the model is mapped to N from the validated molecule). This happens often at linkage sites for ligands which are covalently bound to biomolecules.
 +
* '''Different naming''': An atom from the validated molecule has a different PDB atom name than the corresponding atom from the model (e.g. the C1 atom from the model is mapped to the C7 atom from the validated molecule). This happens often when the original PDB files were produced by different software.
 +
* '''Foreign atom''': An atom from the model was mapped to an atom from outside the validated molecule (i.e. from its surroundings). This happens often at linkage sites for polymeric ligands or ligands which are covalently bound to biomolecules.
 +
* '''Alternate conformations''': In the original PDB file, the validated molecule contains at least some parts (some or all atoms) for which alternate conformers were given (i.e., most probably different rotamers). Only the first conformer was considered during validation.
 +
* '''Zero model RMSD''': The superimposition between the model and the validated molecule has a root mean square deviation of zero, i.e., the validated molecule is identical to the model used as reference.
  
Note that in case a validated residues belongs to more than one category, it can be found in all of them (i.e. Planar and Metal, which is often case of ''HEM'' residues)
+
While the results of the ''Advanced'' analyses have no bearing over the chemical soundness of the validated molecules, they indicate that further, especially automated processing of these structures can be very problematic. Comparison between the structures of molecules with the same annotation (3-letter code) from different PDB entries might even be impossible in the presence of a substitution, as the corresponding atoms have different chemical elements. PDB atom names cannot be used straightforwardly, since even element symbols can differ and atoms can be formally included in neighboring residues.
  
* '''Uncertain chirality''': the presence of unusual bonds may cause an improper evaluation of chirality.
+
<br style="clear:both" />
* '''Substitution''': An atom from the validated motif is of a different chemical element than the corresponding atom in the model residue (e.g. O mapped to N). This happens often at linkage sites.
 
* '''Different naming''': An atom from the validated motif has a different PDB atom name than the corresponding atom from the model residue (e.g. the C1 atom mapped to the C7 atom). This happens often when the original PDB files were produced by different software.
 
* '''Foreign atom''': An atom from the model residue was mapped to an atom from outside the validated residue (i.e. from its surroundings).
 
* '''Alternate conformations''': In the original PDB file, the validated residue contains atoms which were given in conformations (i.e., most probably different rotamers). Only the first rotamer was considered during validation.
 
* '''Zero model RMSD''': The superimposition between the model residue and the validated motif has a root mean square deviation of zero, i.e., the validated motif is identical to the model residue used as reference.
 
  
The last category  '''Correct chirality''' includes all the validated residues, where all the atoms from the validated motif have a matching partner in the model structure and the chirality of the chiral atoms is correct. Last but not least, additional category '''Correct (Tolerant) chirality''' further includes motifs with identified chirality issues as '''Planar''' and '''High order''' since these might not be significant.
+
=Processing warnings=
 +
Aside from the results of the ''completeness'', ''chirality'' and ''advanced'' analyses, '''ValidatorDB''' also provides a list of all unusual circumstances encountered during validation in the form of '''processing warnings'''. Typically, the '''processing warnings''' provide information about why a molecule marked as ''degenerate'' could not be validated, which conformer was used if more conformers were available in the original PDB entry (either in the form of alternate rotamers, or multiple NMR models), etc.
  
Furthermore, for each validation run, there is '''Processing warnings''' tab available, where one can find warnings about the processed structures. Typically, there is an information about skipped ''alternate  conformation residues'', ''wrong CONNECT records'' from the parent PDB file, ''processed models'' (in case the parent PDB is composed of multiple models) etc. Finally, as a general rule, in the validation interface, errors are marked in <span style="color:#FF0000">red</span> (missing atoms) or <span style="color:#DAA520">dark yellow</span> (wrong chirality), correct structures in <span style="color:#228B22">green</span>, and warnings in <span style="color:#008B8B">cyan</span>.
+
'''Continue by learning about the organization of [[ValidatorDB:Database_organization |  ValidatorDB]] or [[MotiveValidator:Service_Organization | MotiveValidator]].'''
 
 
[[File:VDB warnings.png|thumb|left|1000px| '''ValidatorDB''' also reports warnings in the complete structures such as ''substitutions'', ''different naming'' or ''foreign atoms''.]]
 

Latest revision as of 18:11, 10 August 2015

Please make sure to go through the terms of MotiveValidator and ValidatorDB and principles governing ValidatorDB or MotiveValidator before you read this page

For each molecule, the validation report contains several types of results, based on how the molecule fared during the various validation analyses . The evaluation relies on comparing all atoms and bonds in each validated molecule to those in the corresponding model. ValidatorDB and MotiveValidator report correct structures, as well as all potential issues found during validation, namely structures that are wrong either because they are incomplete (missing atoms or rings), or because the chirality of some atoms is incorrect. If no issues are found during the completeness and chirality analyses, the molecule is marked as having complete structure and correct chirality. The results of the advanced analyses are reported as warnings. Additionally, unusual circumstances encountered during validation are reported separately as processing warnings.

In ValidatorDB and MotiveValidator, the results for different types of validation analyses are labeled using the unified color scheme: complete structure and correct chirality, incomplete structure, wrong chirality, warning.

Incomplete structures

Validated molecules exhibiting an error in at least one of the Completeness analyses are denoted as incomplete, whereas the remaining molecules are reported as complete. Incomplete structures refer to molecules which lack some atoms, or whose structure is significantly distorted in comparison to the model:

  • Missing atoms: An atom in the model has no corresponding atom in the validated molecule.
  • Missing rings: At least one missing atom originates from cycles (rings). The formal distinction between ring atoms and non-ring atoms (simply denoted as atoms) is meant to allow a quick localization of potential issues in molecules containing rings, especially where atom identifiers are not useful.
  • Degenerate: The molecules could not be validated due to their distorted structure, i.e. significant discrepancies between the atoms and inter-atomic bonds in the validated molecule and in the model. This generally happens when residues are overlapping in the 3D space, or some atoms appear disconnected from the rest of the structure.
Potential issues found during the completeness analyses. All incomplete structures are marked in red highlight

Complete structures

Complete structures can contain chirality issues on a variety of different atom types, here all possible cases occuring in ValidatorDB are illustrated.

If no issue was found during the completeness analyses, the validated molecule is marked as complete, because it contains all the atoms which are present in the model. Chirality is further evaluated only for these molecules with complete structure, because the absence of some atoms can make it difficult to check the chirality of the remaining atoms.

Wrong chirality

ValidatorDB and MotiveValidator present the results of several kinds of chirality analyses. If an issue is found, the following types of results are reported:

    • Wrong chirality (Carbon): the chirality error was found on an sp3 hybridized carbon atom.
    • Wrong chirality (Planar): the chirality error was found on a planar chiral center. Because of their spacial distribution, planar chiral centers are very sensitive even to small perturbations in the position of the substituents. Therefore, some of the errors reported here might not be significant.
    • Wrong chirality (Metal): the chirality error was found on a metal chiral center.
    • Wrong chirality (High order): the chirality error was detected on a chiral atom (e.g., phosphorus) where at least one substituent is bound by a bond of higher order.
    • Other: Additional chirality issue which could not be listed in any of the above categories (e.g., chirality issues on nitrogen atoms). These appear rarely in the Protein Data Bank (<0.5% of molecules).

If any issue is detected on any atom during any chirality analysis, the molecule is marked as having wrong chirality, because there is at least one atom in the validated molecule with different chirality than the corresponding atom in the model. If more atoms in a molecule exhibit the same type of chirality issue, the validated molecule is counted only once in each type of statistics (e.g., if 3 carbon atoms in one glucose molecule have wrong chirality, this glucose molecule will contribute only once to the statistics on "wrong chirality (carbon)"). If a validated molecule exhibits issues in more than one type of chirality analysis, it will be counted once in each corresponding statistics (e.g., heme molecules may appear both in the statistics for wrong chirality (planar), and for the wrong chirality (metal)).

Correct chirality

If no issues are detected during the chirality analyses, the validated molecule is marked as having correct chirality, because it contains all atoms present in the model, and each of these atoms has the same chirality in the validated molecule and in the model.

Some types of chirality errors do not constitute real issues, but are artifacts of the automated chirality determination procedure. Specifically, an error in planar chirality may just mean that the chiral atom is situated slightly above or below the plane compared to its equivalent in the model from wwPDB CCD. Further, an error in high order chirality often marks the involvement of phosphate O atoms in salt or ester formation, or merely a different PDB format identification of phosphate O atoms of the validated molecule compared to the model. Therefore, if the validated molecule is found to have planar or high order chirality errors, but no other type of chirality issues, the molecule is marked as having correct chirality (tolerant).

Warnings

Examples of warnings reported during the advanced analyses.

Aside from the completeness and chirality analyses, ValidatorDB and MotiveValidator also perform a set of advanced analyses. These are meant to bring attention to aspects which may be indicators of further problems in the structure, or, on the contrary, explain why some errors were found during the completeness and chirality analyses. For example, if a molecule is part of a polymer, a reported change in chirality might be a consequence of some glitch in the polymer building process. Of course, each situation should be studied individually, but the advanced analyses provide a good starting point.

When issues are found during an advanced analysis, a warning is reported:

  • Substitution: An atom from the validated molecule is of a different chemical element than the corresponding atom in the model (e.g. O from the model is mapped to N from the validated molecule). This happens often at linkage sites for ligands which are covalently bound to biomolecules.
  • Different naming: An atom from the validated molecule has a different PDB atom name than the corresponding atom from the model (e.g. the C1 atom from the model is mapped to the C7 atom from the validated molecule). This happens often when the original PDB files were produced by different software.
  • Foreign atom: An atom from the model was mapped to an atom from outside the validated molecule (i.e. from its surroundings). This happens often at linkage sites for polymeric ligands or ligands which are covalently bound to biomolecules.
  • Alternate conformations: In the original PDB file, the validated molecule contains at least some parts (some or all atoms) for which alternate conformers were given (i.e., most probably different rotamers). Only the first conformer was considered during validation.
  • Zero model RMSD: The superimposition between the model and the validated molecule has a root mean square deviation of zero, i.e., the validated molecule is identical to the model used as reference.

While the results of the Advanced analyses have no bearing over the chemical soundness of the validated molecules, they indicate that further, especially automated processing of these structures can be very problematic. Comparison between the structures of molecules with the same annotation (3-letter code) from different PDB entries might even be impossible in the presence of a substitution, as the corresponding atoms have different chemical elements. PDB atom names cannot be used straightforwardly, since even element symbols can differ and atoms can be formally included in neighboring residues.


Processing warnings

Aside from the results of the completeness, chirality and advanced analyses, ValidatorDB also provides a list of all unusual circumstances encountered during validation in the form of processing warnings. Typically, the processing warnings provide information about why a molecule marked as degenerate could not be validated, which conformer was used if more conformers were available in the original PDB entry (either in the form of alternate rotamers, or multiple NMR models), etc.

Continue by learning about the organization of ValidatorDB or MotiveValidator.