From WebChem Wiki
Jump to: navigation, search

ValidatorDB is built around several key principles which govern the type of content available in this database, and how this content is organized.

A) Scheme of the validation procedure for the entire PDB; B) Scheme of the validation procedure for a sialic acid (SIA) molecule in PDB entry 4jtv; C) Typical validation results for SIA molecules.

Validation procedure

Within the ValidatorDB environment, the term validation refers to the process of determining whether a molecule is structurally complete and correctly annotated. This means checking if the topology and chirality of each validated molecule correspond to those of the model with the same annotation (3-letter code) as the validated molecule.

ValidatorDB implements the validation of annotation approach, which consists of several steps. First, for each molecule under investigation, the input motif is extracted from the respective PDB entry. At the same time, the appropriate model is retrieved from wwPDB CCD. Then, the validated molecule (or validated motif) is identified as the subset of atoms common in the model and the input motif. Subsequently, the validated molecule is compared against the model, atom by atom. All the validation analyses in ValidatorDB are based on this comparison of atom properties (presence, chirality, element symbol, PDB name, etc.). Other unusual aspects encountered during validation are reported as processing warnings (e.g., which conformer was validated if several conformers were present).

Validation analyses

The validation analyses performed by ValidatorDB cover all main issues which have been observed in the topology (2D structure) and geometry (3D structure) of ligands and non-standard residues. These validation analyses, along with their respective results, can be classified into three categories, namely Completeness, Chirality and Advanced analyses.

The Completeness analyses attempt to find which atoms are missing, whether these atoms are part of rings, or the structure is degenerate, i.e., the molecule contains very severe errors. These may refer to residues overlapping in the 3D space, or atoms which are disconnected from the rest of the structure.

The Chirality analyses are performed only on complete structures, and aim to evaluate the chirality of each atom in the validated molecule. We distinguish between several types of chirality errors: on carbon atoms (C chirality), on metal atoms (Metal chirality), on atoms with 4 substituents in one plane (Planar chirality), on atoms connected to at least one substituent by a bond of higher order (High order chirality), and the remaining chirality issues (Other chirality).

The Advanced analyses are focused on issues which are not real chemical problems, but which can complicate further processing and exploration of data, and thus should be noted. The Substitution analysis reports the replacement of some atom by an atom of a different chemical element. The Foreign atom analysis detects atoms which originate from the neighborhood of the validated molecule (i.e., having different PDB residue ID than the majority of the validated molecule), and generally marks sites of inter-molecular linkage. The Different naming analysis identifies atoms whose name in PDB format is different than the standard convention for the validated molecule. The Zero RMSD analysis reports molecules whose structure is identical (root mean square deviation = 0 Å) to the model from wwPDB CCD. The Alternate conformations analysis informs about the occurrence of alternate conformations in the validated PDB entry.

Validation results

Each molecule is evaluated depending on how it fares during the validation analyses described above. If no issues are found during the validation analyses, the molecule is marked as having complete structure and correct chirality. Validated molecules exhibiting an error in at least one of the Completeness analyses are denoted as incomplete, whereas the remaining molecules are reported as complete. If no issues are detected during the Chirality analyses, the validated molecule is marked as having Correct chirality, whereas the remaining molecules are marked as having Wrong chirality. When issues are found during an Advanced analysis, a warning is reported: Substitution, Foreign atom, Different naming, Zero RMSD or Alternate conformations. Each type of issue encountered during validation has a specific color code.

Some types of chirality errors do not constitute real issues, but are artifacts of the automated chirality determination procedure. Specifically, an error in planar chirality may just mean that the chiral atom is situated slightly above or below the plane compared to its equivalent in the model from wwPDB CCD. Further, an error in high order chirality often marks the involvement of phosphate O atoms in salt or ester formation, or merely a different PDB format identification of phosphate O atoms of the validated molecule compared to the model. Therefore, if the validated molecule is found to have planar or high order chirality errors, but no other type of chirality issues, the molecule is marked as having Correct chirality (tolerant).

While the results of the Advanced analyses have no bearing over the chemical soundness of the validated molecules, they indicate that further, especially automated processing of these structures can be very problematic. Comparison between the structures of molecules with the same annotation (3-letter code) from different PDB entries might even be impossible in the presence of a substitution, as the corresponding atoms have different chemical elements. PDB atom names cannot be used straightforwardly, since even element symbols can differ and atoms can be formally included in neighboring residues.

Validation reports

For each PDB entry, the relevant molecules are detected and validated. ValidatorDB is then built as the collection of validation results for all molecules in all PDB entries. The results are systematically organized into reports:

  • Detailed validation report for a particular molecule
  • Detailed validation report for a set of molecules with particular molecule identifiers
  • Detailed validation report for a set of molecules sharing a particular annotation
  • Detailed validation report for a set of molecules in a particular PDB entry
  • Summary of validation results for sets of molecules sharing the same annotation
  • Summary of validation results for sets of molecules originating from the same PDB entry
  • Validation overview for the entire PDB

Continue by learning about the content available in ValidatorDB, or return to the Table of contents.