MotiveValidator:Limitations

From WebChem Wiki
Jump to: navigation, search

Although we did out best to make the algorithm as general as possible, there are cases where the algorithm does not work correctly and one should bear in mind when interpreting the results.

The algorithm behind MotiveValidator has the following limitations:

  • It is necessary to ensure that the model serving as reference during validation is indeed correct. This limitation is overcome by using high-quality structures from wwPDB CCD.
  • The superimposition phase might not identify the optimal matching between the atoms of the model and those of the validated molecule if their 3D structures are too different. Specifically, the molecule may appear severely fragmented if some critical atoms are missing or misplaced (i.e., the length of the bond connecting that atom to the rest of the structure differs by over 3 standard deviations from the general PDB average expected for that bond type). In this case, the molecules are generally marked as degenerate. This limitation applies to no more than 0.3% of validated molecules across the entire wwPDB CCD.
  • The superimposition phase might not identify the optimal matching if the validated molecule contains very complicated scaffolds like cages. In such cases, the molecules may incorrectly be marked as degenerate (e.g., hexatantalum dodecabromide TBR). This limitation applies to no more than 0.5% of validated molecules across the entire Protein Data Bank, and is generally seen in organometallic cages.
  • Some molecules are counted as alternate conformers even if they are not marked by the standard alternate location identifiers in the PDB file. Such situations arise when two molecules with the same annotation (3-letter code) but different residue identifiers lie closer than 0.65 Å from each other. In this case, only the molecule with the lower residue identifier is validated. Alternate conformers, either explicitly marked as such in the PDB file or not, are not validated. They add up to approximately 2.5% of the molecules relevant for validation across the entire Protein Data Bank.

All limitations, except for the first one, cause the particular molecule to be marked by an explicit processing warning in all validation reports.


Continue by examining the technical details, or return to the Table of contents.