Difference between revisions of "ValidatorDB:CaseStudies"

From WebChem Wiki
Jump to: navigation, search
Line 1: Line 1:
Below you can find a variety of links used in ValidatorDB case studies.
+
One interesting question is how the quality of the structures varies for different classes of molecules.  We have thus designed and conducted several case studies to show how '''ValidatorDB''' can answer such questions. We selected the molecules according to a combination of features related to chemical structure, biological function, area of application, availability, etc. The following classes were defined as subsets of models from wwPDB CCD:
  
'''Polycyclic compounds''':  
+
1. '''Polycyclic compounds''': Contain 3 or more conjugated rings. The molecules containing metals were excluded, as their quality is influenced more by the presence of the metal than by their polycyclic structure.
:[[Media:Polycyclic.txt | Annotations]]  
+
:[[Media:Polycyclic.txt | List of annotations (3-letter codes) of all molecules in this study]]  
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/5bc6c604-8a6b-4d06-8402-1e33f0163893 Case study]
+
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/5bc6c604-8a6b-4d06-8402-1e33f0163893 Access the results of this case study]
 +
Compared to the PDB-wide statistics for all ligands and non-standard residues, polycyclic molecules have overall higher quality (higher percentage of molecules with complete structure and correct chirality). Nonetheless, they exhibit more errors in C chirality, probably due to their more complicated, carbon-based scaffolds.
  
'''Carbohydrates''':  
+
2. '''Carbohydrates''': Contain the pyran or furan ring. Molecules containing P (e.g., ATP) were excluded, as their quality is influenced by the occurrence of phosphate derivatives than by the sugar part.
:[[Media:Carbohydrates.txt | Annotations]]
+
:[[Media:Carbohydrates.txt | List of annotations (3-letter codes) of all molecules in this study]]
:[http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/1e5f959c-269f-49fe-b45c-d1797a2b4bd3 Case study]
+
:[http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/1e5f959c-269f-49fe-b45c-d1797a2b4bd3 Access the results of this case study]
 +
Carbohydrate molecules show similar trends as polycyclic molecules, since their structure is also ring based. However, they exhibit a higher rate of errors in C chirality, a consequence of the fact that they generally contain more chiral atoms.
  
'''Mannose derivatives''':  
+
3. '''Mannose derivatives''': Subclass of carbohydrates.
:[[Media:Manose.txt | Annotations]]
+
:[[Media:Manose.txt | List of annotations (3-letter codes) of all molecules in this study]]
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/66c257c4-4218-4db9-b2e8-4b5f3870c437 Case study]
+
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/66c257c4-4218-4db9-b2e8-4b5f3870c437 Access the results of this case study]
 +
Mannose derivatives play an important role in cell-cell recognition, a biological function which relies heavily on chirality. Therefore they must have a characteristic structure (determined by chirality) and are also strongly predisposed to have C chirality errors. We found that the percentage of errors in C chirality is over 3 times higher for mannose derivatives than the PDB-wide evaluation for all ligands and non-standard residues.
  
'''Organometalls''':  
+
4. '''Organometalls''': Contain a metal atom.
:[[Media:Organometalls.txt | Annotations]]
+
:[[Media:Organometalls.txt | List of annotations (3-letter codes) of all molecules in this study]]
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/8ab34844-1ecb-4e3e-8f2e-b8f51bd82102 Case study]
+
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/8ab34844-1ecb-4e3e-8f2e-b8f51bd82102 Access the results of this case study]
 +
Organometals seem to have overall lower quality. Part of the errors is artifacts of our validation algorithm, as such molecules can have very complicated scaffolds (see algorithm limitations in the Supplementary Material). However, the majority of the reported errors are significant, proving that many challenges remain in the field of structure determination for organometals.
  
'''Experimental drugs''':  
+
5. '''Experimental drugs''': Described in DrugBank  as experimental drugs, i.e., have been shown to bind specific proteins in mammals, bacteria, viruses, fungi, or parasites.
:[[Media:Experimental drugs.txt | Annotations]]
+
:[[Media:Experimental drugs.txt | List of annotations (3-letter codes) of all molecules in this study]]
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/bcce1691-085b-409a-86a4-d4ea365c59aa Case study]
+
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/bcce1691-085b-409a-86a4-d4ea365c59aa Access the results of this case study]
 +
On the other hand, the overall quality of the structure of experimental drugs is clearly much higher than the PDB-wide statistics for all ligands and non-standard residues.
  
'''Approved drugs''':  
+
6. '''Approved drugs''': Described in DrugBank as approved drugs, i.e., have received approval in at least one country.
:[[Media:Approved drugs.txt | Annotations]]
+
:[[Media:Approved drugs.txt | List of annotations (3-letter codes) of all molecules in this study]]
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/8a5fc8ae-74ae-4a20-ad58-cc20d9a5e750 Case study]
+
: [http://webchemdev.ncbr.muni.cz/Platform/ValidatorDb/Custom/8a5fc8ae-74ae-4a20-ad58-cc20d9a5e750 Access the results of this case study]
 +
For approved drugs, i.e., drugs already on the market, the situation is even better. About 95 % of these molecules are complete and have correct chirality, a consequence of their rather simpler structure, and that markedly more effort is expended in the determination of their structure in biomacromolecular complexes.

Revision as of 05:12, 3 September 2014

One interesting question is how the quality of the structures varies for different classes of molecules. We have thus designed and conducted several case studies to show how ValidatorDB can answer such questions. We selected the molecules according to a combination of features related to chemical structure, biological function, area of application, availability, etc. The following classes were defined as subsets of models from wwPDB CCD:

1. Polycyclic compounds: Contain 3 or more conjugated rings. The molecules containing metals were excluded, as their quality is influenced more by the presence of the metal than by their polycyclic structure.

List of annotations (3-letter codes) of all molecules in this study
Access the results of this case study

Compared to the PDB-wide statistics for all ligands and non-standard residues, polycyclic molecules have overall higher quality (higher percentage of molecules with complete structure and correct chirality). Nonetheless, they exhibit more errors in C chirality, probably due to their more complicated, carbon-based scaffolds.

2. Carbohydrates: Contain the pyran or furan ring. Molecules containing P (e.g., ATP) were excluded, as their quality is influenced by the occurrence of phosphate derivatives than by the sugar part.

List of annotations (3-letter codes) of all molecules in this study
Access the results of this case study

Carbohydrate molecules show similar trends as polycyclic molecules, since their structure is also ring based. However, they exhibit a higher rate of errors in C chirality, a consequence of the fact that they generally contain more chiral atoms.

3. Mannose derivatives: Subclass of carbohydrates.

List of annotations (3-letter codes) of all molecules in this study
Access the results of this case study

Mannose derivatives play an important role in cell-cell recognition, a biological function which relies heavily on chirality. Therefore they must have a characteristic structure (determined by chirality) and are also strongly predisposed to have C chirality errors. We found that the percentage of errors in C chirality is over 3 times higher for mannose derivatives than the PDB-wide evaluation for all ligands and non-standard residues.

4. Organometalls: Contain a metal atom.

List of annotations (3-letter codes) of all molecules in this study
Access the results of this case study

Organometals seem to have overall lower quality. Part of the errors is artifacts of our validation algorithm, as such molecules can have very complicated scaffolds (see algorithm limitations in the Supplementary Material). However, the majority of the reported errors are significant, proving that many challenges remain in the field of structure determination for organometals.

5. Experimental drugs: Described in DrugBank as experimental drugs, i.e., have been shown to bind specific proteins in mammals, bacteria, viruses, fungi, or parasites.

List of annotations (3-letter codes) of all molecules in this study
Access the results of this case study

On the other hand, the overall quality of the structure of experimental drugs is clearly much higher than the PDB-wide statistics for all ligands and non-standard residues.

6. Approved drugs: Described in DrugBank as approved drugs, i.e., have received approval in at least one country.

List of annotations (3-letter codes) of all molecules in this study
Access the results of this case study

For approved drugs, i.e., drugs already on the market, the situation is even better. About 95 % of these molecules are complete and have correct chirality, a consequence of their rather simpler structure, and that markedly more effort is expended in the determination of their structure in biomacromolecular complexes.