Difference between revisions of "ValidatorDB:Terminology"

(Created page with "Before moving on to more extensive descriptions of features, it is important to clearly establish the meaning of a few key terms and principles within the [[ValidatorDB] envir...")
 
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Before moving on to more extensive descriptions of features, it is important to clearly establish the meaning of a few key terms and principles within the [[ValidatorDB] environment.
+
The key terms used within the '''MotiveValidator''' and '''ValidatorDB''' environment are defined below. It is important to establish these terms before moving on to the description of the [[ValidatorDB:Principles | '''ValidatorDB''' principles]], or [[MotiveValidator:Functionality | '''MotiveValidator''' functionality]] on which both validation tools are built.
 
 
=Ligand=
 
We use the term ''ligand'' to refer to a chemical compound which forms a complex with a biomacromolecule (e.g., sugar, drug, heme). Ions can also function as self standing ligands, or they can be part of a residue (such as Fe in heme). In the PDB format, a ligand has its own residue identifier and 3-letter code, and is composed from HETATM records. The '''ValidatorDB''' term [[#Residue | residue]] thus fully covers ligands, in addition to typical components like amino acids and nucleotides.  
 
  
 
=Residue=
 
=Residue=
We generally use the term ''residue'' to refer to any component of a biomacromolecule or a biomacromolecular complex. This includes amino acid residues and nucleotides, which are commonly referred to as residues as they form proteins and nucleic acids. Within the '''MotiveValidator''' environment, any collection of atoms bound by chemical bonds (covalent, coordinative or ionic) can be considered a residue as long as this fact is appropriately indicated in the input PDB file. Specifically, all the atoms that make up a residue should have the same ''residue annotation'' (3-letter code) and ''residue identifier'' (index internal to the input PDB file).
+
We generally use the term ''residue'' to refer to any component of a biomacromolecule or a biomacromolecular complex. Within the validation environment, a collection of atoms bound by chemical bonds (covalent, coordinative or ionic) can be considered a residue as long as this fact is appropriately indicated in the input PDB file. Specifically, all the atoms that make up a residue should have the same ''residue annotation'' (3-letter code) and ''residue identifier'' (index internal to the PDB file).
  
 
=Non-standard residue=
 
=Non-standard residue=
'''ValidatorDB''' does not cover the 5 standard nucleotides and their common deoxy- forms, the 20 standard amino acids and selenomethionine (MSE). Validation results for the standard building blocks of biomacromolecules are not included in our database because many tools already cover these. Additionally, MSE is also excluded from validation due to its extremely high occurrence in the PDB (markedly higher than other ligands and non-standard residues), and high incidence of circumstantial inclusion in biomacromolecules (to aid X-ray crystallography experiments). Therefore, '''ValidatorDB''' covers only ''non-standard residues''.
+
With respect to the building blocks of biomacromolecules, '''ValidatorDB''' denotes as ''standard residues'' the 5 standard nucleotides (A, C, G, T, U) together with their 5 common deoxy forms (DA, DC, DG, DT, DU), and the 20 standard amino acids. Validation results for these standard building blocks of biomacromolecules are not included in our database because many tools already cover these. Additionally, Selenomethionine (MSE) is also considered a ''standard residue'' here due to its extremely high occurrence in the Protein Data Bank (markedly higher than other ligands and non-standard residues), and high incidence of circumstantial inclusion in biomacromolecules (to aid X-ray crystallography experiments). Therefore, '''ValidatorDB''' covers only ''non-standard residues'', i.e., residues which cannot be denoted as ''standard'' by the above definition.
 +
 
 +
=Ligand=
 +
We use the term ''ligand'' to refer to a chemical compound which forms a complex with a biomacromolecule (e.g., sugar, drug, heme). Ions can also function as self standing ligands, or they can be part of a residue (such as Fe in heme). In the PDB format, a ligand has its own residue identifier and annotation (3-letter code), and is composed from HETATM records. The '''validation''' term ''residue'' thus fully covers ligands.
  
=Molecules relevant for validation=
+
=Molecule=
'''ValidatorDB''' covers all ligands and non-standard residues containing 7 or more heavy atoms.  
+
In the validation, ''molecules'' are used as an umbrella term for ''ligands and non-standard residues''. Therefore, all properties of ligands and non-standard residues are valid for molecules as well (PDB entry of origin, residue annotation, residue identifier, number of heavy atoms, properties assigned after the validation, etc.). Moreover, a single occurrence of a ligand or non-standard residue is also a ''molecule''.  
  
Within the '''MotiveValidator''' environment, a ''motif'' is generally a fragment of a biomacromolecule, biomacromolecular complex or ligand, made up of
+
It is essential to note that [[ValidatorDB:UserManual | ValidatorDB]] contains validation results for all ligands and non-standard residues containing ''7 or more heavy atoms''. These are denoted as ''molecules relevant for validation'', or simply ''molecules''. The reason [[ValidatorDB:UserManual | ValidatorDB]] focuses on these types of molecules is that they exhibit high diversity and nontriviality in their structure.
one or more residues or parts of residues. A ''motif'' can in principle be any fragment of a biomolecule. Nonetheless, '''MotiveValidator''' is focused on the validation of residues, thus here ''motif'' generally
 
refers to a fragment made up from the residue under study, together with its surroundings (i.e., atoms from neighboring residues). Note that the terms ''fragment'' and ''motif'' are used as synonyms in this manual.
 
  
 +
=Model=
 +
We use the term ''model'' to refer to a particular structure that is known to be correct. This structure will then be used as reference in the validation process. A model is identified by its residue annotation (3-letter code). The origin of the models used both by [[MotiveValidator:UserManual | MotiveValidator]] and [[ValidatorDB:UserManual | ValidatorDB]] is the wwPDB Chemical Component Dictionary (wwPDB CCD)<ref name="Sen_2014"/>.
  
=Motif=
+
=Input motif=
Only input motif or validated motif (=validated molecule)
+
The term ''motif'' is used here as a fragment of a biomacromolecule, biomacromolecular complex or ligand, made up of one or more residues or parts of residues. Specifically, the term ''input motif'' refers to the individual molecule being validated, together with its surroundings (i.e., atoms from neighboring residues, within two bonds of any atom of the validated molecule). Each ''input motif'' is assigned a unique motif identifier during the validation based on its PDB entry of origin. On the other hand, the term ''validated motif'' (or ''validated molecule'') refers strictly to the subset of atoms in the ''input motif'' which were successfully mapped to atoms in the ''model''.
With respect to the chemistry of biomolecules, the term ''motif'' is used to refer to a well defined distribution of structural elements in a biomolecule or biomolecular complex, with characteristics
 
generally associated with a specific function. Within the '''MotiveValidator''' environment, a ''motif'' is generally a fragment of a biomacromolecule, biomacromolecular complex or ligand, made up of  
 
one or more residues or parts of residues. A ''motif'' can in principle be any fragment of a biomolecule. Nonetheless, '''MotiveValidator''' is focused on the validation of residues, thus here ''motif'' generally
 
refers to a fragment made up from the residue under study, together with its surroundings (i.e., atoms from neighboring residues). Note that the terms ''fragment'' and ''motif'' are used as synonyms in this manual.
 
  
We can generally say that, within the '''MotiveValidator''' environment, all ''residues'' can be thought of as ''motifs''. Therefore, different ''instances of the same residue'' (such as multiple arginine residues throughout the sequence of a protein, or copies of the same ligand in different monomers) can be considered and processed as different ''motifs'', making their identification straightforward and unambiguous.
 
  
We use the term model residue (or simply model) to refer to a particular structure that is known to be correct. This structure will then be used as reference template in the validation process, whereby a query residue with the same name (3-letter code) as the model will be compared to the model. Within the '''MotiveValidator''' environment, a model contains one residue. The origin of the model can be the wwPDB chemical component dictionary accessible via LigandExpo, or a custom model provided by the user.
+
'''Continue with reading about the [[ValidatorDB:Principles | ValidatorDB principles]], [[MotiveValidator:Principles | MotiveValidator functionality]], or return to the [[ValidatorDB:UserManual | ValidatorDB]], or [[MotiveValidator:UserManual | MotiveValidator]] manuals.'''
  
=Validation terminology=
 
* '''Model residue:''' Or simply ''model'' is a particular structure that is known to be correct. This structure is then used as reference template in the validation process, whereby a query residue with the same name (3-letter code) as the model will be compared to the model. Within the ''MotiveValidator'' environment, a model contains one residue. The origin of the ''model'' can be the wwPDB chemical component dictionary accessible via LigandExpo<ref name ="ligandexpo"/>, or a custom model provided by the user.
 
* '''Residue to be validated (validated residue):''' Residue of interest for validation.
 
* '''Input motif:''' Residues to be validated together with their immediate surroundings (i.e. atoms within one or two bonds of any atom of the residue to be validated).
 
* '''Validated motif:''' The subset of atoms from the input motif paired with atoms in the model residue.
 
  
 
=References=
 
=References=
 
<references>
 
<references>
<ref name="ligandexpo">Feng,Z., Chen,L., Maddula,H., Akcan,O., Oughtred,R., Berman,H.M. and Westbrook,J. (2004) [http://dx.doi.org/10.1093/bioinformatics/bth214 Ligand Depot: a data warehouse for ligands bound to macromolecules]. Bioinformatics, 20, 2153–5.</ref>
+
<ref name="Sen_2014">Sen,S., Young,J., Berrisford,J.M., Chen,M., Conroy,M.J., Dutta,S., Di Costanzo,L., Gao,G., Ghosh,S., Hudson,B.P., et al. (2014) [http://dx.doi.org/10.1093/database/bau116 Small molecule annotation for the Protein Data Bank]. Database (Oxford)., 2014, 1–11.</ref>
 
</references>
 
</references>

Latest revision as of 18:13, 10 August 2015

The key terms used within the MotiveValidator and ValidatorDB environment are defined below. It is important to establish these terms before moving on to the description of the ValidatorDB principles, or MotiveValidator functionality on which both validation tools are built.

Contents

Residue

We generally use the term residue to refer to any component of a biomacromolecule or a biomacromolecular complex. Within the validation environment, a collection of atoms bound by chemical bonds (covalent, coordinative or ionic) can be considered a residue as long as this fact is appropriately indicated in the input PDB file. Specifically, all the atoms that make up a residue should have the same residue annotation (3-letter code) and residue identifier (index internal to the PDB file).

Non-standard residue

With respect to the building blocks of biomacromolecules, ValidatorDB denotes as standard residues the 5 standard nucleotides (A, C, G, T, U) together with their 5 common deoxy forms (DA, DC, DG, DT, DU), and the 20 standard amino acids. Validation results for these standard building blocks of biomacromolecules are not included in our database because many tools already cover these. Additionally, Selenomethionine (MSE) is also considered a standard residue here due to its extremely high occurrence in the Protein Data Bank (markedly higher than other ligands and non-standard residues), and high incidence of circumstantial inclusion in biomacromolecules (to aid X-ray crystallography experiments). Therefore, ValidatorDB covers only non-standard residues, i.e., residues which cannot be denoted as standard by the above definition.

Ligand

We use the term ligand to refer to a chemical compound which forms a complex with a biomacromolecule (e.g., sugar, drug, heme). Ions can also function as self standing ligands, or they can be part of a residue (such as Fe in heme). In the PDB format, a ligand has its own residue identifier and annotation (3-letter code), and is composed from HETATM records. The validation term residue thus fully covers ligands.

Molecule

In the validation, molecules are used as an umbrella term for ligands and non-standard residues. Therefore, all properties of ligands and non-standard residues are valid for molecules as well (PDB entry of origin, residue annotation, residue identifier, number of heavy atoms, properties assigned after the validation, etc.). Moreover, a single occurrence of a ligand or non-standard residue is also a molecule.

It is essential to note that ValidatorDB contains validation results for all ligands and non-standard residues containing 7 or more heavy atoms. These are denoted as molecules relevant for validation, or simply molecules. The reason ValidatorDB focuses on these types of molecules is that they exhibit high diversity and nontriviality in their structure.

Model

We use the term model to refer to a particular structure that is known to be correct. This structure will then be used as reference in the validation process. A model is identified by its residue annotation (3-letter code). The origin of the models used both by MotiveValidator and ValidatorDB is the wwPDB Chemical Component Dictionary (wwPDB CCD)[1].

Input motif

The term motif is used here as a fragment of a biomacromolecule, biomacromolecular complex or ligand, made up of one or more residues or parts of residues. Specifically, the term input motif refers to the individual molecule being validated, together with its surroundings (i.e., atoms from neighboring residues, within two bonds of any atom of the validated molecule). Each input motif is assigned a unique motif identifier during the validation based on its PDB entry of origin. On the other hand, the term validated motif (or validated molecule) refers strictly to the subset of atoms in the input motif which were successfully mapped to atoms in the model.


Continue with reading about the ValidatorDB principles, MotiveValidator functionality, or return to the ValidatorDB, or MotiveValidator manuals.


References

  1. Sen,S., Young,J., Berrisford,J.M., Chen,M., Conroy,M.J., Dutta,S., Di Costanzo,L., Gao,G., Ghosh,S., Hudson,B.P., et al. (2014) Small molecule annotation for the Protein Data Bank. Database (Oxford)., 2014, 1–11.