Changes

Jump to: navigation, search

ValidatorDB:Terminology

3,984 bytes added, 00:43, 3 September 2014
no edit summary
'''ValidatorDB''' is a database of validation results for ligands and non-standard residues in the Protein Data Bank. Before moving on to more extensive descriptions of features, it is important to clearly establish the meaning of a few key terms and principles within the [[ValidatorDB] environment. =Ligand=We use the term ''ligand'' to refer to a chemical compound which forms a complex with a biomacromolecule (e.g., sugar, drug, heme). Ions can also function as self standing ligands, or they can be part of a residue (such as Fe in heme). In the PDB format, a ligand has its own residue identifier and 3-letter code, and is composed from HETATM records. The '''ValidatorDB''' term [[#Residue | residue]] thus fully covers ligands, in addition to typical components like amino acids and nucleotidesenvironment.
=Residue=
We generally use the term ''residue'' to refer to any component of a biomacromolecule or a biomacromolecular complex. This includes amino acid residues and nucleotides, which are commonly referred to as residues as they form proteins and nucleic acids. Within the '''MotiveValidatorValidatorDB''' environment, any a collection of atoms bound by chemical bonds (covalent, coordinative or ionic) can be considered a residue as long as this fact is appropriately indicated in the input PDB file. Specifically, all the atoms that make up a residue should have the same ''residue annotation'' (3-letter code) and ''residue identifier'' (index internal to the input PDB file).
=Non-standard residue=
With respect to the building blocks of biomacromolecules, '''ValidatorDB''' does not cover denotes as ''standard residues'' the 5 standard nucleotides and (A, C, G, T, U) together with their 5 common deoxy- forms(DA, DC, DG, DT, DU), and the 20 standard amino acids and selenomethionine (MSE). Validation results for the these standard building blocks of biomacromolecules are not included in our database because many tools already cover these. Additionally, Selenomethionine (MSE ) is also excluded from validation considered a ''standard residue'' here due to its extremely high occurrence in the PDB Protein Data Bank (markedly higher than other ligands and non-standard residues), and high incidence of circumstantial inclusion in biomacromolecules (to aid X-ray crystallography experiments). Therefore, '''ValidatorDB''' covers only ''non-standard residues'', i.e., residues which cannot be denoted as ''standard'' by the above definition.
=Molecules relevant for validationLigand=We use the term ''ligand'' to refer to a chemical compound which forms a complex with a biomacromolecule (e.g., sugar, drug, heme). Ions can also function as self standing ligands, or they can be part of a residue (such as Fe in heme). In the PDB format, a ligand has its own residue identifier and annotation (3-letter code), and is composed from HETATM records. The '''ValidatorDB''' term ''residue'' thus fully covers all ligands and non-standard residues containing 7 or more heavy atoms.
Within the =Molecule='''ValidatorDB'''uses 'MotiveValidator'molecules'' environment, a as an umbrella term for ''motifligands and non-standard residues'' is generally . Therefore, all properties of ligands and non-standard residues are valid for molecules as well (PDB entry of origin, residue annotation, residue identifier, number of heavy atoms, properties assigned after the validation, etc.). Moreover, a fragment single occurrence of a biomacromolecule, biomacromolecular complex or ligand, made up of one or more residues or parts of residues. A non-standard residue is also a ''motifmolecule'' can in principle be any fragment of a biomolecule. NonethelessFurthermore, it is essential to note that '''MotiveValidatorValidatorDB''' is focused on the contains validation of results for all ligands and non-standard residues, thus here containing ''motif7 or more heavy atoms'' generally refers to a fragment made up from the residue under study, together with its surroundings (i.e., atoms from neighboring residues). Note that the terms These are denoted as ''fragmentmolecules relevant for validation'' and , or simply ''motifmolecules'' are used as synonyms in this manual.
=Model=
We use the term ''model'' to refer to a particular structure that is known to be correct. This structure will then be used as reference in the validation process. A model is identified by its residue annotation (3-letter code). The origin of the models used by '''ValidatorDB''' is the wwPDB Chemical Component Dictionary (wwPDB CCD).
=Motif=
Only The term ''motif'' is used here as a fragment of a biomacromolecule, biomacromolecular complex or ligand, made up ofone or more residues or parts of residues. Specifically, the term ''input motif or '' refers to the individual molecule being validated, together with its surroundings (i.e., atoms from neighboring residues, within two bonds of any atom of the validated molecule). Each ''input motif'' in '''ValidatorDB''' is assigned a unique motif identifier based on its PDB entry of origin. On the other hand, the term ''validated motif '' (=or ''validated molecule'')With respect refers strictly to the chemistry subset of biomolecules, atoms in the term ''input motif'' is used which were successfully mapped to refer to a well defined distribution atoms in the ''model''. =Validation procedure='''ValidatorDB''' implements the ''validation of structural elements in a biomolecule or biomolecular complexannotation'' approach, with characteristics generally associated with a specific functionwhich consists of several steps. Within First, for each molecule under investigation, the ''input motif'MotiveValidator'is extracted from the respective PDB entry. At the same time, the appropriate ''model'' environmentis retrieved from wwPDB CCD. Then, a the ''validated molecule'' (or ''validated motif'' ) is generally a fragment identified as the subset of a biomacromoleculeatoms common in the ''model'' and the ''input motif''. Subsequently, biomacromolecular complex or ligandthe ''validated molecule'' is compared against the ''model'', made up atom by atom. All the validation analyses in '''ValidatorDB''' are based on this comparison of atom properties (presence, chirality, element symbol, PDB name, etc.). Other unusual aspects encountered during validation are reported as processing warnings (e.g., which conformer was validated if several conformers were present). Refer to figure..................one or more residues or parts =Validation analyses=The validation analyses performed by ValidatorDB cover all main issues which have been observed in the topology (2D structure) and geometry (3D structure) of ligands and non-standard residues. A These validation analyses, along with their respective results, can be classified into three categories, namely ''motifCompleteness'' can in principle be any fragment of a biomolecule. Nonetheless, ''Chirality'MotiveValidator'and '' Advanced'' analyses. The ''Completeness'' analyses attempt to find which atoms are missing, whether these atoms are part of rings, or the structure is focused degenerate, i.e., the molecule contains very severe errors. These may refer to residues overlapping in the 3D space, or atoms which are disconnected from the rest of the structure. The ''Chirality'' analyses are performed only on complete structures, and aim to evaluate the chirality of each atom in the validation validated molecule. We distinguish between several types of chirality errors: on carbon atoms (C chirality), on metal atoms (Metal chirality), on atoms with 4 substituents in one plane (Planar chirality), on atoms connected to at least one substituent by a bond of residueshigher order (High order chirality), thus here and the remaining chirality issues (Other chirality). The ''motifAdvanced'' generally refers to analyses are focused on issues which are not real chemical problems, but which can complicate further processing and exploration of data, and thus should be noted. The Substitution analysis reports the replacement of some atom by an atom of a fragment made up different chemical element. The Foreign atom analysis detects atoms which originate from the residue under study, together with its surroundings neighborhood of the validated molecule (i.e., having different PDB residue ID than the majority of the validated molecule), and generally marks sites of inter-molecular linkage. The Different naming analysis identifies atoms whose name in PDB format is different than the standard convention for the validated molecule. The Zero RMSD analysis reports molecules whose structure is identical (root mean square deviation = 0 Å) to the model from neighboring residues)wwPDB CCD. The Alternate conformations analysis informs about the occurrence of alternate conformations in the validated PDB entry. =Validation results=Each molecule is evaluated depending on how it fares during the validation analyses described above. If no issues are found during the validation analyses, the molecule is marked as having ''complete structure and correct chirality''. Validated molecules exhibiting an error in at least one of the ''Completeness'' analyses are denoted as ''incomplete'', whereas the remaining molecules are reported as ''complete''. Note that If no issues are detected during the terms ''fragmentChirality'' and analyses, the validated molecule is marked as having ''motifCorrect chirality'' , whereas the remaining molecules are used marked as synonyms in this manualhaving ''Wrong chirality''.
We can generally say Some types of chirality errors do not constitute real issues, but are artifacts of the automated chirality determination procedure. Specifically, an error in planar chirality may just mean thatthe chiral atom is situated slightly above or below the plane compared to its equivalent in the model from wwPDB CCD. Further, within an error in high order chirality often marks the '''MotiveValidator''' environment, all ''residues'' can be thought involvement of as ''motifs''. Thereforephosphate O atoms in salt or ester formation, or merely a different ''instances PDB format identification of phosphate O atoms of the same residue'' (such as multiple arginine residues throughout validated molecule compared to the sequence of a proteinmodel. Therefore, if the validated molecule is found to have planar or copies high order chirality errors, but no other type of chirality issues, the same ligand in different monomers) can be considered and processed molecule is marked as different having ''motifsCorrect chirality (tolerant)'', making their identification straightforward and unambiguous.
We use When issues are found during an ''Advanced'' analysis, a warning is reported: Substitution, Foreign atom, Different naming, Zero RMSD or Alternate conformations. While the results of the ''Advanced'' analyses have no bearing over the term model residue (or simply model) to refer to a particular structure chemical soundness of the validated molecules, they indicate that is known to further, especially automated processing of these structures can be correctvery problematic. This structure will then be used as reference template in Comparison between the validation process, whereby a query residue structures of molecules with the same name annotation (3-letter code) from different PDB entries might even be impossible in the presence of a substitution, as the model will corresponding atoms have different chemical elements. PDB atom names cannot be compared to the model. Within the '''MotiveValidator''' environmentused straightforwardly, a model contains one residue. The origin of the model since even element symbols can differ and atoms can be the wwPDB chemical component dictionary accessible via LigandExpo, or a custom model provided by the userformally included in neighboring residues.
=Validation terminologyreports=* '''Model residue:In ''' Or simply ValidatorDB''model'' is a particular structure that is known to be correct. This structure is then used as reference template in the validation process, whereby a query residue with the same name (3-letter code) as the model will be compared to the model. Within the ''MotiveValidator'' environment, a model contains one residue. The origin results of the ''model'' can be the wwPDB chemical component dictionary accessible via LigandExpo<ref name ="ligandexpo"/>, or a custom model provided by the user.validation analyses are organized systematically:* '''Residue to be validated (validated residue)Validation overview for the entire PDB:''' Residue summarizes the results of interest all validation analyses for validation.all molecules* '''Input motif:''' Residues to be validated together with their immediate surroundings (i.e. atoms within one or two bonds Summary of any atom validation results for sets of molecules sharing the residue to be validated).same annotation* '''Validated motif:''' The subset Summary of validation results for sets of atoms molecules originating from the input motif paired with atoms in the model residue.same PDB entry* Detailed validation report for a set of molecules sharing a particular annotation* Detailed validation report for a particular PDB entry* Detailed validation report for a particular molecule* Custom validation report
=References=<references><ref name="ligandexpo">Feng,ZEach type of validation report is accessible via different sections of the web interface (....., Chen,L., Maddula,H., Akcan,O., Oughtred,R., Berman,H.M. and Westbrook,J. (2004) [http://dx.doi.org/10.1093/bioinformatics/bth214 Ligand Depot: a data warehouse for ligands bound to macromolecules]. Bioinformatics, 20, 2153–5.</ref></references>)

Navigation menu