Difference between revisions of "ValidatorDB:Database organization"
|Line 23:||Line 23:|
[[File:VDB_overview.png|thumb|right|500px|The '''Overview tab''' contains an overall validation report for all the ligands and non-standard residues in the Protein Data Bank. Each bar represents overall statistics for the [[ValidatorDB:Database_contents | results ]] of a certain type of [[ValidatorDB:Principles | validation analysis ]]. ]]
[[File:VDB_overview.png|thumb|right|500px|The '''Overview tab''' contains an overall validation report for all the ligands and non-standard residues in the Protein Data Bank. Each bar represents overall statistics for the [[ValidatorDB:Database_contents | results ]] of a certain type of [[ValidatorDB:Principles| validation analysis ]]. ]]
The '''Overview''' tab of the ''synopsis page'' provides a very general statistical evaluation of results across the entire PDB in graphical form. The
The '''Overview''' tab of the ''synopsis page'' provides a very general statistical evaluation of results across the entire PDB in graphical form. The of the of the total number of molecules in the .
in the . of the . of the .
<br style="clear:both" />
<br style="clear:both" />
Revision as of 17:12, 3 September 2014
In ValidatorDB, the results of the validation analyses are organized systematically, ranging from detailed reports for single molecules, to a PDB-wide general summary, and fully customized reports. To facilitate interpretation, access is provided in graphical and tabular form via different sections of the web interface.
ValidatorDB is organized on two main levels, namely PDB-wide validation statistics (synopsis page), and detailed validation reports for specific molecules of interest (specifics page). We shall describe each level of the database in detail below.
The ValidatorDB synopsis page contains a brief description of ValidatorDB, along with information about the last database update (date and number of structures that have been processed during the validation). Specifically, in May 2014, over 100,000 PDB entries had been processed, containing over 230,000 molecules relevant for validation, and approximately 17,000 different models were used as reference during validation.
The synopsis page consists of 6 tabs, two providing support, and four enabling access to the database itself. Since this is your first contact with the database, the support tabs Quick Help and Samples help you get started, with basic information and examples of database snippets. Various interactive guides are accessible by a green button at the top right corner of some tabs. Examples of more complex analyses are available as case studies.
|The synopsis page consists of 6 tabs, two providing support, and four enabling access to the database itself. The support tab Quick Help helps you get started, with basic information about how to get oriented in the web page. Various interactive guides are accessible by a green button at the top right corner of some tabs. Tool tips are available for most of the graphical elements.||The Samples tab offers basic examples of database snippets, along with a brief interpretation of results. Examples of more complex analyses are available as case studies.|
The ValidatorDB synopsis page further provides access to various data sets of PDB-wide validations via four different tabs, namely Overview, Details by molecule, Details by PDB entry and Custom search. A full description of each of these tab is given in their respective sections.
Additionally, the synopsis page allows to access the validation results for specific residues of interest via the Quick Lookup bar at the bottom of the page. Simply type a comma separated list of residue annotations (3-letter codes) into the Quick Lookup bar, and you will be redirected to the specifics page containing validation results for the molecules you requested. If you specify a list of PDB IDs (4-letter codes) instead, then the corresponding specifics page will contain validation results for all relevant molecules in the PDB entries you specified.
The Overview tab of the synopsis page provides a very general statistical evaluation of results across the entire PDB in graphical form. The graph summarizes the results for each type of validation analysis , for all validated molecules, irrespective of annotation or PDB entry. Each colorful bar refers to the number of molecules which exhibited issues (missing atoms, wrong chirality, etc.), or fared well (complete structure, correct chirality, etc.) during a given analysis. The length of the bar correlates with the percentage calculated from the total number of molecules analyzed. A bar appears in the Overview graph only if it represents at least 0.5%.
Results can be interpreted using the information in the Database contents. Tool tips are available for each element of the graph. The results for different types of validation analyses are labeled using the unified color scheme .......
Details by molecule
The Details by Residue tab contains an interactive table summarizing the results for each residue validated across the entire PDB. Each row corresponds to one residue, identified by its residue name (3-letter code). The information in the table is organized according to the validation result. The color coding for the table header and the font inside the table is the same as in the categories defined in the Overview tab. Each element of the table header is described in a tool tip, but note that here the term residue actually refers to occurrence of residue (motif).
The table is interactive. Clicking on any element in the table header allows to sort the table entries according to that element. Click on any residue name in order to access the ValidatorDB specifics page with detailed validation results for that residue.
The filter at the top right corner allows to retrieve the table row with a specific residue. Simply type the residue name into the filter. All results can be downloaded in *.csv format using the download button at the top left corner.
Details by PDB entry
The Details by PDB Entry tab contains an interactive table summarizing the results for all residues validated in each PDB entry. Each row corresponds to one PDB entry, identified by its PDB ID (4-letter code). The information in the table is organized according to the validation result. The color coding for the table header and the font inside the table is the same as in the categories defined in the Overview tab. Each element of the table header is described in a tool tip, but note that here the term residue actually refers to occurrence of residue (motif).
The table is interactive. Clicking on any element in the table header allows to sort the table entries according to that element. Click on any PDB in order to access the ValidatorDB specifics page with detailed validation results for all residues in that PDB entry.
The filter at the top right corner allows to retrieve the table rows with a specific residue, or the table rows with selected PDB IDs. Simply type the residue name or PDB ID into the filter. All results can be downloaded in .csv format using the download button at the top left corner.
The Custom Search tab allows you to create your own view of the ligands validation of the PDB database. Simply paste a list of your desired ligands (3-letter code) and/or PDB entries (4-letter code) in provided text boxes separated by commas or a newlines. This is particularly convenient in case you need to retrieve a validation report for a huge number of structures. Such as all the glycosyltransferases or nmr structures. Note that you can retrieve such lists by using advanced search in PDB. Also bear in mind that each of these custom searches will be assigned a unique permanent web address, so you can access these results later on. Also a list of your last custom searches is provided for your convenient.
The ValidatorDB specifics page is accessible from the synopsis page, either via the LookUp bar on the Overview tab, or via the residue names and PDB IDs in the interactive tables on the tabs Details by Residue and Details by PDB Entry, respectively. Depending on how it was accessed, the specifics page might retrieve validation results for one or more residues, a fact mentioned at the very top of the page.
The ValidatorDB specifics page provides a straightforward report of the validation results, including a summary and detailed information in both tabular and graphical form, along with a 3D structure visualizer for closer inspection of the problematic structures. These reports are accessible via several tabs on the specifics page, namely Overview, Summary, Details and Processing Warnings. These are described in detail in the section below. Inspecting the tabular and graphical validation reports accessible on the specifics page is the most comfortable and effective way to evaluate the results. Additionally, you may use the JSON Data download button at the top right corner of the specifics page in order to download the complete validation reports and perform any additional analyses on your own.
To keep consistency with the synopsis page, the specifics page also allows visualization of general validation statistics for a selected number of residues via the Overview tab. This representation is entirely compatible with that of the Overview tab on the synopsis page, and in fact makes up a subset of that data set. All color coding conventions are kept, and tool tips provide descriptions of each graphical element.
Note that this statistics can be downloaded in a *.csv format after clicking 'CSV' in the bottom right corner of the infographics.
On the ValidatorDB specifics page, the first view of the results is available in the Summary tab. For each validated residue, the Summary tab provides an overview of potential issues encountered.
If more than one residue were validated in one run, a list of these residues will be at the top of the page. In order to examine the validation summary for each residue, you will need to either click on that specific residue in the list, or just scroll down the page till you reach it. Each validated residue is identified by its 3-letter code, as well as its chemical formula and common name. Validation statistics are given as absolute numbers and percentages over all the motifs that were processed for each residue.
The table with the validation report is organized into two main sections, referring to incomplete (Missing Atoms or Rings) and complete structures (With All Atoms and Rings) respectively. The formal distinction between ring atoms and non-ring atoms (simply denoted as atoms) is meant to allow a quick localization of potential issues in residues containing rings, especially where atom identifiers are not useful. Chirality is evaluated only for the complete structures, since the absence of some atoms makes it difficult to check the chirality of some of the remaining atoms. Further, the problematic atoms are highlighted, in order to better localize the problems in the structures.
Last, a 2D representation of the model residue, and a pie chart with the validation results are provided for visual representation purposes. You can download them via the small icon at the top right corner of the chart, and later use them in your presentations.
Whereas the Summary tab provides statistics of the issues over all validated motifs for each validated residue, the Details tab of the ValidatorDB specifics page allows you to inspect the issues in select groups of motifs, and further in each individual motif. Note that you may also access the details of any particular group of motifs also by clicking on a specific issue in any Summary tab table.
The Details tab is organized into a table where each row contains information regarding a single validated motif. The content of the table (i.e., which motifs are included, and what information is displayed) is dictated by the values of three selection fields at the top of the table. Click on the first field, and select the validated residue by its name (3-letter code) from the drop down menu. Only the motifs that were matched to that residue name will be displayed in the table. Click on the second field and select the type of issue (e.g., wrong chirality) from the drop down menu. Only the motifs which exhibit that type of issue will be displayed in the table. The number of motifs that fit each selection is given in brackets. If you want to make your selection even more specific, use the selection filed Id filter.
Which table columns are filled depend mostly on the type of issue selected in the filter. The most important columns are Id, Issues/Warnings, Missing atoms/rings, Atoms, Processing warnings. The other columns give additional information, usually helpful in identifying the source of the error in the structure. Note that complete structures do not contain columns with the information about missing atoms. The column Id refers to a unique identifier assigned to each motif in order to keep a transparent trace of the motif's origin, as it contains the PDB ID, as well as the serial index of the first atom in the motif, as it appears in the original PDB entry. The column Issues/Warnings reports the number of issues or warnings found for each particular motif. The column Missing atoms/rings explains which atoms are missing in each validated motif, whereas Atoms shows the position of incorrect chirality. Missing atoms are listed by their atom identifier in the model, whereas atoms with wrong chirality are listed by their identifier in the validated motif. Clicking on a column header allows to sort the motifs according to the property specified in the header.
The 3D viewer implemented in the ValidatorDB interface offers one step further in the analysis of each individual validated motif, and is accessible via the 'Details tab on the specifics page. In the table, simply click on the Id of a motif of interest in order to open the 3D viewer, where you can inspect the structural inaccuracies more closely. Here you will be able to view and manipulate with the 3D representations of the validated motif and model residue, to help you better assess the position and relevance of the structural issues found during validation. Additionally, a 2D representation of the model is provided for clarity, which is especially helpful for larger motifs. Basic information about the validated motif is also given, along with a complete report of the validation results, where all the potential issues are listed.
The validation reports in ValidatorDB also mention various unusual aspects encountered during validation. Sometimes the processed PDB entries contain information that is ambiguous, conflicting or which deviates strongly from the expected reference. ValidatorDB reports such events as processing warnings. Such information can be found in the Processing Warnings tab on the specifics page. The selection field at the top of the page helps filter the warnings for different residues, in case more of them are in validation report. Simply click on the drop down menu and select the category of warnings that you would like to explore. Processing warnings are issues that may cause incorrect Validation, such as bad two residues being too close together (misused concept of alternate conformations) or unusual bond lengths given by the CONECT records. Make sure that negative validation results (e.g., missing atoms) are not in fact caused by ignoring some atoms in an ill-formed structure.
Overally a processing warning may simply lead to ignoring a faulty atom, but the motif is validated.