From WebChem Wiki
Revision as of 18:10, 9 April 2020 by Midlik (talk | contribs) (Selection of a template domain and obtaining its annotation)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page describes the procedure for annotating SSEs in a whole protein family, using SecStrAnnotator Suite (SecStrAnnotator + supplementary scripts).

A protein family is understood as a set of structurally similar protein domains. A protein domain can be either a whole protein chain or only a part of it (in multidomain proteins).

The most of this procedure can be automated by script described in SecStrAnnotator:Analysis. The part which must be done manually is #Selection of a template domain and obtaining its annotation.



Most steps in the procedure are realized by scripts which are executed by Python3 interpreter (pre-installed in some Linux distributions).


Preparing structural data

A list of PDB structures corresponding to a protein family can be obtained from PDBe REST API using The protein family can be identified by a CATH code, such as 1.10.630.10 (CATH), or by a Pfam accession, such as PF00067 (Pfam):

python3 1.10.630.10 > family_from_cath.json

python3 PF00067 > family_from_pfam.json

The structures are then downloaded from PDBe by

python3 family_from_cath.json my_structure_directory

In this moment, all necessary structures should be in the directory my_structure_directory.

Selection of a template domain and obtaining its annotation

In this step, one of the domains from the protein family should be selected as a template domain. Suppose that we have selected the domain which is located on the chain A of PDB entry 2nnj.

If an SSE annotation of the template domain is available in the literature, it can be converted into the SecStrAnnotator format and used as the template annotation.

If no annotation is available, it must be created from scratch. The easiest way of creating an annotation file is to perform secondary structure assignment (SSA) by SecStrAnnotator:

dotnet SecStrAnnotator.dll --onlyssa my_structure_directory 2nnj,A

This will create file my_structure_directory/2nnj-detected.sses.json. Rename this file to 2nnj-template.sses.json and optionally refine the annotation in the file:

  • Remove unnecessary SSEs which you don't want to annotate.
  • Add SSEs which were not detected but should be annotated.
  • Rename the SSEs according to a transparent scheme (e.g. helices A, B, C, D... instead of default H0, H1, H4, H6...).
  • If you add/remove β-strands, don't forget to update also the beta_connectivity section.

Running the annotation algorithm on each member of the family

python3 my_structure_directory 2nnj,A family_from_cath.json

Back to the main page