SecStrAnnotator Suite provides scripts (Python, R, and bash) for batch annotation of the whole family and analysis of the annotation results.
* formatting into [[SecStrAnnotator:SecStrAPI#SecStrAPI_format | SecStrAPI format]],
* formatting into TSV format for further analyses.
The whole pipeline can be executed by <code>scripts/</code>
Example usage:
bash python3 scripts/ scripts/SecStrAPI_masterSecStrAPI_pipeline_settings.shjson --resume
Before running, modify the SETTINGS section of settings in <code>SecStrAPI_masterSecStrAPI_pipeline_settings.shjson</code> to set your family of interest, annotation template, data directory, options, paths to your template annotation etc (TEMPLATE_ANNOTATION_FILE, TEMPLATE_STRUCTURE_FILEsee <code>README.txt</code> for more details). Unwanted steps of the pipeline can be commented out in the MAIN PIPELINE section.
===Data analysis===
For the Cytochrome P450 family, structures of 1775 1855 protein domains are available, located in 953 1012 PDB entries (updated on 23 March 7 July 2020). The analysis was performed on a non-redundant subset containing 175 183 protein domains.
The data are available [https://isdoi.muniorg/ 3939133 here] (structural files not included because of their size).
===Occurrence of SSEs===

