CrocoBLAST:Technical details

From WebChem Wiki
Jump to: navigation, search

CrocoBLAST is free to use within the conditions of the licence, and has been available for download since July 2016 at http://webchem.ncbr.muni.cz/CrocoBLAST. There is no login requirement for downloading or running CrocoBLAST.

Software requirements

CrocoBLAST runs on Windows and Linux, and all results are provided in the typical BLAST output. Once you download the .zip archive with all necessary CrocoBLAST files, you will need a program to unpack the archive. Such a program (e.g., unzip, 7zip, etc.) will likely already be installed on your computer, as unzipping archives is a common procedure. Obviously, you will need to have BLAST available on your computer before you can run CrocoBLAST. If you don't already have BLAST, please get it from the [NCBI website]. No further requirements exist for running the command line utility of CrocoBLAST, or inspecting the results. The graphical user interface requires Java, which is likely already installed on your computer. If not, please visit https://java.com/en/download/.

Hardware requirements

Because of the nature of the data being processed, it is better if your computer has at least 200 MB of RAM per core. Nonetheless, it is possible to run CrocoBLAST on big data files even if less memory is available, but you will need to specify this fact during job submission. Furthermore, if you need to analyze NGS data,the input and output files involved in such calculations can be quite large, and therefore you will need to have sufficient space on your hard disk. NCBI databases range from 10 MB to 500 GB (whole genomes). Depending on the type of sequencing experiment you ran, your input files may range from a few kB to 100 GB. If you don't specify any BLAST options, the size of the output file may be up to 1300 times the size of the input file. Nevertheless, in the typical use case, requesting relevant BLAST options (e.g., provide only the first 20 hits) will greatly reduce the size of the output file.


Limitations

As mentioned above, all results are provided in the typical BLAST output, which is a text file. CrocoBLAST does not currently offer facilities for graphical visualization and analysis of the BLAST results, partly due to the fact that it targets big data, and partly because many great tools are already available for such purposes. We recommend that you obtain the alignments using CrocoBLAST, and then select the best alignments to be viewed and analyzed in some specialized software (e.g., MEGAN).

CrocoBLAST currently does not implement a parallelization of the BLAST calculation via the network. This aspect may be addressed in a future version of CrocoBLAST, once we have gathered sufficient information regarding the most common use case for network-distributed calculations. Your feedback is greatly valued.

Finally, while CrocoBLAST will run on most versions of Windows (XP or newer) and Linux, CrocoBLAST will not run on OS X. It is unlikely that this should change in the immediate future, but do check back with us just in case.

Troubleshooting

CrocoBLAST typically checks that the necessary files and permissions exist before starting the demanding BLAST calculation. Furthermore, a specific CrocoBLAST function is available for fixing data consistency errors automatically. If you get into trouble while trying to run CrocoBLAST, please check the error messages, which are quite informative and should help you overcome the most common issues you are likely to encounter. If you experience further issues, please contact us and describe the problem in detail.


Back to Contents