Jump to: navigation, search

CrocoBLAST:Job management

1,377 bytes added, 04:13, 25 July 2016
no edit summary
=Create BLAST jobs=
As already mentioned, BLAST takes an input file with unknown sequences and aligns each such sequence against a database of known sequences. To create a job, you must first specify the [ BLAST program] you plan to use, which depends on the nature of the unknown sequences in your input file, and the nature of the sequences in the reference database. Then, you need to specify the name of the ''database'' listed in the CrocoBLAST index (more details on that below) that contains the reference sequences you wish to use. Finally, provide the input file and the location where you want CrocoBLAST to place the output files. Keep in mind that the output files may be quite large. Finally, if you want to change the default BLAST settings, you can do so by specifying the [ names and values of the BLAST options] of interest.
Note that creating , when you create a BLAST job , CrocoBLAST automatically assigns each BLAST job a unique job ID, and updates the CrocoBLAST queue (more on this later). 
=Manage databases=
To submit a BLAST job, you must specify which database you wish to align against. The first time you indicate a database for a BLAST job, CrocoBLAST will remember it and add it to its index, so that in the future it is easier for you to access this database. You can see which databases are already indexed in CrocoBLAST:
You can If you want to remove a database from the CrocoBLAST index (for example, because it has become obsolete), you need to first specify the type of '''sequences''' it holds, and, of course, the name of the database. <code>CrocoBLAST -remove_database '''nucleotide''' <span style="color:blue">database_name</span><br>CrocoBLAST -remove_database '''protein''' <span style="color:blue">database_name</span></code>  There are two ways to add a new database to the CrocoBLAST index. In both cases, you should provide a simple name for new each database, so that you may later refer to this database easily whenever you need to run a BLAST job. There are two ways to add a new database to the CrocoBLAST index.
==Retrieve database from NCBI servers==
CrocoBLAST -add_database --sequence_file '''protein''' <span style="color:green">fastq_file</span> <span style="color:orange">database_name</span> <span style="color:green">output_folder</span>
=Manage CrocoBLAST queue=
The efficiency of CrocoBLAST lies in its ability to parallelize the execution of your BLAST jobs. This is related to breaking each big calculation into smaller pieces, and then organizing the execution of the pieces. Having smaller pieces means that you need less memory to run each job, and if you can analyze several pieces at once you can speed up the total calculation time. CrocoBLAST takes care of these things for you.
This will provide you with information regarding which jobs are queued, with full details regarding the job ID and BLAST setup, as well as a description about the progress of the alignment. The progress of each job is described in three main directions: fragmentation of the input file, alignment, and assembly of results.  ==Administration==If you want to change anything about the queue (say, pause one job and start another, or change the order of the jobs in a queue), you need to first pause or stop the current run.Subsequently, you may perform operations like adding, removing, or reordering jobs in the queue: <code>CrocoBLAST -add_to_queue <span style="color:blue">blast_program database</span> <span style="color:green">input_file output_folder</span><br>CrocoBLAST -remove_from_queue <span style="color:blue">job_id</span><br>CrocoBLAST -remove_from_queue <span style="color:blue">job_id_1 job_id_2 ...</span><br>CrocoBLAST -move_top_queue <span style="color:blue">job_id</span><br>CrocoBLAST -move_top_queue <span style="color:blue">job_id_pos_1 job_id_pos_2 ...</span></code> Note that, once a job is added to the queue, you may perform operations with it (remove, reorder) if you refer to the job by its job ID. You can obtain the job IDs by checking the current state of the queue: <code>CrocoBLAST -status</code>

Navigation menu