As already mentioned, BLAST takes an input file with unknown sequences and aligns each such sequence against a database of known sequences. Therefore, to submit a BLAST job, you must specify which database you wish to align against. The first time you indicate a database for a BLAST job, CrocoBLAST will remember it and add it to its index, so that in the future it is easier for you to access this database. You can provide a simple name for each database, that you may later refer to whenever you need to run a BLAST job. There are two ways to add a new database to the CrocoBLAST index.
Retrieve from NCBI
In the most typical scenario, you will use the established reference sequence databases maintained by NCBI. CrocoBLAST allows you to specify the name of such a database, and will download or update the database for you:
CrocoBLAST -add_database --ncbi_download ncbi_database_name output_folder
When adding or updating a database in this manner, you need not worry about the format of the database, as NCBI provides pre-formatted database files.