top of page
Search

Learn how to use blastdb alias tool to combine, limit, and convert BLAST databases

sourselacur1983


How to Download and Use Blastdb Alias Tool




If you are a bioinformatics researcher or a BLAST user, you might have encountered situations where you need to search multiple databases together or search a specific subset of sequences within an existing database. For these types of searches, a convenient way to conduct them is by creating a virtual BLAST database. A virtual BLAST database is a collection of sequences that are not physically stored in a single file, but are referenced by an alias file that points to the original databases or sequence files. This way, you can save disk space and time by avoiding duplicating or splitting large databases.


One tool that can help you create and manage virtual BLAST databases is blastdb_aliastool, which is part of the BLAST+ suite of command-line applications developed by the National Center for Biotechnology Information (NCBI). In this article, we will show you how to download and use blastdb_aliastool to perform various tasks related to virtual BLAST databases.




download blastdb alias tool



What is Blastdb Alias Tool?




A tool to create virtual BLAST databases




Blastdb_aliastool is a command-line tool that can create and manipulate alias files for virtual BLAST databases. An alias file is a text file that contains information about the name, title, type, and location of the original databases or sequence files that make up the virtual database. An alias file can also contain filtering criteria such as GIs (numerical IDs) or accessions that limit the search to a subset of sequences within the original databases. An alias file has an extension of .nal for nucleotide databases or .pal for protein databases.


The types of tasks it can perform




Blastdb_aliastool can perform three types of tasks to assist in creating and managing virtual BLAST databases:


  • It can build an alias file to transparently combine searches of different databases. For example, you can create an alias file that combines two nematode nucleotide databases into one virtual database.



  • It can build an alias file that limits a search based on a list of GIs or accessions. For example, you can create an alias file that limits the search to a subset of RefSeq mRNAs from C. elegans within a larger nematode mRNA database.



  • It can convert a list of GIs or accessions to a more efficient binary format that improves the search performance. For example, you can convert a list of accessions to a binary format that only contains accessions in the SwissProt database.



How to Download Blastdb Alias Tool?




Downloading BLAST+ executables




Blastdb_aliastool is included in the BLAST+ suite of command-line applications, which are freely available at .


Downloading NCBI BLAST databases




If you want to use blastdb_aliastool with NCBI BLAST databases, you need to download them from . These are the same databases available via the public BLAST Web Service (How to Use Blastdb Alias Tool?


Creating an alias file to combine multiple databases




To create an alias file that combines multiple databases, you need to use the -dblist option of blastdb_aliastool. The -dblist option takes a space-separated list of database names or paths as an argument. You also need to specify the name and title of the alias file using the -db and -title options, respectively. You also need to indicate the type of the database using the -dbtype option, which can be either nucl for nucleotide or prot for protein. For example, to create an alias file that combines two nematode nucleotide databases, ncbi_nematode and wormbase_nematode, into one virtual database called nematode, you can use the following command:


blastdb_aliastool -dblist "ncbi_nematode wormbase_nematode" -db nematode -title "Nematode nucleotide database" -dbtype nucl


This command will create two files: nematode.nal and nematode.nhr. The nematode.nal file is the alias file that contains the information about the original databases and the virtual database. The nematode.nhr file is a header file that is required for BLAST searches. You can use the virtual database name, nematode, as an argument for the -db option of any BLAST+ application.


Creating an alias file to limit a search by GIs or accessions




To create an alias file that limits a search by GIs or accessions, you need to use the -gilist or -seqidlist options of blastdb_aliastool. The -gilist option takes a text file that contains a list of GIs, one per line, as an argument. The -seqidlist option takes a text file that contains a list of accessions, one per line, as an argument. You also need to specify the name and title of the alias file using the -db and -title options, respectively. You also need to indicate the type of the database using the -dbtype option, which can be either nucl for nucleotide or prot for protein. You also need to provide the name or path of the original database that contains the sequences using the -dblist option. For example, to create an alias file that limits the search to a subset of RefSeq mRNAs from C. elegans within a larger nematode mRNA database called ncbi_nematode_mrna, you can use the following command:


blastdb_aliastool -seqidlist c_elegans_refseq_mrna.txt -db c_elegans_refseq_mrna -title "C. elegans RefSeq mRNA subset" -dbtype nucl -dblist ncbi_nematode_mrna


This command will create two files: c_elegans_refseq_mrna.nal and c_elegans_refseq_mrna.nhr. The c_elegans_refseq_mrna.nal file is the alias file that contains the information about the original database and the filtering criteria. The c_elegans_refseq_mrna.nhr file is a header file that is required for BLAST searches. You can use the virtual database name, c_elegans_refseq_mrna, as an argument for the -db option of any BLAST+ application.


How to use blastdb_aliastool to manage the BLAST databases


Download preformatted NCBI BLAST databases with update_blastdb.pl


Create a virtual BLAST database with blastdb_aliastool


Blast database error: no alias or index file found for nucleotide database


Convert a GI or accession list to binary format with blastdb_aliastool


Aggregate existing BLAST databases with blastdb_aliastool


Create a subset of a BLAST database with blastdb_aliastool


Get FASTA from BLAST databases with blastdbcmd


Blastn -db testdb.nal vs blastn -db testdb


Blastdb_aliastool performance and output comparison


Blastdb_aliastool options and parameters explained


Blastdb_aliastool tutorial and examples


Blastdb_aliastool vs makeblastdb for creating BLAST databases


Blastdb_aliastool for combining different molecule types of databases


Blastdb_aliastool for limiting a search by taxonomy


Download BLAST databases from cloud providers with update_blastdb.pl


Decompress BLAST databases downloaded from NCBI with update_blastdb.pl


Show all available NCBI BLAST databases with update_blastdb.pl --showall


Update BLAST databases only if they have a different time stamp with update_blastdb.pl


Use the --passive option of update_blastdb.pl if you run into any problems


Blast database error: Could not find volume or alias file referenced in alias file


Blast database error: OID not found: Database may be corrupt or out of date


Blast database error: No alias or index file found for protein database


Blast database error: Sequence ID not found


Blast database error: Taxonomy name lookup from taxid requires installation of taxdb database with ftp command in README file


How to fix blast database errors and issues


How to cite blastdb_aliastool and update_blastdb.pl in your publications


How to download and install blastdb_aliastool and update_blastdb.pl on your system


How to run blastdb_aliastool and update_blastdb.pl on Windows, Linux, and Mac OS


How to use blastdb_aliastool and update_blastdb.pl with Python, R, or Perl scripts


How to use blastdb_aliastool and update_blastdb.pl with command line or graphical user interface (GUI)


How to use blastdb_aliastool and update_blastdb.pl with cloud computing services (AWS, GCP, etc.)


How to use blastdb_aliastool and update_blastdb.pl with parallel processing or multithreading


How to use blastdb_aliastool and update_blastdb.pl with large-scale or high-throughput data analysis


How to use blastdb_aliastool and update_blastdb.pl with custom or non-standard data formats or sources


How to use blastdb_aliastool and update_blastdb.pl with different versions of BLAST or BLAST+


How to use blastdb_aliastool and update_blastdb.pl with different types of BLAST searches (blastn, blastp, blastx, etc.)


How to use blastdb_aliastool and update_blastdb.pl with different types of BLAST databases (nucleotide, protein, etc.)


How to use blastdb_aliastool and update_blastdb.pl with different types of BLAST queries (sequences, accessions, GIs, etc.)


How to use blastdb_aliastool and update_blastdb.pl with different types of BLAST outputs (tabular, XML, JSON, etc.)


How to optimize the performance and speed of blastdb_aliastool and update_blastdb.pl


How to troubleshoot common errors and problems of blastdb_aliastool and update_blastdb.pl


How to get help and support for blastdb_aliastool and update_blastdb.pl


How to provide feedback and suggestions for blastd


Converting a GI or accession list to binary format




To convert a GI or accession list to binary format, you need to use the -bin option of blastdb_aliastool. The -bin option takes a text file that contains a list of GIs or accessions, one per line, as an argument. You also need to specify the name of the output binary file using the -out option. You also need to indicate whether the input list contains GIs or accessions using the -input_type option, which can be either gi or acc. For example, to convert a list of accessions in swissprot_accessions.txt to binary format in swissprot_accessions.bin, you can use the following command:


blastdb_aliastool -bin swissprot_accessions.txt -out swissprot_accessions.bin -input_type acc


This command will create one file: swissprot_accessions.bin. This file is a binary representation of the accessions in swissprot_accessions.txt that only contains accessions in the SwissProt database. This file can be used as an argument for the -seqidlist option of blastdb_aliastool or any BLAST+ application.


Advantages of Blastdb Alias Tool




Saving disk Saving disk space and time




One of the main advantages of blastdb_aliastool is that it can help you save disk space and time by creating virtual BLAST databases. By using alias files, you can avoid copying or splitting large databases, which can take up a lot of disk space and time. Instead, you can reference the original databases or sequence files using the alias files, which are much smaller and faster to create. This way, you can also keep your databases updated without having to recreate the alias files every time.


Customizing searches and databases




Another advantage of blastdb_aliastool is that it can help you customize your searches and databases according to your needs. By using alias files, you can combine different databases into one virtual database, which can be useful for searching multiple organisms or taxonomic groups at once. You can also limit your search to a subset of sequences within a database, which can be useful for searching specific genes or proteins of interest. You can also create your own custom databases from sequence files using alias files, which can be useful for searching novel or unpublished sequences.


Improving search efficiency and performance




A third advantage of blastdb_aliastool is that it can help you improve your search efficiency and performance by converting GI or accession lists to binary format. By using binary format, you can reduce the size and complexity of the filtering criteria, which can speed up the search process and reduce the memory usage. You can also ensure that your filtering criteria only contain valid GIs or accessions in the database, which can avoid errors or inconsistencies in the search results.


Conclusion




Blastdb_aliastool is a powerful and versatile tool that can help you create and manage virtual BLAST databases. By using blastdb_aliastool, you can save disk space and time, customize your searches and databases, and improve your search efficiency and performance. Blastdb_aliastool is part of the BLAST+ suite of command-line applications, which are freely available from NCBI. To learn more about blastdb_aliastool and other BLAST+ tools, please visit the .


FAQs




What is a virtual BLAST database?




A virtual BLAST database is a collection of sequences that are not physically stored in a single file, but are referenced by an alias file that points to the original databases or sequence files.


What is an alias file?




An alias file is a text file that contains information about the name, title, type, and location of the original databases or sequence files that make up the virtual database. An alias file can also contain filtering criteria such as GIs or accessions that limit the search to a subset of sequences within the original databases.


How do I create an alias file?




You can use blastdb_aliastool to create an alias file by specifying the -dblist option with a list of database names or paths, the -db and -title options with the name and title of the alias file, and the -dbtype option with the type of the database (nucl or prot). You can also use the -gilist or -seqidlist options with a list of GIs or accessions to limit the search by filtering criteria.


How do I use an alias file?




You can use an alias file as an argument for the -db option of any BLAST+ application. For example, if you have an alias file called nematode.nal that combines two nematode nucleotide databases, you can use it as follows:


blastn -query query.fasta -db nematode


How do I convert a GI or accession list to binary format?




You can use blastdb_aliastool to convert a GI or accession list to binary format by specifying the -bin option with a text file that contains a list of GIs or accessions, one per line, the -out option with the name of the output binary file, and the -input_type option with either gi or acc. 44f88ac181


 
 
 

Recent Posts

See All

Comments


Let's Connect

I'm a paragraph. Click here to add your own text and edit me. I’m a great place for you to tell a story and let your users know a little more about you.

Address

500 Terry Francois Street
San Francisco, CA 94158

Email

Phone

123-456-7890

Contact Us

Thanks for submitting!

© 2023 by Lynch & Powell. Proudly Created with Wix.com

bottom of page