BLAST Parser
About
This is a Perl script for parsing BLAST output and converting it into a more compact form (example below). The purpose of such conversion is to save storage space and enable faster downstream analysis, however all alignments will be lost.
The parser was tested with the default pairwise output format (-outfmt 0) of blastn, blastx and tblastx from BLAST+ 2.2.25 package, and should probably work with any other version. Please report any issues or incompatibilities.
The space saving from using this parser varies greatly depending on the nature of the search (average alignment length, number of hits per database sequence, database sequence lengths, etc). Typically about 5 times reduction of output size can be expected, although in some worst cases it's only 3 times (e.g., with blastx search and NR database). With blastn searches the space saving is often 10 times or better.
Note: For most uses BLAST Compressor is a better choice, as it has a more complete parser and better compression.
News
2015-11-19 – Version 1.1.6. Improved compatibility with recent BLAST versions.
2012-06-27 – Version 1.1.5. Improved compatibility with BLAST Compressor (with decompressed output).
2012-06-06 – Version 1.1.4. Improved compatibility with BLASTP and with ancient versions of BLAST. Added reporting the total number of hits.
2012-06-06 – Version 1.1.3. Improved compatibility with misformatted BLAST output.
2012-01-12 – Version 1.1.2 uploaded. Minor update, cosmetic changes.
Download
(Distributed under the zlib/libpng license, see the source file for details)
Usage
perl blast_parser.pl <blastoutput.txt >blastparsed.txt
You can also parse the BLAST output on the fly as it is generated, saving enormous amount of disk space (but losing all alignments). Just append
| perl blast_parser.pl >blastparsed.txt
to the end of your search command instead of specifying the output file.
Note: BLAST Parser is designed to work with local BLAST search output. Please don't try to use it on HTML output of online searches.