So MEGAN can be a bit annoying at times. As can Blast outputs required by MEGAN. Because I’m particularly interested in the alignments produced, and want to create a consensus sequence for designing primers, I need to run blast again from what I did the other day.
This time, I’ve already identified that the top hit is sufficient for me to make decisions (and culled out any sequences that didn’t have an ncbi hits at any stage). So I am able to use
This works for -m values of 0-4. If you are using local blast, -m is not a command you can use. As I am putting the data into MEGAN, I just use the default. Tabular (-m 7, -m 8) can also be very useful, but does not retain the alignment data.
/ncbi-blast-2.2.31+/bin/blastn -query PP.blastn.all.fasta -db /ncbi-blast-2.2.31+/db/flavivirus-nt-custom-db/flavivirus.nogaps.noempty.db -out PP.blastn.all.fasta.blast -num_threads 2 -num_descriptions 1 -num_alignments 1
In this case, I have so far decided that I am more interested in comparing the three areas than I am the area as a whole (any more, and I’d be giving away my paper!). It also means the blast will be slightly shorter to run, because several small sets of files are faster to run than one big one.
Unfortunately, I also needed to run a blastx on my data, and so I’ll need to redo that one as well. It should be quicker, because I am only using sequences I have already identified as interesting. Then again, blastx is always extremely slow. The command I am using there is
/ncbi-blast-2.2.31+/bin/blastx -query PP.blastx.all.fasta -db /ncbi-blast-2.2.31+/db/flavivirus-aminoacid-custom-db/flavivirus.protein.db -out PP.blastx.all.fasta.blast -num_threads 2 -num_descriptions 1 -num_alignments 1
A note on this is that MEGAN will let you import more than one blast output file at a time – but they have to be in the same directory. I haven’t actually been able to test this yet, because I want to know the differences between the files. I’ll test this current lot out anyway…