{"id":400,"date":"2015-11-16T02:08:17","date_gmt":"2015-11-16T02:08:17","guid":{"rendered":"http:\/\/www.abyteofcommonsense.com\/?p=400"},"modified":"2015-11-16T02:08:17","modified_gmt":"2015-11-16T02:08:17","slug":"local-ncbi-blast-problems-some-solutions-for-the-unexperienced-bioinformatician","status":"publish","type":"post","link":"http:\/\/www.abyteofcommonsense.com\/?p=400","title":{"rendered":"Local NCBI BLAST problems? Some solutions for the unexperienced bioinformatician"},"content":{"rendered":"<p>Is local blast driving you nuts? Blast is a super powerful tool if you download it onto your own computer because you can blast more than one sequence at once, and not have to worry about the server dying on you. Having set it up both with the experience of a senior bioinformatics miracle worker (2012), I needed to do the process again with the new Blast+. Here are some problems I encountered that might help you deal with it too.<\/p>\n<p>&nbsp;<\/p>\n<p>My Blast seems to be incredibly picky about where things are. In the end, I just ended up specifying where EVERYTHING was with a hard defined path.<\/p>\n<p><strong>\/Users\/JohnSmith\/ncbi-blast-2.2.31+\/bin\/blastn -query .\/yourfile.fasta -db \/Users\/JohnSmith\/ncbi-blast-2.2.31+\/db\/nt -out .\/yourfile.fasta.blast.txt -num_threads 2<\/strong><\/p>\n<p>That bit of text on the end there &#8216;<strong>-num_threads<\/strong>&#8216; may work for you, or it might not. It can significantly speed up your Blast run, but will also majorly slow down your computer. I&#8217;m using a machine with 8 cores, and running two sets of &#8216;num_threads 2&#8217; slows it to a standstill. Even writing this post is taxing things. It is said to be quicker if you split your .fasta files into multiples, and assign each one a single thread. Or if you are like me, you can just let it run over the weekend.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Warning: lcl|Query_20167 contig20167: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and\/or filtering options.<\/strong><\/p>\n<p>Make sure that you are using the appropriate type of database for your content. If you are using .fasta sequences with nucleotides, make sure you are using the nt database.<\/p>\n<p>Occasionally, Blast threw up this error for me because there were spaces (shock, horror) in the sequence .fasta file I presented it with. To get around this, I just removed the spaces using this command<\/p>\n<p><strong>grep . yourfile.fasta &gt; yourfile.nospaces.fasta<\/strong><\/p>\n<p>and voila! It was fixed. Maybe not the most professional of work-arounds, but it worked for me.<\/p>\n<p>Finally, if it keeps giving you that error, don&#8217;t despair! You can just ignore it, and go check out the related sequences later. Don&#8217;t exit your Blast session because of this error, otherwise you&#8217;ll have to start all over again. Even if it looks like Blast is doing nothing, trust me, it still is.<\/p>\n<p>&nbsp;<\/p>\n<p>Are you tempted to copy paste my instructions? Paste them into notepad or TextWrangler first, because otherwise you could be taking formatting marks that want to make yet more errors for you. Otherwise, write them out, but you&#8217;ll probably get sick of that quickly. I keep a notepad open with all the commands I am currently using for fast reference.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Is local blast driving you nuts? Blast is a super powerful tool if you download it onto your own computer because you can blast more&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"http:\/\/www.abyteofcommonsense.com\/?p=400\">Continue reading<span class=\"screen-reader-text\">Local NCBI BLAST problems? Some solutions for the unexperienced bioinformatician<\/span><\/a><\/div>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false},"categories":[69],"tags":[70,72,71],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p6nzXS-6s","jetpack-related-posts":[{"id":465,"url":"http:\/\/www.abyteofcommonsense.com\/?p=465","url_meta":{"origin":400,"position":0},"title":"Blast(ed) MEGAN Round 2 - or what to do when you're running yet another blast","date":"April 5, 2016","format":false,"excerpt":"So MEGAN can be a bit annoying at times. As can Blast outputs required by MEGAN. Because I'm particularly interested in the alignments produced, and want to create a consensus sequence for designing primers, I need to run blast again from what I did the other day. This time, I've\u2026","rel":"","context":"In &quot;Bioinformatics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":410,"url":"http:\/\/www.abyteofcommonsense.com\/?p=410","url_meta":{"origin":400,"position":1},"title":"Creating a custom local Blast database from .fasta","date":"November 23, 2015","format":false,"excerpt":"All these commands are pretty simple to use, but I couldn't find really straight forward answers for why the hell I was getting errors, so here is my quick guide. I hope you find it useful. When you are creating a custom NCBI blast database to use, there's a couple\u2026","rel":"","context":"In &quot;Bioinformatics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":412,"url":"http:\/\/www.abyteofcommonsense.com\/?p=412","url_meta":{"origin":400,"position":2},"title":"Using the mac OSX command line","date":"November 30, 2015","format":false,"excerpt":"The most sensible way to set up your working environment in the Mac command line when you want to do the same thing in multiple folders is to make sure you have labelled everything in the same way.\u00a0A handy hint here is that if you have terminal open, and you\u2026","rel":"","context":"In &quot;Bioinformatics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":403,"url":"http:\/\/www.abyteofcommonsense.com\/?p=403","url_meta":{"origin":400,"position":3},"title":"How to deal with Blast output and Fasta to get what you need for MEGAN (using Qiime)","date":"November 18, 2015","format":false,"excerpt":"So... You have some sequence data from Illumina sequencing.\u00a0These are the two sets of files I have, one is forward reads, and the other is reverse. PP2_S1_L001_R2_001.fastq.gz PP2_S1_L001_R1_001.fastq.gz ... it's in fastaq.gz format. You want to upzip it, simple right? You want to use this command: gunzip --keep PP2_S1_L001_R2_001.fastq.gz But\u2026","rel":"","context":"In &quot;Bioinformatics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":414,"url":"http:\/\/www.abyteofcommonsense.com\/?p=414","url_meta":{"origin":400,"position":4},"title":"How to monitor CPU usage on your MAC","date":"January 6, 2017","format":false,"excerpt":"top -F -R -o cpu \u00a0Type that in the Terminal and you\u2019ll get a more efficient usage of top that uses less CPU itself, thanks to the flags. Here\u2019s an explanation of the flags: -F Do not calculate statistics on shared libraries, also known as frameworks. -R Do not traverse\u2026","rel":"","context":"In &quot;Bioinformatics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":457,"url":"http:\/\/www.abyteofcommonsense.com\/?p=457","url_meta":{"origin":400,"position":5},"title":"Converting Gene Names to GO IDs or GO terms","date":"January 14, 2016","format":false,"excerpt":"This is to prevent frustration when doing a beginner's task of annotating genes with GO IDs, or Gene Ontologies. This is useful to visualise large datasets of genes. First, convert\u00a0your gene names to a format recognised by UniProtKB. The tool you can use is DAVID. Don't try and use the\u2026","rel":"","context":"In &quot;Bioinformatics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=\/wp\/v2\/posts\/400"}],"collection":[{"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=400"}],"version-history":[{"count":2,"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=\/wp\/v2\/posts\/400\/revisions"}],"predecessor-version":[{"id":402,"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=\/wp\/v2\/posts\/400\/revisions\/402"}],"wp:attachment":[{"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=400"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=400"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.abyteofcommonsense.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=400"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}