Word by word in Terminal
One of the annoying things in OS terminal is that if you want to traverse word by word in a line of text you need to type ‘esc-b’ and ‘exc-f’. This post on macromates, the textmate blog, explains how...
View ArticleInstalling ABySS
To install ABySS on an system running an older version of gcc and use the following commands. >./configure –enable-maxk=96 –disable-openmp \ CPPFLAGS=-I<path to google-sparsehash...
View ArticleBowtie2 output as BAM
Bowtie2 is a short read aligner that is optimized for aligning longer reads of lengths of 50 bp or greater. I’ve been playing around with it and was initially puzzled by the fact that it only outputs...
View ArticlePython: Interleave Paired-End Reads
Here’s a simple script for interleaving paired-end fastq files. You’ll need to do this if you want to create input files for velvet. It requires python 2.7. #!/usr/bin/env python # encoding: utf-8...
View ArticlePython: Multiprocessing large files
I been working with a lot of very large files and it has become increasing obvious that using a single processor core is a major bottleneck to getting my data processed in a timely fashion. A...
View ArticlePython: Adding Read Group (@RG) tags to BAM or SAM files
The SAM specification now requires @RG tags to be included in all SAM/BAM alignments. If you are using GATK you have probably noticed that it will not run without them. Since @RG tags weren’t standard...
View ArticleFiltering contigs/chromosomes from a multi-fasta file
A colleague needed to remove some individual fastas from a multi-fasta file. Googling didn’t reveal a canned way to do it so I hacked up this script. 8.29.12 – As Jason Gallant pointed out, if your...
View ArticleDesigning qPCR primers from just a GTF/GFF file and a genome sequence
I recently had to design qPCR primers for some genes. I had a genome and an annotated GTF file derived from Cufflinks. Since I wanted the primers to span introns, to prevent the amplification of...
View ArticleGetting started with Ultra Conserved Elements
From Faircloth et al 2012. If you attended Evolution 2013, you probably heard quite a lot of chatter about ultra conserved elements. Essentially, ultra conserved elements (UCEs) are parts of the genome...
View ArticleSorting BED files with headers.
I’m trying to keep most of my genome stats in bed format. This makes it pretty easy to mix and match analyses. However, one task that seems to come up over and over is how to sort the files when they...
View Article