Gene sequencing has become a routine and cost-effective method for conducting research in many life sciences fields. However, the most dominant sequencing technology today generates millions of short sequences, consisting of 100-300 bases (the building blocks of DNA). These short “reads” have to be assembled in the right order to make sense of the data – the same way a box of jigsaw pieces promises a picture, but only when the pieces are put together in the right order. Longer reads are possible, but they have higher error rates and a large amount of “noise,” leaving researchers to choose between short, high-quality reads with gaps, or long, but noisy, data.
Dr. Inanc Birol of the British Columbia Cancer Agency is a world leader in genome assembly. Now he proposes to develop specialized software to quickly, accurately, and efficiently assemble and analyze long sequence reads. The new tools, which will be available for free online, will allow teams around the world to make faster progress on diverse projects.