Russell's Blog

New. Improved. Stays crunchy in milk.

Fun with de Bruijn graphs

Posted by Russell on October 29, 2010 at 4:34 a.m.
One of the projects I'm working on right now involves searching a better approaches to assembling short read data metagenomic data. Many of the popular short read assembly algorithms rely on a mathematical object called a de Bruijn graph. I wanted to play around with these things without having to rummage around in the guts of a real assembler. Real assemblers have to be designed with speed and memory conservation in mind -- or, at least they ought to be. So, I decided to write my own. My implementation is written in pure Python, so it's probably not going to win any points for speed (I may add some optimization later). However, it is pretty useful if all you want to tinker around with de Bruijn graphs.

Anyway, here is the de Bruijn graph for the sequence gggctagcgtttaagttcga projected into 4-mer space :

This is the de Bruijn graph in 32-mer space for a longer sequence (it happens to be a 16S rRNA sequence for a newly discovered, soon-to-be-announced species of Archaea).

It looks like a big scribble because it's folded up to fit into the viewing box. Topologically, it's actually just two long strands; one for the forward sequence, and one for its reverse compliment. There are only four termini, and if you follow them around the scribble, you won't find any branching.

Tomas on April 14, 2014 at 6:40 p.m.

It is very important post good message and strong panele fotowoltaiczne polikrystaliczne.

gds on April 15, 2014 at 7:52 a.m.

Easter 2014 webpage for receiving
Easter Bible Verses

Ignore this field:
 optional; will not be displayed
Don't put anything in this field:
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p>.