Monday, September 5, 2011

ViSpA-- haplotype reconstruction

Inferring viral quasispecies spectra from 454 pyrosequencing reads

Astrovskaya et al, 2011, BMC Bioinformatics.

The authors propose using a graph-based method to reconstruct viral haplotypes in deep sequenced data. The introduction lists several other software tools for the job, and a coherent arguement as to why viral haplotype reconstruction is different than other deep sequencing problems
(de novo assembly are designed to reconstruct a single sequence, not a large number of closely related sequences; haplotype assembly "do not easily extend"; population phasing require additional data; metagenomic studies require large differences between species, and do not try to reconstruct genomes)

The method finds overlapping sections of reads and assembles a minimal path through the resulting graph. The probability that two overlapping reads belong to the same haplotype is a function of the length of overlap and the probability of a missmatch over that length.

They do not provide read error correction.

I don't see a clear advantage of their approach over Shorah until the point where they use Shorah to correct their reads. Then their algorithm shows large improvements.

On the other hand, I don't know how it will compare to the "global constraint" version of Shorah.

No comments:

Post a Comment