Inferring viral quasispecies spectra from 454 pyrosequencing reads
Astrovskaya et al, 2011, BMC Bioinformatics.
The authors propose using a graph-based method to reconstruct viral haplotypes in deep sequenced data. The introduction lists several other software tools for the job, and a coherent arguement as to why viral haplotype reconstruction is different than other deep sequencing problems
(de novo assembly are designed to reconstruct a single sequence, not a large number of closely related sequences; haplotype assembly "do not easily extend"; population phasing require additional data; metagenomic studies require large differences between species, and do not try to reconstruct genomes)
The method finds overlapping sections of reads and assembles a minimal path through the resulting graph. The probability that two overlapping reads belong to the same haplotype is a function of the length of overlap and the probability of a missmatch over that length.
They do not provide read error correction.
I don't see a clear advantage of their approach over Shorah until the point where they use Shorah to correct their reads. Then their algorithm shows large improvements.
On the other hand, I don't know how it will compare to the "global constraint" version of Shorah.
No comments:
Post a Comment