Wednesday, December 14, 2011

Molecular Clocks and the puzzle of RNA virus origins --Holmes 2003

Journal of Virology

Dr. Holmes investigates why phylogenetic dating of viral speciation implies that the major RNA virusus originated not more than 50,000 years ago, yet their hosts speciated many millions of years in the past.

One problem is the rate of replication, which is roughly 10^-3 subs/site/year (21), which further implies the average distance between any two sequences is limited to 500 years (since after 1000 years, every position will have mutated). Better to look at non-synonymous sites, assume the rate is 10^-5, and put the divergence at 50K years ago. Voiala.

So, do virus change their mutation rate? Is it because once adopted to their host species, they don't drift very much? No. Adoptation does not give RNA a repair mechanism, and mutation at synonymous sites doesn't slow down.

Perhaps different parts of the genome mutate at different speeds? Likely:
An important evolutionary by-product of these high mutation rates is a cap on genome size; genomes larger than ∼15 kb are rarely produced because of the “error threshold,” the generation of a prohibitive number of deleterious mutations (11). Since viral genome sizes are limited, sequence regions will encode multiple functions and individual mutations will often have pleiotropic effects, such as those influencing both cell tropism and immune evasion (1). This, in turn, may mean that there are relatively few evolutionary pathways that can be followed by RNA viruses; otherwise, at least one key function will be disrupted, so that mutations preferentially accumulate at that small proportion of sites that are free to vary. Supportive evidence for such a model is the frequency with which convergent evolution is observed for RNA viruses (4, 7, 13), as expected if only a limited number of evolutionary pathways are viable, and the evidence that RNA (37) and protein secondary structure (22) can act as constraints against sequence change.

Helpful to use a (skewed) gamma distribution to allow the rate to vary along the chromosone.
low α values (i.e., <1) mean that the sequence alignment is composed of both very quickly and very slowly evolving sites, and this appears to be true in most cases.


the three groups of flaviviruses, the mean d at these sites, corrected for multiple substitutions but without a gamma distribution, is ∼0.25 and is similar to the nonsynonymous distance estimated previously. The maximum likelihood estimate for the shape parameter of the gamma distribution for these data is highly skewed (α = 0.34). As expected, evolutionary distances increase if they are now estimated using this gamma model (mean d = 0.43), although not sufficiently to make a major difference to estimated divergence times, which only increase to a little over 20,000 years (again assuming a rate of 10−5 substitutions/site/year). However, more dramatic results are obtained if an even more skewed gamma distribution is used. If α = 0.1, then d increases to 2.3, so that maximum divergence times will be in the region of 100,000 years ago


No comments:

Post a Comment