quasi

Prev: Prebiotic Evolution Next: Information threshold


 * TODO List**
 * REF Eigen Schuster 1979 versus Eigen et al 1989 previous page seem inconsistent

=[|Quasispecies theory]=

Eigen and Schuster (1979) formulated the Quasi-species model in order to study whether some key ingredients of evolution would allow for selection processes to arise in non-living RNA molecules and lead to some [|bootstrapping] process which could amplify complexity (or bigger molecules). By taking RNAs, Eigen and Schuster took the best case auto-replicator. Furthermore, they assumed a sequence-independent replication rate.

As we saw in the previous page, minimal requirements for evolution are: (1) generic replicators (2) independent synthesis and decay (3) mutation (4) competition.

Eigen tried to capture these requirements in his __replicator equations.__ These equations describe the abundances of different replicators (e.g. strains, indexed with //"i"//) in a [|chemostat]. The equations are:

//dX i /dt = A i *Q i *X i - d// i //X// i //+ Sum(w ij * X j ) - O i //

with //O i = (X i / Sum(X j )) * Sum((A j -d j )X j )//

Here, //X i // is the abundance of "strain" //i//, //Q i // the quality of replication of "strain" //i//, //A i // its growth or replication rate and //d i // its death or decay rate. The first term thus represents succesful replication of //X i ,// again forming individuals of "strain"/type //i//, while the second term is the decay of //X i //. "Unsuccesful" replications of //X i //mutate and form other strains "//j//". This happens at a rate //A i *(1-Q i )*X i //. The third term in the equation represents the formation of //X i 's// by the replication and mutation of other types //X j .// Here, w// ij // is the amount of //j// mutants that are equal to the //i// genotype. This takes into account the rate and quality of replication (i.e. //A// //j// and //Q j //) and the mutational distance between //i// and //j//. Lastly, //O i // is a dilution term: it leads to a decrease in //X i //-concentration at a rate proportional to the frequency of //X i //in the population (first factor) and the total population growth (second factor)//.// This term represents the **chemostat assumption**: it assures that the total sum //Sum (X j )// remains constant, i.e. that the total concentration of replicators does not change.

The main result from this equation is that over evolutionary time the system will converge to the normalized eigenvector corresponding to the largest eigenvalue of matrix //W// (mutation interactions). This eigenvector describes the final frequency distribution of all "strains". What we then see is a **quasispecies**: a distribution of replicators ("strains") that arise together. The evolutionary process maximizes the total growth rate, and hence selects for the //quasispecies// (i.e. a collection of genotypes) that grows fastest. This means that it does not necessarily go to the fittest RNA, but to the fittest combination of RNAs which is a cloud of mutants that are closely related.

The first surprising insight from this model is that certain replicators are present in the population not because they are "optimal" (i.e. have a high growth rate), but because they are mutationally close to a replicator with a high replication rate. This illustrates an important point: __just observing a certain phenotype in the population, does **not** mean that this phenotype is evolutionary optimized!__

A second important result is that whether evolution leads to the fittest RNA in this model depends on the mutation rate. When the mutation rate is too high the fittest replicator (in terms of replication rate) might no longer be able to survive. This can best be illustrated in a simpler model, that is described below.

//Master/mutant simplification//
In the original quasispecies model fitness can be defined a priori as the rate of replication. Let's assume a [|Dirac-delta] like fitness function, in which one particular replicator has a high replication rate (//a1//), while all other replicators have the same, lower replication rate (//a2//). We can then simplify the model to two equations: a **master equation** (the fittest) and a collective **mutant equation** (all others):

//dx/dt = a1*x*Q - d*x - x*((a1 - d1)x + (a2 - d2)*y)// //dy/dt = a2*y + a1*(1-Q)*x - d2*y - y*((a1 - d1)x + (a2 - d2)*y)//

with //x// the fittest strain (i.e. the **master se****quence**) and //y// all the less fit mutants, and //x + y = 1//. Because //x// is only a specific "strain", and //y// represents all other mutants, we neglect back-mutations from //y// to //x//. We can use this set of equations to investigate under which conditions the master, //x//, will be present in the system. For this we will use the invasion criterion, which looks at whether a small amount of //x// is able to invade a population of //y//. For //x~0//, //dx/dt// simplifies into:

//dx/dt = a1*x*Q - d*x - (a2 - d2)*x*y = x* (a1*Q - d - (a2-d2)*y).//

Since //x+y = 1// and //x~0//, //y// ~1. Hence, we see that x can invade (//dx/dt > 0)// iff:

//a1*Q - d1 > a2 - d2//

Assuming //d1 = d2//, i.e. the difference in fitness is only given by the growth rates //a1// and //a2//, this simplifies into

//Q > a2/a1//

and since //a1/a2// is the selection coefficient, //σ//, we can simply write

//Q > 1 / σ.//

This is the __error threshold condition__. If the replication quality //Q// is not large enough, //x// cannot invade into a population of //y//. This means that the master cannot sustain itself in the population; and hence that we do not see survival of the fittest.

The next question we should ask is whether the error threshold poses a problem at realistic parameter values. For this we need to determine what sequence lengths can be maintained at particular mutation rates.

Next: Information threshold