rna

Prev: Constructive evolution Next: Evolution on neutral networks


 * TODO List**
 * REF: in evolved RNA structures all positions in sequence essential / non-neutral??
 * REF: This suggests that a lot can change in secondary structure even in real RNA which come out of experiments and that the genotype-phenotype mapping is rugged
 * REF: Moreover, in RNA evolution studies results show that 2 functionally different RNAs can be separated by only one mutation which cause all [|helices] to fold differently

=RNA evolution=

In [|in vitro] experiments it is possible to evolve a great variety of RNA-[|ligases] for particular functions. In such experiments RNA population sizes are typically about 10^10, which is **much smaller** than the total sequence seach space which is 4^120. None-the-less ligases can be rapidly evolved. In fact most evolved structures are found in about 10 mutational steps. Moreover in the RNA structures that evolve, all positions in the sequence turn out to be essential, i.e. they are non-neutral (REF).

Q: //Given that the search space is so large, how can we explain such rapid evolution?//

To answer this question and understand how such evolution is possible, we need to understand the **adaptive landscape** of RNA.

[|Adaptive landscapes]
A useful conceptual tool is the adaptive landscape which van be visualized as a hill landscape and can be roughly categorized in two forms:
 * **a [|smooth landscape]** (i.e. a single hill): this represents the case of a simple genotype-phenotype mapping (smooth [|epistasis]).
 * **a [|rugged landscape]** (i.e. many peaks): this represents a more complex genotype-phenotype mapping (rugged epistasis)

Fitness landscapes are a very important concept for understanding **how** evolution can proceed and the shape of the landscape is obviously dependent on coding structure. For instance, intuitively it would appear that evolutionary optimization would be difficult in a rugged landscape due to the isolated nature of peaks (i.e. getting stuck in local optima). However, genotype space has **n-dimensions** not just 1 or 2. Moreover, the genetic, or **mutational operators**, also play an important role in defining the mutational neighbourhood in the landscape.

RNA [|secondary structure]
RNA secondary structure has proved to be a useful paradigm for studying **natural coding structure**. In such studies a non-predefined structure is chosen as an external fitness criterion for selection. RNA of course carries a code (its own sequence) and RNA in secondary or tertiary structure can function as enzymes. In this sense RNA is ideal for such studies. Moreover, at present it is the only //computable// natural genotype-phenotype mapping we can do (the Elephant is still a bit beyond our means!). This mapping is the mapping between RNA code and its secondary structure using minimum energy configurations which is still very difficult for proteins. Therefore, using this RNA mapping and adding mutations on the genotype and selection on phenotype we can study how evolution proceeds within the RNA adaptive landscape as defined by the genotype-phenotype mapping.

One can obtain all mutants and their fitness for every possible sequence, and so can study the local shape of the adaptive landscape. However, given the high dimensionality of the space one needs a good method for visualization. This can be done by showing for each sequence, the frequency distribution of the degree of change in secondary structure for all //n// point mutants (see Figure to left), a so called **correlation landscape** ([|Huynen et al. 1993]). So by looking locally, but then averaging over all positions one obtains an impression of the total landscape. This shows that:
 * as mutations increase, there is **convergence on an average distance in secondary structure**, i.e. after a small number of mutations structure does not change much anymore (saturation in change).
 * **the landscape is rugged**: one point mutation can drop fitness a lot and a moderate or large number of mutations leads to the same effect. For single mutations: often no change, sometimes total change

Real RNA
The above result were computed secondary structures given certain sequences, but what about real RNA structures? In [|eukaryotic] [|mRNA], which have leader sequences, it was shown that mutations in different regions have different effects. This suggests that a lot can change in secondary structure even in real RNA which come out of experiments and that the genotype-phenotype mapping is rugged (REF). Moreover, in RNA evolution studies results show that 2 functionally different RNAs can be separated by only one mutation which cause all [|helices] to fold differently (REF). Both these results contribute to an impression of rugged epistasis.

Multi-One Genotype-Phenotype Mapping
What is apparent in the RNA genotype-phenotype landscape is a multi to one mapping, i.e. many genotypes give the same phenotype. Part of the reason for this is the loss of degrees of freedom as one converts sequence into structure. While there are four variants per location for the genotype (A U G C), there are only 3 variants for the phenotype ( (. ) ) . As a consequence **the number of genotypes is much greater than the number of phenotypes**, i.e. there is a constraint on the phenotype (parentheses need to open and close). In the landscape there is also a great deal of **redundancy** in that almost all sequences have a **typical shape**. None-the-less, **only a small fraction of shapes is typical**, i.e. there is a large diversity of shapes. Looking at 30 long nucleotide sequences, these characteristics translate to the following statistics:
 * 1.07 * 10^9 sequences
 * 218830 shapes (secondary structures)
 * 22718 typical shapes (10%)
 * 93.4% sequences fold into the typical shape ([|degeneracy])

Next: Evolution on neutral networks