Prev: Evolution on neutral network
Next: Integrating pattern formation, coding structure and evolution


TODO List
  • REF: populations evolve to smoother (less rugged) parts of the landscape (REF)
  • REF: initially selected new variants are always very unrobust with respect to wildtypes (REF)
  • See whether we can combine this with the next three pages: "integrating pattern formation", "which code is chosen?" and "what special code evolves?"



Coding structures


In the previous section we have discussed the concept of neutral paths and neutral networks and its role in allowing populations to discover novel structures. However, is there mroe going on? i.e. where do you go on an RNA neutral path? And how neutral is a neutral path? In other words: which code evolves?

One thing that becomes apparent in RNA evolution is that populations evolve to smoother (less rugged) parts of the landscape (REF). So despite neutrality there is still evolution! This is also seen in actual biological system in particular in Drosophila evolution, where it is clear that initially selected new variants are always very unrobust with respect to wildtypes (REF), i.e. they are not yet on a flat part of landscape.

So why does this happen? Well, evolution to higher neutrality allows for more neutral neighbours. As seen in simulations with wildtype tRNA, just in short time spans much more robust structures evolve when secondary structure in the selective pressure. Apparently tRNA is therefore not that robust, probaly because it needs to make conformational changes.

Next: Integrating pattern formation, coding structure and evolution


References


van Nimwegen E, Crutchfield JP. Metastable evolutionary dynamics: crossing fitness barriers or escaping via neutral paths? Bull Math Biol. 2000 Sep;62(5):799-848. MEDLINE link
van Nimwegen E, Crutchfield JP, Huynen MA (1999) Neutral evolution of mutational robustness. Proc Natl Acad Sci U S A 96: 9716-20 link
S.A. Kauffman (1993) The origin of order. Oxford Univ. Press 709pp
van Nimwegen E, Crutchfield JP. Metastable evolutionary dynamics: crossing fitness barriers or escaping via neutral paths? Bull Math Biol. 2000 Sep;62(5):799-848. MEDLINE link





Given code, which evolution
Given evolution, which code
Pred Prey in RNA
Mutation rates and coding organisation
AVIDA


On question to ask is: where do you go on RNA neutral path? and how neutral is neutral path.
OR: WHICH CODE EVOLVES?

- what is apparent is that populations evolve to smoother (less rugged) parts of the landscape
- so despite neutrality there is still evolution!
(see Nimwegen or Huynen)

- example from BIology:
In Drosophila evolution it is clear that initially selected new variants are always very unrobust with respect to wildtypes

- evolution to higher neutrality allows for more neutral neighbours:
- simulation with wildtype tRNA: just in short time span much more robust structure when sec structure is the selective pressure
- apparently tRNA is therefore shouldn't be too robust: needs to make conformational changes



So which code is chosen?

- well evolution to flatter parts of landscape:
- mutational robustness
- high connectivity of neutral networks
- max eigenvector of connection matrix (van Nimwegen 2000)
- this is not max robustness: why?

Consider numbering codes:
- with roman numerals: need to add a lot of digits to get new meaning
- with binary code: more meanings with fewer digits due to coding
RNA code:
- 4 bases giving folding giving landscape

van Nimwegen 1999 PNAS:
- evolution towards mutational robustness (but not MAX)
- walk along neutral path not neutral
- evolved coding structure not typical
- how fast go there depends on lambda and mutation rate
- and population size and mutation rate should be large enough

Ant comparison
Blind ant:
- does steps in arbitrary directions
- if wrong then steps back
this ant is on average on an average connectivity of the network: high connected have more ways to get out, low connected harder to get out
Myopic ant:
- look ahead and choose direction of step (on neutral path)
- but always makes step
this ant more easily finds highly connected nodes in network, sees local landscape properties
this is what populations do



What is the special code that evolves?

Hairpin evolution study:
- fitness is the length of the hairpin
One hairpin:
- in evolution there is evolution to more robustness: which is achieved by long strechtes of pyrimidines and purines (which will bind together)
- this happens for all mutation rates
WHY?: go to larger robustness by making long stretches of pyr/pur which force hairpin structure (clever!)

Two hairpins: fitness criterion is mutliplication of hairpin lengths
- now find different solution: on one hairpin the same solution as above
- on the other: oligo stretches of alternating pyr/pur stretches
- these two hairpins will therefore not interfere with each other mainting the two hairpins: clever!

This study with a real coding structure makes the issue more grabable
- what sequence looks like depends on drift to more robust structure
- sequences you might find might be the result of drift to more robust structure: evolutionary signature



So far we have looked at RNA structure as a paradigm for genotype - phenotype mapping:
- mainly because it is the most / only computable mapping

By looking at the high dimensional space we have seen:
- ruggedness
- neutrality: both local and global (more flat parts of landscape)

In this sense RNA is a nice example because:
- important molecule
- computable mapping

However: DISCLAIMER
e.g. 16s RNA and 23s RNA: minimum energy fold is quite far from conserved fold (Hofacker et al 2002)
- most helices are there
- but major differences in global basepairing: folding is actually more local than in minimum energy as we compute it

So how to do better?
- recognise global-local mistake
- study sequential folding
- more local oriented folding
- get phylogenetic information to find out what is most likely structure: most consistent folding over several species, energy contribution averaged over aligned sequences, compensatory mutations signal basepairing

How important is RNA world?
- recent advances show increasingly expanding role for RNA
- mapping of functional RNAs in phyologenies shows that they expand during evolution (Bompfnevener et al?)
- not really replaced by proteins as such
- in metazoa: more and more rnas added, repertoires of rnas becomes larger (Hertel et al)
- 4% of DNA is functional RNAs



Construcuted landscapes and landscapes as metaphor

Originally the landscape metaphor was nice to give an intuitive idea of what was going on (Sewel Wright), but it is also misleading
- misleading in 2D: no neutrality
- misleading in multi-D and only point mutations: there are many more mutational operators which make "nearness" in landscape hard to define.
What is closeness with respect to: crossover
Moreover duplications lead to a change in dimensions while a landscape is in a fixed number of dimensions
- constant fitness: in landscape metaphor the evolutionary pressures are kept constant, however normally they would change (SEA SCAPE doesn't really help as metaphor)



Now we want to make the connection:
CODING STRUCTURE to SPATIAL PATTERN FORMATION
both of which can leave evolutionary signatures in their own right.

With the main question: what kind of traces do evolutionary processes leave on evolved entities, which are:
- not there because they are functional
- not there for biochemical reasons
- but only because it is evolved in a certain process / way



Landscape studies:

NK landscapes (Kauffman): studied the effect of ruggedness and how it affects evolution
Main result: shouldn't be too rugged
(the results were previous to finding on neutrality)

Generally: such constructed landscapes are made with an a priori idea of what to study (in contrast to taking RNA and studying its landscape)



ROYAL ROAD

this landscape was developed in Holland's group to study the building block hypothesis:
- building blocks were considered ideal to evolution
- smooth landscape (no local optima only one): non-deceptive / getting stuck
- incorporated building blocks which help to give big jumps in fitness through cross-over
Generally genetic algoritms used cross-over and build block idea to get fast evolution and spread of innovations
And cross-over seen as main mechanism to achieve this

Here: genotype is a bit string with block that only give fitness when they have a certain configuration
Made to prove effectiveness of cross-over (Melani Mitchell)
main result: CROSS OVER DOESN'T HELP!
main reason: population always so converged that cross-over doesn't really make that much difference except at initial stages of evolution
(they didn't realise neutrality in this landscape)

Nice example of constructed landscape which was made to prove a point: i.e. easiest case for cross-over builiding block hypothesis
Therefore all the more powerful that this proves that is doesn't work!



But we can use the landscape to study "WHAT IS NEUTRAL"

van Nimwegen: population size and mutation rates in the Royal Road
- epochal evolution
- increase mutation: max fitness does go up, over information threshold: cannot keep fittest, but keep part of string (as far as it can get given mutation rate)
- goes as far as can with Darwinian selection: but Darwinian selection does work as far as possible
- also lower pop size: more stochasticity, earlier information threshold

So what is being neutral: i.e. what fitness difference is neutral?
- neutral is being over the information threshold

Instead of flat is neutral => what does natural selection see as flat
- what fitness differences are not visible for selection
- what can be selected
- selection defines its own neutrality with respect to mutation rate etc



So in RNA evol we have seen: lambda increases and robustness comes for free
- robustness is an evolutionary signature
- So how can we see that increase in robustness in natural populations for population structure
Well, we should expect high variabilty per position for a robust solution (more neutral cases)
Is this true?

AVIDA (Adami et al) study self-replicating program in an attempt to define an artificial world for open-ended evolution.
- i.e. so far we have taken the process of replication for granted
- HERE: actually evolve process of replication in terms of copying computer code which can copy and evolve
- One drawback is that it is as hard to study as living systems: a lot of change and need to devise good observational techniques to see what is happening.

Here observed system by viewing variation in lines of code in the population over time (kind of 3D diagram).
What is seen:
- fitness increases over time
So what about variation:
- well population variability goes down during increases in fitness, i.e. selection and bottlenecks
- and only variation increases on neutral path after selection
- as fitness contributing locations increase: more code gets meaning and get less neutral (functional info)
- there is increasing robustness while being neutral
- so both reduction in variation due to selection and increase in variation due to neutrality are taking place together!



CASE study of Multiple Coding

From viruses to mammals one sees multiple coding of biochemically totally unrelated functions:
example: tRNA
- extra constrained regions on 2 helices are conserved over phylogeny
- only in eukaryotes are they used also for polymerase binding => why? makes use of that conservation?

Model: 2 genes where gene A needs certain target sequence and gene B needs to just bind to A.
First evolve A to towards target but with only a certain CRUCIAL stretch which is more important for fitness
This crucial stretch is moved in different simulations to make sure it is not a question of particular code.
Then start evolving B to bind to A.

Results:
Generally: binding to A evolves to be centered on crucial stretch: i.e. preferential evolution for certain coding, i.e. here most conserved.
But different results for different mutation rates!

High mutation rates:
- binding sequence as short as possible: small parts on both genes overlapping on crucial part
- this is multiple coding: even increases selection pressure on that code and hence, effective mutation rate lower, improves information capacity

Low mutation rates:
- matching on less crucial and conserved area: this has higher effective mutation rate in gene A relative to crucial stretch
- this allows higher effective mutation rate: faster matching, which is sought out by evolution!

So:
- shouldn't have too high mutation
- but also not too low: no optimization, and drift plays a larger role (Sewel Wright)
- we see that coding of information develops in such a way as to be fairly close to the information threshold (cf mutation rates of different lengths genomes): here we see why it happens, coding adapts to mutation rate to affect mutation rate.

So what is happening in an evolutionary process?
- not just selection to get better
- but also adjusting coding in order to get better: coding length, selection pressure: adjust how to get as high as possible on Royal Road
- optimal mutation rate: higher for smaller sequence length: population size, selection pressure


(Note: Mullers ratchet is a consideration of the information threshold when population finitness plays a role in mutations)



Breaking down the landscape

Mutational operators
- landscape requires clear nearness concept:
what happends with non-local mutations: cross-over?
what are the mutational dynamics? (attractors, domains of attraction, side-effects?)
so no a priori idea of dynamics but just defined as a dynamical system: outcome is not trivial in the sense of going somewhere: dynamical system gives some structure: what is it?

(continued later)



Breaking down the landscape (2)

Timescales
- walk on landscape was on a static landscape (i.e. evolution assumed to be on long time scale)

What we see in evolutionary simulation:
- millions of time steps!
- in biology in such timescales amazing changes occur!
climate change, geological periods, speciation
- so simulations appear out of sync

speciation: newest mammal is polar bear is very recent, but also several mammals have remained very much the same.

When we think of a co-evolutionary process (e.g. host pathogen) time scale must be very short!
Only perhaps some processes don't depend on external environment that much (e.g. intrinsic process)
However in most cases the fitness landscape can't be the right metaphor!

In RNA world there is an intermediate concept: primary structure to secondary structure: where fitness may change.
So although fitness landscape concept does not apply: the RNA structure / energy landscape does apply. This in itself is nice in order to know how conclusions based on landscape apply when landscape is not there.



Experiment: RNA co-evolution

No secondary structure defined fitness, but two population of RNAs:
host: needs to not match parastie
parasite: needs to match host

Outcome:
- red queen dynamics of coevolution
- also evolution to non-robustness! (in contrast to external fitness criterion)
why?
- parasite just wants to change as much as possible to be different: mutations should have large impact
- host needs to follow, but can't be as non-robust because needs to match somewhat (needs some info).
So neutrality decreases!

What does this mean for the biological context?
- robustness comes for free if optimum does not change in time
- but with need for change: opposite coding
- and we can get both in one genome



Sexual and Asexual "prey" in RNA

We saw that in the Royal Road cross-over doesn't help
So why crossover (or sex):
- much debated issue, which is not resolved: why add foreign genome

One thing clear: Hamilton
- crossover might be sensible for fast evolution (co-evolution)

So when RNA string co-evol placed in space with cross-over:
- at least sex remains in population, but only in co-evolution context, not in fixed external situation.


Integrating pattern formation, coding structure and evolution

In CA we see spatial patten formation, which affects ecological and evolutionary processes. Higher order patterns arise by themselves and feedback on evolution of replicators: "enslave" evolution of replicators. This is a two-way process whereby replicators interact to generate patterns which then feedback on replicators.

On the other hand we have seen that replicators have a coding structure which leads to a particular genotype-phenotype landscape.
In organisms there is therefore a complex transformation from a code to an organism.

We can ask:
Q1: what is the best code? (smooth landscape?)
Q2: Given a code, how does evolution proceed, i.e. what evolutionary process? In RNA a smooth landscape is false, but appears much better for the evolutionary process, i.e. redundant coding and neutral paths.
Q3: given an evolutionary process, what kind of code evolves? i.e. can evolution select for certain kinds of coding because properites of landscape are not the same everywhere? i.e. select place in genotype space such that landscape has certain properties. Moreover, is it true that evolutionary processes leave an evolutionary signature? i.e. something that does not affect the properties of a system but is merely a signal that shows it has evolved.

Evolution is mostly visualized as a process of point mutations, BUT mostly actually duplication as a major factor in evolution (as least to the same extent as point mutations). Therefore the landscape metaphor breaks down with such mutations. (also with co-evolution where fitness is relative to the system at certain time point.)

So what code evolves:
- evolution to fixed target: evolutionary process goes to robust solution
- coevolution: to where mutations have extra effect

Can we generalize on that idea by connecting:
- Pattern formation
- Coding structure (how they do what they do, not just what they do)
- Evolution

i.e. study CODING AND COPYING.