genpro


 * __NO__**__**TE: THE INFORMATION ON THIS PAGE IS NO LONGER PART OF THE COURSE (removed from main wiki, 2014-2015)**__

Prev: Gene regulation networks

TODO LIST
 * REFS John Hollands attempts
 * CLARIFY: Koza's list

=Genetic Programming=

So far most of the evolutionary simulations we have looked at are still quite constrained in their evolutionary potential. It would therefore seem quite desirable to have a more flexible coding. For this [|Koza] (1992) developed [|genetic programming]. We know from the RNA world that genotype-phenotype mapping is highly complex, moreover we have shown the importance of neutral evolutionary dynamics. In genetic algorithms we find that genes have a meaning and some fitness consequences, but the meaning of duplication of genes is undefined. Genetic programming therefore strives towards a system which is implememented without too much predefined meaning, but with an interesting genotype-phenotype mapping.

Genetic programming (Koza 1992) incorporates genomes as computer code and allows mutational operators to expand and contract this code, thus leading to a great deal of flexibility. For this Koza used the [|LISP] programming language developed by [|John McCarthy]. This language is special because:
 * it has the power of a general program language (generality and power)
 * there is no a priori distinction between the program and the data the program is run on
 * it can be visualized as follows: lists(eval(+ 7 5)) -> 12: showing how it is structured by borders, operators, atoms, in a nested form which can be represented as a tree structure
 * it has one requirement: each list should do something, so (% 7 0) needs to be defined in order to get closure of the system

So in genetic programming, LISP gives the primary coding structure. This is expanded by adding genetic operators: point mutations, branch swapping, branch copying and deletion.

History
In the 1950's John Holland has tried the same approach, however it was just not feasible given the computer speeds at the time and therefore in the 60's he came with genetic algorithms. Later in the 90's Koza (1992) was able to implement genetic programming due to increased computer speed. What he then focused on were "classic" genetic algorithm (artificial intelligence) problems and approached them with genetic programming.

One example is the classic AI problem of solving the case of two intertwined spirals (one with squares and one with circles). Computer code was meant to identify the pattern, i.e. to which spiral does each point belong to? This is an incredibly difficult problem for neural networks in AI. In genetic programming, if one starts out with sin X and sin Y functions and random genomes and apply mutation and selection then we quickly see some evolution to recognition: However:
 * at generation 5 the system classifies point using stripes, but not spirals
 * later: the stripes become wiggly, indicating the match is becoming better
 * at the end: all point are correctly classified!
 * the evolved LISP code is very long (not simple)
 * the code recognizes points, but doesn't really recognize spirals

So although the evolutionary challenge has been solved, it has been achieved with a horribly complex function, even after sanitation of redundant rules (i.e. leaving out rule like + 0.00000000000000001). However, when such rules are sanitized during evolution, evolution is less successful at finding a solution! In other words, the redundancy gives some neutrality which plays a role in evolution. Moreover, we see that evolution produces an **overfitted** solution. For evolution to fixed problems this is a general type of solution.

Koza's list
From his study of genetic programing, Koza came up with a list with issues he found to be of importance:
 * Correctness: how correct is correct?
 * to solve correctly
 * what level of correctness
 * adding redundant features is for all practical purposes correct. For organisms correctness is not an issue, but survival, i.e. redundant features don't affect their survival. (What is neutral for selection determines what is neutral relative to the information threshold).
 * Consistency: contradictory approaches (?)
 * Certainty: anything can happen, nothing is guaranteed!
 * Orderliness: uncoordinated process, no supervision (?)
 * Parsimony:
 * (vs evolvability) evolved systems not usually parsimonious?
 * theoretical biology: need to solve things in a simple way, but organisms need not solve things in a simple way! This is a real challenge for theoretical biology. So what to do? We can mimick organism, but no hope to do extra predictions. Or we can use simple rules and assume that by studying paradigm systems we can get to know something.
 * this example shows that the results of simple rules show that we get this discrepancy problem of theoretical and experimental parsimony.
 * Discreteness: termination