Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

My big idea: Ancient Brain


CA114      CA170

CA668      CA669      Projects

Encoding the Genotype

Recall our rough Genetic Algorithm and our definitions. We need to fill in some items.

First, how should the genotype be encoded so we can subject it to mutation and crossover? Binary bitstrings? Program code?

What is the definition of a random genotype? A random program?

Binary Chromosome

In nature, DNA is a "string" written in an alphabet of 4 "characters" A, T, G and C (the 4 nucleobases). Other mechanisms of inheritance may have existed before DNA, and other mechanisms may exist elsewhere in the universe.

Simplest chromosome for machine evolution: String of bits 0 or 1.

Remember, it is 1-way: We decode a chromosome to a number (or something else). We never encode a number as a chromosome.


This genotype can be interpreted as representing an integer in range INT_MIN to INT_MAX.
Or it can be interpreted as representing a real number in range REAL_MIN to REAL_MAX.
Or it can be interpreted as representing text (program code).

Decoding to minus infinity to plus infinity

Given a real number y from 0 to 1, we can normalise this to any real number x from minus infinity to plus infinity by taking the inverse of a very linear sigmoid function (which encodes plus/minus infinity to 0-1).

For example setting:

  y = 1 / (1 + exp(-x/5))
and solving for x.

Q. An actual straight line is no good for this. Why?

Gray codes

Say we encode numbers as binary:

0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111
Note that to get from 3 to 4 we need a massive mutation (of all 3 bits). This can lead to a "barrier" that it is difficult for the GA to pass over.

Gray codes are an attempt to encode numbers such that to go to the next number always entails only 1 bit change. This rule works: "Start with all 0's for 0. For each successive number, successively flip the rightmost bit that produces a new string."

0 000
1 001
2 011
3 010
4 110
5 111
6 101
7 100
You could say now, a single bit change will change from 7 to 0. But we had before anyway a single bit change from 4 to 0. And indeed 7 to 3, 8 to 0, etc.

Anyway, a small mutation leading to a large (and probably bad) change is not such a problem as a small (desirable) continuous change needing a large mutation.

Jumping barriers

Evolution has a way of "jumping" barriers anyway. e.g. In original coding, 3 can't get to 4, but it jumps to 7, and then declines to 4.

3 can get to 4 via 1-bit mutations in 3 generations.

Or can it? What if 7 is really unfit? i.e. Fit is 3 and 4 only. Then the 7 mutations die immediately. It could get stuck on local minimum, but often you have to be unlucky not to be able to jump barrier somehow.

Bitstrings v. Regular language implementation

Evolving bitstrings is popular, but not compulsory.

Consider where a system is defined by 100 parameters:

You can have two approaches:

  1. Store in regular language implementations.
    • 20 real numbers. Need to define randomise method so they start in 0 to 1. Mutation means (say) plus/minus a real number between 0 and 0.1. Have to check for out of bounds.
    • 20 real numbers. Need to define randomise method so they start in 0 to 1000. Mutation means (say) plus/minus a real number between 0 and 100. Have to check for out of bounds.
    • 20 integers. Need to define randomise method so they start in 0 to 1000. Mutation means (say) plus/minus an integer between 0 and 100. Have to check for out of bounds.
    • and so on
    Mutation is flexible: Different amounts for different parameters.
    Crossover does not generate new parameters: Crossover would probably take some parameters from one parent, rest of parameters from other parent. Would not cut in middle of a parameter. Rely on mutation for new parameters.

  2. Store bitstrings.
    All parameters, say, stored back-to-back on one bitstring. Need method to decode a bitstring to the 100 parameters (not easy). Will involve all the bounds-checking above.
    Once we have defined that method, other methods are easy:
    • Randomise method - bitstring is random 0s and 1s.
    • Mutation method - flip some percentage of the bits.
    • Crossover method - start of one bitstring, end of another. (Maybe multiple cuts.)
    Mutation is not flexible: Same rate for whole genotype.
    Crossover can generate new parameters: Crossover might cut in middle of a parameter. Get a quite different child parameter.

Either approach has something to recommend it.
People evolve many different data structures.

ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.

Wikipedia: Sometimes I link to Wikipedia. I have written something In defence of Wikipedia. It is often a useful starting point but you cannot trust it. Linking to it is like linking to a Google search. A starting point, not a destination. I automatically highlight in red all links to Wikipedia and Google search and other possibly-unreliable user-generated content.