Encoding the Genotype
We need to fill in some items.
First, how should the genotype be encoded so we can subject it to mutation and crossover?
What is the definition of a random genotype?
A random program?
is a "string" written in an alphabet of 4 "characters"
A, T, G and C (the 4 nucleobases).
Other mechanisms of inheritance
may have existed before DNA,
and other mechanisms may exist elsewhere in the universe.
Simplest chromosome for machine evolution: String of bits 0 or 1.
If chromosome is binary string of length n,
this can encode an integer from 0 to 2n-1.
Divide by 2n-1
to get real number x in the interval 0 to 1.
We can normalise previous number x to real number in any interval a to b
a + x (b-a).
Can decode it to any real number in range
REAL_MIN to REAL_MAX.
- Or could use standard
- IEEE floating point encoding.
(Different implementations use different numbers of bits.)
- Graphics showing different IEEE float implementations
Remember, it is 1-way:
We decode a chromosome to a number (or something else).
We never encode a number as a chromosome.
This genotype can be interpreted
as representing an integer in range INT_MIN to INT_MAX.
Or it can be interpreted
as representing a real number in range REAL_MIN to REAL_MAX.
Or it can be interpreted
as representing text (program code).
Decoding to minus infinity to plus infinity
Given a real number y from 0 to 1,
we can normalise this to any real number x from minus infinity to plus infinity
by taking the inverse of a very linear
(which encodes plus/minus infinity to 0-1).
For example setting:
y = 1 / (1 + exp(-x/5))
and solving for x.
Q. An actual straight line is no good for this. Why?
Say we encode numbers as binary:
Note that to get from 3 to 4 we need a massive mutation (of all 3 bits).
This can lead to a "barrier" that it is difficult for the GA to pass over.
are an attempt to encode numbers such that to go to the next number
always entails only 1 bit change.
This rule works:
"Start with all 0's for 0.
For each successive number, successively flip the rightmost bit that produces
a new string."
You could say now, a single bit change will change from 7 to 0.
But we had before anyway a single bit change from 4 to 0.
And indeed 7 to 3, 8 to 0, etc.
Anyway, a small mutation leading to a large (and probably bad) change
is not such a problem as a small (desirable) continuous change
needing a large mutation.
Evolution has a way of "jumping" barriers anyway.
e.g. In original coding, 3 can't get to 4,
but it jumps to 7, and then declines to 4.
3 can get to 4 via 1-bit mutations
in 3 generations.
Or can it? What if 7 is really unfit?
i.e. Fit is 3 and 4 only.
Then the 7 mutations die immediately.
It could get stuck on local minimum, but often you have to be unlucky not to be able to jump barrier somehow.
Bitstrings v. Regular language implementation
Evolving bitstrings is popular,
but not compulsory.
Consider where a system is defined by 100 parameters:
- 20 parameters are real numbers 0 to 1.
- 20 parameters are real numbers 0 to 1000.
- 20 parameters are integers 0 to 1000.
- and so on
You can have two approaches:
- Store in regular language implementations.
Mutation is flexible:
Different amounts for different parameters.
- 20 real numbers. Need to define randomise method so they start in 0 to 1. Mutation means (say) plus/minus a real number between 0 and 0.1. Have to check for out of bounds.
- 20 real numbers. Need to define randomise method so they start in 0 to 1000. Mutation means (say) plus/minus a real number between 0 and 100. Have to check for out of bounds.
- 20 integers. Need to define randomise method so they start in 0 to 1000. Mutation means (say) plus/minus an integer between 0 and 100. Have to check for out of bounds.
- and so on
Crossover does not generate new parameters:
would probably take some parameters from one parent, rest of parameters from other parent.
Would not cut in middle of a parameter.
Rely on mutation for new parameters.
- Store bitstrings.
All parameters, say, stored back-to-back on one bitstring.
Need method to decode a bitstring to the 100 parameters (not easy).
Will involve all the bounds-checking above.
Once we have defined that method, other methods are easy:
Mutation is not flexible:
Same rate for whole genotype.
- Randomise method - bitstring is random 0s and 1s.
- Mutation method - flip some percentage of the bits.
- Crossover method - start of one bitstring, end of another.
(Maybe multiple cuts.)
Crossover can generate new parameters:
Crossover might cut in middle of a parameter.
Get a quite different child parameter.
Either approach has something to recommend it.
People evolve many different data structures.