GAGP Tutorial 2

icon

3

pages

icon

English

icon

Documents

Écrit par

Publié par

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

3

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

NAT Tutorial 31. [Schema theorem] A population consists of the following strings. The probability of crossover is 0.75 and the probability of mutation is 0.1. How many instances of the schema *0***0 would you expect in the next population? [From 2005 exam paper.]String Fitness100100 20001000 20110111 30100101 20100010 10Answer: Here, we should apply the schema theorem, as given in Lecture 4. For *0***0, there are 3 instances, so m(H,t) = 3. The average fitness of these two i s50/3, i.e. 16.6. So we get: E(m(*0***0,t+1)) = 4 x 16.6/20 = 5/6 Note that this does not yet include the disruptive effects of crossover and mutation, so this only covers the first bit of the formula. Defining length is 4, so there is a chance of 0.75 (1-4/5) that the schema survives crossover. Order is 2, so with probability (1-0.1)^2 it survives mutation. Taken together we have E(m(*0***0,t+1)) = 3 *65*/0.15 *0.81 = 0.3Moral: The schema theorem predicts that this schema may not be present in the new generation. Note, however, that the schema theorem gives only a lower bound. The recombination may occur between individuals that both carry the schema or there is a chance that the fourth individual is mutated into this schema. Therefore, probably there will be one or two instances of the schema left. The most drastic effect is here by the crossover probability and the relatively high order of the schema.Note also that high fitness schemata like the ones in the third ...
Voir icon arrow

Publié par

Langue

English

NAT Tutorial 3
1. [Schema theorem] A population consists of the following strings. The probability of
crossover is 0.75 and the probability of mutation is 0.1. How many instances of the
schema *0***0 would you expect in the next population? [From 2005 exam paper.]
String
Fitness
100100
20
001000
20
110111
30
100101
20
100010
10
Answer:
Here, we should apply the schema theorem, as given in Lecture 4.
For *0***0, there are 3 instances, so m(H,t) = 3. The average fitness of these two is
50/3, i.e. 16.6. So we get:
E(m(*0***0,t+1)) = 4 x 16.6/20 = 5/6
Note that this does not yet include the disruptive effects of crossover and mutation, so
this only covers the first bit of the formula. Defining length is 4, so there is a chance of
0.75 (1-4/5) that the schema survives crossover. Order is 2, so with probability (1-
0.1)^2 it survives mutation. Taken together we have
E(m(*0***0,t+1)) = 3 *5/6*0.15 *0.81 = 0.3
Moral: The schema theorem predicts that this schema may not be present in the new
generation. Note, however, that the schema theorem gives only a lower bound. The
recombination may occur between individuals that both carry the schema or there is a
chance that the fourth individual is mutated into this schema. Therefore, probably there
will be one or two instances of the schema left. The most drastic effect is here by the
crossover probability and the relatively high order of the schema.
Note also that high fitness schemata like the ones in the third individual realising there
chances on the cost of the other ones and spread quickly through the population, low
fitness ones are quickly destroyed.
2. If
f(****) = e
0
f(***
1
) = e
0
+ e
1
f(**
1
*) = e
0
+ e
2
f(**
11
) = e
0
+ e
1
+ e
2
+ e
3
(these equations DEFINE
e
0
, . . ., e
3
), then what is
f(**
01
)
in terms of
e
0
, . . ., e
3
?
Answer:
here, f is the average fitness of the schema over all actual members of the set.
So:
f(***1) = (f(**11) + f(**01)) / 2
Why is this?
Because (from Q3) ***1 (8 members) is the union of **11 (4 members) and **01 (4
members), so the average fitness of ***1 is the average of the average fitnesses of **11
and **01.
Rearrange this equation:
2 x f(***1) = f(**11) + f(**01)
So:
f(**01) = 2 x f(***1) - f(**11)
= 2.e0 + 2.e1 - (e0 + e1 + e2 + e3)
= e0 + e1 - e2 - e3
3. [Selection mechanisms] Investigate whether binary tournament selection (i.e.
tournament size 2) is equivalent to linear ranking selection (i.e. selection in which the
fittest of
N
gets
N
chances, the next-fittest gets (
N-
1) chances, etc., and the least fit gets
one chance). Tournament selection is selection where
m
individuals are chosen
randomly from the population and the best
n
of those m are selected. So in binary
tournament selection
m =
2 and
n =
1.
Answer:
Possible argument, courtesy of Peter Ross: in tournament selection, how
would it pick the
K
-th ranked? By picking that one (at random) and also picking one
that is
K
-th *or lower* ranked. Consider the chance of that: what you get is exactly
linear ranking selection.
Check this argument over by trying out this real example:
C1: fitness = 4
C2: fitness = 3
C3: fitness = 2
C4: fitness = 1
There are 16 possible pairs: (C1,C1), (C1,C2), (C1,C3), (C1,C4),
(C2,C1), (C2,C2), (C2,C3), (C2,C4),
(C3,C1), (C3,C2), (C3,C3), (C3,C4),
(C4,C1), (C4,C2), (C4,C3), (C4,C4).
Of these, 7 pick select C1, 5 select C2, 3 select C3 and 1 selects C4. So, this is a linear
ranking selection, but the number of chances is in fact equal to 2R-1 where R is the
position of the candidate in reverse rank order (least fit = 1, next = 2,... most fit =
N
).
4. [Baldwin effect] It has been observed that some organisms seem to pass on behaviours
learned during their lifetime to their offspring. Lamarck’s hypothesis was that traits
acquired during the lifetime of an individual could somehow be passed on genetically to
the individual’s children. However, since there is no obvious biological mechanism for
this, Lamarck’s hypothesis is universally rejected.
One proposal for a non-Lamarckian mechanism explaining the passing on of learned
behaviours was given by Baldwin, who pointed out that if learning helps survival, then
the organisms best able to learn will have the most offspring. Further, if the
environment remains constant, so that the best things to learn remain constant, then this
can lead, via selection, to a genetic encoding of a trait that previously had to be learned.
Describe how you could use evolutionary computation as a model system to
demonstrate the truth (or otherwise) of Baldwin’s hypothesis.
Answer:
In the lecture the paper by Nowlan and Hinton was mentioned with gives an
example (with random search instead of a GA, but this does not make a difference in
the all-ones problem as we have seen before). In addition, any GA that includes local
search can be interpreted as an example for Baldwinian evolution if the result of the
local search is used to edit the individual's genetic code.
5. (Mitchell) Design a fitness function (in terms of schemas, as in
R
1
, see 4. Lecture,
1/10/10) on which you believe the GA should outperform RMHC. Test your hypothesis
numerically.
Answer:
You need to make sure that if there are some individuals with several good
blocks then there is still a chance for new block to be discovered and to survive (i.e.
compete well with the hitchhikers), e.g. a royal road could be built here by a fitness
function that assigns to two blocks less than twice the fitness of one block. In terms of
R
1
this would require actually to slightly punish double or triple blocks.
A trivial solution would be to run RMHC as a local heuristic to the GA and to hope that
not only the parallelism but also the recombination in the GA leads to a clear
advantange (i.e. a speed-up even beyond the linear speed-up due to the population size).
6. (Mitchell) Design a three bit fully deceptive fitness function. “Fully deceptive” means
that the average fitness of every schema indicates that the complement of the global
optimum is actually the global optimum. For example, if 111 is the global optimum, any
schema containing 000 should have the highest fitness in its partition.
Answer:
One possibility is (1-x)+(1-y)+(1-z)+4*xyz, where x, y, z are the three bits.
Except if there are only ones switching a one to a zero increases the fitness. Consider
the ability to fly as an example: In order to fly the individual needs simultaneously:
wings, a good brain for control, specific muscles, some stiffening of the body etc. All
these features have a cost (negative fitness points), only if all of them are there and the
phenotype can actually lift off then a fitness increase is reached. Evolution will thus
have to use properties that have already proved useful (but perhaps in a different
context) instead of waiting for a very lucky mutation of several bits at the same time.
This is an important point in constructing fitness functions in GA.
7. How can elitism be achieved in an application of GA to multi-objective optimisation?
Visualise the situation for the case of two fitness functions. What issues can be expected
to arise for a large numbers of objectives?
Answer:
The simple answer is to spare the first front. If the algorithm is already in the
converging phase this might slow down the progress (no “killer instinct”) such that one
may rather keep only one a few individuals from the first front. The algorithm discussed
in the lecture, assigns a distance function to the individuals which may be used for this
subsection from the first front, i.e. only those with the largest distances will survive. On
the other hand, does this particular algorithm keep the full parent generation together
with a (mutated) children population and performs selection on this double population
such that elitism is already included.
For a large number of objectives there may be very many (most) individuals on the first
Pareto front (which is of dimension
m
-1 for
m
fitness functions) such that there is not
much competition. Either one has to work with very large populations or one has to
think about selecting a few fitness functions for the non-dominating search while the
other ones are combined by some idea about there priority. Discuss parallels to natural
evolution.
Voir icon more
Alternate Text