GA rules in Lisp

From: Newbie
Subject: GA rules in Lisp
Date: Tue, 20 Feb 2001 02:17:23 +0000
Message-ID: <96sk3k$cal$1@usenet01.srv.cis.pitt.edu>

Hi folks,

I'm working on a GA homework that deals with learning rules from real-valued
data, and I'm writing the program in Lisp. I'd like to be able to represent
my entire gene as a string of bits. This means that I need to convert from a
Lisp float, probably using something like one's complement.

1) Is this the right way to solve this problem?
2) If it is, are there existing functions/libraries that I should know about
that would help to do this? (Seems like an awfully silly thing to code up by
hand.)

Thanks,
JT

Re: GA rules in Lisp Bob Whitefield
- Re: GA rules in Lisp Steve McGrew
  - Re: GA rules in Lisp Bob Whitefield
    - Re: GA rules in Lisp Steve McGrew
      - Re: GA rules in Lisp Bob Whitefield
        Re: GA rules in Lisp Steve McGrew
Re: GA rules in Lisp Andrew Cooke
Re: GA rules in Lisp Eugene Zaikonnikov

From: Bob Whitefield
Subject: Re: GA rules in Lisp
Date: Tue, 20 Feb 2001 03:58:23 +0000
Message-ID: <3A91EB5F.9C299141@modeldesign.com>

JT,

Early GA implementations used bit string representation, which makes
crossover and mutation simple, but there's little else to recommend it.
Use a native float representation for your gene values and you'll avoid
the issues of conversion, how many bits to use, Gray codes, etc. It's
also far more efficient.

Crossover and mutation operators are a bit more complex for native
representation. For crossover, pick a random cut point within the gene,
then blend the two parent gene values to produce the offspring value. If
p1 and p2 are two float values from the parents, then the child value
will be:

c = f * p1 + (1-f) * p2

where f is a random fraction [0, 1) representing the cut point. This
basically averages the parent values, weighted according to the position
of the cut point.

For mutation, calculate a random float in the range [-m, m], where m is
the maximum mutation value, and add it to the gene value. An m value of
one tenth the overall gene range usually works well. Ensure that the
resulting value falls within the gene range.

Bob

Newbie wrote:
> 
> Hi folks,
> 
> I'm working on a GA homework that deals with learning rules from real-valued
> data, and I'm writing the program in Lisp. I'd like to be able to represent
> my entire gene as a string of bits. This means that I need to convert from a
> Lisp float, probably using something like one's complement.
> 
> 1) Is this the right way to solve this problem?
> 2) If it is, are there existing functions/libraries that I should know about
> that would help to do this? (Seems like an awfully silly thing to code up by
> hand.)
> 
> Thanks,
> JT

From: Steve McGrew
Subject: Re: GA rules in Lisp
Date: Tue, 20 Feb 2001 14:16:42 +0000
Message-ID: <3a927b7c.1362598@nntp.iea.com>

On Mon, 19 Feb 2001 22:58:23 -0500, Bob Whitefield
<···@modeldesign.com> wrote:

[snip]
>Crossover and mutation operators are a bit more complex for native
>representation. For crossover, pick a random cut point within the gene,
>then blend the two parent gene values to produce the offspring value. If
>p1 and p2 are two float values from the parents, then the child value
>will be:
>
>c = f * p1 + (1-f) * p2
>
>where f is a random fraction [0, 1) representing the cut point. This
>basically averages the parent values, weighted according to the position
>of the cut point.
>
>For mutation, calculate a random float in the range [-m, m], where m is
>the maximum mutation value, and add it to the gene value. An m value of
>one tenth the overall gene range usually works well. Ensure that the
>resulting value falls within the gene range.

There are other approaches as well with their own advantages (and, I
suppose, disadvantages).  Generator, for example, does not do
recombination *inside* a gene; it picks crossover points at the
boundaries of genes so only whole genes are exchanged in
recombination/crossover.  Gene values change only via mutation.
Generator also uses a mutation operator that is quasi-Gaussian: it is
most likely to choose new gene values that are close to the current
value, but the width of the Gaussian peak can be selected by the user.

Steve

From: Bob Whitefield
Subject: Re: GA rules in Lisp
Date: Tue, 20 Feb 2001 21:47:28 +0000
Message-ID: <3A92E5F0.A5FEC645@modeldesign.com>

Steve McGrew wrote:
> Generator, for example, does not do
> recombination *inside* a gene; it picks crossover points at the
> boundaries of genes so only whole genes are exchanged in
> recombination/crossover.  Gene values change only via mutation.

Let me give an example where using a blending crossover within genes can
make a significant difference. Find the minimum of this function:

f(x,y) = x sin(4x) + 1.1 y sin(2y)
where 0 <= x <= 10, 0 <= y <= 10

This is a very bumpy function with many local minima. To solve it, I
created a chromosome with two genes encoded as floats, with values
constrained between 0 and 10. Fitness is the function above; the
objective is to minimize it. Population size is 500.

First, I solved it without a blending algorithm, i.e., crossover cuts
were allowed only between gene values. Here are the results averaged
over ten runs:

Evaluations	770642
Crossovers	731649
Mutations	38493
Improvements	18.7
Elapsed time	62.7 sec

Evaluations is the number of times the fitness function was executed.
Improvements is the number of times a new best member was created via
crossover or mutation during the run.

The overall best fitness was -18.5547210729247, found on the second run.
These results are fairly good, accurate to 8-10 significant digits
compared to the correct value. Average error was 1.947E-07.

Next I enabled the blending algorithm, so crossover cuts can now occur
within a gene. Average results over ten runs:

Evaluations	17267
Crossovers	15918
Mutations	849
Improvements	37.2
Elapsed time	1.35 sec

Best fitness for all ten runs were identical, correct to 15 significant
digits (-18.5547210773827). Blending crossover improved accuracy by 5-7
significant figures. It also executed 45 times faster on average.

The reason blending crossover works so much better is because it
produces offspring with new gene values. Without blending, crossover
only recombines alleles already present in the population. Without new
alleles, the population quickly converges, leaving only blind mutation
to introduce new values, a very slow process.

Some people mistakenly assume binary encoding works better than
continuous parameters, because they don't use an appropriate crossover
operator. Binary crossover does at least provide intra-gene blending,
but conversion and Gray coding overhead can kill performance.

Of course this was an extreme example, a chromosome with only two genes.
For problems with many genes, and/or genes with a limited alphabet, the
advantages of blending crossover are less dramatic.

I would be interested in results others might get for this problem.

Bob

Steve McGrew wrote:
> 
> On Mon, 19 Feb 2001 22:58:23 -0500, Bob Whitefield
> <···@modeldesign.com> wrote:
> 
> [snip]
> >Crossover and mutation operators are a bit more complex for native
> >representation. For crossover, pick a random cut point within the gene,
> >then blend the two parent gene values to produce the offspring value. If
> >p1 and p2 are two float values from the parents, then the child value
> >will be:
> >
> >c = f * p1 + (1-f) * p2
> >
> >where f is a random fraction [0, 1) representing the cut point. This
> >basically averages the parent values, weighted according to the position
> >of the cut point.
> >
> >For mutation, calculate a random float in the range [-m, m], where m is
> >the maximum mutation value, and add it to the gene value. An m value of
> >one tenth the overall gene range usually works well. Ensure that the
> >resulting value falls within the gene range.
> 
> There are other approaches as well with their own advantages (and, I
> suppose, disadvantages).  Generator, for example, does not do
> recombination *inside* a gene; it picks crossover points at the
> boundaries of genes so only whole genes are exchanged in
> recombination/crossover.  Gene values change only via mutation.
> Generator also uses a mutation operator that is quasi-Gaussian: it is
> most likely to choose new gene values that are close to the current
> value, but the width of the Gaussian peak can be selected by the user.
> 
> Steve

From: Steve McGrew
Subject: Re: GA rules in Lisp
Date: Wed, 21 Feb 2001 14:15:23 +0000
Message-ID: <3a93cdcc.30061440@nntp.iea.com>

Bob,
	Thank you for including enough information about your test
problem and results to do a direct comparison!

	I ran your problem on Generator and got very fast results:

population 100

directional hillclimb mutation 5 steps max

mutation size 100% of range, with range broken into cells of size
 .01 x .01

crossover activated

	After 20 generations (less than 25000 evaluations) the best
fitness achieved was  -18.55458459,    with x = 9.04 and y = 8.67

	I don't think this proves much, even though this result was
slightly better than yours.  All it shows is that there are many paths
to the peak of the mountain, and the view is pretty much the same once
you get there.

Steve


On Tue, 20 Feb 2001 16:47:28 -0500, Bob Whitefield
<···@modeldesign.com> wrote:

>Steve McGrew wrote:
>> Generator, for example, does not do
>> recombination *inside* a gene; it picks crossover points at the
>> boundaries of genes so only whole genes are exchanged in
>> recombination/crossover.  Gene values change only via mutation.
>
>Let me give an example where using a blending crossover within genes can
>make a significant difference. Find the minimum of this function:
>
>f(x,y) = x sin(4x) + 1.1 y sin(2y)
>where 0 <= x <= 10, 0 <= y <= 10
>
>This is a very bumpy function with many local minima. To solve it, I
>created a chromosome with two genes encoded as floats, with values
>constrained between 0 and 10. Fitness is the function above; the
>objective is to minimize it. Population size is 500.
>
>First, I solved it without a blending algorithm, i.e., crossover cuts
>were allowed only between gene values. Here are the results averaged
>over ten runs:
>
>Evaluations	770642
>Crossovers	731649
>Mutations	38493
>Improvements	18.7
>Elapsed time	62.7 sec
>
>Evaluations is the number of times the fitness function was executed.
>Improvements is the number of times a new best member was created via
>crossover or mutation during the run.
>
>The overall best fitness was -18.5547210729247, found on the second run.
>These results are fairly good, accurate to 8-10 significant digits
>compared to the correct value. Average error was 1.947E-07.
>
>Next I enabled the blending algorithm, so crossover cuts can now occur
>within a gene. Average results over ten runs:
>
>Evaluations	17267
>Crossovers	15918
>Mutations	849
>Improvements	37.2
>Elapsed time	1.35 sec
>
>Best fitness for all ten runs were identical, correct to 15 significant
>digits (-18.5547210773827). Blending crossover improved accuracy by 5-7
>significant figures. It also executed 45 times faster on average.
>
>The reason blending crossover works so much better is because it
>produces offspring with new gene values. Without blending, crossover
>only recombines alleles already present in the population. Without new
>alleles, the population quickly converges, leaving only blind mutation
>to introduce new values, a very slow process.
>
>Some people mistakenly assume binary encoding works better than
>continuous parameters, because they don't use an appropriate crossover
>operator. Binary crossover does at least provide intra-gene blending,
>but conversion and Gray coding overhead can kill performance.
>
>Of course this was an extreme example, a chromosome with only two genes.
>For problems with many genes, and/or genes with a limited alphabet, the
>advantages of blending crossover are less dramatic.
>
>I would be interested in results others might get for this problem.

From: Bob Whitefield
Subject: Re: GA rules in Lisp
Date: Wed, 21 Feb 2001 17:03:32 +0000
Message-ID: <3A93F4E4.492B9009@modeldesign.com>

Steve McGrew wrote:
>         After 20 generations (less than 25000 evaluations) the best
> fitness achieved was  -18.55458459,    with x = 9.04 and y = 8.67
> 
>         I don't think this proves much, even though this result was
> slightly better than yours.

Steve,

I think you'll see my result is actually more accurate. The goal is to
find the minimum:

-18.55458459       (your result, x=9.04 y=8.67)
-18.5547210773827  (my result, x=9.0389916 y=8.66818896)
  0.0001364873827  (difference)

I'm not trying to criticize Generator--it's a fine product--I just
wanted to show how powerful a real-valued GA can be when you use a
blending crossover. Also note that my implementation doesn't use
hill-climbing techniques, only selection, crossover and mutation.

Bob

p.s. You used a population of 100, while I used 500. Out of curiousity,
I just ran the problem again with 100 members. Here are the averages
over 20 runs:

Evaluations    167746
Mutations      8850
Improvements   74.05
Elapsed time   13.6 sec

In all runs, the best result was still accurate to 15 digits, but the
smaller population increased execution time by a factor of ten, on
average. I imagine Generator would also show better performance with a
larger population.


Steve McGrew wrote:
> 
> Bob,
>         Thank you for including enough information about your test
> problem and results to do a direct comparison!
> 
>         I ran your problem on Generator and got very fast results:
> 
> population 100
> 
> directional hillclimb mutation 5 steps max
> 
> mutation size 100% of range, with range broken into cells of size
>  .01 x .01
> 
> crossover activated
> 
>         After 20 generations (less than 25000 evaluations) the best
> fitness achieved was  -18.55458459,    with x = 9.04 and y = 8.67
> 
>         I don't think this proves much, even though this result was
> slightly better than yours.  All it shows is that there are many paths
> to the peak of the mountain, and the view is pretty much the same once
> you get there.
> 
> Steve
> 
> On Tue, 20 Feb 2001 16:47:28 -0500, Bob Whitefield
> <···@modeldesign.com> wrote:
> 
> >Steve McGrew wrote:
> >> Generator, for example, does not do
> >> recombination *inside* a gene; it picks crossover points at the
> >> boundaries of genes so only whole genes are exchanged in
> >> recombination/crossover.  Gene values change only via mutation.
> >
> >Let me give an example where using a blending crossover within genes can
> >make a significant difference. Find the minimum of this function:
> >
> >f(x,y) = x sin(4x) + 1.1 y sin(2y)
> >where 0 <= x <= 10, 0 <= y <= 10
> >
> >This is a very bumpy function with many local minima. To solve it, I
> >created a chromosome with two genes encoded as floats, with values
> >constrained between 0 and 10. Fitness is the function above; the
> >objective is to minimize it. Population size is 500.
> >
> >First, I solved it without a blending algorithm, i.e., crossover cuts
> >were allowed only between gene values. Here are the results averaged
> >over ten runs:
> >
> >Evaluations    770642
> >Crossovers     731649
> >Mutations      38493
> >Improvements   18.7
> >Elapsed time   62.7 sec
> >
> >Evaluations is the number of times the fitness function was executed.
> >Improvements is the number of times a new best member was created via
> >crossover or mutation during the run.
> >
> >The overall best fitness was -18.5547210729247, found on the second run.
> >These results are fairly good, accurate to 8-10 significant digits
> >compared to the correct value. Average error was 1.947E-07.
> >
> >Next I enabled the blending algorithm, so crossover cuts can now occur
> >within a gene. Average results over ten runs:
> >
> >Evaluations    17267
> >Crossovers     15918
> >Mutations      849
> >Improvements   37.2
> >Elapsed time   1.35 sec
> >
> >Best fitness for all ten runs were identical, correct to 15 significant
> >digits (-18.5547210773827). Blending crossover improved accuracy by 5-7
> >significant figures. It also executed 45 times faster on average.
> >
> >The reason blending crossover works so much better is because it
> >produces offspring with new gene values. Without blending, crossover
> >only recombines alleles already present in the population. Without new
> >alleles, the population quickly converges, leaving only blind mutation
> >to introduce new values, a very slow process.
> >
> >Some people mistakenly assume binary encoding works better than
> >continuous parameters, because they don't use an appropriate crossover
> >operator. Binary crossover does at least provide intra-gene blending,
> >but conversion and Gray coding overhead can kill performance.
> >
> >Of course this was an extreme example, a chromosome with only two genes.
> >For problems with many genes, and/or genes with a limited alphabet, the
> >advantages of blending crossover are less dramatic.
> >
> >I would be interested in results others might get for this problem.

From: Steve McGrew
Subject: Re: GA rules in Lisp
Date: Wed, 21 Feb 2001 18:25:49 +0000
Message-ID: <3a93f783.3277221@nntp.iea.com>

On Wed, 21 Feb 2001 12:03:32 -0500, Bob Whitefield
<···@modeldesign.com> wrote:

>I think you'll see my result is actually more accurate. The goal is to
>find the minimum:
>
>-18.55458459       (your result, x=9.04 y=8.67)
>-18.5547210773827  (my result, x=9.0389916 y=8.66818896)
>  0.0001364873827  (difference)

Oops.  You're right.   I think your program is set up to use
higher-precision gene values than Generator is.

>I'm not trying to criticize Generator--it's a fine product--I just
>wanted to show how powerful a real-valued GA can be when you use a
>blending crossover. Also note that my implementation doesn't use
>hill-climbing techniques, only selection, crossover and mutation.

>p.s. You used a population of 100, while I used 500. Out of curiousity,
>I just ran the problem again with 100 members. Here are the averages
>over 20 runs:
>
>Evaluations    167746
>Mutations      8850
>Improvements   74.05
>Elapsed time   13.6 sec
>
>In all runs, the best result was still accurate to 15 digits, but the
>smaller population increased execution time by a factor of ten, on
>average. I imagine Generator would also show better performance with a
>larger population.

Actually, Generator runs a lot faster with a smaller population but is
more likely to get hung up on a local optimum.  So I usually start
with a large population (100) and gradually shrink it down to about 5
or 10.  This tends to filter out the most likely sections of search
space and then focus on them.

Steve

From: Andrew Cooke
Subject: Re: GA rules in Lisp
Date: Tue, 20 Feb 2001 09:06:17 +0000
Message-ID: <3A923389.68D6C277@andrewcooke.free-online.co.uk>

You're using Lisp, not C, so why represent the gene as numeric values at
all - you can directly manipulate code.  Obviously this won't work in
every scenario, but it's a neat approach (IMHO).

For example, I looked at genetic algorithms for generating rhythms (as
in music).  I implemented a tiny language in Lisp that could generate
rhythms.  I then treated randomly generated programs in this language as
genes and "bred" them by intermingling them.  This sounds complex, but
in Lisp all you are doing is combining two lists to make a third - when
a language has no clear separation between code and data there's no
longer a clear separation between genetic algorithms and genetic
programming.

For more info see
http://www.andrewcooke.free-online.co.uk/jara/rytmo/index.html

Andrew

PS Happy to answer questions, but away on holiday soon! :-)

Newbie wrote:
> 
> Hi folks,
> 
> I'm working on a GA homework that deals with learning rules from real-valued
> data, and I'm writing the program in Lisp. I'd like to be able to represent
> my entire gene as a string of bits. This means that I need to convert from a
> Lisp float, probably using something like one's complement.
> 
> 1) Is this the right way to solve this problem?
> 2) If it is, are there existing functions/libraries that I should know about
> that would help to do this? (Seems like an awfully silly thing to code up by
> hand.)
> 
> Thanks,
> JT

From: Eugene Zaikonnikov
Subject: Re: GA rules in Lisp
Date: Tue, 20 Feb 2001 16:38:05 +0000
Message-ID: <6yy9v1ux76.fsf@viking.cit>

* "JT" == Newbie  <····@hotmail.com> writes:

JT>  I'd like to be able to represent my entire gene as a string of
JT>  bits. This means that I need to convert from a Lisp float,
JT>  probably using something like one's complement.

Storing bits in float is not very good idea.

(make-array 16 :element-type 'bit) will give you bit-vector of size
16, which is object to all standard sequence and array manipulations.

If you still need for some reasons to pack bits into numbers, use
integers. There are some standard functions and accessors defined over
integers which are useful for bitwise manipulations (LDB, DPB and
friends).

For further details consult the chapters 12, 15 and 17 of the HyperSpec.

-- 
  Eugene