From: charlieb
Subject: Simple optimisation question.
Date: 
Message-ID: <c14qoi$1enh6l$1@ID-208832.news.uni-berlin.de>
I've written a mickey mouse genetic algorithm and I'm now looking to 
optimise it for the purposes of learning about efficient lisp. I used 
metering.cl by Mark Kantrowitz to find that the roulette selection 
function is where most (~50%) of the time and the consing (~70%) is 
taking place.

(defun roulette-select (pop-scores &optional (pop-size (length pop-scores)))
   (let ((roulette (random 1.0))
	(running-total 0.0))
     (do ((i 0 (1+ i)))
	((or (>= (incf running-total (svref pop-scores i)) roulette)
	     (>= i (1- pop-size))) i))))

Just to give you an idea of my level I was already able to make this run 
~twice as fast by changing (running-total 0) to (running-total 0.0). 
Looking at the function I found the most of it's time is spent accessing 
the array pop-scores. I tried various accessors (including not accessing 
the array at all for reference (is that a useful thing to do?)) and the 
results are:
                                                 Cons
                   %     %                       Per      Total   Total
Function          Time  Cons  Calls  Sec/Call   Call     Time    Cons
-----------------------------------------------------------------------
no ref
ROULETTE-SELECT:  0.08  0.08  40000  0.000022    199  0.881267  7945716
svref
ROULETTE-SELECT:  0.51  0.71  40000  0.000276   5108  11.03586 204314176
aref
ROULETTE-SELECT:  0.53  0.71  40000  0.000291   5152  11.64674 206082000
elt
ROULETTE-SELECT:  0.52  0.70  40000  0.000294   5168  11.75690 206701328
rationalize
ROULETTE-SELECT:  0.58  0.57  40000  0.000360   2927  14.39069 117075972

pop-scores is an array of rationals that is let in another function
(temp-scores (make-array pop-size :element-type 'rational))
so I tried (let (roulette (rationalize (random 1.0)) ... and that helped 
the consing but not the time.

Any suggestions/comments (I'm not into reading machine or byte code (yet!))

Thanks
Charlieb

From: Joe Marshall
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <isi1n6ad.fsf@ccs.neu.edu>
charlieb <··@privacy.net> writes:

> I've written a mickey mouse genetic algorithm and I'm now looking to
> optimise it for the purposes of learning about efficient lisp. I used
> metering.cl by Mark Kantrowitz to find that the roulette selection
> function is where most (~50%) of the time and the consing (~70%) is
> taking place.
>
> (defun roulette-select (pop-scores &optional (pop-size (length pop-scores)))
>    (let ((roulette (random 1.0))
> 	(running-total 0.0))
>      (do ((i 0 (1+ i)))
> 	((or (>= (incf running-total (svref pop-scores i)) roulette)
> 	     (>= i (1- pop-size))) i))))
>

> pop-scores is an array of rationals that is let in another function
> (temp-scores (make-array pop-size :element-type 'rational))
> so I tried (let (roulette (rationalize (random 1.0)) ... and that
> helped the consing but not the time.

This is consing because running-total is a floating point number and
pop-scores is an array of rationals.  At every iteration, the rational
is pulled out, converted to a float (which conses), and added to
running-total.

One possibility here is to try to avoid the conversion to floating
point by staying with rationals, but dealing with the numerator and
denominator separately.  This may avoid the consing, but it obviously
makes the algorithm considerably hairier.  What sort of rationals are
in the pop-scores?
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1599d$1eo9si$1@ID-208832.news.uni-berlin.de>
Joe Marshall wrote:

> charlieb <··@privacy.net> writes:
> 
> 
>>I've written a mickey mouse genetic algorithm and I'm now looking to
>>optimise it for the purposes of learning about efficient lisp. I used
>>metering.cl by Mark Kantrowitz to find that the roulette selection
>>function is where most (~50%) of the time and the consing (~70%) is
>>taking place.
>>
>>(defun roulette-select (pop-scores &optional (pop-size (length pop-scores)))
>>   (let ((roulette (random 1.0))
>>	(running-total 0.0))
>>     (do ((i 0 (1+ i)))
>>	((or (>= (incf running-total (svref pop-scores i)) roulette)
>>	     (>= i (1- pop-size))) i))))
>>
> 
> 
>>pop-scores is an array of rationals that is let in another function
>>(temp-scores (make-array pop-size :element-type 'rational))
>>so I tried (let (roulette (rationalize (random 1.0)) ... and that
>>helped the consing but not the time.
> 
> 
> This is consing because running-total is a floating point number and
> pop-scores is an array of rationals.  At every iteration, the rational
> is pulled out, converted to a float (which conses), and added to
> running-total.
> 
> One possibility here is to try to avoid the conversion to floating
> point by staying with rationals, but dealing with the numerator and
> denominator separately.  This may avoid the consing, but it obviously
> makes the algorithm considerably hairier.  What sort of rationals are
> in the pop-scores?

They tend to be quite small but they also tend to have the same 
denominator as they are normalised. It should have occurred to me before 
but that does suggest a small but significant modification. I could, as 
you say, do the entire thing integers (fixnums?). Thank-you for your 
suggestion.

Charlie.
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c15bl8$1d981c$1@ID-208832.news.uni-berlin.de>
Result!

                                                 Cons
                   %     %                       Per      Total   Total
Function          Time  Cons  Calls  Sec/Call   Call     Time    Cons
-----------------------------------------------------------------------
ROULETTE-SELECT:  0.37  0.42  40000  0.000118   1315  4.716783  52595252

(defun roulette-select (pop-scores &optional
				   (pop-size (length (cdr pop-scores))))
   (let ((roulette (random (car pop-scores)))
	(scores (cdr pop-scores))
	(running-total 0))
     (do ((i 0 (1+ i)))
	((or (>= (incf running-total (svref scores i)) roulette)
	     (>= i (1- pop-size))) i))))

pop-scores was a simple array of normalised scores as rationals it's now 
a cons of the total of all the scores and a simple array of the 
non-normalised scores.

Cheers
Charlie.
From: Jorge Tavares
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c156ij$7d1$1@rena.mat.uc.pt>
charlieb wrote:

> I've written a mickey mouse genetic algorithm and I'm now looking to 
> optimise it for the purposes of learning about efficient lisp. I used 
> metering.cl by Mark Kantrowitz to find that the roulette selection 
> function is where most (~50%) of the time and the consing (~70%) is 
> taking place.

[...]

> Any suggestions/comments (I'm not into reading machine or byte code (yet!))

Yes, change from roulette selection to tournament selection. Results 
will improve as well as the algorithm speed.


Best regards,
Jorge

-- 
Jorge Tavares

"An engineer is someone who can do for ten shillings
  what any fool can do for a pound." - Neville Shute
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1597b$1dnvei$1@ID-208832.news.uni-berlin.de>
Jorge Tavares wrote:
> charlieb wrote:
> 
>> I've written a mickey mouse genetic algorithm and I'm now looking to 
>> optimise it for the purposes of learning about efficient lisp. I used 
>> metering.cl by Mark Kantrowitz to find that the roulette selection 
>> function is where most (~50%) of the time and the consing (~70%) is 
>> taking place.
> 
> 
> [...]
> 
>> Any suggestions/comments (I'm not into reading machine or byte code 
>> (yet!))
> 
> 
> Yes, change from roulette selection to tournament selection. Results 
> will improve as well as the algorithm speed.
> 
> 
> Best regards,
> Jorge
> 
Tournament selection? ...
Pick n genes and choose the two best to breed?
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1ctmb$1hm6fk$1@ID-208832.news.uni-berlin.de>
Jorge Tavares wrote:

> charlieb wrote:
> 
>> I've written a mickey mouse genetic algorithm and I'm now looking to 
>> optimise it for the purposes of learning about efficient lisp. I used 
>> metering.cl by Mark Kantrowitz to find that the roulette selection 
>> function is where most (~50%) of the time and the consing (~70%) is 
>> taking place.
> 
> 
> [...]
> 
>> Any suggestions/comments (I'm not into reading machine or byte code 
>> (yet!))
> 
> 
> Yes, change from roulette selection to tournament selection. Results 
> will improve as well as the algorithm speed.
> 
> 
> Best regards,
> Jorge
> 
Yes, again a significant improvement just using the quick hack:

(defun tournament-select (pop-scores &optional
			     (pop-size (length (cdr pop-scores)))
			     (tournament-size 5))
   (let ((best-i 0)
	(best-score 0)
	(scores (cdr pop-scores)))
     (dotimes (tour tournament-size best-i)
       (let* ((i (random pop-size))
	     (i-score (svref scores i)))
	(when (> i-score best-score)
	  (setq best-i i
		best-score i-score))))))

Thanks for the suggestion. I'm looking into other selection methods now 
but I think they will require me to re-jig my code a little as the 
selection functions I wrote return only one parent and e.g. Stochastic 
universal sampling returns a mating population.

Cheers
Charlie.
From: Jorge Tavares
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1du0i$262$1@rena.mat.uc.pt>
Hi,

charlieb wrote:

> Yes, again a significant improvement just using the quick hack:
> 
> (defun tournament-select (pop-scores &optional
>                  (pop-size (length (cdr pop-scores)))
>                  (tournament-size 5))
>   (let ((best-i 0)
>     (best-score 0)
>     (scores (cdr pop-scores)))
>     (dotimes (tour tournament-size best-i)
>       (let* ((i (random pop-size))
>          (i-score (svref scores i)))
>     (when (> i-score best-score)
>       (setq best-i i
>         best-score i-score))))))
> 
> Thanks for the suggestion. I'm looking into other selection methods now 
> but I think they will require me to re-jig my code a little as the 
> selection functions I wrote return only one parent and e.g. Stochastic 
> universal sampling returns a mating population.

Sorry for the late response, I was away for the weekend.

Yes, tournament selection is what you've implemented. You fill your new 
population by running a tournament of size K (usually 3 or 5) with 
random selected individuals. The best one is copied into the new 
population where later recombination and mutation is applied.

I think this is a good method for two reasons: it is computational cheap 
and it helps preventing premature convergence. Of course this last 
reason is always dependent on your problem but from my experience (and 
others) tournament selection usually is a very good choice.

Another thing I've remembered while reading your code. I don't keep two 
lists, one with the individuals and another with the fitness values. I 
prefer to represent the individual as a structure and then I use a 
single array where I keep all of my individuals (the population).


Best Regards,
Jorge

-- 
Jorge Tavares

"An engineer is someone who can do for ten shillings
  what any fool can do for a pound." - Neville Shute
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1f91b$1gi08t$1@ID-208832.news.uni-berlin.de>
Jorge Tavares wrote:
> Hi,
> 
> charlieb wrote:
> 
<snip>
> 
> Sorry for the late response, I was away for the weekend.
Me too, but you were away so you wouldn't know that ;)
> 
> Yes, tournament selection is what you've implemented. You fill your new 
> population by running a tournament of size K (usually 3 or 5) with 
> random selected individuals. The best one is copied into the new 
> population where later recombination and mutation is applied.
I'm slightly confused by this. Maybe I should post more code ...
You seem to be implying that I should be creating some kind of 
intermediate population of selected individuals before I breed/mutate 
them. This isn't what I'm doing (explicitly). I pick two tournament 
winners and immediatelly breed/mutate them.

(defun select-and-breed (pop scores
			     &optional (mutate-prob 0.01)
					(breed-prob 0.75))
   (let ((parent-1 (aref pop (tournament-select scores)))
	(parent-2 (aref pop (tournament-select scores))))
;    (dbg :select "~%Selected ~S, ~S" parent-1 parent-2)
     (when (<= (random 1.0) breed-prob)
       (multiple-value-bind (child-1 child-2)
	  (crossover parent-1 parent-2)
	(setq parent-1 (mutate child-1 mutate-prob)
	      parent-2 (mutate child-2 mutate-prob))))
     (values parent-1 parent-2)))


> 
> I think this is a good method for two reasons: it is computational cheap 
> and it helps preventing premature convergence. Of course this last 
> reason is always dependent on your problem but from my experience (and 
> others) tournament selection usually is a very good choice.
> 
> Another thing I've remembered while reading your code. I don't keep two 
> lists, one with the individuals and another with the fitness values. I 
> prefer to represent the individual as a structure and then I use a 
> single array where I keep all of my individuals (the population).
> 
I decided that passing indexes around was OK
e.g. (aref pop (tournament-select scores))

> 
> Best Regards,
> Jorge
> 

Cheers,
Charlieb
From: Jorge Tavares
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1gkng$hmn$1@rena.mat.uc.pt>
Hi!

>> Yes, tournament selection is what you've implemented. You fill your 
>> new population by running a tournament of size K (usually 3 or 5) with 
>> random selected individuals. The best one is copied into the new 
>> population where later recombination and mutation is applied.
> 
> I'm slightly confused by this. Maybe I should post more code ...
> You seem to be implying that I should be creating some kind of 
> intermediate population of selected individuals before I breed/mutate 
> them. This isn't what I'm doing (explicitly). I pick two tournament 
> winners and immediatelly breed/mutate them.

Well, this is the way I normally do but it doesn't mean yours is 
incorrect or mine is a better one. It depends more on what you want to 
do with your algorithm. If the intention is just to implement a GA, use 
and it's done, then your approach might be better because it is simpler 
to code. In a way, we can say it is tuned to the problem at hand.

I prefer working with intermediate populations because this way I can 
further explore the evolutionary algorithm I am developing. For example, 
  it is simpler to implement different crossover operators and test 
them. Imagine that you want a multi-parent crossover, where the number 
of parents is > 2, or if you want to apply a different number of 
mutation operators, or if you want to apply more techniques like 
learning, or simply to introduce mechanisms to study population 
diversity in different stages, etc. I find the algorithm (code) to be 
more flexible this way, which is crucial for my research.

As an example, a few months ago I started to work on a problem and I had 
some ideas in how to use genetic algorithms with it. In a single 
afternoon I developed a simple GA framework in Lisp and before dinner I 
  was able to explore 5 different GAs (coding and running some 
experiments) and decided which one should I explore in more detail. :)


>> Another thing I've remembered while reading your code. I don't keep 
>> two lists, one with the individuals and another with the fitness 
>> values. I prefer to represent the individual as a structure and then I 
>> use a single array where I keep all of my individuals (the population).
>>
> I decided that passing indexes around was OK
> e.g. (aref pop (tournament-select scores))

It is the same reasoning as before. For example, it is simpler to 
transform a GA into a Self-Adaptive GA, or to use more complex 
representations, if the individuals are structures (or classes). But 
really, it all depends on what you want to do. :)

Cya,
Jorge

-- 
Jorge Tavares

"An engineer is someone who can do for ten shillings
  what any fool can do for a pound." - Neville Shute
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1keg3$1jqfpa$1@ID-208832.news.uni-berlin.de>
Thank-you for your interesting and detailed reply. Are there any books 
on this subject that you would recommend to a beginner?

Cheers
Charlieb.
From: Jorge Tavares
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1n9sh$cn8$1@rena.mat.uc.pt>
Hi,

charlieb wrote:
> Thank-you for your interesting and detailed reply. Are there any books 
> on this subject that you would recommend to a beginner?

To the best of my knowledge I think the only book that talks about Lisp 
and Evolutionary Algorithms is "Genetic Programming" by John Koza, but 
like the title says, it is only about Genetic Programming.

EC books for beginners I would recommed "Introduction to Evolutionary 
Computation", by A. E. Eiben and J. E. Smith. In spite of it's age, 
"Genetic Algorithms in Search, Optimization and Machine Learning", by 
David Goldberg is still a must.

Books about Lisp is a common subject in this newsgroup, so you can 
easily find good recomendations if you google for previous posts.


Jorge

-- 
Jorge Tavares

"An engineer is someone who can do for ten shillings
  what any fool can do for a pound." - Neville Shute
From: charlieb
Subject: Re: Simple optimisation question.
Date: 
Message-ID: <c1ng9j$1j1c0i$1@ID-208832.news.uni-berlin.de>
Jorge Tavares wrote:

> Books about Lisp is a common subject in this newsgroup, so you can 
> easily find good recomendations if you google for previous posts.

Thanks, I did mean just evolutionary computing. I already have the lisp 
books I need.

Cheers
Charlie.