From: Peter Dudey
Subject: Free program: GALE (long--includes description)
Date:
Message-ID: <C89u92.GI5@willamette.edu>
The program explained below is now available. Send email to ·······@sfu.ca if
you are interested.
--
! Peter Dudey, 11 kyu, Lisp SubGuru and Purveyor of Fine Cheeses >FINGER ME< !
! Tobacco is the leading cause of death in the U.S. | More mass transit! !
! Darwin was right. | There is no valid use for handguns or nuclear weapons. !
! Please mail me plastic spaceships: 257 NE 13th Street, Salem, OR 97301 !
An Evolutionary Program to Mimic Language Acquisition:
GALE (Genetic Acquisition of a Language of Exchange).
-------------------------------------------------------------------------------
It's free, and you can do what you want with it, but:
1) Keep my name on it, and
2) I'm not responsible for anything you do to or with it. Don't sue me, I
don't have any money.
-Peter Dudey
-------------------------------------------------------------------------------
GALE was written by Peter Dudey in CLISP. CLISP was written by Bruno Haible and
Michael Stoll, and is available via anonymous ftp from
ma2s2.mathematik.uni-karlsruhe.de. The source code accompanies this file. It
will, of course, run much faster if compiled; since compiled code will depend
on the platform, that is left to the user.
This was in response to a brief thread that emerged on comp.ai.genetic about
modeling human language evolution after an idea by Paul Harrald.
On a vt100 monitor GALE displays a grid, each element of the grid can
contain either a ``resource'' or a ``Critter.'' There are three types of
resource, they are represented by the non-alphanumeric objects of a comma (,),
a bar (_), or a dot (.). A ``Critter'' is one of the main actors in GALE.
Critters are small neural networks. They are represented by alphabetic
characters, either upper or lower case or at-signs (@). The Critters move
randomly to adjacent
grid points (in Manhattan fashion). If the grid point contains a resource, the
Critter picks it up. If there is another Critter adjacent they may attempt to
communicate. The reason for communication is that the individual resources are
of no value in themselves, but a combination of a comma, bar and dot can be
used to produce one `widget' (what else?). A widget endows a Critter with 50
units of `lifespan'---it can exist for 50 more moves. The resources are
`manna from heaven' in that they appear randomly in unoccupied grid points
from time to time.
When two Critters meet they attempt to communicate by uttering a sound from
a fixed vocabulary. The sound they utter is intended to convey two things:
what they have that is available and what they want in exchange. They listen
to one another, try to guess the intention of the other, and then offer what
they think the other wants---literally present the item. It
might be complained that the Critters can simply see what the other has, and
hence a possible trade is manifest: no need for language. We have in mind that
carry around the resources is very arduous, and that they are hidden in a
secret archive known only to the Critters. They then wander around, bumping
into the others. Once they guess at the meaning of the other's utterance, and
if they think there is a trade to be made, they go to their respective
archives and retrieve what they think is appropriate (if they think there is
the possibility of trade). When they reconvene, they then expectantly present
what they have, and are either pleased or disappointed by what the other has.
If they have successfully interpreted one another (and have something to
trade) a trade takes place, otherwise it does not.
As the critters move around they are trying to associate a word utterance
with an idea. There are six basic ``ideas'' the Critters would like to convey;
``I will give you a dot for a comma''
``I will give you a dot for a bar''
``I will give you a comma for a dot''
``I will give you a comma for a bar''
``I will give you a bar for a dot''
``I will give you a bar for a comma''
There are six sounds that a Critter can utter:
``Fnord'' (See Robert Anton Wilson's _Illuminatus!_ trilogy)
``Flugblogs'' (See damn near anything by Bob French)
``Bar'' (The rest are standard LISP gibberish variables)
``Baz''
``Foo''
``Quux''
Each critter assigns 36 integer-valued weights between the six ideas and the
six words. When the critter wishes to convey a certain idea, it utters the
word connected to that idea with greatest weight. Similarly, when a Critter
hears a particular word, it associates an idea on the basis of greatest
weight.
The Critters alter the weights as experience dictates. Periodically two
Critters, when adjacent will mate, producing a genetic crossbreed Critter that
is placed at a random, empty point.
GALE keeps track of many statistics to illustrate the evolution of common
interpretations of words (or not)---later.
Representing the Critters
The agents in GALE are comprised of two parts: a `brain' and `genes.' The
brain and genes are stored as an array, the 36 weights between the ideas and
words. The brain can `learn' by increasing weights in response to correct
interpretation, and decreasing weights in response to an incorrect
interpretation.
Initially, and at birth, a critter's brain and it's genes are identical.
Over time, the brain learns via interaction, but the genes (naturally) remain
the same.
Movement and interaction.
The initial grid is randomly seeded with critters and resources, randomly
placed in the ``world.'' Each critter moves to one of the adjacent grid
points: if the critter is located at grid point (x,y) it will move to grid
points , (x,y+1), (x+1,y), (x-1,y), (x,y-1), with equal probability.
Three things can happen when a critter takes up a new position: it can
find a resource, it can find nothing, or it can become adjacent to another
critter. Naturally, the first two are mutually exclusive events. A critter who
finds a resource that enables it to produce a widget does so immediately.
When two critters become adjacent, they ``interact.'' The nature of an
interaction depends upon the critters' current stores of resources. A critter
with two identical resources (and nothing else) will want to trade one of them
for either of the two resources it does not have in its possession. This
applies to a critter with a resource of one type and two of another, in which
case, however, it will want to acquire the third type. Since critters always
manufacture (and consume) widgets immediately, they can have at most two
resource types in their possession. There is no limit on the quantity of
resources the critters can hoard. Generally, then, a critter has a particular
quantity of one resource, and a particular quantity of some other. If either
of these quantities is greater than two, the critter would like to trade.
Attempts to trade
It is highly unlikely that trade will take place unless there is
successful communication. When two critters meet, and if they want to make a
trade, they must choose a word to represent their desires. An idea is
represented as a dotted pair, of the form (resource1 . resource2). The
notation (resource1 . resource2) indicates a desire to swap a unit of
resource1 for a unit of resource2, this constitutes an `idea'. There is no
quibbling about prices. A critter who does not wish to trade simply says
nothing. Should either critter do this, that is the end of the interaction.
This leaves the case in which both critters have uttered a word. Recall, the
word that is uttered is the one with greatest link to the idea that the
critter wishes to convey. Having exchanged this initial utterance, the
critters then must decide on an interpretation of what the other said---again
based on which idea has greatest weight associated with the word the other
chose. If a critter interprets the other's utterance as the converse of its
own (that is, it thinks they can trade), it will make the same utterance
again. If the critter thinks (given its interpretation of the other) that the
other does not want to trade, then a new offer is made---a new word, if the
critter has a secondary trade it would like to make. Both critters now make
the offers indicated by their second utterances. If they match, there is a
trade.
Double-Quadruple-coincidence
It can be seen that trades will be quite a rare occurrence. A prerequisite is
the incidence of double-coincidence of wants in the traditional sense. With
only three goods, this is not all that unlikely. Notice that there can be only
one feasible trade between any two critters with two resources, but if one
critter has more than one unit of just one resource, then there are two
possible trades with another critter with more than one unit of the other two
resources. In order that a trade take place the critters must then use the
same words to indicate the two ideas to be conveyed. A further coincidence is
required: the critters must be adjacent before anything can happen. Given
this, we would not expect to see communication adapt very quickly. If GALE
illustrates one thing, it is that if language is an emergent phenomenon, it
certainly was not very rapid in its emergence. Yet, by evolutionary standards,
we have developed an incredibly complex means of verbal communication in very
short order. This still remains a puzzle for me.
[The critters in GALE don't get to learn directly from their parents in
anthing resembling school. Also unlike humans, they all just happen to speak
the same "mental" language, i.e., have the same (finite) set of ideas--it's
just a question of correlation. It's more of a model of learning /foreign/
languages. Hmmm, maybe if each critter was understood to represent a nation or
culture . . .
Also see Richard Dawkins' _The Selfish Gene_, for the concept of memes.
-Peter Dudey]
Screen output GALE and internal operations.
In this section I show some sample screens from GALE as it is running to help
explain what is possible when two critters bump into one another, and also
discuss several other features of GALE. Consider the following:
_ UJ . ,Q
N , ,I , _
S .E .
. , , Y
., _ ,
.
,BW Z M D
, , A_
F C
_ . O_ R_ _ _T _ K
G . _ V ,X ,
_ ,
. . _H . . _
, _ P
L , , ,
Command? [QOTHKNDFEL]
This is the start-up screen. The capital letters represent critters, which have
been randomly positioned. In this run, there are 26 critters represented by
the lettters of the alphabet. Resources have been randomly positioned in free
grid-points. The menu, `Command? [QOTHKNDFEL]' offers the user the the
following options: Q= Quit, O= one epoch, T= ten epochs, H= one hundred
epochs, K= 1000 epochs, D= calculation of population diversity, F= an
approximate
diversity calculation (F for `fast diversity'), L = a list of statistics,
yielding the following:
STATISTICS
Word: (./,) (./_) (,/.) (,/_) (_/.) (_/,)
Foo 0/0 0/0 0/0 0/0 0/0 0/0
Bar 0/0 0/0 0/0 0/0 0/0 0/0
Quux 0/0 0/0 0/0 0/0 0/0 0/0
Baz 0/0 0/0 0/0 0/0 0/0 0/0
Fnord 0/0 0/0 0/0 0/0 0/0 0/0
Flugblogs 0/0 0/0 0/0 0/0 0/0 0/0
Number of times used/correctly interpreted
(Hit any key to continue)
The root menu, [QOTHKNDFEL], appears after each menu choice is completed. An
`epoch' consists of one move per critter, and of course, any associated
interactions. If two critters meet, and neither of them wishes to trade, they
make no attempt at speech. This happens a lot early on as resource
acquisitions are limited. As the critters acquire resources, the potential for
trade increases, as does the complexity of interactions. Consider two
critters, one of which wishes to acquire a COMMA, the other a DOT.
GALE shows one possible interaction below:
Critter 3 (C) at (6 . 6) moving. Lifespan: 538 Goodies: (DOT BAR BAR BAR BAR)
C: Bar? (meaning (BAR . COMMA), interpreted as (COMMA . DOT))
S: Baz? (meaning (BAR . COMMA), interpreted as (BAR . DOT))
S: Baz. (meaning (DOT . COMMA), interpreted as (BAR . DOT))
It worthwhile decomposing these events, I think. Critter C has the hoard (DOT
BAR BAR BAR BAR), and naturally wishes to acquire a COMMA, and would like to
swap a BAR for a COMMA.
Footnote: If critter C thought it would be very easy to acquire a DOT in the
future, then it might consider trading its DOT for a COMMA. This kind of
reasoning is not yet embodied in GALE.
Although not shown, critter S also wants to acquire a COMMA, and is also
going to offer a BAR. These critters do not actually have a trade to make, but
they are not sure of this. To convey the idea `(BAR . COMMA)' critter C has
selected the word `BAR' which critter S has interpreted as `(COMMA . DOT).'
Critter S does not want a DOT, and makes an offer to swap its BAR for C's
COMMA, which of course C doesn't have. After the first attempt at speech, the
two critters have different ideas about what has happened. Critter C thinks
the following sequence of events has occurred: 1) I offered him my BAR for his
COMMA. 2) His response was to offer me his BAR for my DOT. Critter C then
waits for S to make a second offer, which may differ from the first offer. All
that happens is that S makes the same offer again (in A's estimation). Critter
A ends this interaction by saying nothing, meaning there is no trade to take
place.
Now let us see these events as critter S interpreted them. The first event
was 1) The other critter offers me a COMMA in exchange for a DO}, then 2) I
offer BAR in exchange for a COMMA, 3) The other critter did not make an
attempt to trade, so I make the new offer, 4) I offer, as an alternative, a
DOT, in exchange for a COMMA, 5) nothing doing, apparently.
It may seem odd that critter S uses the same word, Baz, to convey two different
ideas. This, however, will happen from time to time. It so
happens that, for critter S, the meanings (BAR . COMMA) and (DOT . COMMA)
are both expressed the same way. Every critter (at any moment) has
exactly one word to express a given meaning, and exactly one
interpretation of a given word it hears, but the converses are not true.
A word may be used to express several different meanings, and an idea may
be the interpretation of several different (heard) words: thanks Peter!
Note that neither critter can be sure that the other has misunderstood any
utterance in this interaction. In this regard, the interaction illustrated
does not alter the brains of the critters.
Consider a more fruitful encounter:
Critter 12 (L) at (18 . 12) moving. Lifespan: 990 Goodies: (COMMA...
L: Flugblogs? (meaning (BAR . DOT), interpreted as (DOT . COMMA))
F: Foo? (meaning (COMMA . BAR), interpreted as (DOT . BAR))
L: Flugblogs. (meaning (BAR . DOT), interpreted as (DOT . COMMA))
F: Foo. (meaning (DOT . BAR), interpreted as (DOT . BAR))
L trades BAR for F's DOT
In this encounter, only the final statement of F is properly understood, and it
happens to correspond to a feasible trade, which takes place. For both of
these critters the weight associated between the idea (DOT . BAR) and the word
Foo is increased, the weight between (DOT . BAR) and all other words is
decreased, as is the weight between Foo and all other ideas. the amount of
increase and decrease is stored in a variable in GALE. Two critters could make
a mistake in expecting they have a compatible trade to make, only to
subsequently discover they had misinterpreted each other. This has the obvious
effects on the weights between ideas and words.
The option E in the root menu allows the user to observe, and edit if
desired, a critter's brain. For example, critter A's brain might look like
this initially:
Critter A
Word: (./,) (./_) (,/.) (,/_) (_/.) (_/,)
Foo -4 -9 -9 -2 -6 -10
Bar -5 -8 -3 -5 -2 -5
Quux -1 -9 0 -2 -8 -2
Baz -1 -9 -5 -2 -1 -2
Fnord -8 0 -9 -4 -2 -1
Flugblogs -5 -10 -3 -1 -9 -6
Option? [QCS]
}
The sub-menu here allows the user to go back to the main screen, Q, perhaps to
quit altogether or to carry on with some more epochs, to ``clear'' this
critter's brains and genes (reset all links to zero), or to set any one of the
links displayed. Links (initially) range from +10 to -10.
The option D calculates an overall ``diversity'' calculation. For GALE
diversity is based on actual performance, not (as yet) the size of
links. It is an average of the `diversity' between the brains of all critters
compared pairwise (but not with themselves). The critters can react to 6 types
of `input:' each one of the ideas. by ``react'' I mean generate an
interpretation. There are also the converse words used by critters to
indicate each of the 6 ideas. In this respect, critters can either agree or
disagree on 12 different counts, the 6 interpretations associated with the six
words, and the six words associated with the 6 ideas. This number is
calculated for each pair of critters, and 1 added to it to ensure safe
division. This calculation is then averaged for all critters. (In the saved
statistics, the diversity is only calculated for critters with more than
100 units of lifespan (or some other number if desired) so that children don't
create spurious jumps in the diversity calculation. `F' makes a `fast'
diversity population by sampling the population randomly.
The option N creates a new world populated by a fresh new-born population.
There are some other details. The initial population of 26 critters are given
the names A through Z, and are indexed by 1 through 26. Children are given the
names a through z sequentially, and after that they always appear as @'s.
However, indexes are incremented sequentially.
In some runs of GALE there are many births. When two critters meet if both
their lifespans are ``large'' they give birth with a probability equal to a
specifiable constant, *fecundity*. If two critters breed, they still interact
linguistically, and perhaps make a trade.
If the user is interested only in the development of neural networks, it is
possible to set *fecundity* equal to zero, and ensure initial lifespans are
large enough. However, it may be of interest to watch a genetic algorithm-like
evolution too, in which case birth is desirable. It can be tricky to set
lifespans and *fecundity* in a particular relationship that maintains a
reasonable sized population, and overcrowding can be a problem. This
simulation is not designed to mimic adaption to life-threatening situations,
although clearly it could be adapted to do so. We have found that a method of
`stark-fist' removal helps. Each epoch a grid-point is randomly selected. If a
critter is unfortunate enough to be residing in that grid point then that
critter loses a certain amount of lifespan, which may be fatal.
Variables and constants in Lisp are declared at the top of the code for
gale. Some of the ones a user might want to alter are:
*save-critter-files*: GALE can save statistics related to each critter,
described in detail in the next section. If *save-critter-files*T then these
statistics are saved, if *save-critter-files*=nil they are not.
*world-width* and *world-height* are the dimensions of the grid inhabited by
the critters. The example here show a 35 by 15 grid. Changing the height may
well mess up the message-writing mechanisms, so it is best to only twiddle
width.
*learning-rate* is the magnitude of the change in the links between words
and ideas when there is any change.
*initial-resources* is the number of initial resources placed in the world.
Whatever this number is, the choice of resource type is random when a resource
is placed in a grid-point. Each epoch a number *resources-per-epoch* are
placed in the grid, if there is room. If there is not, GALE will hang!
*fecundity*, is the probability of a birth when two `fit' critters meet.
*delay* is the amount of time, in seconds (approximately) that GALE will pause
after it displays the details of an interaction.
*smite* is the number of units of lifespan removed by the stark fist. The
stronger is the invisible hand, the less important will be the stark fist.
File output.
In its current form GALE will save some statistics related to its current run,
all of which are reported for a particular run in the next section. Always
stored is a file called `stats' are contains 110 variables, defined as
follows:
The words and ideas are given as follows:
Foo = Word 1
Bar = Word 2
Quux = Word 3
Baz = Word 4
Fnord = Word 5
Flugblogs = Word 6
(dot . comma) = Idea 1
(dot . bar) = Idea 2
(comma . dot) = Idea 3
(comma . bar) = Idea 4
(bar . dot) = Idea 5
(bar . comma) = Idea 6
The 110 columns in stats are:
Column data
1 Epoch
2 population diversity
3 Number of times word 1 used as idea 1 (this epoch)
4 Number of times word 1 used as idea 2
9 Number of times word 1 used as idea 6
10 Number of times word 2 used as idea 1
38 Number of times word 1 correctly interpreted as idea 1
75 Number of critters using word 1 as idea 1
110 Number of critters using word 6 as idea 6
Optionally, files store statistics on individual critters too, in files of the
form critter#.stats. the file critter3.stats would contain the following:
Column data
1 Epoch
2 lifespan
3 Total number of interactions so far
4 Number of trades of idea 1 type so far
9 Number of trades of idea 6 type so far
10 (Block metric) distance from critter 1
11 Diversity with critter 1
12 (Block metric) distance from critter 2
13 Diversity with critter 2
14 (Block metric) distance from critter 4
15 Diversity with critter 4
The number of columns will depend upon the number of critters in existence;
there will be 9 + (2 * <number of critters minus 1>), the -1 because a critter
is not compared to itself. The number of columns will vary with population
size!