There has been an on going discussion of what languages
are effective for creating self-modifying code in the comp.ai
newsgroup. I will summarize some of the langauges I have
identified thus far. I am distributing this note to several
different language groups because I am interested in hearing
of other languages that are also effective for creating self-
modifying code.
When discussing languages for creating self-modifying
code we must first identify why we want such languages. If
you have such tools then you can write a program that itself
will write a program. This can be considered a form of
learning where a program learns a procedure or function that
can be used to do a task.
Thus the main focus is to be able to create a function or
procedure and then run it. We also want to be able to create a
function or procedure and then be able to modify it easily.
For the first function created might not do the task we want it
to do, thus we would want our program to automatically
modify the created function or procedure until it is modified
into a form that will execute the task effectively.
What is the most effective representation of a function
that can be modified? If we represent a function as a string it
is not as easily modified and manipulated as when it is
represented as a structure of symbols. Thus to have an
effective language for the creation of self-modifying code it is
best to have a language that manipulates symbols and
structures effectively. Examples of languages that fit into this
category are: Lisp, Prolog, Pop, Smalltalk, Icon, and many
others. C, Pascal, Ada and like languages do not have as
effective built-in tools for the creation and manipulation of
structures and symbols.
We would also need tools that can then execute the
symbol-structure representations of functions that we have
created. Preferrably we would want a tool that could execute
the structure on different sets of input values. This tool would
resemble the apply tool that exists in some implementations of
Lisp. Apply takes two arguments. The first argument is a
structure that represents a function. The second argument is
a list or structure of arguments to be passed to the function.
There are a number of programming languages that allow the
creation of such 'apply' tools fairly easily. The languages that I
have identified so far are: Lisp; Prolog; Pop-11; and Smalltalk.
In Common Lisp apply is a built function that operates not on a
structure but rather on a function type. Thus we need to
name our 'apply' tool differently. We name it 'apply_to' and
define it as follows. (Scheme and other Lisp dialects probably
have similar tools.)
> (defun apply_to (funlist arglist)
(apply (eval (list 'function funlist)) arglist))
We execute apply_to as follows:
> (apply_to '(lambda (a b) (+ a b)) '(4 5))
9
In Pop-11 the name apply is also already taken, thus again we
use 'apply_to' and define it as follows: (Developed in
discussions with A. Sloman.):
: define apply_to(funlist, arglist);
popval(funlist) (explode (arglist));
enddefine;
We call it as follows:
apply_to([procedure(A, B); A+B =>; endprocedure], [4 5]);
** 9
Prolog's syntax is significantly different from Lisp and Pop-11.
Predicates (functions) are broken into several parts called
clauses. Thus we need a structure that can represent several
clauses. In order to represent a predicate in a structureal
form we will use a structure that has the name 'lambda' and
has one argument which itself is a list of nameless clauses. We
will also use a lambda structure of two arguments to represent
a single clause. The first argument being a list of parameters
for this nameless clause and the second argument being the
body. Apply is then defined as follows:
apply(lambda([Clause1|_]), Args) :-
Function =.. [lambda|Clause1], apply(Function, Args).
apply(lambda([_|Rest], Args) :- apply(lambda(Rest), Args).
apply(lambda(Args, Body), Args) :- Body.
Apply can then be used as follows: (Note that a third argument
'C-Result' is needed for a return value.)
?- apply(lambda([[[A,B,C], (C is A + B)]]), [4, 5, Result]).
C=Result=9
For a more detailed discussion of apply in Prolog see my article
in "Computer Languages" 1993, Volume 18, # 1.
Defining apply in Smalltalk is slightly more complex.
There are no tools for 'apply' on structures. There are only
tools for the evaluation of strings. Likewise unnamed
functions are in the form of blocks which differ from the
structures that exist in smalltalk. Thus there are a number of
conversions that must be made to apply a method (function)
in the form of a structure. The below defined apply is limited
because of all of the ambiguities of translations of structures
to strings and then to blocks. The below apply only takes numbers as
arguments to the unnamed function being passed to apply and
only allows three parameters at most.
apply: lambda arguments:args
^Compiler evaluate:
(((lambda printString) replaceFrom: 1 to: 1
with: '[' startingAt:1)
replaceFrom: ((lambda printString) size) to:
((lambda printString) size)
with: ']' startingAt:1)
, ((args collect: [:each | ' value:',
(each printString)])
inject: '' into: [:string :next | string , next])
We would use the above method as follows:
^Compiler apply: #(:a :b | a+b) arguments: #(4 5)
9
To the best of my knowledge other languages such as
Icon, and Forth don't have tools such as 'apply' although they
do have tools to create and manipulate symbols and
structures. Many people have commented that we could
create a program that wrote a string that represented a
program to a file and then exectued that file. Icon reportedly
has the power to do this. While this of course is possible it
adds a number of complicating steps in the process of creating
self-modifying code, and thus makes these languages less
suited for this purpose. With Icon some of the compilcations
include: translating a structure of symbols into a string;
writing a file; managing files for there may be an apply within
an apply; executing a file; and accessing the value outputed
from the execution of the file. With other languages such as C
the same compilcations exist, however we have a couple of
additional difficulties: the development of a set of tools for
structures; and a set of tools for representing symbols that can
act as key words or variable names in the functions that are
created by the self-modifying code.
I am very interested in receiving any email or news
responses. In particular demonstrations of 'apply' in other
langauges.
H. Justin Coven, Ph.D.
Computer Science Dept.
Bellarmine College
Louisville, KY 40206
········@ulkyvx.louisville.edu
From: Tom Kramer
Subject: Re: Languages for self-modifying code
Date:
Message-ID: <23528@cosmos.cme.nist.gov>
H. Justin Coven recently inquired about ideas for self-modifying code,
particularly by using "apply" in LISP. The question was raised in the
context of systems that learn. What follows is (only) my opinion,
based on a lot of writing LISP programs that write other programs, but
only a little work on learning systems.
In the context of learning systems, "apply", I think, is handy for
simple situations, but awkward to use in complex situations. It is
useful to think specifically about the best form in which to use the
modified code, and there are usually many options, including using the
modified code in a different process from that which generated it.
This gets us quickly to thinking generally about how to deal with
programs generated by other programs.
The best form for programs generated by other programs depends upon
the circumstances in which the generated program is to be used. The
range of circumstances is large, from compilation to transient
generation and use. Deciding what the actual circumstances are may be
harder than it appears to be at first glance; there are apt to be
more choices than you originally think you have.
One issue is whether the generated program is to be in the same
language as the generator. If the generated program is in a different
language, it is usually necessary (especially if the generated program
is source code which is to be compiled) to write it to a file. It may
be useful to generate the entire program as a structure in active
memory before printing any of it, however. I have used LISP and C to
write RS274-D programs, for example, and find it useful to proceed
this way. I sometimes use LISP to write C, which I then compile, for a
second example.
A second issue is whether the generated program is to run in the same
process as the generator. In this case the generated program must be
in the same language as the generator or in a language (tcl, for
example) for which the generating language has an interpreter.
Three criteria for making a decision on how to implement when there is
a choice are programming difficulty, timing, and program persistence.
1. Difficulty - If generating the program is difficult, implement the
easiest way first. Reconsider other choices after you've got the easiest
method working.
2. Timing - If the generated program is to be used many times or if it
is apt to take a long time run once, it may be a good idea to compile
it to minimize the time it takes to run it. Of course LISP is great in
this regard, since you can write a program, compile it, load it, and
run it - bang, bang, bang. On the other hand, if you want a small fast
program as output, generating in a lower level language (C, for
example) may be a good idea. I recently wrote a LISP program which
read data from a file and ran a long time (many days in some runs);
call it program A. I then then wrote a LISP program which read the
data used by program A and generated C-language structures; I wrote by
hand the C-language code which used the structures. The structures and
the hand-written code compiled into a program (call it B) which did
exactly the same thing as program A. Program B (even with all the data
compiled in) required about 1/20 of the memory used by program A, and
B ran about 40 times as fast as A.
3. Persistence - If the generated program is to be kept around for
repeated use, it is usually useful to write it to a file.
Tom Kramer
······@cme.nist.gov
From: Fernando Mato Mira
Subject: Re: Languages for self-modifying code
Date:
Message-ID: <2dvrto$7h9@disuns2.epfl.ch>
In article <·····@cosmos.cme.nist.gov>, ······@cme.nist.gov (Tom Kramer) writes:
> exactly the same thing as program A. Program B (even with all the data
> compiled in) required about 1/20 of the memory used by program A, and
> B ran about 40 times as fast as A.
How much time did it take to make this port? Using a tree-shaker should
save a lot of memory in a snap. Optimizing to run "only" 10 times slower
should be easy, but getting down to 2X might be a lot trickier, but worthwile
if the program is big and sophisticated.
--
Fernando D. Mato Mira
Computer Graphics Lab
Swiss Federal Institute of Technology (EPFL) Phone : +41 (21) 693 - 5248
CH-1015 Lausanne FAX : +41 (21) 693 - 5328
Switzerland E-mail : ········@epfl.ch
"
From: Tom Kramer
Subject: Re: Languages for self-modifying code
Date:
Message-ID: <23533@cosmos.cme.nist.gov>
In article <··········@disuns2.epfl.ch>, ········@di.epfl.ch (Fernando Mato Mira) writes:
I posted an article which included the following:
exactly the same thing as program A. Program B (even with all the data
compiled in) required about 1/20 of the memory used by program A, and
B ran about 40 times as fast as A.
Program A is in LISP. Program B is in C.
Fernando D. Mato Mira asks:
> How much time did it take to make this port? Using a tree-shaker should
> save a lot of memory in a snap. Optimizing to run "only" 10 times slower
> should be easy, but getting down to 2X might be a lot trickier, but worthwile
> if the program is big and sophisticated.
This was a small (less than 2 pages of code in both languages,
excluding data) but sophisticated search program. The LISP I am using
starts up with about 9 Mb of memory and grows to about 12 Mb when the
program is running and the data has been read. The C program takes
about 0.5 Mb and does not grow. The port took less than a day. I did
not try to optimize the LISP other than compiling for maximum speed
and minimum safety. I would guess that a getting the speedup Fernando
Mira suggests is feasible. I do not see an easy way to get the size of
the LISP down other than getting a size minimizer from the vendor, and
the vendor claims at most 50% size reduction by using it; that is
still 10 times the size of the C program.
In article <·····@cosmos.cme.nist.gov>, ······@cme.nist.gov (Tom Kramer) writes:
|> In article <··········@disuns2.epfl.ch>, ········@di.epfl.ch (Fernando Mato Mira) writes:
|>
|> I posted an article which included the following:
|>
|> exactly the same thing as program A. Program B (even with all the data
|> compiled in) required about 1/20 of the memory used by program A, and
|> B ran about 40 times as fast as A.
|>
|> Program A is in LISP. Program B is in C.
|>
|> Fernando D. Mato Mira asks:
|>
|> > How much time did it take to make this port? Using a tree-shaker should
|> > save a lot of memory in a snap. Optimizing to run "only" 10 times slower
|> > should be easy, but getting down to 2X might be a lot trickier, but worthwile
|> > if the program is big and sophisticated.
|>
|> This was a small (less than 2 pages of code in both languages,
|> excluding data) but sophisticated search program. The LISP I am using
|> starts up with about 9 Mb of memory and grows to about 12 Mb when the
|> program is running and the data has been read. The C program takes
|> about 0.5 Mb and does not grow. The port took less than a day. I did
|> not try to optimize the LISP other than compiling for maximum speed
|> and minimum safety. I would guess that a getting the speedup Fernando
|> Mira suggests is feasible. I do not see an easy way to get the size of
|> the LISP down other than getting a size minimizer from the vendor, and
|> the vendor claims at most 50% size reduction by using it; that is
|> still 10 times the size of the C program.
|>
Clearly, any program that is only 2 pages of code is not the type of program
suited for Common Lisp if speed is very important.
It's like driving your car to your next door neighbor's house --
you'll get there faster and cheaper if you just walk (or run if you're in a hurry).
but if you need to go to the grocery store, I'd keep your car.
--
-- Kelly Murray (···@prl.ufl.edu)
University of Florida Parallel Research Lab :: 96-node KSR1, 64-node nCUBE
-------------------------------------------------------------------------
> A second issue is whether the generated program is to run in the same
> process as the generator. In this case the generated program must be
> in the same language as the generator or in a language (tcl, for
> example) for which the generating language has an interpreter.
Or -compiler- in the case of POPLOG which generates native code for a range of
languages. It will also support linking, relinking and loading of external
functions into itself. A simple example is:
/*
cfn is a POP-11 procedure which takes as arguments -f-, the name of an
function in the C language, and -string-, its definition. It compiles
the C code, links it into POPLOG and defines a POP-11 procedure
with the same name which calls the external code.
*/
define cfn(f,string);
lvars string,i,
rep = discout('temp.c'), ;;; Make output file
;
for i from 1 to datalength(string) do ;;; Copy string to file
rep(string(i));
endfor;
rep(termin); ;;; Close file
sysobey('gcc -c temp.c'); ;;; Call GNU-C compiler
lvars Exptr = "Exptr_"<>f; ;;; Name of pointer to external code
popval([exload ^f ['temp.o'] ;;; Link in the compiled code
(language C)
^Exptr(1):int <- ^f; ;;; With -f- code in Exptr
endexload
]);
popval([define ^f(x); exacc ^Exptr(x); ;;; Define POP-11 procedure
enddefine]);
enddefine;
;;; Define a C-function "twice", compile it and incrementally link it to
;;; POPLOG
cfn("twice", 'int twice(int x) {return x*2;}');
;;; twice is now a POP-11 procedure which uses the compiled C - call it and
;;; print out the answer
twice(23)=>
** 46