From: David Steuber
Subject: Reflections on Trusting Defmacro*
Date: 
Message-ID: <87fyqafkyq.fsf@david-steuber.com>
I've been told on several occasions by different people that no one
writes a Common Lisp compiler from scratch anymore.  From a certain
point of view, I can understand why.  Lisp is a complex language and
there is more to Lisp than just the compiler.  There is also the
development environment.  On the other hand, modern processors seem to
do all sorts of funky things to try and run code as fast as possible
while optimizing compilers try to produce code that runs as fast as
possible on those same CPUs.  I sometimes wonder if at some level a
rewrite would not be a waste of time.

Naturally the best language to use to write a Lisp compiler would be
Lisp itself.  One rather nice thing about Lisp is that there are many
parts that you don't have to write again.  Presumably LOOP can be
grabbed from an appropriate implementation and stuffed into another
without too much pain.  Stepping up in complexity there is CLOS and
MOP (as defined in the AMOP book).  So a complete rewrite would really
be a bottom up face lift.

If only it were that simple.  This is something I think about on and
off again when I go to bed whenever.  Here is one of the sabot that
gets tossed into the mental machinery.  My favorite free Lisp
implementation for non-numerical heavy use is OpenMCL for Darwin.  One
of the critical macros defined in the OpenMCL sources is DEFMACRO:

(defmacro defmacro  (name arglist &body body &environment env)
  (unless (symbolp name)(signal-program-error $XNotSym name))
  (unless (listp arglist) (signal-program-error "~S is not a list." arglist))
  (multiple-value-bind (lambda-form doc)
                       (parse-macro-1 name arglist body env)
    (let* ((normalized (normalize-lambda-list arglist t t))
           (body-pos (position '&body normalized))
           (argstring (let ((temp nil))
                        (dolist (arg normalized)
                          (if (eq arg '&aux)
                            (return)
                            (push arg temp)))
                        (format nil "~:a" (nreverse temp)))))
      (if (and body-pos (memq '&optional normalized)) (decf body-pos))
      `(progn
         (eval-when (:compile-toplevel)
           (define-compile-time-macro ',name ',lambda-form ',env))
         (eval-when (:load-toplevel :execute)
           (%macro 
            (nfunction ,name ,lambda-form)
            '(,doc ,body-pos . ,argstring))
           ',name)))))

Like \v in the C compiler, DEFMACRO had to be written in some other
form at an earlier time in OpenMCL's history.  To this day, the way
OpenMCL is developed prevents you from grabbing an arbitrary CVS
snapshot and compiling OpenMCL with itself.  I have no idea how to
compile OpenMCL with another Lisp.  It is either not possible or a
heck of a lot of work.  Doing so is not supported in the
documentation.

SBCL has a different bootstrapping process.  For the most part, an
ANSI Lisp compiler will have no problem building a random CVS snapshot
of SBCL.

I've seen other chicken and egg examples.  I seem to recall someone
posting an implementation of CONS that when something like this:

(defun cons (car cdr)
  (cons car cdr))

I must confess that these things boggle me.  Why does the original
implementation have to go away?  Is binary dependence really a good
thing?

I don't have an answer to that.  I'm lining up a much simpler project
to write in Lisp.  I am hoping that an assembler is no big deal.  I've
never written one and I think I would feel more like a real programmer
if I did.

[*] Ken Thompson's Essay "Reflections on Trusting Trust"
    http://www.acm.org/classics/sep95/

-- 
http://www.david-steuber.com/
The UnBlog: An island of conformity in a sea of quirks.
http://www.david-steuber.com/snippets/Boycott_Sony/
----------------------------------------------------------------------

From: Brian Downing
Subject: Re: Reflections on Trusting Defmacro*
Date: 
Message-ID: <7Egbf.546875$xm3.85046@attbi_s21>
In article <··············@david-steuber.com>,
David Steuber  <·····@david-steuber.com> wrote:
> I've seen other chicken and egg examples.  I seem to recall someone
> posting an implementation of CONS that when something like this:
> 
> (defun cons (car cdr)
>   (cons car cdr))
> 
> I must confess that these things boggle me.  Why does the original
> implementation have to go away?  Is binary dependence really a good
> thing?

This is probably not saying, "Use an old CONS to make a new CONS."
What this is saying is, "CONS is implemented (probably inline) by
compiler magic, but we need to make an actual CONS function so that
people can FUNCALL it."  So the real implementation of CONS is
there, but it's in the compiler somewhere.

-bcd
-- 
*** Brian Downing <bdowning at lavos dot net> 
From: Kaz Kylheku
Subject: Re: Reflections on Trusting Defmacro*
Date: 
Message-ID: <1131303176.616629.154800@g14g2000cwa.googlegroups.com>
David Steuber wrote:
> I've been told on several occasions by different people that no one
> writes a Common Lisp compiler from scratch anymore.  From a certain
> point of view, I can understand why.  Lisp is a complex language and
> there is more to Lisp than just the compiler.  There is also the
> development environment.  On the other hand, modern processors seem to
> do all sorts of funky things to try and run code as fast as possible
> while optimizing compilers try to produce code that runs as fast as
> possible on those same CPUs.  I sometimes wonder if at some level a
> rewrite would not be a waste of time.
>
> Naturally the best language to use to write a Lisp compiler would be
> Lisp itself.

s/Lisp compiler/any compiler/

> Like \v in the C compiler, DEFMACRO had to be written in some other
> form at an earlier time in OpenMCL's history.

Not necessarily. Lisp implementation bootstrapping is a little bit
different from C implementation bootstrapping. If you have a C compiler
implemented as a monolithic, binary executable on your operating
system, either that compiler understands the \v character escape
sequence in string and character literals, or it does not. If it does
not, the only way you ``teach'' it (in Ken Thompson's words) is to hack
the source code of that compiler, and re-compile it with itself. Then
you can use that \v syntax in the source code of that compiler, so long
as you don't lose the new version of the compiler binary that
understands it.  In the development of a C compiler, these round trips
are done manually over and over again, until you have the full
language, which is written in pretty much the full language.

In a Lisp system, these multiple boostrapping stages can be easily
condensed into simple loading of files in the right order. An example
of this approach is CLISP. CLISP has a kernel written in C. Once that
is built, you have a primordial CLISP image which is capable of loading
the remaining modules and dumping a new image.  The remaining modules
implement many stages of bootstrapping. Each loaded module adds a
language feature that the remaining ones can use. For instance, there
is a module that provides backquotes. Until that module is loaded, it's
not possible to use the backquote syntax. But, of course, the backquote
syntax module is never replaced with an implementation that uses the
backquote syntax.

Lisp's run-time image modification is what makes it possible, and the
fact that the language is really a library: by loading new functions,
one single instance of the growing Lisp system modifies itself step by
step. The ever larger library gives a bigger language for subsequently
loaded libary features to rely on.

Eventually, the Lisp system is big enough to support the compiler.
That's just another module that is loaded. Once everything is loaded,
you can dump an image. CLISP calls this interpreted.mem, because
everything in there was loaded but not compiled.

There are additional steps that essentially repeat what was done, but
compile everything now that a compiler is available. First a
halfcompiled.mem is dropped, if memory serves me correctly: Just enough
is compiled in there to make the compiler run fast, without taking too
much time compiling those parts using interpreted code. Then the
half-compiled image can compile everything and produce the full image.

And so, CLISP doesn't rely on anything other than the C compiler binary
and surrounding build environment. No Lisp material has to be present.