Re: Portable LISP subsets?

From: Jeff Dalton
Subject: Re: Portable LISP subsets?
Date: Wed, 11 Apr 1990 16:09:01 +0000
Message-ID: <2205@skye.ed.ac.uk>

In article <····················@ibmpcug.co.uk> ····@ibmpcug.co.uk (Bill Birch) writes:

>         Can anyone refer me to a paper or book about PORTABLE LISP?
>I want to know if there is a subset of LISP that will work on most
>dialects of Common LISP, Cambridge LISP, Standard LISP, XLISP, Scheme
>etc. etc.  Perhaps somebody has done a survey to find out what the 
>lowest-common-denominator is? I hope it doesn't turn out to be Pure LISP!

Or greatest common divisor...

Sometime in 1985, I looked into this question detail for the main Lisp
dialects in use in the UK at the time, namely Franz Lisp, UCI Lisp,
Cambridge Lisp, and Common Lisp.  

If you do this sort of comparison, it soon becomes clear that the
common subset that will work without any change or support code in all
of these Lisps is fairly small.  For example, there isn't a common way
to define functions.  Franz Lisp and Common Lisp use DEFUN, but UCI
Lisp uses DE.  (At least one version of Cambridge Lisp allows both,
but that doesn't solve the problem in general.)  The same sort of
thing will happen with your list of dialects.  For example, Scheme
uses yet another form: DEFINE.

On the other hand, if you're willing to write in a somewhat neutral
dialect and to write some support code to "adapt" each of the dialects
you want your code to run in, you can do much better.  For example,
you could use DEFUN to define functions provided that you wrote a
DEFUN macro for each of the Lisps that use something else.  Note that
the "neutral dialect" might actually be a subset of some actual
dialect, if you are willing to pick a favorite.

A problem with this approach, however, is that it is really _too_
powerful.  Since the support code (macros and functions that define
everything needed by the "portable" neutral-dialect code) can be
arbitrarily complex, it's hard to establish clear limits on that the
neutral dialect can be.  You have to be careful not to be too
ambitious, for otherwise you will spend too much time on the support
code and may also find that it's difficult to debug anything on the
importing side.

So this is why I've been calling the portable dialect "neutral": it
has to avoid things that work well in only some of the dialects and
require elaborate support in all the others.  Exactly when support
becomes too elaborate is something you'll have to decide.  You'll
also have to decide whether you really want to include all of the
dialects you listed.  For example, it may be easier if you leave
out Scheme.  

With those points in mind, here are some of the things that may
cause problems:

 * Functions.

   It's hard to think of any functions that are exactly the same in
   every Lisp.  Even CAR and CDR can be different, because some Lisps
   define them for the empty list while others do not.  EQ is another
   function that can vary in its details.  You can pretty much rely on
   its behavior for symbols and conses, but not for numbers and some
   other things.  There are many other examples of this sort.

   Even if we look instead for functions that have most of their
   behavior in common, we have a small list, because (for example)
   Scheme has different names for such things as EQ.  (In Scheme 
   it's "EQ?".)

 * Name conflicts.

   Since it's not always safe to redefine built-in functions, you
   have to avoid giving your own functions that same name as any
   of the built-in function in any of the dialects involved.

 * Datatypes.

   Different Lisps support different datatypes.  Almost all Lisps will
   have symbols, conses, and integers up to 16 bits; after that it
   varies.  Strings, vectors, and integers up to the size of the
   address field of the pointers used by the Lisp (let's say this is
   usually at least 24 bits) are usually available too.  [The latter
   rule about integers means, more or less, that you can use them to
   count objects, index vectors, and things like that, but can't rely
   on such things as 32-bit arithmetic.]  Bignums (arbitrarily large
   integers) are fairly common.

 * Overlap, disjointness, etc.

   Some things can turn out to be the same.  For example, in some
   Common Lisps VECTORP is true of the objects defined by DEFSTRUCT.
   Some predicates, such as ATOM, can vary quite widely.  Consequently,
   you have to be careful about such things as the order of tests in 
   a COND.

   One of the most important things of this sort is the varying
   treatment of NIL, false, and the empty list.  They might all be
   different (as they are in some dialects of Scheme) or all be the
   same (as in Common Lisp).

   Another important case is that CONSP (or the equivalent) may be
   true of the functions that result from evaluating lambda-expressions.

 * Scope and extent.

   Are variable instances (sometimes called bindings) visible
   everywhere (indefinite scope) or only within the textual scope
   of the construct that introduced them (lexical scope)?  Are
   references valid forever (indefinite extent) or only until the
   construct exits (dynamic extent)?  Are lambda-expressions 
   processed with respect to the context in which they occur or
   as completely separate functions?

   Since the answers can vary, and since it's difficult to change
   them very much using support code, neutral code has to try to
   avoid all of the cases where any of them makes a difference.
   This pretty much means that free variables can't be used.
   References to formal parameters that occur within the body of
   the same function are ok unless they appear inside a lambda-
   expression, unless the lambda-expression is the car of a
   function call.

   Lambda-expressions in the car of a form are usually treated
   differently from other ones; they're usually treated as would
   be the equivalent LET.  In other contexts lambda-expressions
   may have to have (FUNCTION ...) around them.  So you may want
   to define a macro, called %LAMBDA or something, that expands
   into the right thing.

 * Function and value environments.

   Common Lisp and a number of other Lisps allow identifiers to act as
   variables and function names independently.  Other Lisps, such as
   Cambridge Lisp and Scheme do not.  Lisps are sometimes referred to
   as being a Lisp-2 or a Lisp-1, depending on whether function names
   and variables are different or not, respectively.

   For example, in Common Lisp

      (let ((f 10)) (flet ((f (x) (* x x))) (f f)))  ==>  100

   Moreover, in most Lisp-2s F in (F X) will always be interpreted
   as a function name.  So if F is a function-valued paramater it
   has ot be called by something like (FUNCALL F X).

   Another related issue is whether the car of a fucntion application
   can be any function-valued expression or whether it has to be a
   name or a lambda-expression.

   Neutral code has to avoid anything that relies on function names
   and variables being different, has to call function-valued
   variables by using FUNCALL (which can then be defined in those
   Lisps that don't have it), and can't have arbitrary expressions
   in the function position (ie, the car) of a function call.

 * Definition syntax, multiple arguments, etc.

   The syntax for optional and "rest" arguments varies or may have
   to be built out of lower-level things (eq, LEXPRS).  Keyword
   parameters are pretty much confied to Common Lisp and Lisps
   that try to approximate CL subsets, such as XLISP.

 * Control structures.

   Things like IF, COND, AND, OR, and LET are available everywhere or
   can easily be defined.  Not so things like CATCH, THROW, CALL/CC,
   and MULTIPLE-VALUE-BIND.

 * Macros

   Some Lisps don't have macros, and there isn't yet a standard macro
   mechanism for Scheme.

Jeff Dalton,                      JANET: ········@uk.ac.ed             
AI Applications Institute,        ARPA:  ·················@nsfnet-relay.ac.uk
Edinburgh University.             UUCP:  ...!ukc!ed.ac.uk!J.Dalton