History of Common Lisp's multidimensional arrays?

From: Peter Seibel
Subject: History of Common Lisp's multidimensional arrays?
Date: Sun, 31 Aug 2003 22:41:39 +0000
Message-ID: <m365kdsb5y.fsf@javamonkey.com>

Does anyone have any pointers to information about the history of the
design of Common Lisp's multidimensional arrays? For example:
precursors in other languages, documentation of issues considered
during the standardization, etc? Or personal knowledge they'd like to
share?

I'm just trying to get a sense of why things are the way they are--the
various facilities: access by row-major-aref, adjustability,
displacement, add up to a lot of flexibility but I'm feeling a bit
fuzzy on what problems they were designed to address and how they were
intended to be combined.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

Re: History of Common Lisp's multidimensional arrays? Christopher C. Stacy
- Re: History of Common Lisp's multidimensional arrays? Peter Seibel
  - Re: History of Common Lisp's multidimensional arrays? Christopher C. Stacy
    - Re: History of Common Lisp's multidimensional arrays? Steven M. Haflich
      - Re: History of Common Lisp's multidimensional arrays? Peter Seibel
        Re: History of Common Lisp's multidimensional arrays? Steven M. Haflich
        Re: History of Common Lisp's multidimensional arrays? Peter Seibel
      - Re: History of Common Lisp's multidimensional arrays? Christian Lynbech
Re: History of Common Lisp's multidimensional arrays? David Golden
Re: History of Common Lisp's multidimensional arrays? Kent M Pitman
- Re: History of Common Lisp's multidimensional arrays? james anderson

From: Christopher C. Stacy
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 02:10:45 +0000
Message-ID: <uk78ti7h6.fsf@dtpq.com>

They were for the Lisp Machine, so we could point 
into device buffers for the network and display hardware.

From: Peter Seibel
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 03:25:48 +0000
Message-ID: <m3smnhqjfz.fsf@javamonkey.com>

······@dtpq.com (Christopher C. Stacy) writes:

> They were for the Lisp Machine, so we could point 
> into device buffers for the network and display hardware.

Interesting. Beyond just having multidimensional arrays, are there any
other specific features of Common Lisp's arrays (displacement,
adjustability, etc) that were rooted in those same requirements?

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Christopher C. Stacy
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 04:58:27 +0000
Message-ID: <uekz1w1e4.fsf@dtpq.com>

>>>>> On Mon, 01 Sep 2003 03:25:48 GMT, Peter Seibel ("Peter") writes:

 Peter> ······@dtpq.com (Christopher C. Stacy) writes:
 >> They were for the Lisp Machine, so we could point 
 >> into device buffers for the network and display hardware.

 Peter> Interesting. Beyond just having multidimensional arrays, are there any
 Peter> other specific features of Common Lisp's arrays (displacement,
 Peter> adjustability, etc) that were rooted in those same requirements?

I was actually thinking of displacement when I said that.

From: Steven M. Haflich
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 17:35:37 +0000
Message-ID: <JrL4b.255$kv2.28768429@newssvr13.news.prodigy.com>

Christopher C. Stacy wrote:

>  Peter> ······@dtpq.com (Christopher C. Stacy) writes:
>  >> They were for the Lisp Machine, so we could point 
>  >> into device buffers for the network and display hardware.
> 
>  Peter> Interesting. Beyond just having multidimensional arrays, are there any
>  Peter> other specific features of Common Lisp's arrays (displacement,
>  Peter> adjustability, etc) that were rooted in those same requirements?
> 
> I was actually thinking of displacement when I said that.

I'll add to Chris' remark.

Let's see.  From my personal experience, multidimensional arrays in
computer languages go back at least 40 years.  (This was, if I remember
correctly, before Lisp had any arrays at all.)  Mathematicians have had
them at least an order of magnitude longer (although I can't vouch for
this with personal experience) so it isn't very surprising that Lisp
has multidimensional arrays.  It is a little more surprising that
languages like C and Java do not have them, except that C was designed
to be a magnifier and abstractifier [neologism alert!] for the
assembly-language expressiveness of typical hardware.  (I can't explain
the lack in Java, except that the good points of Java took C as a
progenitor, and the bad points were swept along in the flush water.)

As for features such as adjustibility and displacement, these and other
features were somewhat confused language design.  This confusion cannot
be explained in a single sentence -- because when confusion is simple,
it isn't robust enough to survive.  The issue around which the confusion
gyrates still plagues the language.  Here's a stab at its description.

Lisp lives with a tension between data type abstraction and data processing
efficiency.  On one hand, Lisp is a high level language, and Common Lisp
carefully presents mathematical integers as an abstraction.  A typical
implementation will support integer sizes far larger than native hardware
sizes, and far outside realms needed by nearly all implementations.  C (and
also Java, to a considerable extent) provide in place if integers the
mathematically-bizarre substitute of 2's complement mod-32-bit arithmetic.
The trade offs between easily-predictable hardware execution speed and the
possibility of airplanes falling out of the sky are familiar to most readers
of this list and do not need further discussion.  There are of course
appropriate places for both designs.

Of course Lisp applications sometimes need easily-predictable hardware
execution speed, so alternative abstractions like fixnum are provided
(although the meaning is left up to the implementation).  But using these
abstractions requires extra declaration -- clearly the default bias in
Common Lisp data types and computation is slanted toward the abstract
mathematical entities, not the hardware implementational entities.

FORTRAN programmers may think it bizarre that CL has a ratio data type.
(Pythagoras 2500 years ago would not have thought this remarkable at all,
although he would have had trouble with floating point and possibly also
with the inclusion of integer zero. :-)

Things get a little less consistent with floats, and a lot more of the
underlying (e.g. IEEE) implementation peeks out through the language
standard.  Try (apropos "NORMALIZED" :common-lisp) to see one way the
bits and fields of floats can become important to user code performing
routine practical calculations.  But numerical calculation and the
problems of maintaining guaranteed significance are routine for
numerical analysts and practitioners are quite accustomed to dealing
with these things.

The most pernicious failing of the ANA in my opinion is the refusal to
admit the reality that computers deal in numeric bytes and characters
with specific formats and codings.  A portable ANSI CL program cannot
write binary information to a socket or disk file that would be portably
readable by another language platform, even a different Common Lisp, and
even on the same system.  A portable CL program cannot write characters
and/or binary numbers of varying format to a single socket or stream.  C
and Java have no such problems, and portable programs can interact (modulo
a little care about endianness byte order).  That typical CL programs
would be happy typing characters on a `console' to demonstrate AI is a
worn-out idea from the age of Eliza.  Today a typical CL program might
contain an entire http server as an incidental component...

The Lisp Machine was one major force in changing the nature of Lisp as a
abstract AI language that didn't need to interface much to the outside
world.  The Lisp implemented nearly the entire operating system (please,
no religious arguments about whether the LM actually had an operating
system) in Lisp.  There was no underlying system language to fall back
upon to manage hardware and IO and move data around, and no suite of
utility programs like cat and tar and mkfs to do these mundane tasks.
The LM put some of low level stuff in microcode, but otherwise, the Lisp
language had to deal with everything.

To return to the issue of arrays, it is easy to see how the basic CL
concept of array follows the abstract mathematical notion of n-dimension
matrix.  Even zero-rank arrays are provided.  But there remained the need
to do array operations with hardware efficiency.

The ability to deal with vectors of objects is fundamental to system
programming.  I may be misremembering details, but the Lisp Machine did
not implement separate simple and non-simple arrays.  All arrays had
headers, and all arrays could be displaced and adjusted.  There was little
cost for this generality, at least on speed, since most of the header
processing and and dereferencing was assisted by the microcode.

Now, given the LM underlying microcode, and the nature of the Lisp
language, it may be that array headers and displacement and adjustability
were appropriate ways to attain something near-hardware efficiency when it
is needed.  It is less clear that this is the right solution on stock
hardware, and so we see definite motion towards streams that operate in
bytes (supporting both characters and byte integers) with exposed buffers
that are implemented from unsigned-byte-8 vectors.  It is diffifcult to
reconcile this kind of C-style data-type punning with Lisp's strict
run-time data-typing, but there does seem to be a real need in modern
applications.

Lisp solutions to these hardware bits and bytes problems sometimes don't
succeed.  Look at the `typed' defstruct mess still hanging on the language.
Someone felt the need to expose something near native data types in an
_abstract_ and structured way, and a way that supports single inheritance.
It's a mess.  C structs do it more simply and elegantly (but in a way that
wouldn't translate directly to Lisp, alas).

It's easy to forget this now, but Common Lisp was designed to be a single
common langage that would be reasonably and efficiently implementable on
both hardware LMs and on stock hardware.  There are many provisions in the
language definition that bend over backwards to satisfy real or imagined
requirements of each community.  Locatives and forwarding tricks are
omitted because they are inefficient without microcode support.  Simple
arrays are in because they are necessary for efficiency on stock hardware,
but there is no requirement that an implementation provide arrays that
are actually simple.

If the language were redesigned today, I would like to see the definition
get a lot closer to the realities of underlying platforms -- recognize that
8-bit-bytes have an important reality unshared by 9-bit bytes, recognize
that real streams have endianess and need to have arbitrary varying
datatypes passed through them, deal with the realities of filenames with
less elegance and more recognizance of the situation on the ground.  And
then there's that little issue of case sensitivity.  I don't mean any of
this to launch the usual religious flame war -- these are just my opinions.

By the way, row-major-aref was added to the ANS quite late.  There were
implementation-specific analogues on various platforms, but X3J13 felt that
this was something worth exposing portably.

From: Peter Seibel
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 18:35:58 +0000
Message-ID: <m3k78sqrvh.fsf@javamonkey.com>

"Steven M. Haflich" <·················@alum.mit.edu> writes:

> Christopher C. Stacy wrote:
> 
> >  Peter> ······@dtpq.com (Christopher C. Stacy) writes:
> >  >> They were for the Lisp Machine, so we could point  >> into
> > device buffers for the network and display hardware.
> >  Peter> Interesting. Beyond just having multidimensional arrays, are
> > there any
> >  Peter> other specific features of Common Lisp's arrays (displacement,
> >  Peter> adjustability, etc) that were rooted in those same requirements?
> > I was actually thinking of displacement when I said that.
> 
> I'll add to Chris' remark.

[snip lots of interesting history]

Thank you! That's exactly the kind of thing that I was looking for.

One question about something you said in passing:

> A portable ANSI CL program cannot write binary information to a
> socket or disk file that would be portably readable by another
> language platform, even a different Common Lisp, and even on the
> same system. A portable CL program cannot write characters and/or
> binary numbers of varying format to a single socket or stream.

Well, sockets are obviously a problem in portable ANSI CL, but for
files, what's wrong with:

  (defun save-my-byte (b) 
    (with-open-file (out "my-byte.binary" :direction :output :element-type '(unsigned-byte 8) :if-exists :supersede)
      (write-byte b stream)))

Assuming that reliably writes a file with one byte that could be read
by a Java or C program, I could build up the rest of the data types in
whatever format by writing them one byte at a time. Or is the problem
that I'm assuming WRITE-BYTE to a stream with an (unsigned-byte 8)
element-type actually writes the data out in some rational format and
the standard doesn't actually require that?

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Steven M. Haflich
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 21:04:13 +0000
Message-ID: <3F53B45D.1050804@alum.mit.edu>

Peter Seibel wrote:

> "Steven M. Haflich" <·················@alum.mit.edu> writes:

>>A portable ANSI CL program cannot write binary information to a
>>socket or disk file that would be portably readable by another
>>language platform, even a different Common Lisp, and even on the
>>same system. A portable CL program cannot write characters and/or
>>binary numbers of varying format to a single socket or stream. 
> 
> Well, sockets are obviously a problem in portable ANSI CL, but for
> files, what's wrong with:
> 
>   (defun save-my-byte (b) 
>     (with-open-file (out "my-byte.binary" :direction :output :element-type '(unsigned-byte 8) :if-exists :supersede)
>       (write-byte b stream)))

Nothing.  You wrote an 8-bit byte, you can read it back, and presuming the
implementation chose the obvious external representation, you could even
communicate with programs written in other languages.

What you can't do is this:

    unsigned byte b;
    long int i;
    double f;
    ...
    write(sock,&b,sizeof b);
    write(sock,&i,sizeof i);
    write(sock,&f,sizeof f);

(Hope I got the code right -- I haven't written much C in the last couple
decades and my syntax is rusty.)

What's more, you can't do this idiomatic C stuff:

   struct foo { byte b; int i; double f; } myfoo;
   ...
   write(sock, &myfoo, sizeof myfoo);

With some conditionalization in a .h file, it is even possible to abstract
these datatypes so that they will be similar on different hardware
platforms, with some limitations and byte order issues, of course.

> Assuming that reliably writes a file with one byte that could be read
> by a Java or C program, I could build up the rest of the data types in
> whatever format by writing them one byte at a time. Or is the problem
> that I'm assuming WRITE-BYTE to a stream with an (unsigned-byte 8)
> element-type actually writes the data out in some rational format and
> the standard doesn't actually require that?

Anatole France wrote that the law forbids the rich as well as the poor to
sleep under bridges.  Analogously, C and Lisp programmers are provided
approximately equal abilities to spend their time writing marshalling
code for all their datatype abstractions.  But only the Lisp programmer
needs to.

From: Peter Seibel
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 22:58:58 +0000
Message-ID: <m37k4sqfp6.fsf@javamonkey.com>

"Steven M. Haflich" <·················@alum.mit.edu> writes:

> Peter Seibel wrote:
> 
> > "Steven M. Haflich" <·················@alum.mit.edu> writes:
> 
> >>A portable ANSI CL program cannot write binary information to a
> >>socket or disk file that would be portably readable by another
> >>language platform, even a different Common Lisp, and even on the
> >>same system. A portable CL program cannot write characters and/or
> >> binary numbers of varying format to a single socket or stream.
> > Well, sockets are obviously a problem in portable ANSI CL, but for
> > files, what's wrong with:
> >   (defun save-my-byte (b)     (with-open-file (out "my-byte.binary"
> > :direction :output :element-type '(unsigned-byte 8) :if-exists
> > :supersede)
> >       (write-byte b stream)))
> 
> Nothing.  You wrote an 8-bit byte, you can read it back, and presuming the
> implementation chose the obvious external representation, you could even
> communicate with programs written in other languages.
> 
> What you can't do is this:
> 
>     unsigned byte b;
>     long int i;
>     double f;
>     ...
>     write(sock,&b,sizeof b);
>     write(sock,&i,sizeof i);
>     write(sock,&f,sizeof f);
> 
> (Hope I got the code right -- I haven't written much C in the last couple
> decades and my syntax is rusty.)
> 
> What's more, you can't do this idiomatic C stuff:
> 
>    struct foo { byte b; int i; double f; } myfoo;
>    ...
>    write(sock, &myfoo, sizeof myfoo);
> 
> With some conditionalization in a .h file, it is even possible to abstract
> these datatypes so that they will be similar on different hardware
> platforms, with some limitations and byte order issues, of course.
> 
> > Assuming that reliably writes a file with one byte that could be read
> > by a Java or C program, I could build up the rest of the data types in
> > whatever format by writing them one byte at a time. Or is the problem
> > that I'm assuming WRITE-BYTE to a stream with an (unsigned-byte 8)
> > element-type actually writes the data out in some rational format and
> > the standard doesn't actually require that?
> 
> Anatole France wrote that the law forbids the rich as well as the poor to
> sleep under bridges.  Analogously, C and Lisp programmers are provided
> approximately equal abilities to spend their time writing marshalling
> code for all their datatype abstractions.  But only the Lisp programmer
> needs to.

Well, I was never much of a C programmer. But my impression was that
while the C code you wrote above is probably would allow communication
between two C programs running on the same architecture and using the
same C compiler, beyond that all bets are probably off. (For instance,
are C structs allowed to be padded in memory for allignment purposes.)

In Java, all the portable marshalling is done on top of octet streams
in what are essentially user-level libraries. Look at the source or
java.io.DataOutputStream. So maybe I'm just used to it. But I don't
know that "portable marshalling" and "dumping the contents of memory
to as stream" are ever going to be compatible. If anything, I think
the C guys get screwed because they forget that, and their naive
marshalling usually works because they test it between similar
machines with similar compilers. (Or maybe I'm just too paranoid--as I
said, I'm no C wizard.)

Apropos marshalling, I found this url somewhere the other day:

 <http://www.pointnclick.com/lijos/>

This guy has implemented the whole Java Object Serialization standard
in Common Lisp. So the marshalling code is already written.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Christian Lynbech
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Thu, 04 Sep 2003 07:28:29 +0000
Message-ID: <offzjdovvm.fsf@situla.ted.dk.eu.ericsson.se>

>>>>> "Steven" == Steven M Haflich <·················@alum.mit.edu> writes:

Steven> Look at the `typed' defstruct mess still hanging on the language.

I am curious about what you mean by this.

If you are talking about the ability to do 

   (defstruct (ship (:type list)) pos speed cargo)

giving you a bunch of accessors for list of three elements, I find
that very nifty. 

I find it  greatly enhances readability by enabling `(ship-cargo x)'
over `(nth 2 x)', increases representation robustness, decreases
off-by-1 errors and is much better when doing things such as
`(sort xs (function <) :key (function ship-speed))' as opposed to
`(sort xs (function <) :key (function cadr))'.


------------------------+-----------------------------------------------------
Christian Lynbech       | christian ··@ defun #\. dk
------------------------+-----------------------------------------------------
Hit the philistines three times over the head with the Elisp reference manual.
                                        - ·······@hal.com (Michael A. Petonic)

From: David Golden
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Mon, 01 Sep 2003 10:55:21 +0000
Message-ID: <235b265c.0309010255.24bff155@posting.google.com>

Peter Seibel <·····@javamonkey.com> wrote in message news:<··············@javamonkey.com>...
> Does anyone have any pointers to information about the history of the
> design of Common Lisp's multidimensional arrays? For example:
> precursors in other languages, documentation of issues considered
> during the standardization, etc? Or personal knowledge they'd like to
> share?
>

Most languages with multidimensional arrays borrow heavily from a fun
language
called "APL", standing for "A Programming Language", which deals with
mathematical functions over large multidimensional arrays in a quite
nice way.  In its pure form, APL requires its own character set, since
it is written using a sort of adapted mathematical notation, though
many derivatives such as J and K use an ugly ASCII mapping (I find APL
in APL character sets relatively readable, but the "modern" ASCII APLs
really do look like line noise).  I always wanted my own APL keyboard,
just for sheer pose value :-)

Common Lisp array functions seem like a very verbose version
of APL.   (Alternatively, APL might be seen as extremely terse and
cryptic...)

I would imagine the Lisp community was at least aware of APL when
implementing
multidimensional arrays and various mathematical funcitons in lisp.

From: Kent M Pitman
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Tue, 02 Sep 2003 19:08:00 +0000
Message-ID: <sfwznhnxb3j.fsf@shell01.TheWorld.com>

Peter Seibel <·····@javamonkey.com> writes:

> Does anyone have any pointers to information about the history of the
> design of Common Lisp's multidimensional arrays? For example:
> precursors in other languages, documentation of issues considered
> during the standardization, etc? Or personal knowledge they'd like to
> share?
> 
> I'm just trying to get a sense of why things are the way they are--the
> various facilities: access by row-major-aref, adjustability,
> displacement, add up to a lot of flexibility but I'm feeling a bit
> fuzzy on what problems they were designed to address and how they were
> intended to be combined.

These same issues come up in FORTRAN.

As an implementor, you have to choose whether to use column major layout
or row major layout, whether or not you publish that fact.  In memory,
arrays are (obviously) linear.

In FORTRAN, you can use COMMON to create overlaid arrays of other
arities and declared types.  For example, some libraries did not trust
compilers, to be adequately efficient with double precision and
complex in some or all circumstances, and so they didn't hesitate to use
overlays to regular REAL arrays and manipulated the parts of the words
individually themselves.

MACLISP had multiple arity arrays, which is why the LispM had them.  CL
descends from both of these.  MACLISP had two operators LISTARRAY and 
FILLARRAY that operated on elements of an array, regardless of arity, 
exploiting their row major layout.  There was an X3J13 Cleanup Issue 
that introduced ROW-MAJOR-AREF that you can look to in order to see the
precise ways in which this led to the introduction of ROW-MAJOR-AREF.

Displacement was not available in MACLISP, that I recall, and was new in
the LispM.  However, it was in FORTRAN.  Using one array displaced into 
another, I vaguely recall that one could make it possible to use negative
array indices in Fortran, and other tricks.

What was new with the LispM, and which did not carry into CL (perhaps
because of the lack of microcode assist for speed) was conformally
displaced arrays (I think you said :displaced-conformally t, or some
such) in which case you got a displaced region of the original square
(cube, etc) rather than a region of the linearized storage.  This was
useful for displacing to screen memory, especially since the LispM
used DMA (direct memory access) display from a raster array that was,
I think, specially known by the screen to mean "this array's memor IS
the screen" and doing a SETF of AREF into that special array made
something appear on the screen.  All windows had conformally displaced
indirect arrays that represented their part of the screen.

From: james anderson
Subject: Re: History of Common Lisp's multidimensional arrays?
Date: Tue, 02 Sep 2003 20:19:46 +0000
Message-ID: <3F54FAD2.D7FDB355@setf.de>

Kent M Pitman wrote:
> 
> Peter Seibel <·····@javamonkey.com> writes:
> 
> > Does anyone have any pointers to information about the history of the
> > design of Common Lisp's multidimensional arrays? For example:
> > precursors in other languages, documentation of issues considered
> > during the standardization, etc? Or personal knowledge they'd like to
> > share?
> >...
> 
> These same issues come up in FORTRAN.
> 
> As an implementor, you have to choose whether to use column major layout
> or row major layout, whether or not you publish that fact.  In memory,
> arrays are (obviously) linear.
> 

until one ventures into the world of distributed shared memory.

my recollection from writing simulations (mostly in c) on the bbn-butterfly
was that this was one aspect of its architecture. which i would think would
have made its way into butterfly common lisp and scheme. i went looking for
documents, but the pointers hit a dead end in the cmu ai-repository. in the
process, however, i did notice a thesis, AITR-2002-005 in
www.ai.mit.edu/publications/pubsDB/pubs.html, which discusses the problem in general.

...