List diagrams -- Siebel and Touretzky draw them differently

From: ·············@gmail.com
Subject: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 15:39:27 +0000
Message-ID: <0267e20b-0be8-4acb-9087-13494ed3b8b9@m45g2000hsb.googlegroups.com>

Hi,

It was recently pointed out to me that my understanding of lists is
incorrect.  Looking further into the issue I found two types of
diagrams being drawn.  One is by Peter Seibel, who places the list
elements in the car of the cons cell, while the other is Touretzky,
who places them outside of the car, with a pointer from the car to the
element.

Is one of them grossly in error, or are the differences minor?

(It was also pointed out that the "car points to element" way explains
how an element can be a member of more than one list.  That would
suggest that Seibel's diagrams are in error. )

The reason I am asking as that I still want to submit a lisp cookbook
recipe on growing the list by the tail, and the list representation is
central to explaining the recipe.

Thanks,

Mirko

Re: List diagrams -- Siebel and Touretzky draw them differently Rainer Joswig
- Re: List diagrams -- Siebel and Touretzky draw them differently Rob Warnock
  - Re: List diagrams -- Siebel and Touretzky draw them differently ······@corporate-world.lisp.de
    - Re: List diagrams -- Siebel and Touretzky draw them differently John Thingstad
      - Re: List diagrams -- Siebel and Touretzky draw them differently Edi Weitz
    - Re: List diagrams -- Siebel and Touretzky draw them differently Rainer Joswig
      - Re: List diagrams -- Siebel and Touretzky draw them differently Duane Rettig
        Re: List diagrams -- Siebel and Touretzky draw them differently Rainer Joswig
        Re: List diagrams -- Siebel and Touretzky draw them differently Duane Rettig
        Re: List diagrams -- Siebel and Touretzky draw them differently Thomas F. Burdick
        Re: List diagrams -- Siebel and Touretzky draw them differently Duane Rettig
        Re: List diagrams -- Siebel and Touretzky draw them differently Rob Warnock
        Re: List diagrams -- Siebel and Touretzky draw them differently Waldek Hebisch
        Re: List diagrams -- Siebel and Touretzky draw them differently Juanjo
        Re: List diagrams -- Siebel and Touretzky draw them differently Pascal J. Bourguignon
Re: List diagrams -- Siebel and Touretzky draw them differently Pascal J. Bourguignon
- Re: List diagrams -- Siebel and Touretzky draw them differently ·············@gmail.com
  - Re: List diagrams -- Siebel and Touretzky draw them differently Ken Tilton
    - Re: List diagrams -- Siebel and Touretzky draw them differently ·············@gmail.com
      - Re: List diagrams -- Siebel and Touretzky draw them differently Ken Tilton
        Re: List diagrams -- Siebel and Touretzky draw them differently Sohail Somani
        Re: List diagrams -- Siebel and Touretzky draw them differently ·············@gmail.com
        Re: List diagrams -- Siebel and Touretzky draw them differently Grant Rettke
- Re: List diagrams -- Siebel and Touretzky draw them differently ·······@eurogaran.com
  - Re: List diagrams -- Siebel and Touretzky draw them differently Grant Rettke
    - Re: List diagrams -- Siebel and Touretzky draw them differently John Thingstad
      - Re: List diagrams -- Siebel and Touretzky draw them differently Pascal J. Bourguignon
- Re: List diagrams -- Siebel and Touretzky draw them differently Grant Rettke
- Re: List diagrams -- Siebel and Touretzky draw them differently Rainer Joswig
  - Re: List diagrams -- Siebel and Touretzky draw them differently John Thingstad
  - Re: List diagrams -- Siebel and Touretzky draw them differently Duane Rettig
Re: List diagrams -- Siebel and Touretzky draw them differently Pertti Kellomäki

From: Rainer Joswig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 16:08:13 +0000
Message-ID: <joswig-BFC58B.18081319052008@news-europe.giganews.com>

In article 
<····································@m45g2000hsb.googlegroups.com>,
 ·············@gmail.com wrote:

> Hi,
> 
> It was recently pointed out to me that my understanding of lists is
> incorrect.  Looking further into the issue I found two types of
> diagrams being drawn.  One is by Peter Seibel, who places the list
> elements in the car of the cons cell,

Not always. Other cons cells and strings
are pointed to in his diagrams.

> while the other is Touretzky,
> who places them outside of the car, with a pointer from the car to the
> element.
> 
> Is one of them grossly in error, or are the differences minor?

Usually I would think of cons cells being a record of
two pointers.

Lisp has two different types of objects:

* objects that have an identity (cons cells, arrays, ...)
* objects where the 'same' object is possibly not the
  same in memory. numbers are such a case. the number
  5 is usually not represented by one object for the number 5
  and everywhere this number is used there is a pointer to this number.
  That's not the case.

So sometimes a Lisp system can optimize the storage
of an object like a number such that it stores the object
directly into the slot of the datastructure.

But I don't think this is usually done for cons cells on the heap.

> (It was also pointed out that the "car points to element" way explains
> how an element can be a member of more than one list.  That would
> suggest that Seibel's diagrams are in error. )
> 
> The reason I am asking as that I still want to submit a lisp cookbook
> recipe on growing the list by the tail, and the list representation is
> central to explaining the recipe.
> 
> Thanks,
> 
> Mirko

-- 
http://lispm.dyndns.org/

From: Rob Warnock
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 02:25:23 +0000
Message-ID: <ONSdnaxOYOgOq6_VnZ2dnUVZ_rzinZ2d@speakeasy.net>

Rainer Joswig  <······@lisp.de> wrote:
+---------------
| Lisp has two different types of objects:
| * objects that have an identity (cons cells, arrays, ...)
| * objects where the 'same' object is possibly not the
|   same in memory. numbers are such a case. the number
|   5 is usually not represented by one object for the number 5
|   and everywhere this number is used there is a pointer to this number.
|   That's not the case.
+---------------

[Aside: SIOD Scheme is an interesting mix of these, since while all
numbers are heap-allocated, a few small integers (typically 2048) are
allocated only once at startup and "interned", as it were, so that
*any* subsequent arithmetic operation which results in one of those
"interned" values gets a pointer to the "interned" (shared) instance
of that value, whereas all other result values are freshly allocated
on the heap each time they are generated as a arithmetic result.
In CL syntax (and assuming 2048 such "inums" were defined), executing
READ-FROM-STRING("(0 1 2046 2047 2048 2048 2049 2049)") would only
allocate four new integers -- but two copies each of 2048 & 2049!]

+---------------
| So sometimes a Lisp system can optimize the storage of an object
| like a number such that it stores the object directly into the slot
| of the datastructure.
| But I don't think this is usually done for cons cells on the heap.
+---------------

Actually, for fixnums & chars [and perhaps also for some other "small"
types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!
The trick is that the type is encoded into the same amount of space
as a Lisp pointer, and distinguished from a pointer with a few "tag"
bits stolen from the lowest (usually) few bits of the "pointer".[1]

Since such "immediate" values takes the same space as a pointer,
they are stored into other Lisp objects -- including cons cells --
exactly the same as actual pointers to heap objects.


-Rob

[1] For byte-addressed machines, if one chooses to force heap objects
    to be allocated & aligned on (say) 8-byte boundaries [which is how
    big a cons cell would be for a 32-bit machine], then it's convenient
    to allocate the least-significant 3 bits of each Lisp value to the
    "lowtag", since *if* it's a pointer-type value than the machine
    address of the associated heap object can always be found by masking
    off (or subtracting out "for free" using address arithmetic in the
    instruction set) the low three bits. E.g., CMUCL currently uses
    these lowtags [note that fixnums use *two* lowtag codepoints, which
    allows fixnums to be signed 30-bit numbers instead of 29-bit],:

    000 even fixnum
    001 function pointer
    010 other-immediate-0 (header-words, chars, symbol-value trap value, etc.)
    011 list pointer (cons cell or NIL)
    100 odd fixnum
    101 CLOS instance or structure pointer
    110 other-immediate-1 (overflow from other-immediate-0)
    111 other-pointer (heap data-blocks other than functions, conses, or
		       structs/CLOS_objs -- symbols, strings, arrays, etc.)

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: ······@corporate-world.lisp.de
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 03:49:42 +0000
Message-ID: <a291bb3e-62f1-4024-bea9-5bd75f6778c6@z72g2000hsb.googlegroups.com>

On May 20, 4:25 am, ····@rpw3.org (Rob Warnock) wrote:
> Rainer Joswig  <······@lisp.de> wrote:
> +---------------
> | Lisp has two different types of objects:
> | * objects that have an identity (cons cells, arrays, ...)
> | * objects where the 'same' object is possibly not the
> |   same in memory. numbers are such a case. the number
> |   5 is usually not represented by one object for the number 5
> |   and everywhere this number is used there is a pointer to this number.
> |   That's not the case.
> +---------------
>
> [Aside: SIOD Scheme is an interesting mix of these, since while all
> numbers are heap-allocated, a few small integers (typically 2048) are
> allocated only once at startup and "interned", as it were, so that
> *any* subsequent arithmetic operation which results in one of those
> "interned" values gets a pointer to the "interned" (shared) instance
> of that value, whereas all other result values are freshly allocated
> on the heap each time they are generated as a arithmetic result.
> In CL syntax (and assuming 2048 such "inums" were defined), executing
> READ-FROM-STRING("(0 1 2046 2047 2048 2048 2049 2049)") would only
> allocate four new integers -- but two copies each of 2048 & 2049!]
>
> +---------------
> | So sometimes a Lisp system can optimize the storage of an object
> | like a number such that it stores the object directly into the slot
> | of the datastructure.
> | But I don't think this is usually done for cons cells on the heap.
> +---------------
>
> Actually, for fixnums & chars [and perhaps also for some other "small"
> types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!

But it would not be for the full fixnum range, but usually something
smaller than fixnum?

> The trick is that the type is encoded into the same amount of space
> as a Lisp pointer, and distinguished from a pointer with a few "tag"
> bits stolen from the lowest (usually) few bits of the "pointer".[1]
>
> Since such "immediate" values takes the same space as a pointer,
> they are stored into other Lisp objects -- including cons cells --
> exactly the same as actual pointers to heap objects.
>
> -Rob
>
> [1] For byte-addressed machines, if one chooses to force heap objects
>     to be allocated & aligned on (say) 8-byte boundaries [which is how
>     big a cons cell would be for a 32-bit machine], then it's convenient
>     to allocate the least-significant 3 bits of each Lisp value to the
>     "lowtag", since *if* it's a pointer-type value than the machine
>     address of the associated heap object can always be found by masking
>     off (or subtracting out "for free" using address arithmetic in the
>     instruction set) the low three bits. E.g., CMUCL currently uses
>     these lowtags [note that fixnums use *two* lowtag codepoints, which
>     allows fixnums to be signed 30-bit numbers instead of 29-bit],:
>
>     000 even fixnum
>     001 function pointer
>     010 other-immediate-0 (header-words, chars, symbol-value trap value, etc.)
>     011 list pointer (cons cell or NIL)
>     100 odd fixnum
>     101 CLOS instance or structure pointer
>     110 other-immediate-1 (overflow from other-immediate-0)
>     111 other-pointer (heap data-blocks other than functions, conses, or
>                        structs/CLOS_objs -- symbols, strings, arrays, etc.)
>
> -----
> Rob Warnock                     <····@rpw3.org>
> 627 26th Avenue                 <URL:http://rpw3.org/>
> San Mateo, CA 94403             (650)572-2607

From: John Thingstad
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 07:06:44 +0000
Message-ID: <op.ubfx1ir0ut4oq5@pandora.alfanett.no>

P� Tue, 20 May 2008 05:49:42 +0200, skrev ······@corporate-world.lisp.de  
<······@corporate-world.lisp.de>:

>>
>> Actually, for fixnums & chars [and perhaps also for some other "small"
>> types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!
>
> But it would not be for the full fixnum range, but usually something
> smaller than fixnum?
>

The whole idea of fixnum is that it fits in the same space as a pointer.
fixnums have two or three tag bits and leave the rest for the number.

In LispWorks on a 32 bit machine.

CL-USER 4 > most-positive-fixnum
536870911

CL-USER 5 > (log * 2)
29.0

--------------
John Thingstad

From: Edi Weitz
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 07:13:49 +0000
Message-ID: <ufxsd1m8i.fsf@agharta.de>

On Tue, 20 May 2008 09:06:44 +0200, "John Thingstad" <·······@online.no> wrote:

> The whole idea of fixnum is that it fits in the same space as a
> pointer.  fixnums have two or three tag bits and leave the rest for
> the number.
>
> In LispWorks on a 32 bit machine.
>
> CL-USER 4 > most-positive-fixnum
> 536870911
>
> CL-USER 5 > (log * 2)
> 29.0

BTW:

  http://www.lispworks.com/documentation/HyperSpec/Body/f_intege.htm

Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")

From: Rainer Joswig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 09:09:39 +0000
Message-ID: <joswig-AA5673.11093920052008@news-europe.giganews.com>

In article 
<····································@z72g2000hsb.googlegroups.com>,
 ·······@corporate-world.lisp.de" <······@corporate-world.lisp.de> 
 wrote:

> On May 20, 4:25 am, ····@rpw3.org (Rob Warnock) wrote:
> > Rainer Joswig  <······@lisp.de> wrote:
> > +---------------
> > | Lisp has two different types of objects:
> > | * objects that have an identity (cons cells, arrays, ...)
> > | * objects where the 'same' object is possibly not the
> > |   same in memory. numbers are such a case. the number
> > |   5 is usually not represented by one object for the number 5
> > |   and everywhere this number is used there is a pointer to this number.
> > |   That's not the case.
> > +---------------
> >
> > [Aside: SIOD Scheme is an interesting mix of these, since while all
> > numbers are heap-allocated, a few small integers (typically 2048) are
> > allocated only once at startup and "interned", as it were, so that
> > *any* subsequent arithmetic operation which results in one of those
> > "interned" values gets a pointer to the "interned" (shared) instance
> > of that value, whereas all other result values are freshly allocated
> > on the heap each time they are generated as a arithmetic result.
> > In CL syntax (and assuming 2048 such "inums" were defined), executing
> > READ-FROM-STRING("(0 1 2046 2047 2048 2048 2049 2049)") would only
> > allocate four new integers -- but two copies each of 2048 & 2049!]
> >
> > +---------------
> > | So sometimes a Lisp system can optimize the storage of an object
> > | like a number such that it stores the object directly into the slot
> > | of the datastructure.
> > | But I don't think this is usually done for cons cells on the heap.
> > +---------------
> >
> > Actually, for fixnums & chars [and perhaps also for some other "small"
> > types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!
> 
> But it would not be for the full fixnum range, but usually something
> smaller than fixnum?

Okay, answering my own question after rereading tag usage in Clozure CL
and CMUCL and than rereading Rob's mail. ;-)

The cons will just be two 32 bit words (in a 32 bit Lisp) and
can encode the usual data. The cons itself does not need a tag
to encode its type - so no space is wasted there. Instead the usual
trick is to encode the information into the pointer to such a cons via a special
tag. It's not a pointer, but a pointer to a cons. Other datastructures
that don't have the luxury of a tag for themselves (the number of possible tags
is limited - often 2**3 in 32bit Lisp and 2**4 in 64bit Lisp) needs to be encoded
differently (using a header, ...). So, there is some built-in 'trickery'
in most Lisps to represent conses in an efficient way.

So going back to the original question. Peter Seibel's box diagrams
indicate that some 'primitive' data (fixnums) can be stored 'inside'
the cons cell, while other data (other cons cells, strings, bignums, ...)
will be pointed to.

Thanks, Rob.

> 
> > The trick is that the type is encoded into the same amount of space
> > as a Lisp pointer, and distinguished from a pointer with a few "tag"
> > bits stolen from the lowest (usually) few bits of the "pointer".[1]
> >
> > Since such "immediate" values takes the same space as a pointer,
> > they are stored into other Lisp objects -- including cons cells --
> > exactly the same as actual pointers to heap objects.
> >
> > -Rob
> >
> > [1] For byte-addressed machines, if one chooses to force heap objects
> >     to be allocated & aligned on (say) 8-byte boundaries [which is how
> >     big a cons cell would be for a 32-bit machine], then it's convenient
> >     to allocate the least-significant 3 bits of each Lisp value to the
> >     "lowtag", since *if* it's a pointer-type value than the machine
> >     address of the associated heap object can always be found by masking
> >     off (or subtracting out "for free" using address arithmetic in the
> >     instruction set) the low three bits. E.g., CMUCL currently uses
> >     these lowtags [note that fixnums use *two* lowtag codepoints, which
> >     allows fixnums to be signed 30-bit numbers instead of 29-bit],:
> >
> >     000 even fixnum
> >     001 function pointer
> >     010 other-immediate-0 (header-words, chars, symbol-value trap value, etc.)
> >     011 list pointer (cons cell or NIL)
> >     100 odd fixnum
> >     101 CLOS instance or structure pointer
> >     110 other-immediate-1 (overflow from other-immediate-0)
> >     111 other-pointer (heap data-blocks other than functions, conses, or
> >                        structs/CLOS_objs -- symbols, strings, arrays, etc.)
> >
> > -----
> > Rob Warnock                     <····@rpw3.org>
> > 627 26th Avenue                 <URL:http://rpw3.org/>
> > San Mateo, CA 94403             (650)572-2607

-- 
http://lispm.dyndns.org/

From: Duane Rettig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 15:02:53 +0000
Message-ID: <o0lk25dnmq.fsf@gemini.franz.com>

Rainer Joswig <······@lisp.de> writes:

> In article 
> <····································@z72g2000hsb.googlegroups.com>,
>  ·······@corporate-world.lisp.de" <······@corporate-world.lisp.de> 
>  wrote:
>
>> On May 20, 4:25 am, ····@rpw3.org (Rob Warnock) wrote:
>> > Rainer Joswig  <······@lisp.de> wrote:
>> > +---------------
>> > | Lisp has two different types of objects:
>> > | * objects that have an identity (cons cells, arrays, ...)
>> > | * objects where the 'same' object is possibly not the
>> > |   same in memory. numbers are such a case. the number
>> > |   5 is usually not represented by one object for the number 5
>> > |   and everywhere this number is used there is a pointer to this number.
>> > |   That's not the case.
>> > +---------------
>> >
>> > [Aside: SIOD Scheme is an interesting mix of these, since while all
>> > numbers are heap-allocated, a few small integers (typically 2048) are
>> > allocated only once at startup and "interned", as it were, so that
>> > *any* subsequent arithmetic operation which results in one of those
>> > "interned" values gets a pointer to the "interned" (shared) instance
>> > of that value, whereas all other result values are freshly allocated
>> > on the heap each time they are generated as a arithmetic result.
>> > In CL syntax (and assuming 2048 such "inums" were defined), executing
>> > READ-FROM-STRING("(0 1 2046 2047 2048 2048 2049 2049)") would only
>> > allocate four new integers -- but two copies each of 2048 & 2049!]
>> >
>> > +---------------
>> > | So sometimes a Lisp system can optimize the storage of an object
>> > | like a number such that it stores the object directly into the slot
>> > | of the datastructure.
>> > | But I don't think this is usually done for cons cells on the heap.
>> > +---------------
>> >
>> > Actually, for fixnums & chars [and perhaps also for some other "small"
>> > types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!
>> 
>> But it would not be for the full fixnum range, but usually something
>> smaller than fixnum?
>
> Okay, answering my own question after rereading tag usage in Clozure CL
> and CMUCL and than rereading Rob's mail. ;-)
>
> The cons will just be two 32 bit words (in a 32 bit Lisp) and
> can encode the usual data. The cons itself does not need a tag
> to encode its type - so no space is wasted there. Instead the usual
> trick is to encode the information into the pointer to such a cons via a special
> tag. It's not a pointer, but a pointer to a cons. Other datastructures
> that don't have the luxury of a tag for themselves (the number of possible tags
> is limited - often 2**3 in 32bit Lisp and 2**4 in 64bit Lisp) needs to be encoded
> differently (using a header, ...). So, there is some built-in 'trickery'
> in most Lisps to represent conses in an efficient way.

Some lisps actually encode a cons cell with more than two natural
words.  But most rely on the tag of the cons' "pointer" to identify it
as a cons cell.  Each of the car and the cdr are indeed large enough
to hold a complete tagged "pointer" to a lisp object.

> So going back to the original question. Peter Seibel's box diagrams
> indicate that some 'primitive' data (fixnums) can be stored 'inside'
> the cons cell, while other data (other cons cells, strings, bignums, ...)
> will be pointed to.

I have always used (and have on occasion been criticized for using)
the studly-caps name "LispVal" to describe the tagged sometimes-pointer
that represents a lisp value.  I use an ex-lisp naming style for it
because the name is a meta-name in more of a sense than any of the
other meta-concepts in Lisp - a name like "lisp-val" doesn't stand out
enough as a meta-name - it just looks like a lisp name, and there could
indeed be a lisp name named lisp-val.  So I use LispVal to describe
these.

LispVals are not pointers.  They are always tagged, the tag bits
usually being the least significant 3 bits (in a 32-bit lisp) or 4
bits (in a 64-bit lisp).  The tag then determines how the remaining
bits in the word are interpreted; a character tag, for example, might
cause the rest of the bits to be interpreted as the char-code of the
character.  Fixnums tend to be special; on most lisps that use this
approach two tags are used: 0 and 4 for 32-bit and 0 and 8 for 64-bit,
with the 0 representing the even fixnums and the 4/8 representing the
odd fixnums.  The rest of the bits are effectively the value of the
fixnum truncated by 2, so the whole fixnum can be translated to a
machine integer by doing an arithmetic shift right by one bit less
than the width of the tag.  This setup is chosen because when doing
array accesses on arrays whose element types are natural-word-sized,
the need for an extra shift is eliminated.

For other kinds of objects, the tag causes the other bits to be
interpreted as pointers, each of which points to a block of data in the
Lisp heap which I just call a Lisp Object.  These are always allocated
on 64-bit boundaries (for a 32-bit lisp) or 128-bit boudaries (for a
64-bit lisp) and the rest of the bits in a LispVal point to the Lisp
object - to get to the Lisp Object's start in C code, just take the
LispVal as a char * and subtract off the tag value.  This is what
happens in the case of cons cells, symbols, and any other object type.
There are some variations on this, but the concept is a simple one.

So back to the original question;  yes, both representations are
right, and both representations are wrong.  Because CL doesn't specify
what objects are immediate and what objects are heap-allocated, the
diagrams cannot be arranged to be completely correct for all
implementations.  But as far as I'm concerned, both styles are useful,
if one doesn't take them too literally.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Rainer Joswig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 17:13:54 +0000
Message-ID: <joswig-603141.19135320052008@news-europe.giganews.com>

In article <··············@gemini.franz.com>,
 Duane Rettig <·····@franz.com> wrote:

> Rainer Joswig <······@lisp.de> writes:
> 
> > In article 
> > <····································@z72g2000hsb.googlegroups.com>,
> >  ·······@corporate-world.lisp.de" <······@corporate-world.lisp.de> 
> >  wrote:
> >
> >> On May 20, 4:25 am, ····@rpw3.org (Rob Warnock) wrote:
> >> > Rainer Joswig  <······@lisp.de> wrote:
> >> > +---------------
> >> > | Lisp has two different types of objects:
> >> > | * objects that have an identity (cons cells, arrays, ...)
> >> > | * objects where the 'same' object is possibly not the
> >> > |   same in memory. numbers are such a case. the number
> >> > |   5 is usually not represented by one object for the number 5
> >> > |   and everywhere this number is used there is a pointer to this number.
> >> > |   That's not the case.
> >> > +---------------
> >> >
> >> > [Aside: SIOD Scheme is an interesting mix of these, since while all
> >> > numbers are heap-allocated, a few small integers (typically 2048) are
> >> > allocated only once at startup and "interned", as it were, so that
> >> > *any* subsequent arithmetic operation which results in one of those
> >> > "interned" values gets a pointer to the "interned" (shared) instance
> >> > of that value, whereas all other result values are freshly allocated
> >> > on the heap each time they are generated as a arithmetic result.
> >> > In CL syntax (and assuming 2048 such "inums" were defined), executing
> >> > READ-FROM-STRING("(0 1 2046 2047 2048 2048 2049 2049)") would only
> >> > allocate four new integers -- but two copies each of 2048 & 2049!]
> >> >
> >> > +---------------
> >> > | So sometimes a Lisp system can optimize the storage of an object
> >> > | like a number such that it stores the object directly into the slot
> >> > | of the datastructure.
> >> > | But I don't think this is usually done for cons cells on the heap.
> >> > +---------------
> >> >
> >> > Actually, for fixnums & chars [and perhaps also for some other "small"
> >> > types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!
> >> 
> >> But it would not be for the full fixnum range, but usually something
> >> smaller than fixnum?
> >
> > Okay, answering my own question after rereading tag usage in Clozure CL
> > and CMUCL and than rereading Rob's mail. ;-)
> >
> > The cons will just be two 32 bit words (in a 32 bit Lisp) and
> > can encode the usual data. The cons itself does not need a tag
> > to encode its type - so no space is wasted there. Instead the usual
> > trick is to encode the information into the pointer to such a cons via a special
> > tag. It's not a pointer, but a pointer to a cons. Other datastructures
> > that don't have the luxury of a tag for themselves (the number of possible tags
> > is limited - often 2**3 in 32bit Lisp and 2**4 in 64bit Lisp) needs to be encoded
> > differently (using a header, ...). So, there is some built-in 'trickery'
> > in most Lisps to represent conses in an efficient way.
> 
> Some lisps actually encode a cons cell with more than two natural
> words.

Any reason for that?

>  But most rely on the tag of the cons' "pointer" to identify it
> as a cons cell.  Each of the car and the cdr are indeed large enough
> to hold a complete tagged "pointer" to a lisp object.
> 
> > So going back to the original question. Peter Seibel's box diagrams
> > indicate that some 'primitive' data (fixnums) can be stored 'inside'
> > the cons cell, while other data (other cons cells, strings, bignums, ...)
> > will be pointed to.
> 
> I have always used (and have on occasion been criticized for using)
> the studly-caps name "LispVal" to describe the tagged sometimes-pointer
> that represents a lisp value.  I use an ex-lisp naming style for it
> because the name is a meta-name in more of a sense than any of the
> other meta-concepts in Lisp - a name like "lisp-val" doesn't stand out
> enough as a meta-name - it just looks like a lisp name, and there could
> indeed be a lisp name named lisp-val.  So I use LispVal to describe
> these.
> 
> LispVals are not pointers.  They are always tagged, the tag bits
> usually being the least significant 3 bits (in a 32-bit lisp) or 4
> bits (in a 64-bit lisp).  The tag then determines how the remaining
> bits in the word are interpreted; a character tag, for example, might
> cause the rest of the bits to be interpreted as the char-code of the
> character.  Fixnums tend to be special; on most lisps that use this
> approach two tags are used: 0 and 4 for 32-bit and 0 and 8 for 64-bit,
> with the 0 representing the even fixnums and the 4/8 representing the
> odd fixnums.  The rest of the bits are effectively the value of the
> fixnum truncated by 2, so the whole fixnum can be translated to a
> machine integer by doing an arithmetic shift right by one bit less
> than the width of the tag.  This setup is chosen because when doing
> array accesses on arrays whose element types are natural-word-sized,
> the need for an extra shift is eliminated.
> 
> For other kinds of objects, the tag causes the other bits to be
> interpreted as pointers, each of which points to a block of data in the
> Lisp heap which I just call a Lisp Object.  These are always allocated
> on 64-bit boundaries (for a 32-bit lisp)

Is there some potential to waste space when dealing with lots of
32bit objects?

> or 128-bit boudaries (for a
> 64-bit lisp) and the rest of the bits in a LispVal point to the Lisp
> object - to get to the Lisp Object's start in C code, just take the
> LispVal as a char * and subtract off the tag value.  This is what
> happens in the case of cons cells, symbols, and any other object type.
> There are some variations on this, but the concept is a simple one.
> 
> So back to the original question;  yes, both representations are
> right, and both representations are wrong.  Because CL doesn't specify
> what objects are immediate and what objects are heap-allocated, the
> diagrams cannot be arranged to be completely correct for all
> implementations.  But as far as I'm concerned, both styles are useful,
> if one doesn't take them too literally.

Thanks for the good explanation!

-- 
http://lispm.dyndns.org/

From: Duane Rettig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 20:40:31 +0000
Message-ID: <o0d4ngemkg.fsf@gemini.franz.com>

Rainer Joswig <······@lisp.de> writes:

> In article <··············@gemini.franz.com>,
>  Duane Rettig <·····@franz.com> wrote:
>
>> Rainer Joswig <······@lisp.de> writes:
>> 
>> > In article 
>> > <····································@z72g2000hsb.googlegroups.com>,
>> >  ·······@corporate-world.lisp.de" <······@corporate-world.lisp.de> 
>> >  wrote:
>> >
>> >> On May 20, 4:25 am, ····@rpw3.org (Rob Warnock) wrote:
>> >> > Rainer Joswig  <······@lisp.de> wrote:
>> >> > +---------------
>> >> > | Lisp has two different types of objects:
>> >> > | * objects that have an identity (cons cells, arrays, ...)
>> >> > | * objects where the 'same' object is possibly not the
>> >> > |   same in memory. numbers are such a case. the number
>> >> > |   5 is usually not represented by one object for the number 5
>> >> > |   and everywhere this number is used there is a pointer to this number.
>> >> > |   That's not the case.
>> >> > +---------------
>> >> >
>> >> > [Aside: SIOD Scheme is an interesting mix of these, since while all
>> >> > numbers are heap-allocated, a few small integers (typically 2048) are
>> >> > allocated only once at startup and "interned", as it were, so that
>> >> > *any* subsequent arithmetic operation which results in one of those
>> >> > "interned" values gets a pointer to the "interned" (shared) instance
>> >> > of that value, whereas all other result values are freshly allocated
>> >> > on the heap each time they are generated as a arithmetic result.
>> >> > In CL syntax (and assuming 2048 such "inums" were defined), executing
>> >> > READ-FROM-STRING("(0 1 2046 2047 2048 2048 2049 2049)") would only
>> >> > allocate four new integers -- but two copies each of 2048 & 2049!]
>> >> >
>> >> > +---------------
>> >> > | So sometimes a Lisp system can optimize the storage of an object
>> >> > | like a number such that it stores the object directly into the slot
>> >> > | of the datastructure.
>> >> > | But I don't think this is usually done for cons cells on the heap.
>> >> > +---------------
>> >> >
>> >> > Actually, for fixnums & chars [and perhaps also for some other "small"
>> >> > types like NIL, <unbound>, or <GC-fwd-ptr>] it almost always *IS* done!!
>> >> 
>> >> But it would not be for the full fixnum range, but usually something
>> >> smaller than fixnum?
>> >
>> > Okay, answering my own question after rereading tag usage in Clozure CL
>> > and CMUCL and than rereading Rob's mail. ;-)
>> >
>> > The cons will just be two 32 bit words (in a 32 bit Lisp) and
>> > can encode the usual data. The cons itself does not need a tag
>> > to encode its type - so no space is wasted there. Instead the usual
>> > trick is to encode the information into the pointer to such a cons via a special
>> > tag. It's not a pointer, but a pointer to a cons. Other datastructures
>> > that don't have the luxury of a tag for themselves (the number of possible tags
>> > is limited - often 2**3 in 32bit Lisp and 2**4 in 64bit Lisp) needs to be encoded
>> > differently (using a header, ...). So, there is some built-in 'trickery'
>> > in most Lisps to represent conses in an efficient way.
>> 
>> Some lisps actually encode a cons cell with more than two natural
>> words.
>
> Any reason for that?

I don't know.  Presumably header information for gc maintenance.
Perhaps someone with such a lisp can comment.

>>  But most rely on the tag of the cons' "pointer" to identify it
>> as a cons cell.  Each of the car and the cdr are indeed large enough
>> to hold a complete tagged "pointer" to a lisp object.
>> 
>> > So going back to the original question. Peter Seibel's box diagrams
>> > indicate that some 'primitive' data (fixnums) can be stored 'inside'
>> > the cons cell, while other data (other cons cells, strings, bignums, ...)
>> > will be pointed to.
>> 
>> I have always used (and have on occasion been criticized for using)
>> the studly-caps name "LispVal" to describe the tagged sometimes-pointer
>> that represents a lisp value.  I use an ex-lisp naming style for it
>> because the name is a meta-name in more of a sense than any of the
>> other meta-concepts in Lisp - a name like "lisp-val" doesn't stand out
>> enough as a meta-name - it just looks like a lisp name, and there could
>> indeed be a lisp name named lisp-val.  So I use LispVal to describe
>> these.
>> 
>> LispVals are not pointers.  They are always tagged, the tag bits
>> usually being the least significant 3 bits (in a 32-bit lisp) or 4
>> bits (in a 64-bit lisp).  The tag then determines how the remaining
>> bits in the word are interpreted; a character tag, for example, might
>> cause the rest of the bits to be interpreted as the char-code of the
>> character.  Fixnums tend to be special; on most lisps that use this
>> approach two tags are used: 0 and 4 for 32-bit and 0 and 8 for 64-bit,
>> with the 0 representing the even fixnums and the 4/8 representing the
>> odd fixnums.  The rest of the bits are effectively the value of the
>> fixnum truncated by 2, so the whole fixnum can be translated to a
>> machine integer by doing an arithmetic shift right by one bit less
>> than the width of the tag.  This setup is chosen because when doing
>> array accesses on arrays whose element types are natural-word-sized,
>> the need for an extra shift is eliminated.
>> 
>> For other kinds of objects, the tag causes the other bits to be
>> interpreted as pointers, each of which points to a block of data in the
>> Lisp heap which I just call a Lisp Object.  These are always allocated
>> on 64-bit boundaries (for a 32-bit lisp)
>
> Is there some potential to waste space when dealing with lots of
> 32bit objects?

Yes.  For a vector, for example, if the header size times the vector
size is an odd number then there will be an unused "slot" at the end
of the vector.  The odds are, though, that half of your vectors will
be space efficient and half will waste one 32-bit word (on a 32-bit
Lisp).

>> or 128-bit boudaries (for a
>> 64-bit lisp) and the rest of the bits in a LispVal point to the Lisp
>> object - to get to the Lisp Object's start in C code, just take the
>> LispVal as a char * and subtract off the tag value.  This is what
>> happens in the case of cons cells, symbols, and any other object type.
>> There are some variations on this, but the concept is a simple one.
>> 
>> So back to the original question;  yes, both representations are
>> right, and both representations are wrong.  Because CL doesn't specify
>> what objects are immediate and what objects are heap-allocated, the
>> diagrams cannot be arranged to be completely correct for all
>> implementations.  But as far as I'm concerned, both styles are useful,
>> if one doesn't take them too literally.
>
> Thanks for the good explanation!

You're welcome.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Thomas F. Burdick
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 20:58:56 +0000
Message-ID: <d4b700c4-827a-4084-b351-ecc939a7978a@y22g2000prd.googlegroups.com>

On May 20, 10:40 pm, Duane Rettig <·····@franz.com> wrote:
> Rainer Joswig <······@lisp.de> writes:
> > In article <··············@gemini.franz.com>,
> >  Duane Rettig <·····@franz.com> wrote:
>
> >> Rainer Joswig <······@lisp.de> writes:
>
> >> For other kinds of objects, the tag causes the other bits to be
> >> interpreted as pointers, each of which points to a block of data in the
> >> Lisp heap which I just call a Lisp Object.  These are always allocated
> >> on 64-bit boundaries (for a 32-bit lisp)
>
> > Is there some potential to waste space when dealing with lots of
> > 32bit objects?
>
> Yes.  For a vector, for example, if the header size times the vector
> size is an odd number then there will be an unused "slot" at the end
> of the vector.  The odds are, though, that half of your vectors will
> be space efficient and half will waste one 32-bit word (on a 32-bit
> Lisp).

I'd also add to this that with 3 bits of tag you're going to lose the
ability to address 7/8 of the heap one way or another; either you get
64-bit precision or you can address every octet in 1/8 of the heap.  I
suppose it would be possible to have a wacky hybrid approach where for
example you can address every 32-bit aligned word in 1/2 the heap.
But given that you won't have any heap objects exactly 32 bits in
size--so you avoid the worst-case scenario of 50% bloat--it seems
pretty clear to me that 64 bit alignment is the best trade off.

From: Duane Rettig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 23:22:08 +0000
Message-ID: <o0r6bwef33.fsf@gemini.franz.com>

"Thomas F. Burdick" <········@gmail.com> writes:

> On May 20, 10:40�pm, Duane Rettig <·····@franz.com> wrote:
>> Rainer Joswig <······@lisp.de> writes:
>> > In article <··············@gemini.franz.com>,
>> > �Duane Rettig <·····@franz.com> wrote:
>>
>> >> Rainer Joswig <······@lisp.de> writes:
>>
>> >> For other kinds of objects, the tag causes the other bits to be
>> >> interpreted as pointers, each of which points to a block of data in the
>> >> Lisp heap which I just call a Lisp Object. �These are always allocated
>> >> on 64-bit boundaries (for a 32-bit lisp)
>>
>> > Is there some potential to waste space when dealing with lots of
>> > 32bit objects?
>>
>> Yes. �For a vector, for example, if the header size times the vector
>> size is an odd number then there will be an unused "slot" at the end
>> of the vector. �The odds are, though, that half of your vectors will
>> be space efficient and half will waste one 32-bit word (on a 32-bit
>> Lisp).
>
> I'd also add to this that with 3 bits of tag you're going to lose the
> ability to address 7/8 of the heap one way or another; either you get
> 64-bit precision or you can address every octet in 1/8 of the heap.

Gotta think outside the box - why are you limiting yourself to one
value to perform this addressing?  But first, because of the use of
two tags for fixnums, you actually get 1/4 of the heap rather than
1/8.  But add to this the fact that every word is accounted for, and
you get a mapping of fixnums onto every natural-word-aligned address
in memory: fixnum 0 is of course 0, fixnum 1 is 4, fixnum 1000 is
4000, fixnum -1 is #xfffffffc, and most-negative-fixnum and
most-positive-fixnum are, respectively, #x80000000 and #x1ffffffc.
So every word in memory is covered by a fixnum, and every fixnum
covers a word.  In Allegro CL we call this punning "aligned pointers".
Aligned pointers are supported in more and more functions, and it
leads to less and less consing of bignums.  But we've always had such
pointers to a certain extent; sys:memref is a low-level peek-like
function that allows you to access any address in memory; it uses a
secondary offset argument (actually, there are two) which allows you
to access any byte within the requested object.  Normally, the first
argument to sys:memref is a lisp object.  But in fact, sys:memref
accepts any LispVal, which can include a fixnum; in this case the
fixnum acts as an aligned memory location 1/4 the size of its value
(with two's complement adjustments for negative values).  So with any
fixnum and another offset of less than 4, you can access any location
in memory.

  I
> suppose it would be possible to have a wacky hybrid approach where for
> example you can address every 32-bit aligned word in 1/2 the heap.
> But given that you won't have any heap objects exactly 32 bits in
> size--so you avoid the worst-case scenario of 50% bloat--it seems
> pretty clear to me that 64 bit alignment is the best trade off.

Agreed.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Rob Warnock
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Wed, 21 May 2008 08:25:13 +0000
Message-ID: <RcWdnbs3Mp_0Qa7VnZ2dnUVZ_orinZ2d@speakeasy.net>

Duane Rettig  <·····@franz.com> wrote:
+---------------
| Rainer Joswig <······@lisp.de> writes:
| > Duane Rettig <·····@franz.com> wrote:
| >> Some lisps actually encode a cons cell with more than
| >> two natural words.
| >
| > Any reason for that?
| 
| I don't know.  Presumably header information for gc maintenance.
+---------------

Or for sometimes just for the simplicity of not having to deal
with decoding lowtag types everywhere type discrimination is done.

+---------------
| Perhaps someone with such a lisp can comment.
+---------------

SIOD Scheme is one such. *All* SIOD objects are a single C struct
type which contains a GC mar, a type field, and a union of all
of the possible subtypes [actually, the first two words of them].
Edited slightly:

    struct obj
    {short gc_mark;
     short type;
     union {struct {struct obj * car;
		    struct obj * cdr;} cons;
	    struct {double data;} flonum;
	    struct {char *pname;
		    struct obj * vcell;} symbol;
	    struct {char *name;
		    struct obj * (*f)(void);} subr0;
	    struct {char *name;
		    struct obj * (*f)(struct obj *);} subr1;
	    struct {char *name;
		    struct obj * (*f)(struct obj *, struct obj *);} subr2;
	    ...[and so on for subr3, subr4, subr5]...
	    ...[other types]...
	    struct {struct obj *env;
		    struct obj *code;} closure;
	    struct {long dim;
		    long *data;} long_array;
	    ...[and so on for other specialized array types]...
	    struct {long dim;
		    struct obj **data;} lisp_array;
	    struct {FILE *f;
		    char *name;} c_file;}
     storage_as;};

So in SIOD a cons is 3 machine words! [(car p) =~= p->storage_as.cons.car]

Another example is MzScheme. [Caveat: I don't know what the current code
looks like; this is valid for MzScheme-103 and earlier.] MzScheme uses
only one lowtag bit, distinguishing only between fixnums [low bit set]
and word-aligned pointers [low bit clear]. All heap objects start with
a short type field, then a short "hash key extension" field, then either
a union of several other types or a dedicated struct definition [edited
slightly for brevity]:

    typedef struct Scheme_Object
    {
      Scheme_Type type; /* Anything that starts with a type field
			   can be a Scheme_Object */
      short keyex;
      union
	{
	  struct { char *string_val; int tag_val; } str_val;
	  struct { void *ptr1, *ptr2; } two_ptr_val;
	  struct { int int1; int int2; } two_int_val;
	  struct { void *ptr; int pint; } ptr_int_val;
	  struct { void *ptr; long pint; } ptr_long_val;
	  struct { struct Scheme_Object *car, *cdr; } pair_val;
	  struct { struct Scheme_Env *env;
		   struct Scheme_Object *code; } closure_val;
	  struct { short len; short *vec; } svector_val;
	} u;
    } Scheme_Object;
    ...
    /* A floating-point number: */
    typedef struct {
      Scheme_Type type;
      MZ_HASH_KEY_EX
      double double_val;
    } Scheme_Double;
    ...
    typedef struct Scheme_Symbol {
      Scheme_Type type;
      MZ_HASH_KEY_EX
      int len;
      char s[4]; /* Really, a number of chars to match `len' */
    } Scheme_Symbol;

So here again a cons is three machine words. [(car p) == p->u.pair_val.car]


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Waldek Hebisch
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Wed, 21 May 2008 18:08:42 +0000
Message-ID: <g11ofa$kn$1@z-news.pwr.wroc.pl>

Duane Rettig <·····@franz.com> wrote:
> Rainer Joswig <······@lisp.de> writes:
> 
> > In article <··············@gemini.franz.com>,
> >  Duane Rettig <·····@franz.com> wrote:
> >
> >> 
> >> Some lisps actually encode a cons cell with more than two natural
> >> words.
> >
> > Any reason for that?
> 
> I don't know.  Presumably header information for gc maintenance.
> Perhaps someone with such a lisp can comment.
> 

I belive that main reason is implementation simplicity.  Current
releases of gcl, ECL and Poplog Common Lisp all use three word
conses.   Developement (unstable) version of gcl uses two word
conses.  Next relase of ECL is going to support two word conses
us an option.

Another reason is that two word conses are a tradeoff: quite a
lot of code have to dispatch on types, two word conses add a
few instructions to the dispatch.  So one saves data space
but loses on code space and number of instructions to execute.
ECL benchmarks indicate that one still gets reduced execution
time (most likely due to cache effects), but a priori effect
on speed is not so obvious.

Also, there is question if one should really optimize cons
cells: in many applications other data structures are more
efficient.  For example for sequences vectors are likely
to save both time and storage.  In Poplog 2 element vector
needs 4 machine words, 3 element vector needs 5 machine
words, so even for relatively short sequences vectors
take less storage.  Another popular use of cons cells is
to build trees.  Again using vectors or structures as
tree nodes is likely to save both time and space.

In Poplog there is extra reason: Poplog supports four languages,
Pop11, Common Lisp, Prolog and SML.  In Prolog and SML lists
are much less special than in Common Lisp, so optimizing
cons cells may be viewed as giving more favour to one
language than to another.  Also, in Poplog cons cells are
just one of many possible structures with two fields
-- currently the user can define own type for list nodes
and expect the same efficiency as from built-in lists.

Garbage collector plays a role too.  Main Poplog garbage collector
is a relatively dumb copying two-semispace collector.  Due to
simplicity speed of copying is in fact quite good.  Complicationg
collector to support two word conses risks slowing down the
collector. 

As extra information, in Poplog fixnums and short floats are
represented directly, while all other types are represented
as pointers -- it is enough to check lowest bit to decide if
there is a header.  In Pop11 characters are represented by
small integers.  Currently Lisp characters are heap allocated
structures (since Poplog Lisp supports only 1 byte characters
all characters are pre-allocated).

-- 
                              Waldek Hebisch
·······@math.uni.wroc.pl

From: Juanjo
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 26 May 2008 19:36:45 +0000
Message-ID: <f396097a-572d-4e01-bd0c-34309864aad3@x35g2000hsb.googlegroups.com>

On May 21, 8:08 pm, Waldek Hebisch <·······@math.uni.wroc.pl> wrote:
> Duane Rettig <·····@franz.com> wrote:
> > Rainer Joswig <······@lisp.de> writes:
>
> > > In article <··············@gemini.franz.com>,
> > >  Duane Rettig <·····@franz.com> wrote:
>
> > >> Some lisps actually encode a cons cell with more than two natural
> > >> words.
>
> > > Any reason for that?
>
> > I don't know.  Presumably header information for gc maintenance.
> > Perhaps someone with such a lisp can comment.
>
> I belive that main reason is implementation simplicity.  Current
> releases of gcl,ECLand Poplog Common Lisp all use three word
> conses.   Developement (unstable) version of gcl uses two word
> conses.  Next relase ofECLis going to support two word conses
> us an option.
>
> Another reason is that two word conses are a tradeoff: quite a
> lot of code have to dispatch on types, two word conses add a
> few instructions to the dispatch.  So one saves data space
> but loses on code space and number of instructions to execute.
> ECLbenchmarks indicate that one still gets reduced execution
> time (most likely due to cache effects), but a priori effect
> on speed is not so obvious.

Some history about this. Implementations derived from KCL have a
similar philosophy when implementing lisp objects. A lisp object is
represented by a computer word. In the least significant bits one
stores a tag that marks that word as an integer or as a character, and
if those bits are zero, it is interpreted as a pointer to a structure
where further type information is stored. That simplifies a lot the C
code to access a lisp object since most of the time one is just
dereferencing a C structure, as in object->cons.car, or object-
>string.fillp

It also simplifies a lot the code for the garbage collector, since one
can quickly identify the object by the first bytes of memory where the
type information is stored. Since there ar also free bits in that type
tag, one can also store a mark for objects that have been already
examined by the garbage collector, free objects, etc.

If you want to store a CONS as a two word object, then you cannot
store that type information in the two words, but in the computer
word. That means you no longer access conses with simple C structures:
pointer arithmetic is needed. Furthermore, the object NIL, which is
both a symbol and a LIST can be no longer implemeted so easily.
However, if you write your implementation carefully, only a few lines
have to be changed to get one or another behavior.
Real complications come from garbage collector, though. ECL solves
them by using the Boehm-Weiser garbage collector instead of the old
KCL-style one.

As for efficiency, using two-words for a CONS does two things for you.
On the one hand, it saves memory. However, that rarely makes your
program run faster, unless you really run over small lists that fit in
the cache. What really makes the implementation _a lot_ faster, is
that you can deduce that an object is a CONS simply by inspecting the
word that represents it. That is, the least significant bits of that
word tell you, hey, I am an encoded pointer to a cons! There is no
need to dereference that pointer and read a type tag to guess the
object type.

That is a huge gain, of about 20% in the case of ECL, and it actually
simplifies the assembler code generated for reading the CAR and CDR of
a cons, since type errors are quickly detected: the same code that has
to be used to decode the pointer already do the type checking.

CMUCL and SBCL go even further. Instead of using just two bits for
storing type information, they use 3 or more (do not remember the
exact amount) That means you can now detect other objects, such as
symbols, arrays, strings, etc, more efficiently. The side effect is
that for this to work, object addressing is a bit more complicated (I
am talking about the perspective of an implementor in C) and objects
have to be two-words aligned -- a bit inconvenient in some cases.

I hope this helps you,

Juanjo

From: Pascal J. Bourguignon
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Wed, 21 May 2008 07:30:29 +0000
Message-ID: <7czlqkp10q.fsf@pbourguignon.anevia.com>

Rainer Joswig <······@lisp.de> writes:
>> Some lisps actually encode a cons cell with more than two natural
>> words.
>
> Any reason for that?

For example, clisp has a compilation option to add an application
accessible word to each cons.  It can be used to identify the conses,
for example to implement persistent memory. 

-- 
__Pascal Bourguignon__

From: Pascal J. Bourguignon
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 16:13:14 +0000
Message-ID: <7cprris25h.fsf@pbourguignon.anevia.com>

·············@gmail.com writes:

> Hi,
>
> It was recently pointed out to me that my understanding of lists is
> incorrect.  Looking further into the issue I found two types of
> diagrams being drawn.  One is by Peter Seibel, who places the list
> elements in the car of the cons cell, while the other is Touretzky,
> who places them outside of the car, with a pointer from the car to the
> element.
>
> Is one of them grossly in error, or are the differences minor?

The difference is minor.

That is, at a certain level of abstraction, there is no difference.

Lisp hides the use of pointers.  Internally, an implementation may
either store the bits of the object, or a pointer to the bits of the
object indiscriminaly.  Well, sometimes you can discriminate it with EQ.


That is, as long as there is enough bits in the CAR slot to store the
bits of the object directly, the implementation can do so, and not use
a pointer.  When the object needs more bits, then space for them will
be allocated on the heap, and a pointer to that heap block will be
stored in the CAR slot.  Same for the CDR, and actually for any other
slots (arrays, CLOS objects, variables, etc).


> (It was also pointed out that the "car points to element" way explains
> how an element can be a member of more than one list.  That would
> suggest that Seibel's diagrams are in error. )

Again, no.

This is because when an object has little enough bits to be stored in
place of the pointer, it's a kind of object whose bits don't change.
For example, a fixnum.  The fixnum 42 could have these bits:

   00000000000000000000000010101001

(assuming fixnums are represented with 30-bit and a 2-bit typetag in
the least significant bits) and the fixnum (+ 40 2) would have these
bits:

   00000000000000000000000010101001

and since they have the same bits, in this implementation, (EQ 42 (+ 40 2)).  



That is, all the copies of  42 are actually the same identical object.
Like if we had copies of  a pointer, only since the bits don't change,
and they're less  than those of pointer, we just copy  the bits of the
object instead of a pointer.



> The reason I am asking as that I still want to submit a lisp cookbook
> recipe on growing the list by the tail, and the list representation is
> central to explaining the recipe.


You can go either all arrows:

+---------------------------------------------+
| (42 "abc" DEF)                              |
|                                             |
| +---+---+   +---+---+   +---+---+   +-----+ |
| | * | * |-->| * | * |-->| * | * |-->| NIL | |
| +---+---+   +---+---+   +---+---+   +-----+ |
|   |           |           |                 |
|   v           v           v                 |
| +----+      +-------+   +-----+             |
| | 42 |      | "abc" |   | DEF |             |
| +----+      +-------+   +-----+             |
+---------------------------------------------+


or all boxes:

+--------+-------------------------------+
|        |  +---------+---------------+  |
|        |  |         | +-----+-----+ |  |
|   42   |  |  "abc"  | | DEF | NIL | |  |
|        |  |         | +-----+-----+ |  |
|        |  +---------+---------------+  |
+--------+-------------------------------+


Or a mix of them.


All are valid representations.  However if you want to show structure
sharing, using arrows and external boxes will be easier:

+-------------------------------------------------+
| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
|                                                 |
| +---+---+   +---+---+   +---+---+   +---+---+   |
| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
| +---+---+   +---+---+   +---+---+   +---+---+   |
|   |           |           |           |         |
|   v           v           v           v         |
| +---+       +---+       +---+       +---+       |
| | 1 |       | 2 |       | 3 |       | 4 |       |
| +---+       +---+       +---+       +---+       |
|   ^           ^           ^           ^         |
|   |           |           |           |         |
| +---+---+   +---+---+   +---+---+   +---+---+   |
| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
| +---+---+   +---+---+   +---+---+   +---+---+   |
+-------------------------------------------------+


Even if in most implementations, this representation is closer to what
is stored in memory:
                                                   
+-------------------------------------------------+
| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
|                                                 |
| +---+---+   +---+---+   +---+---+   +---+---+   |
| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
| +---+---+   +---+---+   +---+---+   +---+---+   |
|                                           |     |
|                                           v     |
|                                         +---+   |
|                                         |NIL|   |
|                                         +---+   |
|                                           ^     |
|                                           |     |
| +---+---+   +---+---+   +---+---+   +---+---+   |
| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
| +---+---+   +---+---+   +---+---+   +---+---+   |
+-------------------------------------------------+

since fixnums are usually stored instead of pointers, and the symbol
NIL is a structure allocated on the heap like any other.
                                            

-- 
__Pascal Bourguignon__

From: ·············@gmail.com
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 16:56:38 +0000
Message-ID: <fa2661ea-2311-499d-9af3-0dd6a5b1a0de@59g2000hsb.googlegroups.com>

On May 19, 12:13 pm, ····@informatimago.com (Pascal J. Bourguignon)
wrote:
> ·············@gmail.com writes:
> > Hi,
>
> > It was recently pointed out to me that my understanding of lists is
> > incorrect.  Looking further into the issue I found two types of
> > diagrams being drawn.  One is by Peter Seibel, who places the list
> > elements in the car of the cons cell, while the other is Touretzky,
> > who places them outside of the car, with a pointer from the car to the
> > element.
>
> > Is one of them grossly in error, or are the differences minor?
>
> The difference is minor.
>
> That is, at a certain level of abstraction, there is no difference.
>
> Lisp hides the use of pointers.  Internally, an implementation may
> either store the bits of the object, or a pointer to the bits of the
> object indiscriminaly.  Well, sometimes you can discriminate it with EQ.
>
> That is, as long as there is enough bits in the CAR slot to store the
> bits of the object directly, the implementation can do so, and not use
> a pointer.  When the object needs more bits, then space for them will
> be allocated on the heap, and a pointer to that heap block will be
> stored in the CAR slot.  Same for the CDR, and actually for any other
> slots (arrays, CLOS objects, variables, etc).
>
> > (It was also pointed out that the "car points to element" way explains
> > how an element can be a member of more than one list.  That would
> > suggest that Seibel's diagrams are in error. )
>
> Again, no.
>
> This is because when an object has little enough bits to be stored in
> place of the pointer, it's a kind of object whose bits don't change.
> For example, a fixnum.  The fixnum 42 could have these bits:
>
>    00000000000000000000000010101001
>
> (assuming fixnums are represented with 30-bit and a 2-bit typetag in
> the least significant bits) and the fixnum (+ 40 2) would have these
> bits:
>
>    00000000000000000000000010101001
>
> and since they have the same bits, in this implementation, (EQ 42 (+ 40 2)).
>
> That is, all the copies of  42 are actually the same identical object.
> Like if we had copies of  a pointer, only since the bits don't change,
> and they're less  than those of pointer, we just copy  the bits of the
> object instead of a pointer.
>
> > The reason I am asking as that I still want to submit a lisp cookbook
> > recipe on growing the list by the tail, and the list representation is
> > central to explaining the recipe.
>
> You can go either all arrows:
>
> +---------------------------------------------+
> | (42 "abc" DEF)                              |
> |                                             |
> | +---+---+   +---+---+   +---+---+   +-----+ |
> | | * | * |-->| * | * |-->| * | * |-->| NIL | |
> | +---+---+   +---+---+   +---+---+   +-----+ |
> |   |           |           |                 |
> |   v           v           v                 |
> | +----+      +-------+   +-----+             |
> | | 42 |      | "abc" |   | DEF |             |
> | +----+      +-------+   +-----+             |
> +---------------------------------------------+
>
> or all boxes:
>
> +--------+-------------------------------+
> |        |  +---------+---------------+  |
> |        |  |         | +-----+-----+ |  |
> |   42   |  |  "abc"  | | DEF | NIL | |  |
> |        |  |         | +-----+-----+ |  |
> |        |  +---------+---------------+  |
> +--------+-------------------------------+
>
> Or a mix of them.
>
> All are valid representations.  However if you want to show structure
> sharing, using arrows and external boxes will be easier:
>
> +-------------------------------------------------+
> | (let ((x '(1 2 3 4))) (values x (copy-list x))) |
> |                                                 |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> |   |           |           |           |         |
> |   v           v           v           v         |
> | +---+       +---+       +---+       +---+       |
> | | 1 |       | 2 |       | 3 |       | 4 |       |
> | +---+       +---+       +---+       +---+       |
> |   ^           ^           ^           ^         |
> |   |           |           |           |         |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> +-------------------------------------------------+
>
> Even if in most implementations, this representation is closer to what
> is stored in memory:
>
> +-------------------------------------------------+
> | (let ((x '(1 2 3 4))) (values x (copy-list x))) |
> |                                                 |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> |                                           |     |
> |                                           v     |
> |                                         +---+   |
> |                                         |NIL|   |
> |                                         +---+   |
> |                                           ^     |
> |                                           |     |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> +-------------------------------------------------+
>
> since fixnums are usually stored instead of pointers, and the symbol
> NIL is a structure allocated on the heap like any other.
>
> --
> __Pascal Bourguignon__

Will there ever be an end to this lisp learning process?  (I know the
answer)

Thanks to all.

Mirko

From: Ken Tilton
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 17:07:38 +0000
Message-ID: <4831b3da$0$25046$607ed4bc@cv.net>

·············@gmail.com wrote:
> On May 19, 12:13 pm, ····@informatimago.com (Pascal J. Bourguignon)
> wrote:
> 
>>·············@gmail.com writes:
>>
>>>Hi,
>>
>>>It was recently pointed out to me that my understanding of lists is
>>>incorrect.  Looking further into the issue I found two types of
>>>diagrams being drawn.  One is by Peter Seibel, who places the list
>>>elements in the car of the cons cell, while the other is Touretzky,
>>>who places them outside of the car, with a pointer from the car to the
>>>element.
>>
>>>Is one of them grossly in error, or are the differences minor?
>>
>>The difference is minor.
>>
>>That is, at a certain level of abstraction, there is no difference.
>>
>>Lisp hides the use of pointers.  Internally, an implementation may
>>either store the bits of the object, or a pointer to the bits of the
>>object indiscriminaly.  Well, sometimes you can discriminate it with EQ.
>>
>>That is, as long as there is enough bits in the CAR slot to store the
>>bits of the object directly, the implementation can do so, and not use
>>a pointer.  When the object needs more bits, then space for them will
>>be allocated on the heap, and a pointer to that heap block will be
>>stored in the CAR slot.  Same for the CDR, and actually for any other
>>slots (arrays, CLOS objects, variables, etc).
>>
>>
>>>(It was also pointed out that the "car points to element" way explains
>>>how an element can be a member of more than one list.  That would
>>>suggest that Seibel's diagrams are in error. )
>>
>>Again, no.
>>
>>This is because when an object has little enough bits to be stored in
>>place of the pointer, it's a kind of object whose bits don't change.
>>For example, a fixnum.  The fixnum 42 could have these bits:
>>
>>   00000000000000000000000010101001
>>
>>(assuming fixnums are represented with 30-bit and a 2-bit typetag in
>>the least significant bits) and the fixnum (+ 40 2) would have these
>>bits:
>>
>>   00000000000000000000000010101001
>>
>>and since they have the same bits, in this implementation, (EQ 42 (+ 40 2)).
>>
>>That is, all the copies of  42 are actually the same identical object.
>>Like if we had copies of  a pointer, only since the bits don't change,
>>and they're less  than those of pointer, we just copy  the bits of the
>>object instead of a pointer.
>>
>>
>>>The reason I am asking as that I still want to submit a lisp cookbook
>>>recipe on growing the list by the tail, and the list representation is
>>>central to explaining the recipe.
>>
>>You can go either all arrows:
>>
>>+---------------------------------------------+
>>| (42 "abc" DEF)                              |
>>|                                             |
>>| +---+---+   +---+---+   +---+---+   +-----+ |
>>| | * | * |-->| * | * |-->| * | * |-->| NIL | |
>>| +---+---+   +---+---+   +---+---+   +-----+ |
>>|   |           |           |                 |
>>|   v           v           v                 |
>>| +----+      +-------+   +-----+             |
>>| | 42 |      | "abc" |   | DEF |             |
>>| +----+      +-------+   +-----+             |
>>+---------------------------------------------+
>>
>>or all boxes:
>>
>>+--------+-------------------------------+
>>|        |  +---------+---------------+  |
>>|        |  |         | +-----+-----+ |  |
>>|   42   |  |  "abc"  | | DEF | NIL | |  |
>>|        |  |         | +-----+-----+ |  |
>>|        |  +---------+---------------+  |
>>+--------+-------------------------------+
>>
>>Or a mix of them.
>>
>>All are valid representations.  However if you want to show structure
>>sharing, using arrows and external boxes will be easier:
>>
>>+-------------------------------------------------+
>>| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
>>|                                                 |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>|   |           |           |           |         |
>>|   v           v           v           v         |
>>| +---+       +---+       +---+       +---+       |
>>| | 1 |       | 2 |       | 3 |       | 4 |       |
>>| +---+       +---+       +---+       +---+       |
>>|   ^           ^           ^           ^         |
>>|   |           |           |           |         |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>+-------------------------------------------------+
>>
>>Even if in most implementations, this representation is closer to what
>>is stored in memory:
>>
>>+-------------------------------------------------+
>>| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
>>|                                                 |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>|                                           |     |
>>|                                           v     |
>>|                                         +---+   |
>>|                                         |NIL|   |
>>|                                         +---+   |
>>|                                           ^     |
>>|                                           |     |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>+-------------------------------------------------+
>>
>>since fixnums are usually stored instead of pointers, and the symbol
>>NIL is a structure allocated on the heap like any other.
>>
>>--
>>__Pascal Bourguignon__
> 
> 
> Will there ever be an end to this lisp learning process? 


Actually I was going to suggest that you stop trying to understand Lisp 
and go the Zen route, Just Do It. The Mind Is A Terrible Thing period 
when we hold the reins, but let it go and it can do wonderful things, 
the key is to let go, get out of the way.

ie, Starting with a cookbook entry is an invitation to disaster, you are 
trying to think your way to being an expert. Instead...

...on what application are you working?

k

-- 
http://smuglispweeny.blogspot.com/
http://www.theoryyalgebra.com/
ECLM rant: 
http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
ECLM talk: 
http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en

From: ·············@gmail.com
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 17:49:25 +0000
Message-ID: <3310f861-fc9b-4ce2-940b-ee6a78d1d74b@b1g2000hsg.googlegroups.com>

On May 19, 1:07 pm, Ken Tilton <···········@optonline.net> wrote:
> ·············@gmail.com wrote:
> > On May 19, 12:13 pm, ····@informatimago.com (Pascal J. Bourguignon)
> > wrote:
>
> >>·············@gmail.com writes:
>
> >>>Hi,
>
> >>>It was recently pointed out to me that my understanding of lists is
> >>>incorrect.  Looking further into the issue I found two types of
> >>>diagrams being drawn.  One is by Peter Seibel, who places the list
> >>>elements in the car of the cons cell, while the other is Touretzky,
> >>>who places them outside of the car, with a pointer from the car to the
> >>>element.
>
> >>>Is one of them grossly in error, or are the differences minor?
>
> >>The difference is minor.
>
> >>That is, at a certain level of abstraction, there is no difference.
>
> >>Lisp hides the use of pointers.  Internally, an implementation may
> >>either store the bits of the object, or a pointer to the bits of the
> >>object indiscriminaly.  Well, sometimes you can discriminate it with EQ.
>
> >>That is, as long as there is enough bits in the CAR slot to store the
> >>bits of the object directly, the implementation can do so, and not use
> >>a pointer.  When the object needs more bits, then space for them will
> >>be allocated on the heap, and a pointer to that heap block will be
> >>stored in the CAR slot.  Same for the CDR, and actually for any other
> >>slots (arrays, CLOS objects, variables, etc).
>
> >>>(It was also pointed out that the "car points to element" way explains
> >>>how an element can be a member of more than one list.  That would
> >>>suggest that Seibel's diagrams are in error. )
>
> >>Again, no.
>
> >>This is because when an object has little enough bits to be stored in
> >>place of the pointer, it's a kind of object whose bits don't change.
> >>For example, a fixnum.  The fixnum 42 could have these bits:
>
> >>   00000000000000000000000010101001
>
> >>(assuming fixnums are represented with 30-bit and a 2-bit typetag in
> >>the least significant bits) and the fixnum (+ 40 2) would have these
> >>bits:
>
> >>   00000000000000000000000010101001
>
> >>and since they have the same bits, in this implementation, (EQ 42 (+ 40 2)).
>
> >>That is, all the copies of  42 are actually the same identical object.
> >>Like if we had copies of  a pointer, only since the bits don't change,
> >>and they're less  than those of pointer, we just copy  the bits of the
> >>object instead of a pointer.
>
> >>>The reason I am asking as that I still want to submit a lisp cookbook
> >>>recipe on growing the list by the tail, and the list representation is
> >>>central to explaining the recipe.
>
> >>You can go either all arrows:
>
> >>+---------------------------------------------+
> >>| (42 "abc" DEF)                              |
> >>|                                             |
> >>| +---+---+   +---+---+   +---+---+   +-----+ |
> >>| | * | * |-->| * | * |-->| * | * |-->| NIL | |
> >>| +---+---+   +---+---+   +---+---+   +-----+ |
> >>|   |           |           |                 |
> >>|   v           v           v                 |
> >>| +----+      +-------+   +-----+             |
> >>| | 42 |      | "abc" |   | DEF |             |
> >>| +----+      +-------+   +-----+             |
> >>+---------------------------------------------+
>
> >>or all boxes:
>
> >>+--------+-------------------------------+
> >>|        |  +---------+---------------+  |
> >>|        |  |         | +-----+-----+ |  |
> >>|   42   |  |  "abc"  | | DEF | NIL | |  |
> >>|        |  |         | +-----+-----+ |  |
> >>|        |  +---------+---------------+  |
> >>+--------+-------------------------------+
>
> >>Or a mix of them.
>
> >>All are valid representations.  However if you want to show structure
> >>sharing, using arrows and external boxes will be easier:
>
> >>+-------------------------------------------------+
> >>| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
> >>|                                                 |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>|   |           |           |           |         |
> >>|   v           v           v           v         |
> >>| +---+       +---+       +---+       +---+       |
> >>| | 1 |       | 2 |       | 3 |       | 4 |       |
> >>| +---+       +---+       +---+       +---+       |
> >>|   ^           ^           ^           ^         |
> >>|   |           |           |           |         |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>+-------------------------------------------------+
>
> >>Even if in most implementations, this representation is closer to what
> >>is stored in memory:
>
> >>+-------------------------------------------------+
> >>| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
> >>|                                                 |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>|                                           |     |
> >>|                                           v     |
> >>|                                         +---+   |
> >>|                                         |NIL|   |
> >>|                                         +---+   |
> >>|                                           ^     |
> >>|                                           |     |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
> >>| +---+---+   +---+---+   +---+---+   +---+---+   |
> >>+-------------------------------------------------+
>
> >>since fixnums are usually stored instead of pointers, and the symbol
> >>NIL is a structure allocated on the heap like any other.
>
> >>--
> >>__Pascal Bourguignon__
>
> > Will there ever be an end to this lisp learning process?
>
> Actually I was going to suggest that you stop trying to understand Lisp
> and go the Zen route, Just Do It. The Mind Is A Terrible Thing period
> when we hold the reins, but let it go and it can do wonderful things,
> the key is to let go, get out of the way.
>
> ie, Starting with a cookbook entry is an invitation to disaster, you are
> trying to think your way to being an expert. Instead...
>
> ...on what application are you working?
>
> k
>
> --http://smuglispweeny.blogspot.com/http://www.theoryyalgebra.com/
> ECLM rant:http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
> ECLM talk:http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en

Oh, I am doing it.  I use (n)lisp for numerical analysis and modeling
(together with gsll and several other packages).

But I thought that having benefited from advice, I could at least
write put it to the benefits of others as well.

And I am not seriously despairing or complaining.  (the above
statement was more like a self-depreciating joke).  I will have to
learn this stuff sooner or later...

Mirko

From: Ken Tilton
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 18:09:34 +0000
Message-ID: <4831c25e$0$11606$607ed4bc@cv.net>

·············@gmail.com wrote:
> On May 19, 1:07 pm, Ken Tilton <···········@optonline.net> wrote:
> 
>>·············@gmail.com wrote:
>>
>>>On May 19, 12:13 pm, ····@informatimago.com (Pascal J. Bourguignon)
>>>wrote:
>>
>>>>·············@gmail.com writes:
>>
>>>>>Hi,
>>
>>>>>It was recently pointed out to me that my understanding of lists is
>>>>>incorrect.  Looking further into the issue I found two types of
>>>>>diagrams being drawn.  One is by Peter Seibel, who places the list
>>>>>elements in the car of the cons cell, while the other is Touretzky,
>>>>>who places them outside of the car, with a pointer from the car to the
>>>>>element.
>>
>>>>>Is one of them grossly in error, or are the differences minor?
>>
>>>>The difference is minor.
>>
>>>>That is, at a certain level of abstraction, there is no difference.
>>
>>>>Lisp hides the use of pointers.  Internally, an implementation may
>>>>either store the bits of the object, or a pointer to the bits of the
>>>>object indiscriminaly.  Well, sometimes you can discriminate it with EQ.
>>
>>>>That is, as long as there is enough bits in the CAR slot to store the
>>>>bits of the object directly, the implementation can do so, and not use
>>>>a pointer.  When the object needs more bits, then space for them will
>>>>be allocated on the heap, and a pointer to that heap block will be
>>>>stored in the CAR slot.  Same for the CDR, and actually for any other
>>>>slots (arrays, CLOS objects, variables, etc).
>>
>>>>>(It was also pointed out that the "car points to element" way explains
>>>>>how an element can be a member of more than one list.  That would
>>>>>suggest that Seibel's diagrams are in error. )
>>
>>>>Again, no.
>>
>>>>This is because when an object has little enough bits to be stored in
>>>>place of the pointer, it's a kind of object whose bits don't change.
>>>>For example, a fixnum.  The fixnum 42 could have these bits:
>>
>>>>  00000000000000000000000010101001
>>
>>>>(assuming fixnums are represented with 30-bit and a 2-bit typetag in
>>>>the least significant bits) and the fixnum (+ 40 2) would have these
>>>>bits:
>>
>>>>  00000000000000000000000010101001
>>
>>>>and since they have the same bits, in this implementation, (EQ 42 (+ 40 2)).
>>
>>>>That is, all the copies of  42 are actually the same identical object.
>>>>Like if we had copies of  a pointer, only since the bits don't change,
>>>>and they're less  than those of pointer, we just copy  the bits of the
>>>>object instead of a pointer.
>>
>>>>>The reason I am asking as that I still want to submit a lisp cookbook
>>>>>recipe on growing the list by the tail, and the list representation is
>>>>>central to explaining the recipe.
>>
>>>>You can go either all arrows:
>>
>>>>+---------------------------------------------+
>>>>| (42 "abc" DEF)                              |
>>>>|                                             |
>>>>| +---+---+   +---+---+   +---+---+   +-----+ |
>>>>| | * | * |-->| * | * |-->| * | * |-->| NIL | |
>>>>| +---+---+   +---+---+   +---+---+   +-----+ |
>>>>|   |           |           |                 |
>>>>|   v           v           v                 |
>>>>| +----+      +-------+   +-----+             |
>>>>| | 42 |      | "abc" |   | DEF |             |
>>>>| +----+      +-------+   +-----+             |
>>>>+---------------------------------------------+
>>
>>>>or all boxes:
>>
>>>>+--------+-------------------------------+
>>>>|        |  +---------+---------------+  |
>>>>|        |  |         | +-----+-----+ |  |
>>>>|   42   |  |  "abc"  | | DEF | NIL | |  |
>>>>|        |  |         | +-----+-----+ |  |
>>>>|        |  +---------+---------------+  |
>>>>+--------+-------------------------------+
>>
>>>>Or a mix of them.
>>
>>>>All are valid representations.  However if you want to show structure
>>>>sharing, using arrows and external boxes will be easier:
>>
>>>>+-------------------------------------------------+
>>>>| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
>>>>|                                                 |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>|   |           |           |           |         |
>>>>|   v           v           v           v         |
>>>>| +---+       +---+       +---+       +---+       |
>>>>| | 1 |       | 2 |       | 3 |       | 4 |       |
>>>>| +---+       +---+       +---+       +---+       |
>>>>|   ^           ^           ^           ^         |
>>>>|   |           |           |           |         |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>| | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>+-------------------------------------------------+
>>
>>>>Even if in most implementations, this representation is closer to what
>>>>is stored in memory:
>>
>>>>+-------------------------------------------------+
>>>>| (let ((x '(1 2 3 4))) (values x (copy-list x))) |
>>>>|                                                 |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>|                                           |     |
>>>>|                                           v     |
>>>>|                                         +---+   |
>>>>|                                         |NIL|   |
>>>>|                                         +---+   |
>>>>|                                           ^     |
>>>>|                                           |     |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>| | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
>>>>| +---+---+   +---+---+   +---+---+   +---+---+   |
>>>>+-------------------------------------------------+
>>
>>>>since fixnums are usually stored instead of pointers, and the symbol
>>>>NIL is a structure allocated on the heap like any other.
>>
>>>>--
>>>>__Pascal Bourguignon__
>>
>>>Will there ever be an end to this lisp learning process?
>>
>>Actually I was going to suggest that you stop trying to understand Lisp
>>and go the Zen route, Just Do It. The Mind Is A Terrible Thing period
>>when we hold the reins, but let it go and it can do wonderful things,
>>the key is to let go, get out of the way.
>>
>>ie, Starting with a cookbook entry is an invitation to disaster, you are
>>trying to think your way to being an expert. Instead...
>>
>>...on what application are you working?
>>
>>k
>>
>>--http://smuglispweeny.blogspot.com/http://www.theoryyalgebra.com/
>>ECLM rant:http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
>>ECLM talk:http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en
> 
> 
> Oh, I am doing it.  I use (n)lisp for numerical analysis and modeling
> (together with gsll and several other packages).
> 
> But I thought that having benefited from advice, I could at least
> write put it to the benefits of others as well.
> 
> And I am not seriously despairing or complaining.  (the above
> statement was more like a self-depreciating joke).  I will have to
> learn this stuff sooner or later...
> 

True, and they say the best way to learn something is to teach it.

kt

-- 
http://smuglispweeny.blogspot.com/
http://www.theoryyalgebra.com/
ECLM rant: 
http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
ECLM talk: 
http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en

From: Sohail Somani
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 18:30:27 +0000
Message-ID: <7HjYj.3579$KB3.276@edtnps91>

Ken Tilton wrote:

> True, and they say the best way to learn something is to teach it.

Unfortunately, those who teach don't always know what they are teaching!

From: ·············@gmail.com
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 19:24:52 +0000
Message-ID: <2f2a9877-7b12-4818-b77a-52867243f6ec@t54g2000hsg.googlegroups.com>

On May 19, 2:30 pm, Sohail Somani <······@taggedtype.net> wrote:
> Ken Tilton wrote:
> > True, and they say the best way to learn something is to teach it.
>
> Unfortunately, those who teach don't always know what they are teaching!

True.  I see myself as a "compiler".  That is why I ask here before
putting up it as a recipe.

From: Grant Rettke
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 19:37:28 +0000
Message-ID: <5eb46c93-ee3a-4be7-b727-3e78cfbe096f@b1g2000hsg.googlegroups.com>

On May 19, 1:30 pm, Sohail Somani <······@taggedtype.net> wrote:
> Ken Tilton wrote:
> > True, and they say the best way to learn something is to teach it.
>
> Unfortunately, those who teach don't always know what they are teaching!

Unfortunately, those who do, don't always know what they are doing!

From: ·······@eurogaran.com
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 18:42:44 +0000
Message-ID: <34c9e69d-3efa-46b3-a287-79d3edd9ca46@l42g2000hsc.googlegroups.com>

Pascal is an artist!

From: Grant Rettke
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 19:31:29 +0000
Message-ID: <293e9912-cd83-41f9-bc38-98d88ee44d75@d1g2000hsg.googlegroups.com>

> Pascal is an artist!

Emacs artist-mode?

From: John Thingstad
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 20:23:26 +0000
Message-ID: <op.ube39cmkut4oq5@pandora.alfanett.no>

P� Mon, 19 May 2008 21:31:29 +0200, skrev Grant Rettke <·······@gmail.com>:

>> Pascal is an artist!
>
> Emacs artist-mode?
>

picture-mode

--------------
John Thingstad

From: Pascal J. Bourguignon
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 07:32:22 +0000
Message-ID: <7c1w3xsa61.fsf@pbourguignon.anevia.com>

"John Thingstad" <·······@online.no> writes:

> P� Mon, 19 May 2008 21:31:29 +0200, skrev Grant Rettke <·······@gmail.com>:
>
>>> Pascal is an artist!
>>
>> Emacs artist-mode?
>>
>
> picture-mode

Not only that, but also:

http://darcs.informatimago.com/lisp/common-lisp/cons-to-ascii.lisp

I should add options to draw data in the cells, and embedded cells.

-- 
__Pascal Bourguignon__

From: Grant Rettke
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 19:29:35 +0000
Message-ID: <cfee589b-2cdb-45f0-8828-2c569c0229b8@e39g2000hsf.googlegroups.com>

> using arrows and external boxes will be easier

+1

From: Rainer Joswig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 17:23:01 +0000
Message-ID: <joswig-0B215B.19230019052008@news-europe.giganews.com>

In article <··············@pbourguignon.anevia.com>,
 ···@informatimago.com (Pascal J. Bourguignon) wrote:

> All are valid representations.  However if you want to show structure
> sharing, using arrows and external boxes will be easier:
> 
> +-------------------------------------------------+
> | (let ((x '(1 2 3 4))) (values x (copy-list x))) |
> |                                                 |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> |   |           |           |           |         |
> |   v           v           v           v         |
> | +---+       +---+       +---+       +---+       |
> | | 1 |       | 2 |       | 3 |       | 4 |       |
> | +---+       +---+       +---+       +---+       |
> |   ^           ^           ^           ^         |
> |   |           |           |           |         |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | * | * |-->| * | * |-->| * | * |-->| * |NIL|   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> +-------------------------------------------------+
> 
> 
> Even if in most implementations, this representation is closer to what
> is stored in memory:
>                                                    
> +-------------------------------------------------+
> | (let ((x '(1 2 3 4))) (values x (copy-list x))) |
> |                                                 |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> |                                           |     |
> |                                           v     |
> |                                         +---+   |
> |                                         |NIL|   |
> |                                         +---+   |
> |                                           ^     |
> |                                           |     |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> | | 1 | * |-->| 2 | * |-->| 3 | * |-->| 4 | * |   |
> | +---+---+   +---+---+   +---+---+   +---+---+   |
> +-------------------------------------------------+
> 
> since fixnums are usually stored instead of pointers, and the symbol
> NIL is a structure allocated on the heap like any other.

Which implementation would store fixnums inside cons cells?

-- 
http://lispm.dyndns.org/

From: John Thingstad
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 20:22:30 +0000
Message-ID: <op.ube37sm7ut4oq5@pandora.alfanett.no>

P� Mon, 19 May 2008 19:23:01 +0200, skrev Rainer Joswig <······@lisp.de>:

>>
>> since fixnums are usually stored instead of pointers, and the symbol
>> NIL is a structure allocated on the heap like any other.
>
> Which implementation would store fixnums inside cons cells?
>

Almost all I would think. I am not sure what you mean. The compiler sees a  
number it decides it is small enough to be a fixnum and places in the cons  
cell as a value with a tag. It doesn't care if it on the heap or on the  
stack. List's of small integers occur regularly and this saves space and  
inproves performance.

--------------
John Thingstad

From: Duane Rettig
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Tue, 20 May 2008 15:06:32 +0000
Message-ID: <o0hcctdngn.fsf@gemini.franz.com>

Rainer Joswig <······@lisp.de> writes:

>> since fixnums are usually stored instead of pointers, and the symbol
>> NIL is a structure allocated on the heap like any other.
>
> Which implementation would store fixnums inside cons cells?

Most do.  Hopefully my explanation elsewhere on this thread helps.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pertti Kellomäki
Subject: Re: List diagrams -- Siebel and Touretzky draw them differently
Date: Mon, 19 May 2008 16:11:58 +0000
Message-ID: <g0s8se$u6r$1@news.cc.tut.fi>

·············@gmail.com wrote:
> One is by Peter Seibel, who places the list
> elements in the car of the cons cell, while the other is Touretzky,
> who places them outside of the car, with a pointer from the car to the
> element.
> 
> Is one of them grossly in error, or are the differences minor?

FWIW, I find the latter style more illustrative. The differences
are minor, however, as long as the things that are drawn inside
cons cells (either in the car part or the cdr part) are immutable.
If two cons cells contain the integer one in the car field, there
is no way to determine if they share the same one or if there are
two ones involved. If the datum is mutable, it is possible to observe
sharing by mutating via one route and looking if the mutation is
visible via the other route as well.
-- 
Pertti