From: Kadir Solid Gold Suleyman
Subject: wacky PPC lisp os fundamentals
Date: 
Message-ID: <1108762151.198200.177830@g14g2000cwa.googlegroups.com>
Let's say you have a 64-bit CPU such as the MPC970 that has an MMU that
allows you to manage 16MByte pages (instead of the
usual 4KB) and you want to write a bootable Lisp-OS.  You are less
concerned with making Linus and Bill live out of cardboard refrigerator
boxes under the freeway overpass than getting the thing to work (ie
making something that boots) and extending it.

Using a cons consisting of 24-bit CAR
and 24-bit CDR pointers would allow you to address
everything inside of each 16MB page, and make basic lisp functions
simple to implement (important when booting).  I think having 24-bit
CDR would be easier than CDR coding because CDR coding would create a
lot of overhead.  58 bit immediates would be possible with a tag
setting.  10-bit segment field would give 1024 pages per process, with
pointer indirection.  Even if some pages were shared for inter-process
communication it would still give up to nearly 16GB available to each
process.  Of course the full virtual space 42bits for 970? would be
allocable by the system.

 tag SEGMENT  CAR    CDR
|3F |3FF     |FFFFFF|FFFFFF|

Garbage collection would be a copying collector, setting up barriers on
a 16MB page basis.

Would this be the way you would set up lisp primitives if you were
going to implement lisp as an operating system on the MPC970 (and
beyond) and not concerned with being compatible with CADR and its
children?

From: Wade Humeniuk
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <mOtRd.7$9a3.4@edtnps91>
Kadir Solid Gold Suleyman wrote:

> Using a cons consisting of 24-bit CAR
> and 24-bit CDR pointers would allow you to address
> everything inside of each 16MB page, and make basic lisp functions
> simple to implement (important when booting).  I think having 24-bit
> CDR would be easier than CDR coding because CDR coding would create a
> lot of overhead.  58 bit immediates would be possible with a tag
> setting.  10-bit segment field would give 1024 pages per process, with
> pointer indirection.  Even if some pages were shared for inter-process
> communication it would still give up to nearly 16GB available to each
> process.  Of course the full virtual space 42bits for 970? would be
> allocable by the system.
> 
>  tag SEGMENT  CAR    CDR
> |3F |3FF     |FFFFFF|FFFFFF|
> 

I am not sure, but what if the CAR and CDR were in different segments?
How would that be represented?  I know you said everything would
be inside a one segment, but Lisp images already occupy > 16M spaces.
A simple thing like an image array or DB could easily
surpass 16MB.

Wade
From: Kadir Solid Gold Suleyman
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <1108769184.657227.164190@o13g2000cwo.googlegroups.com>
Wade,

What about an indirection pointer
say the CDR points to a location in the same page as the CAR.

tag seg          tag seg                      tag seg
XXX|0x1|CAR CDR->ind|0x2|car CDR--/new pg/--->xxx|0x2|car cdr|
         |                                                 |
       same page                                          blah

The address pointed to by the CDR has a tag as "indirection" and has a
segment listing that is on another page.  During GC obviously you would
want to scan the indirection tags and try to bring them into the same
page if possible.  For image arrays it would probably be better to have
a tag that denotes "image" and points to a structure that defines how
many pages it takes, for example.  with 2^6 bits you have a lot of
different tags you can use.

-SGS

Wade Humeniuk wrote:

> I am not sure, but what if the CAR and CDR were in different
segments?
> How would that be represented?  I know you said everything would
> be inside a one segment, but Lisp images already occupy > 16M spaces.
> A simple thing like an image array or DB could easily
> surpass 16MB.
> 
> Wade
From: Kaz Kylheku
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <1108828753.164152.115360@z14g2000cwz.googlegroups.com>
Kadir Solid Gold Suleyman wrote:
> Let's say you have a 64-bit CPU such as the MPC970 that has an MMU
that
> allows you to manage 16MByte pages (instead of the
> usual 4KB) and you want to write a bootable Lisp-OS.  You are less
> concerned with making Linus and Bill live out of cardboard
refrigerator
> boxes under the freeway overpass than getting the thing to work (ie
> making something that boots) and extending it.

If you are just interested in getting something to work, why worry
about optimization? Just make proper conses that are two 64 bit cells
back to back.

> Using a cons consisting of 24-bit CAR
> and 24-bit CDR pointers would allow you to address
> everything inside of each 16MB page, and make basic lisp functions

You don't need 24 bits to address each aligned 64 byte cell in a 16
megabyte page, but only 21 bits.

> simple to implement (important when booting).  I think having 24-bit
> CDR would be easier than CDR coding because CDR coding would create a

CDR coding saves space without introducing address space limitations.
It's "non-lossy", whereas your scheme is "lossy", with respect to
address space. So you can't entirely present them as fully qualified
alternatives for saving space.

> lot of overhead.  58 bit immediates would be possible with a tag
> setting.

But a cons cell can be called upon to hold a /pair/ of immediates.
Oops! Your cells would only be able to hold boxed representations of
these 58 bit quantities.

Oops!

> 10-bit segment field would give 1024 pages per process, with
> pointer indirection.

Of course, if the two pointer fields are only 21 bits wide instead of
24, you can use 16 bits there, and have 65536 pages.
From: Kadir Solid Gold Suleyman
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <1108855385.582043.124920@c13g2000cwb.googlegroups.com>
Kaz Kylheku wrote:
> If you are just interested in getting something to work, why worry
> about optimization? Just make proper conses that are two 64 bit cells
> back to back.

I am was looking for another book in the library and I happened on John
Allen's Anatomy of Lisp.  He goes through the steps to create a
dynamically scoped lisp interpreter (and later lisp compiler) using
PDP-10-like primitives.  One of the things he did was to have tha car
and cdr in the same word (36-bits wide).  If I decide to pursue it
further I would use the dynamically scoped interpreter I write to
bootstrap a lexically scoped compiler.

> > Using a cons consisting of 24-bit CAR
> > and 24-bit CDR pointers would allow you to address
> > everything inside of each 16MB page, and make basic lisp functions
>
> You don't need 24 bits to address each aligned 64 byte[sic] cell in a
16
> megabyte page, but only 21 bits.

You are correct, however not every cell may point to 64-bit aligned
data, right?  If 80% of your data is 32-bit aligned, isn't that a
waste?

> CDR coding saves space without introducing address space limitations.
> It's "non-lossy", whereas your scheme is "lossy", with respect to
> address space. So you can't entirely present them as fully qualified
> alternatives for saving space.

I didn't know I was presenting my scheme as an alternative for saving
space.  RAM is cheap right?  Allowing all cells in an MMU page to be
able to be accessed by a cons in the same page was the idea.  I was
also thinking about an easy virtual machine implementation.  According
to Joe Marshall:http://home.comcast.net/~prunesquallor/kmachine.htm
only 1/5 of
storage was allocated in lists...

> > lot of overhead.  58 bit immediates would be possible with a tag
> > setting.
>
> But a cons cell can be called upon to hold a /pair/ of immediates.
> Oops! Your cells would only be able to hold boxed representations of
> these 58 bit quantities.
>
> Oops!

I'm not following you here.  As I understand it, the cons points to two
lisp objects that are on the heap or, can point to one object and an
immediate not on the heap or a pair of immediates not on the heap.  Is
this not correct?

> > 10-bit segment field would give 1024 pages per process, with
> > pointer indirection.
>
> Of course, if the two pointer fields are only 21 bits wide instead of
> 24, you can use 16 bits there, and have 65536 pages.

that is true.  hmm maybe 32 bit aligned accesses?  Anyway, thanks for
the suggestions.
From: Kaz Kylheku
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <1108933090.191096.222660@o13g2000cwo.googlegroups.com>
Kadir Solid Gold Suleyman wrote:

> > > lot of overhead.  58 bit immediates would be possible with a tag
> > > setting.
> >
> > But a cons cell can be called upon to hold a /pair/ of immediates.
> > Oops! Your cells would only be able to hold boxed representations
of
> > these 58 bit quantities.
> >
> > Oops!
>
> I'm not following you here.  As I understand it, the cons points to
two
> lisp objects that are on the heap or, can point to one object and an
> immediate not on the heap or a pair of immediates not on the heap.
Is
> this not correct?

A cons is conceptually two atoms stuck together; like a tiny structure
or vector consisting of two slots.

In a system where values are 64 bit quantities, a cons cell wants to be
128 bits.

E.g. what does the object (1000 . 1000) potentially look like?

Each of the atoms is a 58 bit fixnum integer, with a 6 bit tag. The
cons cell has enough space to hold these two representations back to
back.

How does your Lisp system know that it's dealing with a cons? The type
tag for the cons is stored in the cons /reference/ rather than in the
cons itself. That is, a cons is some immediate 64 bit value with a tag
which says ``this is a cons reference'', and remaining bits which point
to the two-element vector.

If you want a cons to fit into only 64 bits, then the object (1000 .
1000) cannot be self-contained. You need a cons pointing to two boxed
integers.

And of course your Lisp implementation has to handle all the
conversions that it requires. For instance, what if a fixnum is
assigned to the CDR field of a cell?

  (setf (cdr some-cell) 42)

The 42 is an unboxed 64 bit value. The CDR field only has 24 bits. So
it has to be converted to some narrower type of fixnum that's designed
to fit into the 24 bit spaces in conses, or else it has to be boxed up,
a cell has to be allocated (in the same page as the cons!), and the 24
bit offset stored there.

Another question. What if an object living in one page is assigned into
a field of a cons in another page?

I think you've made your life a little bit difficult there (which is
even contrary to your explicit project goal to get Lisp up and running
on this machine with little effort!).

Now if you have a simple 128 bit cons, what do you do? You take the 64
bit value representing 42, copy it into the second half of the cons,
and you are done!

If you are going to optimize the data representation to save space, you
have to think of all the operations that might become cumbersome (more
code, more complexity, more cycles), and also consider the situations
in which more space might be required.
From: David Steuber
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <87650lx3ul.fsf@david-steuber.com>
"Kaz Kylheku" <···@ashi.footprints.net> writes:

> If you are going to optimize the data representation to save space, you
> have to think of all the operations that might become cumbersome (more
> code, more complexity, more cycles), and also consider the situations
> in which more space might be required.

So why aren't cons cells three words instead of two?  With three, you
could put the type information in the first word, the car and cdr
could go in the second and third words.  With 64 bit words you could
then have unboxed double-floats.

-- 
An ideal world is left as an excercise to the reader.
   --- Paul Graham, On Lisp 8.1
From: Kaz Kylheku
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <1109028379.757337.278800@c13g2000cwb.googlegroups.com>
David Steuber wrote:
> "Kaz Kylheku" <···@ashi.footprints.net> writes:
>
> > If you are going to optimize the data representation to save space,
you
> > have to think of all the operations that might become cumbersome
(more
> > code, more complexity, more cycles), and also consider the
situations
> > in which more space might be required.
>
> So why aren't cons cells three words instead of two?  With three, you
> could put the type information in the first word, the car and cdr
> could go in the second and third words.  With 64 bit words you could
> then have unboxed double-floats.

My intuitive reaction would be that it's not necessary for a cons to be
able to efficiently store a pair of double floats, because it's
probably an uncommon requirement which can be met by the use of a
two-element vector. Lisp vectors have the bells and whistles for
various tradeoffs.
From: Peter Seibel
Subject: Re: wacky PPC lisp os fundamentals
Date: 
Message-ID: <m3ekf9ydkh.fsf@javamonkey.com>
David Steuber <·····@david-steuber.com> writes:

> "Kaz Kylheku" <···@ashi.footprints.net> writes:
>
>> If you are going to optimize the data representation to save space, you
>> have to think of all the operations that might become cumbersome (more
>> code, more complexity, more cycles), and also consider the situations
>> in which more space might be required.
>
> So why aren't cons cells three words instead of two?  With three, you
> could put the type information in the first word, the car and cdr
> could go in the second and third words.  With 64 bit words you could
> then have unboxed double-floats.

The following is an excellent paper on the tradeoffs of various data
representation strategies for Lisp:

  <http://citeseer.ist.psu.edu/gudeman93representing.html>

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp