From: Tim Bradshaw
Subject: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <ey3k885ptg7.fsf@cley.com>
I've recently noticed that in several lisp implementations on 32bit
hardware, including two commercial ones, ARRAY-TOTAL-SIZE-LIMIT is
2^24, which is 4 or 5 bits shorter than the fixnum size.  This only
gives about 17*10^6 elements, or equivalently a 4096x4096 array, which
is, for instance, probably not enough to hold a reasonably high
resolution scanned image.

Given that this limit is the same on a number (well, 3) of independent
implementations, I think I must be missing some implementation trick
which makes good use of these bits.  Does anyone know what it is?

Note this isn't a hidden complaint or anything -- I don't intend to
do any big array bashing...

--tim

From: Duane Rettig
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <4y9wliqh4.fsf@beta.franz.com>
Tim Bradshaw <···@cley.com> writes:

> I've recently noticed that in several lisp implementations on 32bit
> hardware, including two commercial ones, ARRAY-TOTAL-SIZE-LIMIT is
> 2^24, which is 4 or 5 bits shorter than the fixnum size.  This only
> gives about 17*10^6 elements, or equivalently a 4096x4096 array, which
> is, for instance, probably not enough to hold a reasonably high
> resolution scanned image.
> 
> Given that this limit is the same on a number (well, 3) of independent
> implementations, I think I must be missing some implementation trick
> which makes good use of these bits.  Does anyone know what it is?

The requirements for header information in a simple-array of one
dimension are its type (typically one byte) and a size.  To pack this
information so as to only lose one word per array, the 32-bit word
can be divided into 8 bits for type + 24 bits for size.

Of course, any other combinations are possible; an implementation may
choose to allow for larger arrays by taking up two words in the header
(the three bytes in the first word might thus be wasted, or perhaps used
for a hash code for fast sxhashing).  The down side to this tradeoff
is that on the average every array suffers the extra size, for the
sake of those arrays that will be larger than the 24-bit size limit.

Going to 64-bits, there are 2^56 bits available for a one-word header
minus the type byte.  And although we haven't yet taken advantage of
this in our 64-bit ports, we will eventually.

> Note this isn't a hidden complaint or anything -- I don't intend to
> do any big array bashing...

Various things can be done to circumvent the array size limitation.
Because the size field refers to elements and not bytes, word arrays
can be punned into larger-looking byte arrays.  Also, I believe most
lisps have a peek/poke type of menchanism (ours is called sys:memref
which is setf'able).  One can use something like mmap to allocate a
huge array, and the peek/poke mechanism to access elements.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Barry Margolin
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <WOs66.45$UJ5.148@burlma1-snr2>
In article <·············@beta.franz.com>,
Duane Rettig  <·····@franz.com> wrote:
>Tim Bradshaw <···@cley.com> writes:
>
>> I've recently noticed that in several lisp implementations on 32bit
>> hardware, including two commercial ones, ARRAY-TOTAL-SIZE-LIMIT is
>> 2^24, which is 4 or 5 bits shorter than the fixnum size.  This only
>> gives about 17*10^6 elements, or equivalently a 4096x4096 array, which
>> is, for instance, probably not enough to hold a reasonably high
>> resolution scanned image.
>> 
>> Given that this limit is the same on a number (well, 3) of independent
>> implementations, I think I must be missing some implementation trick
>> which makes good use of these bits.  Does anyone know what it is?
>
>The requirements for header information in a simple-array of one
>dimension are its type (typically one byte) and a size.  To pack this
>information so as to only lose one word per array, the 32-bit word
>can be divided into 8 bits for type + 24 bits for size.
>
>Of course, any other combinations are possible; an implementation may
>choose to allow for larger arrays by taking up two words in the header
>(the three bytes in the first word might thus be wasted, or perhaps used
>for a hash code for fast sxhashing).  The down side to this tradeoff
>is that on the average every array suffers the extra size, for the
>sake of those arrays that will be larger than the 24-bit size limit.

How about having a bit in the type byte that indicates whether the size is
immediate or in the next word?  Or maybe a distinguished size value (all
bits on, perhaps) that indicates this.  This way, you would only have a
2-word header on large arrays, and the wastage would be negligible compared
to the data portion.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Duane Rettig
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <4u279ik52.fsf@beta.franz.com>
Barry Margolin <······@genuity.net> writes:

> In article <·············@beta.franz.com>,
> Duane Rettig  <·····@franz.com> wrote:
> >Of course, any other combinations are possible; an implementation may
> >choose to allow for larger arrays by taking up two words in the header
> >(the three bytes in the first word might thus be wasted, or perhaps used
> >for a hash code for fast sxhashing).  The down side to this tradeoff
> >is that on the average every array suffers the extra size, for the
> >sake of those arrays that will be larger than the 24-bit size limit.
> 
> How about having a bit in the type byte that indicates whether the size is
> immediate or in the next word?  Or maybe a distinguished size value (all
> bits on, perhaps) that indicates this.  This way, you would only have a
> 2-word header on large arrays, and the wastage would be negligible compared
> to the data portion.

Yes, of course, this is a possibility.  However, based on tradeoffs
within the type byte I would opt for stealing the extra bit or value from
the size field itself.  I'd rather preserve the usage of a bit in the type
byte (or in fact a copy of the range of type codes for arrays of various
specializations) for the purposes of optimizing something else, like for
example, two dimensional simple arrays, or fill-pointer arrays using a
simple-vector-like implementation, or even more specializations.

Stealing the bit for the size field would tend to have aref efficiency
issues, because there would now be two different offsets for an aref
(though judicious placement of the bit with the type byte could result
in a 9-bit pseudo-type byte to do a table-dispatch on and set the offset,
instead of a normal 8-bit type dispatch).  Open-coding arefs would
probably be done differently depending on whether the array were declared
to be (simple-array t (100)) or (simple-array t (#.most-positive-fixnum)),
both of which could be efficient open codings, or (simple-array t (*)),
which would require a slightly more complex open code sequence.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Lieven Marchand
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <m34rz8mxgc.fsf@localhost.localdomain>
Tim Bradshaw <···@cley.com> writes:

> I've recently noticed that in several lisp implementations on 32bit
> hardware, including two commercial ones, ARRAY-TOTAL-SIZE-LIMIT is
> 2^24, which is 4 or 5 bits shorter than the fixnum size.  This only
> gives about 17*10^6 elements, or equivalently a 4096x4096 array, which
> is, for instance, probably not enough to hold a reasonably high
> resolution scanned image.
> 
> Given that this limit is the same on a number (well, 3) of independent
> implementations, I think I must be missing some implementation trick
> which makes good use of these bits.  Does anyone know what it is?
> 
> Note this isn't a hidden complaint or anything -- I don't intend to
> do any big array bashing...
> 

Lispworks/Linux seems to have ARRAY-TOTAL-SIZE-LIMIT = 2^21 -
size-of-header, which is even smaller.

-- 
Lieven Marchand <···@village.uunet.be>
Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
From: Pekka P. Pirinen
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <ixsnmrpj8k.fsf@harlequin.co.uk>
Lieven Marchand <···@village.uunet.be> writes:
> Tim Bradshaw <···@cley.com> writes:
> > I've recently noticed that in several lisp implementations on 32bit
> > hardware, including two commercial ones, ARRAY-TOTAL-SIZE-LIMIT is
> > 2^24, which is 4 or 5 bits shorter than the fixnum size.  [...]
> 
> Lispworks/Linux seems to have ARRAY-TOTAL-SIZE-LIMIT = 2^21 -
> size-of-header, which is even smaller.

Yes and no.  Yes, it's smaller than 2^24.  No, it's only three bits
shorter than fixnum size, which is 24 bits on the PC platforms (Linux
and Windows).  On other Unix platforms, LW has it better in both
senses: ARRAY-TOTAL-SIZE-LIMIT = MOST-POSITIVE-FIXNUM = 2^29-1.

I was involved in designing the PC array layouts back in the 80's.
PCs used to be much smaller than Unix workstations then.  I'm afraid
we didn't have the foresight to expect the implementation to be used
to process multi-MB arrays.
-- 
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited
Quidquid latine dictum sit, altum viditur.
From: Lieven Marchand
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <m3y9wj1fs2.fsf@localhost.localdomain>
·····@harlequin.co.uk (Pekka P. Pirinen) writes:

> Yes and no.  Yes, it's smaller than 2^24.  No, it's only three bits
> shorter than fixnum size, which is 24 bits on the PC platforms (Linux
> and Windows).  On other Unix platforms, LW has it better in both
> senses: ARRAY-TOTAL-SIZE-LIMIT = MOST-POSITIVE-FIXNUM = 2^29-1.
> 

What are the eight remaining tag bits used for on PC?

-- 
Lieven Marchand <···@village.uunet.be>
Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
From: Pekka P. Pirinen
Subject: Re: ARRAY-TOTAL-SIZE-LIMIT
Date: 
Message-ID: <ix4rz0d727.fsf@zaphod.cam.harlequin.co.uk>
Lieven Marchand <···@village.uunet.be> writes:
> ·····@harlequin.co.uk (Pekka P. Pirinen) writes:
> > Yes and no.  Yes, it's smaller than 2^24.  No, it's only three bits
> > shorter than fixnum size, which is 24 bits on the PC platforms (Linux
> > and Windows).  [...]
>
> What are the eight remaining tag bits used for on PC?

The lowest three bits are used for tags on all types of data (all
allocations are 8-byte aligned).  One of those tag values (000) is for
the "short" types; the next five bits discriminate between different
short types, and each then has upto 24 data bits.  Short types include
fixnums, characters and various internal markers.  The design is
oriented more towards faster type checking than minimizing tag size.
-- 
Pekka P. Pirinen, Adaptive Memory Management Group, Harlequin Limited
 A dullard is someone who can open a dictionary or encyclopedia and
 read only what they'd planned to.