From: Camm Maguire
Subject: Favorite C technique to Lisp... How to?
Date: 
Message-ID: <54d62nai9z.fsf@intech19.enhanced.com>
Greetings!  I have a very frequently used technique for accessing and
modifying very large formatted files from within C programs, and am
wondering if there is a 'natural' way to accomplish the same within
lisp. 

Files that exceed available ram can be mmapped MAP_SHARED, at least
under Linux, and the OS will only page into memory sections actually
read or written by the program.  I very frequently cast the mmapped
beginning pointer (r1) to a pointer to a given structure type, and
then set a similar pointer (re) to the end of the mmapped region.
Presuming the records in the file are sorted according to some
function of the structure elements, a pointer to any structure of
interest can be efficiently obtained via bsearch on arguments of an
assembled key or template record,r1,re-r1,sizeof(*r1),and the
comparison function.  Modifying the file at this location is simply
assigning a value to the structure element referenced by this
pointer.  In all this, only a few pages of memory are used, and two
pointers are initialized.

So one thinks of an array of structures in lisp.  However in lisp,
structures are objects distinct from the 'structure body,
i.e. contain the structure definition, etc.  A structure body itself,
the appropriate concept to share structure with the large mmapped file,
is not a definable array element type, rather the type is 'upgraded'
to object, meaning that one would have to initialize pointers for each
record in the file, and then initialize each structure element in each
record via a pointer to the relevant location in the file.  This is
a lot more allocation and overhead on startup.

Any ideas?

Take care,

-- 
Camm Maguire			     			····@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

From: Carl Shapiro
Subject: Re: Favorite C technique to Lisp... How to?
Date: 
Message-ID: <ouysmbh7rtu.fsf@panix3.panix.com>
Camm Maguire <····@enhanced.com> writes:

> Files that exceed available ram can be mmapped MAP_SHARED, at least
> under Linux, and the OS will only page into memory sections actually
> read or written by the program.  I very frequently cast the mmapped
> beginning pointer (r1) to a pointer to a given structure type, and
> then set a similar pointer (re) to the end of the mmapped region.
> Presuming the records in the file are sorted according to some
> function of the structure elements, a pointer to any structure of
> interest can be efficiently obtained via bsearch on arguments of an
> assembled key or template record,r1,re-r1,sizeof(*r1),and the
> comparison function.  Modifying the file at this location is simply
> assigning a value to the structure element referenced by this
> pointer.  In all this, only a few pages of memory are used, and two
> pointers are initialized.

The right way to manipulate record-oriented data in Common Lisp is by
overlaying source data with byte-arrays which are in turn overlaid
with displaced arrays of the correct type for the structure member
data.  This is the strategy which the Lisp Machine uses deep within
the file system internals as well as in other situations where
pointer-oriented structures are not an appropriate abstraction for
modeling domain data.  If you have control over your implementation's
array primitives you can easily clobber the data pointer for an
array-header object to reuse the base record and spare yourself some
otherwise gratuitous copying.  It's "a veritable bos grunniens of
hair", as some would have said, but at the same time, unequivocally
effective.
From: Edi Weitz
Subject: Re: Favorite C technique to Lisp... How to?
Date: 
Message-ID: <87iscdhkfk.fsf@bird.agharta.de>
On 24 Jul 2004 04:05:49 -0400, Carl Shapiro <·············@panix.com> wrote:

> The right way to manipulate record-oriented data in Common Lisp is
> by overlaying source data with byte-arrays which are in turn
> overlaid with displaced arrays of the correct type for the structure
> member data.  This is the strategy which the Lisp Machine uses deep
> within the file system internals as well as in other situations
> where pointer-oriented structures are not an appropriate abstraction
> for modeling domain data.  If you have control over your
> implementation's array primitives you can easily clobber the data
> pointer for an array-header object to reuse the base record and
> spare yourself some otherwise gratuitous copying.  It's "a veritable
> bos grunniens of hair", as some would have said, but at the same
> time, unequivocally effective.

That sounds interesting. Do you have an example for this technique
using a non-LispM implementation like AllegroCL, LW, or CMUCL?

Thanks,
Edi.

-- 

"Lisp doesn't look any deader than usual to me."
(David Thornley, reply to a question older than most languages)

Real email: (replace (subseq ·········@agharta.de" 5) "edi")