From: Mike G.
Subject: Saving Lisp data
Date: 
Message-ID: <b41bc628-548e-4cb6-8b1d-063c2becce55@p25g2000hsf.googlegroups.com>
Can someone show me how to save/load binary Lisp data? I want to avoid
the conversion to ASCII if possible. I've been fooling w/ SB-FASL, but
when I LOAD the resulting fasl, I get an error about the fasl stack
not being empty when it should be.

I do:

(let ((F (sb-fasl:open-fasl-output #p"foo1.fasl" nil)))
   (sb-fasl:dump-object FOO F)
   (sb-fasl:close-fasl-output F nil))

But I cannot do (load "foo1.fasl")

What am I missing?

-M

From: Zach Beane
Subject: Re: Saving Lisp data
Date: 
Message-ID: <m3skyxvbh5.fsf@unnamed.xach.com>
"Mike G." <···············@gmail.com> writes:

> Can someone show me how to save/load binary Lisp data? I want to avoid
> the conversion to ASCII if possible.

Why?

Zach
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <6523f41c-e011-4345-b6f7-1b1ae59ee766@m3g2000hsc.googlegroups.com>
On Mar 11, 5:47 am, Zach Beane <····@xach.com> wrote:
> "Mike G." <···············@gmail.com> writes:
> > Can someone show me how to save/load binary Lisp data? I want to avoid
> > the conversion to ASCII if possible.
>
> Why?
>
> Zach

My objects are large and I expect to be saving out lots of stuff.

-M
From: Pascal J. Bourguignon
Subject: Re: Saving Lisp data
Date: 
Message-ID: <7c8x0przii.fsf@pbourguignon.anevia.com>
"Mike G." <···············@gmail.com> writes:

> On Mar 11, 5:47 am, Zach Beane <····@xach.com> wrote:
>> "Mike G." <···············@gmail.com> writes:
>> > Can someone show me how to save/load binary Lisp data? I want to avoid
>> > the conversion to ASCII if possible.
>>
>> Why?
>>
>> Zach
>
> My objects are large and I expect to be saving out lots of stuff.

Binary format often takes more space than ASCII.

The reason is simple to understand: any number will take usually 32
bits in binary, but most numbers are smaller than 100, and therefore
only take 3 characters.

Of course it depends on the kind of data you have.  My experience is
that on geographical data (road coordinates), in ASCII it takes half
the space than in binary.


Finally, ASCII or binary, if your data is big and you want to save
mem<->disk bandwidth, you can always compress it.
http://www.cliki.net/admin/search?words=gzip


-- 
__Pascal Bourguignon__
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <7e54983c-a85b-427a-a78c-c1b3aa946377@u10g2000prn.googlegroups.com>
On Mar 11, 12:33 pm, ····@informatimago.com (Pascal J. Bourguignon)
wrote:

> Binary format often takes more space than ASCII.

I'm also concerned with the time it takes to cook the data for output.
Preliminary tests w/ CL-STORE vs. READ/PRINT are favorable. The sum of
the save time and load time w/ CL-STORE is less than w/ READ/PRINT.

> The reason is simple to understand: any number will take usually 32
> bits in binary, but most numbers are smaller than 100, and therefore
> only take 3 characters.

You're forgetting the SPACES needed to separate your ASCII numbers. :)

But I'll grant you that, in general, ASCII can be better than binary.
In binary you can only use base-2. In ASCII you can use base-64, for
instance.

> Of course it depends on the kind of data you have.  My experience is
> that on geographical data (road coordinates), in ASCII it takes half
> the space than in binary.

I'm processing large arrays of 32-bit pixel data. I want to use CL-
STORE to save the intermediate results of certain image transforms so
that I don't have to re-transform from scratch if I want to tweak a
filter parameter. At the end, I'll pump the array through a JPEG
processor.

> Finally, ASCII or binary, if your data is big and you want to save
> mem<->disk bandwidth, you can always compress it.http://www.cliki.net/admin/search?words=gzip

Thanks for the link! I don't know that I'll use it in this project,
but setting up a stream to pipe through gzip is a neat idea :)

-M
From: Pascal Bourguignon
Subject: Re: Saving Lisp data
Date: 
Message-ID: <87d4q0yjjk.fsf@thalassa.informatimago.com>
"Mike G." <···············@gmail.com> writes:

> On Mar 11, 12:33 pm, ····@informatimago.com (Pascal J. Bourguignon)
> wrote:
>
>> Binary format often takes more space than ASCII.
>
> I'm also concerned with the time it takes to cook the data for output.
> Preliminary tests w/ CL-STORE vs. READ/PRINT are favorable. The sum of
> the save time and load time w/ CL-STORE is less than w/ READ/PRINT.

Processing time may be what you have plenty.  I/O bandwidth is much
more limited on current hardware.  That's why you may even be able to
compress the data.

>> The reason is simple to understand: any number will take usually 32
>> bits in binary, but most numbers are smaller than 100, and therefore
>> only take 3 characters.
>
> You're forgetting the SPACES needed to separate your ASCII numbers. :)

No. (= (1+ (log 100 10)) 3)


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

CONSUMER NOTICE: Because of the "uncertainty principle," it is
impossible for the consumer to simultaneously know both the precise
location and velocity of this product.
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <6e61f8a2-0eff-42d8-8b93-424e84832967@13g2000hsb.googlegroups.com>
On Mar 11, 6:37 pm, Pascal Bourguignon <····@informatimago.com> wrote:
> "Mike G." <···············@gmail.com> writes:
> > On Mar 11, 12:33 pm, ····@informatimago.com (Pascal J. Bourguignon)
> > wrote:
>
> >> Binary format often takes more space than ASCII.
>
> > I'm also concerned with the time it takes to cook the data for output.
> > Preliminary tests w/ CL-STORE vs. READ/PRINT are favorable. The sum of
> > the save time and load time w/ CL-STORE is less than w/ READ/PRINT.
>
> Processing time may be what you have plenty.  I/O bandwidth is much
> more limited on current hardware.  That's why you may even be able to
> compress the data.

If compressing / decompressing gets my data to me in a usable fashion
in less real time, I'm there. We'll see.
>
> >> The reason is simple to understand: any number will take usually 32
> >> bits in binary, but most numbers are smaller than 100, and therefore
> >> only take 3 characters.
>
> > You're forgetting the SPACES needed to separate your ASCII numbers. :)
>
> No. (= (1+ (log 100 10)) 3)
>
> --
> __Pascal Bourguignon__                    http://www.informatimago.com/
>
> CONSUMER NOTICE: Because of the "uncertainty principle," it is
> impossible for the consumer to simultaneously know both the precise
> location and velocity of this product.
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <77b55ab8-aa69-454d-adc1-6b2e1f3717ea@y77g2000hsy.googlegroups.com>
On Mar 11, 6:37 pm, Pascal Bourguignon <····@informatimago.com> wrote:
> > You're forgetting the SPACES needed to separate your ASCII numbers. :)
>
> No. (= (1+ (log 100 10)) 3)

I guess my comparison is unfair. You have to store a space if you
express the numbers naturally. I.e. 100 and 10, not 100 and 010. But
32-bit int binary data does the later.

When I think of outputting in ASCII though, I think of "rich" human-
readable formats. Like sexps. Or even HTML.

-M
From: Pascal Bourguignon
Subject: Re: Saving Lisp data
Date: 
Message-ID: <878x0oxvdp.fsf@thalassa.informatimago.com>
"Mike G." <···············@gmail.com> writes:

> On Mar 11, 6:37 pm, Pascal Bourguignon <····@informatimago.com> wrote:
>> > You're forgetting the SPACES needed to separate your ASCII numbers. :)
>>
>> No. (= (1+ (log 100 10)) 3)
>
> I guess my comparison is unfair. You have to store a space if you
> express the numbers naturally. I.e. 100 and 10, not 100 and 010. But
> 32-bit int binary data does the later.

Numbers smaller than 100 have 2 digits: from 00 to 99.

I counted a space, because variable length format is in general more
compact, because in general, we have a lot of 0 or other small numbers.

But you're right, if you know before hand the range of your numbers,
you can do without the space. For example, a timestamp can be written
as:        "20080312081734"
instead of "2008 3 12 8 17 34"


> When I think of outputting in ASCII though, I think of "rich" human-
> readable formats. Like sexps. Or even HTML.

HTML and its ilk are more expensive in space.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Specifications are for the weak and timid!"
From: Alex Mizrahi
Subject: Re: Saving Lisp data
Date: 
Message-ID: <47d653cc$0$90275$14726298@news.sunsite.dk>
 MG> Can someone show me how to save/load binary Lisp data?

you mean storing arbitrary objects in more-or-less efficient way?
try cl-store 
From: Marco Antoniotti
Subject: Re: Saving Lisp data
Date: 
Message-ID: <76652e55-5861-46e5-9f1d-f23e0e9fadfc@13g2000hsb.googlegroups.com>
On Mar 11, 10:41 am, "Alex Mizrahi" <········@users.sourceforge.net>
wrote:
>  MG> Can someone show me how to save/load binary Lisp data?
>
> you mean storing arbitrary objects in more-or-less efficient way?
> try cl-store

+1

I second that.  cl-store has been one of the most pleasant surprises I
used recently.  There are some pitfalls, but you get what you pay for.

Cheers
--
Marco
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <51c1b5b6-efe6-4638-9157-e16f6560801c@e60g2000hsh.googlegroups.com>
On Mar 11, 9:01 am, Marco Antoniotti <·······@gmail.com> wrote:
> On Mar 11, 10:41 am, "Alex Mizrahi" <········@users.sourceforge.net>
> wrote:
>
> >  MG> Can someone show me how to save/load binary Lisp data?
>
> > you mean storing arbitrary objects in more-or-less efficient way?
> > try cl-store
>
> +1
>
> I second that.  cl-store has been one of the most pleasant surprises I
> used recently.  There are some pitfalls, but you get what you pay for.
>
> Cheers
> --
> Marco

Thanks. I just installed this and it looks pretty cool. This is
exactly what I want!

Care to give me a brief outline of the pitfalls you've experienced?

-M
From: ··············@gmail.com
Subject: Re: Saving Lisp data
Date: 
Message-ID: <54607523-c7b2-4436-849f-b8edeadb6693@d62g2000hsf.googlegroups.com>
> > you mean storing arbitrary objects in more-or-less efficient way?
> > try cl-store
>
> +1
>
> I second that.  cl-store has been one of the most pleasant surprises I
> used recently.  There are some pitfalls, but you get what you pay for.

we used cl-store for a while, but it was slow, difficult to customize
and generated bigger output then it should have. therefore cl-
serializer was born: http://common-lisp.net/project/cl-serializer/

it's faster, produces smaller output and conses less then cl-store
(for example numbers are not fixed length, etc). also it supports a
byte-array based interface which makes it much faster in real-life due
to not calling the stream api for each byte written/read.

but as usual, ymmv...

- attila
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <f770bdc5-439b-4baf-b010-3c124e625f21@13g2000hsb.googlegroups.com>
On Mar 13, 9:44 am, ···············@gmail.com"
<··············@gmail.com> wrote:
> > > you mean storing arbitrary objects in more-or-less efficient way?
> > > try cl-store
>
> > +1
>
> > I second that.  cl-store has been one of the most pleasant surprises I
> > used recently.  There are some pitfalls, but you get what you pay for.
>
> we used cl-store for a while, but it was slow, difficult to customize
> and generated bigger output then it should have. therefore cl-
> serializer was born:http://common-lisp.net/project/cl-serializer/
>
> it's faster, produces smaller output and conses less then cl-store
> (for example numbers are not fixed length, etc). also it supports a
> byte-array based interface which makes it much faster in real-life due
> to not calling the stream api for each byte written/read.
>
> but as usual, ymmv...
>
> - attila

Thanks for the link - I'll check this out too. I did notice that CL-
STORE is producing massive output. A 1000x1000 double-float matrix
takes 47M (!) which is about 5 times what it should be. CL-STORE's load
+save time is still less than READ/PRINT, though so its a win - but
I'd like to see what your serializer can do.

-M
From: David Golden
Subject: Re: Saving Lisp data
Date: 
Message-ID: <3RhCj.24825$j7.452814@news.indigo.ie>
Mike G. wrote:


> Thanks for the link - I'll check this out too. I did notice that CL-
> STORE is producing massive output. A 1000x1000 double-float matrix
> takes 47M (!) which is about 5 times what it should be. CL-STORE's
> load +save time is still less than READ/PRINT, though so its a win -
> but I'd like to see what your serializer can do.
> 
> -M


I hesitate to mention it, but with SBCL on posix there is a fast, simple
way that you can use if you don't care too much about ever so
hifalutin' things like portability and abstraction and safety and
avoiding brittle data (non)formats and whatnot - just mmap a file in
and access it as an alien array with the ffi.

mmaping is a convenient technique fairly often used for transient
restart files in numerical simulation, though in real life I'm always
telling people to use a proper file format even for their restart files
(some binary format libraries may use mmapping underneath so the hit
there often isn't too bad, and there are many advantages to having
restart files in a well-defined format)

You might find that even if you prefer to work with real lisp-side
arrays, copying the values over to an mmapped alien array at the end to
save them can still be a fair bit faster than writing to a stream,
though YMMV.

N.B. And you can also crash your sbcl session if you make one false
move, so watch out - this is bypassing all normal safeties and munging
the process' memory map.

Simple/limited example code (run at your own risk!):


(defconstant +plat-page-size+ 4096) ; sysconf not in sb-posix...

(defun dump-double-array (filename array)
  (declare (type (array double-float) array))
  (let* ((fd (sb-posix:open filename (logior sb-posix:o-creat
sb-posix:o-rdwr)
                           (logior sb-posix:s-irusr sb-posix:s-iwusr)))
        (byte-size (* (array-total-size array) 8))
        (page-rounded (* +plat-page-size+
                         (ceiling byte-size +plat-page-size+))))
    (unwind-protect
         (progn
           (sb-posix:ftruncate fd byte-size)
           (let ((addr (sb-posix:mmap nil page-rounded
                                      (logior sb-posix:prot-read
                                              sb-posix:prot-write)
                                      sb-posix:map-shared
                                      fd 0)))
             (unwind-protect
                  (let ((ptr
                         (sb-alien:sap-alien addr
                                             (* double-float))))
                    (dotimes (i (array-total-size array))
                      (setf (sb-alien:deref ptr i) (row-major-aref array i))))
               (sb-posix:munmap addr page-rounded))))
      (sb-posix:close fd))))

(defun restore-double-array (filename dimensions)
  (let* ((fd (sb-posix:open filename sb-posix:o-rdonly)))
    (unwind-protect
         (let* ((byte-size (sb-posix:stat-size (sb-posix:fstat fd)))
                (page-rounded (* +plat-page-size+
                                 (ceiling byte-size +plat-page-size+)))
                (array-size (/ byte-size 8))
                (array (make-array dimensions :element-type 'double-float)))
           (when (/= array-size (array-total-size array))
             (error "File size vs. dimensions mismatch"))
           (let ((addr (sb-posix:mmap nil page-rounded
                                      sb-posix:prot-read
                                      sb-posix:map-private
                                      fd 0)))
             (unwind-protect
                  (let ((ptr
                         (sb-alien:sap-alien addr
                                             (* double-float))))
                    (dotimes (i array-size)
                      (setf (row-major-aref array i) (sb-alien:deref ptr i))))
               (sb-posix:munmap addr page-rounded)))
           array)
      (sb-posix:close fd))))

;;;; try it.
;;; you may need to increase sbcl --dynamic-space-size to fit..
;;; each array here is ~ 290 MByte.  Remember 32-bit platforms
;;; can only address 4GB...
;;;
;; (prog1 t (defparameter *a* 
;;              (make-array '(8765 4321) :element-type 'double-float)))
;;
;; (dotimes (i 8765)
;;   (dotimes (j 4321)
;;    (setf (aref *a* i j) (coerce (+ i j) 'double-float))))
;;
;; (dump-double-array "/tmp/test1.dat" *a*)
;;
;;; Later...
;;
;; (prog1 t (defparameter *b* 
;;         (restore-double-array "/tmp/test1.dat" '(8765 4321))))
;;
;; (dotimes (i 8765)
;;      (dotimes (j 4321)
;;        (unless (eql (aref *b* i j) (coerce (+ i j) 'double-float))
;;          (error "Uhoh, compare failed under eql"))))
From: Holger Schauer
Subject: Re: Saving Lisp data
Date: 
Message-ID: <yxzr6edlm27.fsf@gmx.de>
On 5307 September 1993, David Golden wrote:
> You might find that even if you prefer to work with real lisp-side
> arrays, copying the values over to an mmapped alien array at the end to
> save them can still be a fair bit faster than writing to a stream,
> though YMMV.

That's an interesting idea. I wonder if such an approach could be
generalized. For instance, I have a ternary search tree library, but
saving such a tree to disc is horribly inefficient in comparison to
the C version -- in which you would just dump the memory part in
question. You just gave me an idea. 

Ah, and btw., the resulting data might not be portable, but using a
FFI wrapper such as CFFI, at least the lisp side access might be more
or less portable across lisp implementations.

Holger

-- 
---          http://hillview.bugwriter.net/            ---
Fachbegriffe der Informatik - Einfach erkl�rt
163: SMD
       Schwer Montierbare Dinger (Holger K�pke)
From: Nicolas Neuss
Subject: Re: Saving Lisp data
Date: 
Message-ID: <87d4pyu2wv.fsf@ma-patru.mathematik.uni-karlsruhe.de>
···············@gmail.com" <··············@gmail.com> writes:

>> I second that.  cl-store has been one of the most pleasant surprises I
>> used recently.  There are some pitfalls, but you get what you pay for.
>
> we used cl-store for a while, but it was slow, difficult to customize
> and generated bigger output then it should have. therefore cl-
> serializer was born: http://common-lisp.net/project/cl-serializer/

A related question: I would like to store a large amount of data consisting
of blocks of floating point numbers (the size of these blocks can vary
between 1..10000 double floats) as efficiently as possible in a database,
(especially, I would like to avoid converting to ASCII, because the numbers
usually will not have a nice ASCII representation).  Does anyone have any
experience how one could approach this?  Which database, which interface,
which method to store the double-float arrays?

Thanks,
Nicolas
From: Mike G.
Subject: Re: Saving Lisp data
Date: 
Message-ID: <167f18d7-9b27-4707-8804-3f5e0473bb40@c33g2000hsd.googlegroups.com>
On Mar 13, 10:14 am, Nicolas Neuss <········@math.uni-karlsruhe.de>
wrote:
> ···············@gmail.com" <··············@gmail.com> writes:
> >> I second that.  cl-store has been one of the most pleasant surprises I
> >> used recently.  There are some pitfalls, but you get what you pay for.
>
> > we used cl-store for a while, but it was slow, difficult to customize
> > and generated bigger output then it should have. therefore cl-
> > serializer was born:http://common-lisp.net/project/cl-serializer/
>
> A related question: I would like to store a large amount of data consisting
> of blocks of floating point numbers (the size of these blocks can vary
> between 1..10000 double floats) as efficiently as possible in a database,
> (especially, I would like to avoid converting to ASCII, because the numbers
> usually will not have a nice ASCII representation).  Does anyone have any
> experience how one could approach this?  Which database, which interface,
> which method to store the double-float arrays?
>
> Thanks,
> Nicolas

Are you locked into a DB solution? I'd probably approach this with
something like CL-STORE, or CL-SERIALIZER and a hash-table.

-M
From: George Neuner
Subject: Re: Saving Lisp data
Date: 
Message-ID: <f3ljt3934t5lr1cuufqrbrgkmdkdi6uc5v@4ax.com>
On Thu, 13 Mar 2008 15:14:24 +0100, Nicolas Neuss
<········@math.uni-karlsruhe.de> wrote:

>···············@gmail.com" <··············@gmail.com> writes:
>
>>> I second that.  cl-store has been one of the most pleasant surprises I
>>> used recently.  There are some pitfalls, but you get what you pay for.
>>
>> we used cl-store for a while, but it was slow, difficult to customize
>> and generated bigger output then it should have. therefore cl-
>> serializer was born: http://common-lisp.net/project/cl-serializer/
>
>A related question: I would like to store a large amount of data consisting
>of blocks of floating point numbers (the size of these blocks can vary
>between 1..10000 double floats) as efficiently as possible in a database,
>(especially, I would like to avoid converting to ASCII, because the numbers
>usually will not have a nice ASCII representation).  Does anyone have any
>experience how one could approach this?  Which database, which interface,
>which method to store the double-float arrays?

In general any data can be stored as a BLOB.  I can't guide you much
as to libraries ... I've done this from C++ but not from Lisp.

If you need to serialize the data for portability, use an
implementation that can (un)serialize directly to a byte array (or
alternatively, make sure streams can be opened on byte arrays).  If
you have no portability worries and need speed, you can read/write the
BLOB directly from the buffer of a typed array.

George
--
for email reply remove "/" from address
From: Pascal Bourguignon
Subject: Re: Saving Lisp data
Date: 
Message-ID: <87y78lrej6.fsf@thalassa.informatimago.com>
Nicolas Neuss <········@math.uni-karlsruhe.de> writes:

> ···············@gmail.com" <··············@gmail.com> writes:
>
>>> I second that.  cl-store has been one of the most pleasant surprises I
>>> used recently.  There are some pitfalls, but you get what you pay for.
>>
>> we used cl-store for a while, but it was slow, difficult to customize
>> and generated bigger output then it should have. therefore cl-
>> serializer was born: http://common-lisp.net/project/cl-serializer/
>
> A related question: I would like to store a large amount of data consisting
> of blocks of floating point numbers (the size of these blocks can vary
> between 1..10000 double floats) as efficiently as possible in a database,
> (especially, I would like to avoid converting to ASCII, because the numbers
> usually will not have a nice ASCII representation).  Does anyone have any
> experience how one could approach this?  Which database, which interface,
> which method to store the double-float arrays?

Depends on whether you want to do that portably or not.  I cannot help
you for implementation specific stuff.  Portably, you can use things
like:
http://darcs.informatimago.com/darcs/public/lisp/common-lisp/float-binio.lisp
or:
http://www.cliki.net/ieee-floats
also see this thread:
http://groups.google.com/group/comp.lang.lisp/browse_frm/thread/ed5fdab0db1a80a5/e1895651aca6fe1e?lnk=st&q=#e1895651aca6fe1e

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

HANDLE WITH EXTREME CARE: This product contains minute electrically
charged particles moving at velocities in excess of five hundred
million miles per hour.
From: vanekl
Subject: Re: Saving Lisp data
Date: 
Message-ID: <frdvr2$j5b$1@aioe.org>
Nicolas Neuss wrote:
snip
> A related question: I would like to store a large amount of data consisting
> of blocks of floating point numbers (the size of these blocks can vary
> between 1..10000 double floats) as efficiently as possible in a database,
> (especially, I would like to avoid converting to ASCII, because the numbers
> usually will not have a nice ASCII representation).  Does anyone have any
> experience how one could approach this?  Which database, which interface,
> which method to store the double-float arrays?
> 
> Thanks,
> Nicolas

If the data is primarily static, simply writing a binary file to disk
and storing the file name and path in the database is probably
the best solution. There's no sense in backing up the same data
every day if it's static. The ACID properties that databases
provide are of little use for static data. Indexing is probably
the best feature you can get, but you don't need to store the
actual doubles in the db to get this (unless of course, you want
to index the actual data, which most business solutions would
not require).

If the data is dynamic, then it's best to use a normalized schema.
You would want something like,

tblDoubles
auto_incr id [key]
integer   block_num
double    dbl

OR, if the block size never increases, and you are allowed to
lock the table before populating it with a new block:

tblDoubles
integer   id [key]
double    dbl

tblBlockRanges
auto_incr block_num [key]
integer   id_start
integer   id_end



But if your PHB tells you to cram it all into a db, and it must
be stored in the most efficient manner, George is correct: use
any database that supports blobs. You also have to make sure
your database driver that you use to connect with the language
of your choice supports blobs. Not all do.