Lisp vs. I/O

From: Bud Graziano
Subject: Lisp vs. I/O
Date: Tue, 07 Dec 2004 22:01:07 +0000
Message-ID: <56a5817f.0412071401.936511a@posting.google.com>

Hi,


Yesterday we started learning about Common Lisp and CLOS. My professor
said that Lisp is weak in I/O. Having a glance at
http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
reveals several stream functions and macros. Not to mention the file
section of the CLHS and the pathnames data-structure. So it seems to
me that Lisp is good at I/O. At least that's my impression. Can anyone
comment it?

Thank you,
Bud.

Re: Lisp vs. I/O Jeff M.
- Re: Lisp vs. I/O Julian Stecklina
  - Re: Lisp vs. I/O Duane Rettig
  - Re: Lisp vs. I/O Jeff
    - Re: Lisp vs. I/O Pascal Bourguignon
Re: Lisp vs. I/O Paolo Amoroso
Re: Lisp vs. I/O Pascal Bourguignon
- Re: Lisp vs. I/O Adam Warner
Re: Lisp vs. I/O Harald Hanche-Olsen
- Re: Lisp vs. I/O Peter Seibel
  - Re: Lisp vs. I/O Harald Hanche-Olsen
    - Re: Lisp vs. I/O Duane Rettig
    - Re: Lisp vs. I/O Pascal Bourguignon
      - Re: Lisp vs. I/O Duane Rettig
        Re: Lisp vs. I/O Peter Seibel
        Re: Lisp vs. I/O Duane Rettig
      - Re: Lisp vs. I/O Russell McManus
        Re: Lisp vs. I/O Peter Seibel
        Re: Lisp vs. I/O David Steuber
        Re: Lisp vs. I/O Duane Rettig
        Re: Lisp vs. I/O Peter Seibel
        Re: Lisp vs. I/O Duane Rettig
        Re: Lisp vs. I/O Pascal Bourguignon
        Re: Lisp vs. I/O Duane Rettig
        Re: Lisp vs. I/O Pascal Bourguignon
        Re: Lisp vs. I/O Duane Rettig
        Re: Lisp vs. I/O Pascal Bourguignon
        Re: Lisp vs. I/O Russell McManus
        Re: Lisp vs. I/O Brian Downing
        Re: Lisp vs. I/O Pascal Bourguignon
      - Re: Lisp vs. I/O Harald Hanche-Olsen
        Re: Lisp vs. I/O rif
        Re: Lisp vs. I/O Duane Rettig
        Re: Lisp vs. I/O Mario S. Mommer
        Re: Lisp vs. I/O Pascal Bourguignon
        Re: Lisp vs. I/O Pascal Bourguignon
        Re: Lisp vs. I/O Charles Hixson
      - Re: Lisp vs. I/O Karl A. Krueger
    - Re: Lisp vs. I/O Peter Seibel
      - Re: Lisp vs. I/O Harald Hanche-Olsen
        Re: Lisp vs. I/O Peter Seibel
    - Re: Lisp vs. I/O Duane Rettig
    - Re: Lisp vs. I/O ···@gnu.org
    - Re: Lisp vs. I/O Paolo Amoroso
Re: Lisp vs. I/O David Sletten
Re: Lisp vs. I/O Adam Warner
- Re: Lisp vs. I/O Pascal Bourguignon
  - Re: Lisp vs. I/O Adam Warner
    - Re: Lisp vs. I/O Christopher C. Stacy
- Re: Lisp vs. I/O Adam Warner

From: Jeff M.
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 22:15:16 +0000
Message-ID: <1102457716.159847.88180@c13g2000cwb.googlegroups.com>

I have never run into an I/O limitation using Lisp. There have been
many times, however, where I have been frustrated with C after using
Lisp for a while (especially with file I/O).

Bottom line: Lisp can handle I/O just fine and do whatever your
professor thinks that it can't. Go ahead, test us. Ask your professor
something that Lisp can't do and post it here :)

Jeff M.

From: Julian Stecklina
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 22:20:16 +0000
Message-ID: <86fz2hoin3.fsf@goldenaxe.localnet>

"Jeff M." <·······@gmail.com> writes:

> I have never run into an I/O limitation using Lisp. There have been
> many times, however, where I have been frustrated with C after using
> Lisp for a while (especially with file I/O).

To back C a bit: You can easily mmap() a file into the applications
virtual memory and cast it into a structure pointer. Something which
is not straight-forward in CL.

Regards,
-- 
                    ____________________________
 Julian Stecklina  /  _________________________/
  ________________/  /
  \_________________/  LISP - truly beautiful

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 02:28:49 +0000
Message-ID: <4is7dh6am.fsf@franz.com>

Julian Stecklina <··········@web.de> writes:

> "Jeff M." <·······@gmail.com> writes:
> 
> > I have never run into an I/O limitation using Lisp. There have been
> > many times, however, where I have been frustrated with C after using
> > Lisp for a while (especially with file I/O).
> 
> To back C a bit: You can easily mmap() a file into the applications
> virtual memory and cast it into a structure pointer. Something which
> is not straight-forward in CL.

This should be easy enough using simple-streams.  In fact, as I
understand it, one of the major reasons why Paul Foley created the
original open-source version of simple-streams (now available in one
form or another on at least cmucl and sbcl) was to allow him to
do memory-mapped i/o.

The idea would be to open the file as a memory mapped file, and then
grab the buffer slot in the stream and treat it (along with any offset
that is needed) as an "aligned" pointer [1] on which one can overlay any
foreign type.  This can easily be done in Allegro CL.

[1] In Allegro CL terminology, an aligned pointer is a fixnum whose
actual bit representation is the address being considered.  This is
usually 4 or 8 times the actual value of the fixnum.  It does not
cons, because it is a fixnum, and it covers all natural-word-aligned
addresses in memory (negaitve fixnum values cover the top half of
memory).  Since a mapped file always aligns its workspace to a
page boundary, it is always thus at least aligned to a fixnum's
tag as well. The ff:fslot-value-typed accessor accepts an :aligned
allocation argument.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Jeff
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 00:38:30 +0000
Message-ID: <a2std.204817$HA.117968@attbi_s01>

Julian Stecklina wrote:

> "Jeff M." <·······@gmail.com> writes:
> 
> > I have never run into an I/O limitation using Lisp. There have been
> > many times, however, where I have been frustrated with C after using
> > Lisp for a while (especially with file I/O).
> 
> To back C a bit: You can easily mmap() a file into the applications
> virtual memory and cast it into a structure pointer. Something which
> is not straight-forward in CL.

That, I will admit, is a nice thing to be able to do.

Jeff M.

-- 
http://www.retrobyte.org
··············@gmail.com

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 01:00:19 +0000
Message-ID: <87hdmxy57g.fsf@thalassa.informatimago.com>

"Jeff" <·······@gmail.com> writes:

> Julian Stecklina wrote:
> 
> > "Jeff M." <·······@gmail.com> writes:
> > 
> > > I have never run into an I/O limitation using Lisp. There have been
> > > many times, however, where I have been frustrated with C after using
> > > Lisp for a while (especially with file I/O).
> > 
> > To back C a bit: You can easily mmap() a file into the applications
> > virtual memory and cast it into a structure pointer. Something which
> > is not straight-forward in CL.
> 
> That, I will admit, is a nice thing to be able to do.

You can easily do it. For example, in clisp:

    - write a little module/library with:
        char* get_data(unsigned long address x){return(x);}
        void copy_data(unsigned long address x,const char* d){strcpy(x,d);}

    - (ffi:def-call-out get-data (:name "get_data")
         (:arguments (x ffi:ulong)) (:return-type ffi:string)
         (:library "libgetdata.so") (:language :stdc))
      (ffi:def-call-out put-data (:name "copy_data")
         (:arguments (x ffi:ulong) (s ffi:string)) (:return-type void)
         (:library "libgetdata.so") (:language :stdc))

then use read-from-string and write-to-string to avoid messing with
FFI anylonger.

    (let ((a (susv3.mc3:mmap 0 4096
                     (+ susv3.mc3:PROT_READ susv3.mc3:PROT_WRITE)
                     MAP_SHARED fd 0)))
        (unwind-protect (read-from-string (get-data a))
            (susv3.mc3:munmap a)))

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Paolo Amoroso
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 22:18:58 +0000
Message-ID: <87k6rt69bh.fsf@plato.moon.paoloamoroso.it>

·····@euromail.hu (Bud Graziano) writes:

> Yesterday we started learning about Common Lisp and CLOS. My professor
> said that Lisp is weak in I/O. Having a glance at

Show your professor the videos "Using a Symbolics Lisp Machine ..." at:

  http://lispm.dyndns.org

and ask him/her whether this qualifies as I/O.


Paolo
-- 
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Recommended Common Lisp libraries/tools (see also http://clrfi.alu.org):
- ASDF/ASDF-INSTALL: system building/installation
- CL-PPCRE: regular expressions
- UFFI: Foreign Function Interface

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 22:34:27 +0000
Message-ID: <87brd5zqj0.fsf@thalassa.informatimago.com>

·····@euromail.hu (Bud Graziano) writes:

> Hi,
> 
> 
> Yesterday we started learning about Common Lisp and CLOS. My professor
> said that Lisp is weak in I/O. Having a glance at
> http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
> reveals several stream functions and macros. Not to mention the file
> section of the CLHS and the pathnames data-structure. So it seems to
> me that Lisp is good at I/O. At least that's my impression. Can anyone
> comment it?

Yes, that's a problem with lisp: it is so STRONG, that when you look a
particular point in it, that particular point looks weak, with respect
with the whole rest of lisp.  But you should be comparing lisp I/O
with eg. C I/O, not lisp I/O with lisp macros, or lisp I/O with lisp
list processing, or lisp I/O with lisp error handling, etc.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Adam Warner
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 23:20:05 +0000
Message-ID: <pan.2004.12.07.23.20.01.946517@consulting.net.nz>

Hi Pascal Bourguignon,

> Yes, that's a problem with lisp: it is so STRONG, that when you look a
> particular point in it, that particular point looks weak, with respect
> with the whole rest of lisp.  But you should be comparing lisp I/O
> with eg. C I/O, not lisp I/O with lisp macros, or lisp I/O with lisp
> list processing, or lisp I/O with lisp error handling, etc.

Pascal, if you compare ANSI Common Lisp IO with Java IO you're in for a
big shock. Standard Java IO is a lot like what you'd expect from a Lisp
implementation with lots of useful non-portable extensions. The Java
standard library is a big achievement.

Regards,
Adam

From: Harald Hanche-Olsen
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 23:07:52 +0000
Message-ID: <pcopt1laerb.fsf@shuttle.math.ntnu.no>

+ ·····@euromail.hu (Bud Graziano):

| Yesterday we started learning about Common Lisp and CLOS. My professor
| said that Lisp is weak in I/O. Having a glance at
| http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
| reveals several stream functions and macros. Not to mention the file
| section of the CLHS and the pathnames data-structure. So it seems to
| me that Lisp is good at I/O.

It seems so to me too, but if we're looking for weaknesses one such
would be that it's hard in standard CL to do binary I/O with different
types of data objects in a single file.  I.e., write a couple of
floats, then an integer or two, etc., all in binary format, and read
them back in.

But google for "gray streams". ...

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow

From: Peter Seibel
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 23:12:59 +0000
Message-ID: <m3is7dr9c4.fsf@javamonkey.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + ·····@euromail.hu (Bud Graziano):
>
> | Yesterday we started learning about Common Lisp and CLOS. My professor
> | said that Lisp is weak in I/O. Having a glance at
> | http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
> | reveals several stream functions and macros. Not to mention the file
> | section of the CLHS and the pathnames data-structure. So it seems to
> | me that Lisp is good at I/O.
>
> It seems so to me too, but if we're looking for weaknesses one such
> would be that it's hard in standard CL to do binary I/O with different
> types of data objects in a single file.  I.e., write a couple of
> floats, then an integer or two, etc., all in binary format, and read
> them back in.

Hmmmm. It's no worse in Lisp than most languages. For one way to do it
see:

  <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
  <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>

I can say the code in those chapters was *a* *lot* less painful to
write than code to do similar things I've written in Java.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Harald Hanche-Olsen
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 23:27:41 +0000
Message-ID: <pcollc9adua.fsf@shuttle.math.ntnu.no>

+ Peter Seibel <·····@javamonkey.com>:

| > I.e., write a couple of
| > floats, then an integer or two, etc., all in binary format, and read
| > them back in.
| 
| Hmmmm. It's no worse in Lisp than most languages. For one way to do it
| see:
| 
|   <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
|   <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>

Nice.  But I see no floats there.  Not that I doubt it can be done,
but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
the corresponding read() for brevity.  I think if I had to parse the
typical file of scientific data, stuffed full of floats, I'd use some
form of FFI.

| I can say the code in those chapters was *a* *lot* less painful to
| write than code to do similar things I've written in Java.

Uh, don't know Java.  I have to Java books on my bookshelf.  They sat
there since Java was supposed to revolutionize the world.  I even read
one, but haven't written a single line of Java in anger.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 02:37:45 +0000
Message-ID: <4eki1h5vq.fsf@franz.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + Peter Seibel <·····@javamonkey.com>:
> 
> | > I.e., write a couple of
> | > floats, then an integer or two, etc., all in binary format, and read
> | > them back in.
> | 
> | Hmmmm. It's no worse in Lisp than most languages. For one way to do it
> | see:
> | 
> |   <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
> |   <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>
> 
> Nice.  But I see no floats there.  Not that I doubt it can be done,
> but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
> the corresponding read() for brevity.  I think if I had to parse the
> typical file of scientific data, stuffed full of floats, I'd use some
> form of FFI.

Check out

http://www.franz.com/support/documentation/7.0/doc/operators/excl/write-vector.htm

This is a part of simple-streams, and it supports the writing of floats
directly, as well as vectors of floats.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 00:16:05 +0000
Message-ID: <87653dzltm.fsf@thalassa.informatimago.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + Peter Seibel <·····@javamonkey.com>:
> 
> | > I.e., write a couple of
> | > floats, then an integer or two, etc., all in binary format, and read
> | > them back in.
> | 
> | Hmmmm. It's no worse in Lisp than most languages. For one way to do it
> | see:
> | 
> |   <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
> |   <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>
> 
> Nice.  But I see no floats there.  Not that I doubt it can be done,
> but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
> the corresponding read() for brevity.  

Always "brevity", "velocity"...  But what about correctness and
interchangeability (portability)?

It's hard to beat ite(fd,&some_float,sizeof(some_float)); for bugs.

At least, in Common-Lisp, you can extract floating point number
properties portably and write them in an interchangeable format. See
DECODE-FLOAT and INTEGER-DECODE-FLOAT.

> I think if I had to parse the
> typical file of scientific data, stuffed full of floats, I'd use some
> form of FFI.

Well, I've not seen a lot of scientific data files, but those I've
seen were all in pure ASCII.  (And by the way, one good reason to keep
data in ASCII is that it takes much less space than in binary).

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 03:01:27 +0000
Message-ID: <4acsph4s8.fsf@franz.com>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Harald Hanche-Olsen <······@math.ntnu.no> writes:
> 
> > + Peter Seibel <·····@javamonkey.com>:
> > 
> > | > I.e., write a couple of
> > | > floats, then an integer or two, etc., all in binary format, and read
> > | > them back in.
> > | 
> > | Hmmmm. It's no worse in Lisp than most languages. For one way to do it
> > | see:
> > | 
> > |   <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
> > |   <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>
> > 
> > Nice.  But I see no floats there.  Not that I doubt it can be done,
> > but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
> > the corresponding read() for brevity.  
> 
> Always "brevity", "velocity"...  But what about correctness and
> interchangeability (portability)?

Well, sorry to wave the simple-streams flag _again_, but if it were to
become a pseudo-standard (e.g. via a CLRFI process) then it would be
correct and portable by definition.

> It's hard to beat ite(fd,&some_float,sizeof(some_float)); for bugs.
> 
> At least, in Common-Lisp, you can extract floating point number
> properties portably and write them in an interchangeable format. See
> DECODE-FLOAT and INTEGER-DECODE-FLOAT.

Yes, but it is also an incredibly slow procedure to do that, one which
would have C advocates laughing at the poor support Lisp has for i/o :-)
Note that for double-floats these functions cons in a 32-bit lisp.

Here is the counter-example:

CL-USER(1): (time (write-vector 1.0d0 *standard-output*))
????????               < ---- control characters output
; cpu time (non-gc) 0 msec user, 0 msec system
; cpu time (gc)     0 msec user, 0 msec system
; cpu time (total)  0 msec user, 0 msec system
; real time  0 msec
; space allocation:
;  2 cons cells, 0 other bytes, 0 static bytes
12
CL-USER(2): 

The two cons cells were just part of the interpretation process.
But of course writing binary data to a chracter stream is not
as useful, so let's make it more so:

CL-USER(2): (defparameter *foo*
               (make-array 12 :element-type '(unsigned-byte 8)
                              :initial-element 0))
*FOO*
CL-USER(3): (with-output-to-buffer (s *foo*) (write-vector 1.0d0 s))
12
CL-USER(4): 

Almost as simple.  Now let's verify that the value got written
out:

CL-USER(4): *foo*
#(0 0 0 0 0 0 240 63 0 0 ...)
CL-USER(5): :i 1.0d0
A NEW double-float = 1.0d0 [#x3ff00000 00000000] @ #x1062bfd2
CL-USER(6): (format t "~x ~x" 63 240)
3f f0
NIL
CL-USER(7): 

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Peter Seibel
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 04:20:09 +0000
Message-ID: <m3fz2hpgjq.fsf@javamonkey.com>

Duane Rettig <·····@franz.com> writes:

> Here is the counter-example:
>
> CL-USER(1): (time (write-vector 1.0d0 *standard-output*))
> ????????               < ---- control characters output
> ; cpu time (non-gc) 0 msec user, 0 msec system
> ; cpu time (gc)     0 msec user, 0 msec system
> ; cpu time (total)  0 msec user, 0 msec system
> ; real time  0 msec
> ; space allocation:
> ;  2 cons cells, 0 other bytes, 0 static bytes
> 12
> CL-USER(2): 

How is it that 1.0d0 is understood to be a vector? Or are the docs you
pointed to for write-vector in another post incomplete?

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 05:25:58 +0000
Message-ID: <4sm6hz7h5.fsf@franz.com>

Peter Seibel <·····@javamonkey.com> writes:

> Duane Rettig <·····@franz.com> writes:
> 
> > Here is the counter-example:
> >
> > CL-USER(1): (time (write-vector 1.0d0 *standard-output*))
> > ????????               < ---- control characters output
> > ; cpu time (non-gc) 0 msec user, 0 msec system
> > ; cpu time (gc)     0 msec user, 0 msec system
> > ; cpu time (total)  0 msec user, 0 msec system
> > ; real time  0 msec
> > ; space allocation:
> > ;  2 cons cells, 0 other bytes, 0 static bytes
> > 12
> > CL-USER(2): 
> 
> How is it that 1.0d0 is understood to be a vector? Or are the docs you
> pointed to for write-vector in another post incomplete?

I assume that you mean this link:

http://www.franz.com/support/documentation/7.0/doc/operators/excl/write-vector.htm

If you didn't see it the first time, look again.  Search specifically
with your browser for occurrences of "float" - they are all in the same
paragraph.

Note that in keeping in line with the "vector" aspect of write-vector
(and for read-vector, where it's mandatory), you can either write or
read series' of floats from or into appropriate-sized specialized single-
or double-float vectors, and writing from a double-float vector of size
1 is indeed equivalent to writing from a double-float object.  It's
just more convenient not to have to create an intermediate vector
from which to write when on can do so directly from a constant or
a variable with the value already available.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Russell McManus
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 01:40:02 +0000
Message-ID: <87sm6h36vh.fsf@thelonious.dyndns.org>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Always "brevity", "velocity"...  But what about correctness and
> interchangeability (portability)?

OK, I'll bite.  Is there some way to recreate the value written out by
this program in portable common lisp?  The best I've come up with in
the past is to use the FFI of the implementation.

Here is the C program I am wondering about:

#include <fcntl.h>
int main(int argc,char*argv[]) {
  int fd=open("/var/tmp/foo", O_WRONLY|O_TRUNC|O_CREAT);
  double v = 1234.567;
  write(fd,(char*)&v,sizeof(v));
}

Here are the resulting bits:

1010100 11100011 10100101 10011011 1000100 1001010 10010011 1000000

I got the bits from this program:

(with-open-file (s "/var/tmp/foo"
		   :direction :input
		   :element-type 'unsigned-byte)
  (loop for byte = (read-byte s nil s) then (read-byte s nil s)
    until (eql byte s)
    do (format t "~B " byte)))

-russ

From: Peter Seibel
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 02:07:13 +0000
Message-ID: <m3wtvtpmpa.fsf@javamonkey.com>

Russell McManus <···············@yahoo.com> writes:

> Pascal Bourguignon <····@mouse-potato.com> writes:
>
>> Always "brevity", "velocity"...  But what about correctness and
>> interchangeability (portability)?
>
> OK, I'll bite.  Is there some way to recreate the value written out by
> this program in portable common lisp?  The best I've come up with in
> the past is to use the FFI of the implementation.

See the other post I just made in this thread for one way. If that
looks enticing you might want to take a look at chapters 24 and 25 at:

  <http://www.gigamonkeys.com/book/>

> Here is the C program I am wondering about:
>
> #include <fcntl.h>
> int main(int argc,char*argv[]) {
>   int fd=open("/var/tmp/foo", O_WRONLY|O_TRUNC|O_CREAT);
>   double v = 1234.567;
>   write(fd,(char*)&v,sizeof(v));
> }

Note, here that you're at the mercy of how double's are represented in
memory. I don't believe that is specified by C but I don't know much
about C so I could be wrong.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: David Steuber
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 16:10:02 +0000
Message-ID: <87r7m0py91.fsf@david-steuber.com>

Peter Seibel <·····@javamonkey.com> writes:

> Note, here that you're at the mercy of how double's are represented in
> memory. I don't believe that is specified by C but I don't know much
> about C so I could be wrong.

The way the C program was written, the byte ordering would certainly
matter.  I think IEEE-754 specifies the bit pattern.  But I think you
are still stuck with endian issues.  Admittedly, I've only done this
sort of thing with integers.

In any case, Lisp can do all sorts of bit twiddling.  So far, the best
argument I've seen agianst Lisp IO is mmap's absence.  Pascal
Bourguignon did come up with an example for doing it, but it did use
FFI.  I suppose if the file in question was small enough, it could be
read into a bit vector.  That's not quite the same as what mmap does
though.

In a situation like this, storing IEEE-754 doubles in a file, isn't
the portability of the data more important than the code that works on
it?  Doesn't that imply that the bit pattern in the file has to be
specified?

-- 
An ideal world is left as an excercise to the reader.
   --- Paul Graham, On Lisp 8.1

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 16:43:13 +0000
Message-ID: <4eki0pwpq.fsf@franz.com>

David Steuber <·····@david-steuber.com> writes:

> Peter Seibel <·····@javamonkey.com> writes:
> 
> > Note, here that you're at the mercy of how double's are represented in
> > memory. I don't believe that is specified by C but I don't know much
> > about C so I could be wrong.
> 
> The way the C program was written, the byte ordering would certainly
> matter.  I think IEEE-754 specifies the bit pattern.  But I think you
> are still stuck with endian issues.  Admittedly, I've only done this
> sort of thing with integers.

The endianness issues don't have to be issues.  See an article I wrote
elsewhere in this thread giving an example of writing and reading a
float on two opposite-endian architectures.

> In any case, Lisp can do all sorts of bit twiddling.  So far, the best
> argument I've seen agianst Lisp IO is mmap's absence.  Pascal
> Bourguignon did come up with an example for doing it, but it did use
> FFI.  I suppose if the file in question was small enough, it could be
> read into a bit vector.  That's not quite the same as what mmap does
> though.

I also had a description of using it, though it may not have been clear.
In Allegro CL (or in any fully implemented simple-streams implementation)
you can use OPEN with a :mapped t keyword argument specification.  The
stream you get back has a "buffer" slot which is really not a buffer; it
is the workspaced which has been mmapped in from the file.

> In a situation like this, storing IEEE-754 doubles in a file, isn't
> the portability of the data more important than the code that works on
> it?  Doesn't that imply that the bit pattern in the file has to be
> specified?

Yes, but it need not be nonportable.  In the read-vector/write-vector
parlance, specifying :endian-swap :network-order to both is sufficient
to guarantee an order that can be communicated portably.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Peter Seibel
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 17:42:52 +0000
Message-ID: <m3r7m0ofdv.fsf@javamonkey.com>

Duane Rettig <·····@franz.com> writes:

> I also had a description of using it, though it may not have been
> clear. In Allegro CL (or in any fully implemented simple-streams
> implementation) you can use OPEN with a :mapped t keyword argument
> specification. The stream you get back has a "buffer" slot which is
> really not a buffer; it is the workspaced which has been mmapped in
> from the file.

So does the user then grab that buffer out of the stream and frob it
directly? I.e. I don't want a stream interface, I want a random access
interface. And does the buffer map the whole file or just a chunk of
it at a time.?

>> In a situation like this, storing IEEE-754 doubles in a file, isn't
>> the portability of the data more important than the code that works
>> on it? Doesn't that imply that the bit pattern in the file has to
>> be specified?
>
> Yes, but it need not be nonportable.  In the read-vector/write-vector
> parlance, specifying :endian-swap :network-order to both is sufficient
> to guarantee an order that can be communicated portably.

Are all the other aspects of float formats (such as the number of bits
used for mantissa and exponent) specified by the language standard. I
thought the standard left a lot of room for implementations to choose
their floating point representations.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 19:09:34 +0000
Message-ID: <44qiwppxt.fsf@franz.com>

Peter Seibel <·····@javamonkey.com> writes:

> Duane Rettig <·····@franz.com> writes:
> 
> > I also had a description of using it, though it may not have been
> > clear. In Allegro CL (or in any fully implemented simple-streams
> > implementation) you can use OPEN with a :mapped t keyword argument
> > specification. The stream you get back has a "buffer" slot which is
> > really not a buffer; it is the workspaced which has been mmapped in
> > from the file.
> 
> So does the user then grab that buffer out of the stream and frob it
> directly? I.e. I don't want a stream interface, I want a random access
> interface.

Perhaps I wasn't clear in my original response on this thread:

! The idea would be to open the file as a memory mapped file, and then
! grab the buffer slot in the stream and treat it (along with any offset
! that is needed) as an "aligned" pointer [1] on which one can overlay any
! foreign type.  This can easily be done in Allegro CL.

The part about grabbing the buffer slot is obvious; it is almost
word-for-word the same.  What might not be obvious is the implication
of what it means to "overlay any foreign type" over the workspace.  This
means _any_ foreign type, including foreign structs, arrays, bytes,
words, longs, floats, etc.  All random-access.  See 

http://www.franz.com/support/documentation/7.0/doc/ftype.htm

> And does the buffer map the whole file or just a chunk of
> it at a time.?

The former.  I found the response which I had given to David Steuber
elsewhere on this thread:

! I also had a description of using it, though it may not have been clear.
! In Allegro CL (or in any fully implemented simple-streams implementation)
! you can use OPEN with a :mapped t keyword argument specification.  The
! stream you get back has a "buffer" slot which is really not a buffer; it
! is the workspace which has been mmapped in from the file.

I guess the implication that the whole mapping occurs at once isn't
obvious, though if you know how mmap works it's hard to imagine it
any other way (i.e. why do buffering when you can read and write directly
to the file)?  So to be clear, a single mmap call is performed to grab
the complete file.  If the file is too large to fit into memory, it is
possible to subclass the mapped-file-simple-stream class, and to give
it a different device-open method to map the file in pieces, and then to
specialize device-read/device-write to allow for extending and/or remapping
the file as necessary, but that question didn't seem to be part of the
reqirement in the original question about mmapping a file; I took the
original question to be about mmapping a whole file.

> >> In a situation like this, storing IEEE-754 doubles in a file, isn't
> >> the portability of the data more important than the code that works
> >> on it? Doesn't that imply that the bit pattern in the file has to
> >> be specified?
> >
> > Yes, but it need not be nonportable.  In the read-vector/write-vector
> > parlance, specifying :endian-swap :network-order to both is sufficient
> > to guarantee an order that can be communicated portably.
> 
> Are all the other aspects of float formats (such as the number of bits
> used for mantissa and exponent) specified by the language standard. I
> thought the standard left a lot of room for implementations to choose
> their floating point representations.

Yes, of course, this assumes IEE-754 floats.  That's a pretty good
assumption, nowadays, since it is an almost completely accepted
stabndard format.  Exceptions are

 1. IBM 360/370 formats, which feature a radix 16 exponent,

 2. Cray XMP/YMP/2 formats, which are of their own size and style

 3. Vax style, still available on Alphas.

Alpha architectures define both Vax and IEEE-754 formats.  I have not
looked at the IBM ISA since we dropped our UTS/370 port over 10
years ago.  I don't know if newer versions of the ISA add IEEE-754
support.

I guess I should indeed modify my statements to fit a "mostly portable"
description.  99% ain't bad...

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 18:55:16 +0000
Message-ID: <87y8g8vcvf.fsf@thalassa.informatimago.com>

Peter Seibel <·····@javamonkey.com> writes:

> Duane Rettig <·····@franz.com> writes:
> 
> > I also had a description of using it, though it may not have been
> > clear. In Allegro CL (or in any fully implemented simple-streams
> > implementation) you can use OPEN with a :mapped t keyword argument
> > specification. The stream you get back has a "buffer" slot which is
> > really not a buffer; it is the workspaced which has been mmapped in
> > from the file.
> 
> So does the user then grab that buffer out of the stream and frob it
> directly? I.e. I don't want a stream interface, I want a random access
> interface. And does the buffer map the whole file or just a chunk of
> it at a time.?

Well, the difficulty comes from the fact that Lisp works with
references more than by copying values.  Usually only simple scalar
values such as fixnum and characters are stored directly into a
structure or an array slot.  In most cases, it's a pointer to the
value that is stored.

So to work in the mmap style, you have to change all your lisp habits,
and explictely place your structure or array in the buffer, and
explicitely copy slots between this structure or array and the normal
lisp heap.  You have to manage the depth of the copy too.

Therefore, in all cases you end up doing some kind of FFI, unless you
just need the most basic unboxed array of ieee floats or char, int or
long.

Well, I guess the same problems occur even in C when you have complex
data structures in mmap: you have to use offsets instead of pointers
too, and to explicitely copy too.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 19:39:49 +0000
Message-ID: <4zn0oo9yy.fsf@franz.com>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Peter Seibel <·····@javamonkey.com> writes:
> 
> > Duane Rettig <·····@franz.com> writes:
> > 
> > > I also had a description of using it, though it may not have been
> > > clear. In Allegro CL (or in any fully implemented simple-streams
> > > implementation) you can use OPEN with a :mapped t keyword argument
> > > specification. The stream you get back has a "buffer" slot which is
> > > really not a buffer; it is the workspaced which has been mmapped in
> > > from the file.
> > 
> > So does the user then grab that buffer out of the stream and frob it
> > directly? I.e. I don't want a stream interface, I want a random access
> > interface. And does the buffer map the whole file or just a chunk of
> > it at a time.?
> 
> Well, the difficulty comes from the fact that Lisp works with
> references more than by copying values.  Usually only simple scalar
> values such as fixnum and characters are stored directly into a
> structure or an array slot.  In most cases, it's a pointer to the
> value that is stored.

Yes, this is true.

> So to work in the mmap style, you have to change all your lisp habits,
> and explictely place your structure or array in the buffer, and
> explicitely copy slots between this structure or array and the normal
> lisp heap.  You have to manage the depth of the copy too.

Explicit copying isn't necessary.  Given a good foreign types interface,
the interpretation and/or copying of bits will be done for you.  See
the example below.

> Therefore, in all cases you end up doing some kind of FFI, unless you
> just need the most basic unboxed array of ieee floats or char, int or
> long.

This is more likely true, although most lisps supply some kind of
peek/poke facility as well (in Allegro CL it is sys:memref or
sys:memref-int).

> Well, I guess the same problems occur even in C when you have complex
> data structures in mmap: you have to use offsets instead of pointers
> too, and to explicitely copy too.

No, C allows you to overlay types onto raw data, and Allegro CL
has a similar C-like facility for this as well, which is what I've
been describing in previous articles in this thread.  No explicit
copying or offsetting is required; it is only required that one describe
the data.  For example:

CL-USER(1): (with-open-file (out "~/foo" :direction :output
                                 :if-does-not-exist :create
                                :if-exists :supersede)
              (write-vector 1234.567d0 out)
              (write-vector 7654.321 out))
4
CL-USER(2): (setq foo (open "~/foo" :mapped t))
; Autoloading for class MAPPED-FILE-SIMPLE-STREAM:
; Fast loading from bundle code/streamm.fasl.
#<MAPPED-FILE-SIMPLE-STREAM
  #P"/net/gemini/home/duane/foo" mapped for input pos 0 @ #x1063a43a>
CL-USER(3): (setq buf (slot-value foo 'excl::buffer))
168285184
CL-USER(4): (ff:def-foreign-type mystruct
               (:struct (x :double) (y :float)))
#<FOREIGN-FUNCTIONS::FOREIGN-STRUCTURE MYSTRUCT>
CL-USER(5): (ff:fslot-value-typed 'mystruct :aligned buf 'y)
7654.321
CL-USER(6): (ff:fslot-value-typed 'mystruct :aligned buf 'x)
1234.567d0
CL-USER(7): (ff:fslot-value-typed 'mystruct :aligned buf 'x)
1234.567d0
CL-USER(8): (ff:fslot-value-typed 'mystruct :aligned buf 'y)
7654.321
CL-USER(9): 

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 13:43:59 +0000
Message-ID: <87brd4ykf4.fsf@thalassa.informatimago.com>

Russell McManus <···············@yahoo.com> writes:

> Pascal Bourguignon <····@mouse-potato.com> writes:
> 
> > Always "brevity", "velocity"...  But what about correctness and
> > interchangeability (portability)?
> 
> OK, I'll bite.  Is there some way to recreate the value written out by
> this program in portable common lisp?  The best I've come up with in
> the past is to use the FFI of the implementation.
> 
> Here is the C program I am wondering about:
> 
> #include <fcntl.h>
> int main(int argc,char*argv[]) {
>   int fd=open("/var/tmp/foo", O_WRONLY|O_TRUNC|O_CREAT);
>   double v = 1234.567;
>   write(fd,(char*)&v,sizeof(v));
> }

(loop with s = (format nil "~64,'0B~%"(float-64-to-ieee-754 1234.567d0)) 
      for i from 56 downto 0 by 8 
      do (princ (subseq s i (+ 8 i))) (princ " "))

01010100 11100011 10100101 10011011 01000100 01001010 10100011 01000000 
 
> Here are the resulting bits:
> 
> 1010100 11100011 10100101 10011011 1000100 1001010 10010011 1000000
 01010100 11100011 10100101 10011011 01000100 01001010 10100011 01000000 

Yeah! Another victory for lisp!


You can write the binary file with:

    (with-open-file (out "value.ieee-754-double" :direction :output
                         :if-does-not-exist :create
                         :if-exists :supersede
                         :element-type '(unsigned-byte 64))
        (write-byte (float-64-to-ieee-754 1234.567d0) out))


Note that the file is written with the host byte sex, that is it is
wrong.  But that bug was *specified* by the C programmer...


Now, let's return the challenge and ask the C programmer to write the
same IEEE-754 double in network byte order, so the file be portable
accross platforms.  Something like:

    (with-open-file (out "value.ieee-754-double" :direction :output
                         :if-does-not-exist :create
                         :if-exists :supersede
                         :element-type '(unsigned-byte 8))
       (loop with v = (float-64-to-ieee-754 1234.567d0)
             for i from 56 downto 0 by 8
             do  (write-byte (ldb (byte 8 i) v) out)))


    (with-open-file (s "value.ieee-754-double"
               :direction :input
               :element-type 'unsigned-byte)
      (loop for byte = (read-byte s nil s) then (read-byte s nil s)
        until (eql byte s)
        do (format t "~8,'0B " byte)))
    
==> 01000000 10100011 01001010 01000100 10011011 10100101 11100011 01010100 


Of course, the difficuly lies in float-64-to-ieee-754.  Have the same
C source  work on any bytesex platform, AND on platforms that don't
use IEEE-754 as native floating-point format.



(defmacro gen-ieee-encoding (name type exponent-bits mantissa-bits)
  `(progn
    (defun ,(with-standard-io-syntax 
             (intern (format nil "~A-TO-IEEE-754" name)))  (float)
      (multiple-value-bind (mantissa exponent sign) 
          (integer-decode-float float)
        (dpb (if (minusp sign) 1 0)
             (byte 1 ,(1- (+ exponent-bits mantissa-bits)))
             (dpb (+ ,(+ (1- (expt 2 (1- exponent-bits))) mantissa-bits)
                     exponent)
                  (byte ,exponent-bits ,(1- mantissa-bits))
                  (ldb (byte ,(1- mantissa-bits) 0) mantissa)))))
    (defun ,(with-standard-io-syntax 
             (intern (format nil "IEEE-754-TO-~A" name)))  (ieee)
      (let ((aval (scale-float (coerce
                                (dpb 1 (byte 1 ,(1- mantissa-bits))
                                     (ldb (byte ,(1- mantissa-bits) 0) ieee))
                                type)
                               (- (ldb (byte ,exponent-bits ,(1- mantissa-bits))
                                       ieee) 
                                  ,(1- (expt 2 (1- exponent-bits)))
                                  ,(1- mantissa-bits)))))
        (if (zerop (ldb (byte 1 ,(1- (+ exponent-bits mantissa-bits))) ieee))
            aval
            (- aval))))));;gen-ieee-encoding


(gen-ieee-encoding float-32 'single-float  8 24)
(gen-ieee-encoding float-64 'double-float 11 53)


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 16:29:13 +0000
Message-ID: <4is7cpxd2.fsf@franz.com>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Russell McManus <···············@yahoo.com> writes:
> 
> > Pascal Bourguignon <····@mouse-potato.com> writes:
> > 
> > > Always "brevity", "velocity"...  But what about correctness and
> > > interchangeability (portability)?
> > 
> > OK, I'll bite.  Is there some way to recreate the value written out by
> > this program in portable common lisp?  The best I've come up with in
> > the past is to use the FFI of the implementation.
> > 
> > Here is the C program I am wondering about:
> > 
> > #include <fcntl.h>
> > int main(int argc,char*argv[]) {
> >   int fd=open("/var/tmp/foo", O_WRONLY|O_TRUNC|O_CREAT);
> >   double v = 1234.567;
> >   write(fd,(char*)&v,sizeof(v));
> > }
> 
> (loop with s = (format nil "~64,'0B~%"(float-64-to-ieee-754 1234.567d0)) 
>       for i from 56 downto 0 by 8 
>       do (princ (subseq s i (+ 8 i))) (princ " "))

 [ ... ]

> You can write the binary file with:
> 
>     (with-open-file (out "value.ieee-754-double" :direction :output
>                          :if-does-not-exist :create
>                          :if-exists :supersede
>                          :element-type '(unsigned-byte 64))
>         (write-byte (float-64-to-ieee-754 1234.567d0) out))
> 
> 
> Note that the file is written with the host byte sex, that is it is
> wrong.  But that bug was *specified* by the C programmer...

I don't know that this is the case.  The C programmer tends not to care
about endianness if working on one machine; it is networking that
forces a programmer to consider byte-swapping.  This is true in any
language.  However....

> Now, let's return the challenge and ask the C programmer to write the
> same IEEE-754 double in network byte order, so the file be portable
> accross platforms.  Something like:

 [ ... example, 13 LOC ... ]

> Of course, the difficuly lies in float-64-to-ieee-754.  Have the same
> C source  work on any bytesex platform, AND on platforms that don't
> use IEEE-754 as native floating-point format.

 [ ... largish example of 27 LOC ... ]

This is too large.  network ordering should be an integral part of
a good i/o system, and although write-vector/read-vector are not
standard, they could easily be so, and will do the whole job for
you.

If you are familiar with an old television show called "Name That Tune",
you know that the famous phrase is "I can name that tune in X notes".
Well in the spirit of "name that tune", I can write that code in 7 LOC!

On one machine, with some system-identifying calls, over an NFS LAN:

CL-USER(1): (software-type t)
"FreeBSD"
CL-USER(2): (machine-type)
"x86"
CL-USER(3): (with-open-file (out "~/ieee-754-double" :direction :output
                                 :if-does-not-exist :create
                                :if-exists :supersede)
              (write-vector 1234.567d0 out :endian-swap :network-order))
12
CL-USER(4): 

and on another machine on the same LAN:

CL-USER(1): (software-type t)
"Solaris"
CL-USER(2): (machine-type)
"SPARC"
CL-USER(3): (defparameter *foo* (make-array 1 :element-type 'double-float))
*FOO*
CL-USER(4): (with-open-file (in "~/ieee-754-double")
              (read-vector *foo* in :endian-swap :network-order))
8
CL-USER(5): *foo*
#(1234.567d0)
CL-USER(6): 

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 18:46:13 +0000
Message-ID: <873bygwruy.fsf@thalassa.informatimago.com>

Duane Rettig <·····@franz.com> writes:
>  [ ... largish example of 27 LOC ... ]
> 
> This is too large.  network ordering should be an integral part of
> a good i/o system, and although write-vector/read-vector are not
> standard, they could easily be so, and will do the whole job for
> you.

Good. We can agree that Common Lisp lacks a good I/O systems, but that
it does not prevent _implementers_ or _users_ to write a good I/O
system _over_ it.  Possibly a better I/O system that what is possible
or usually done over C.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Russell McManus
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 15:46:31 +0000
Message-ID: <87k6rs3i94.fsf@thelonious.dyndns.org>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Yeah! Another victory for lisp!
>
> (defmacro gen-ieee-encoding (name type exponent-bits mantissa-bits)
>   `(progn
>     (defun ,(with-standard-io-syntax 
>              (intern (format nil "~A-TO-IEEE-754" name)))  (float)
>       (multiple-value-bind (mantissa exponent sign) 
>           (integer-decode-float float)
>         (dpb (if (minusp sign) 1 0)
>              (byte 1 ,(1- (+ exponent-bits mantissa-bits)))
>              (dpb (+ ,(+ (1- (expt 2 (1- exponent-bits))) mantissa-bits)
>                      exponent)
>                   (byte ,exponent-bits ,(1- mantissa-bits))
>                   (ldb (byte ,(1- mantissa-bits) 0) mantissa)))))
>     (defun ,(with-standard-io-syntax 
>              (intern (format nil "IEEE-754-TO-~A" name)))  (ieee)
>       (let ((aval (scale-float (coerce
>                                 (dpb 1 (byte 1 ,(1- mantissa-bits))
>                                      (ldb (byte ,(1- mantissa-bits) 0) ieee))
>                                 type)
>                                (- (ldb (byte ,exponent-bits ,(1- mantissa-bits))
>                                        ieee) 
>                                   ,(1- (expt 2 (1- exponent-bits)))
>                                   ,(1- mantissa-bits)))))
>         (if (zerop (ldb (byte 1 ,(1- (+ exponent-bits mantissa-bits))) ieee))
>             aval
>             (- aval))))));;gen-ieee-encoding
>
>
> (gen-ieee-encoding float-32 'single-float  8 24)
> (gen-ieee-encoding float-64 'double-float 11 53)

This is exactly what I was looking for.  You rule.

-russ

From: Brian Downing
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 14:01:16 +0000
Message-ID: <ANDtd.160909$5K2.17597@attbi_s03>

In article <··············@thalassa.informatimago.com>,
Pascal Bourguignon  <····@mouse-potato.com> wrote:
> Note that the file is written with the host byte sex, that is it is
> wrong.  But that bug was *specified* by the C programmer...

Okay, I'm all for purity, but I'm going to take issue with that in
general.  It's only a bug if the byte sex is unspecified.  If it's
specified, either in the file itself or (less usefully) in the file
format description, it's not a bug.  Not ideal perhaps, but not a bug.

In the embedded world you usually output whatever you can do fast and
let the client sort it out.  In my world that's /usually/ a native byte
sex (little for me) fixed point representation with the occasional
native byte sex single-float.

-bcd
-- 
*** Brian Downing <bdowning at lavos dot net>

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 15 Dec 2004 04:53:33 +0000
Message-ID: <87y8g0kvqq.fsf@thalassa.informatimago.com>

Pascal Bourguignon <····@mouse-potato.com> writes:
> (defmacro gen-ieee-encoding (name type exponent-bits mantissa-bits)

(defmacro gen-ieee-encoding (name type exponent-bits mantissa-bits)
  ;; Thanks to ivan4th (········@nat-msk-01.ti.ru) for correcting an off-by-1
  `(progn
    (defun ,(with-standard-io-syntax 
             (intern (format nil "~A-TO-IEEE-754" name)))  (float)
      (multiple-value-bind (mantissa exponent sign) 
          (integer-decode-float float)
        (dpb (if (minusp sign) 1 0)
             (byte 1 ,(1- (+ exponent-bits mantissa-bits)))
             (dpb (+ ,(+ (- (expt 2 (1- exponent-bits)) 2) mantissa-bits)
                     exponent)
                  (byte ,exponent-bits ,(1- mantissa-bits))
                  (ldb (byte ,(1- mantissa-bits) 0) mantissa)))))
    (defun ,(with-standard-io-syntax 
             (intern (format nil "IEEE-754-TO-~A" name)))  (ieee)
      (let ((aval (scale-float (coerce
                                (dpb 1 (byte 1 ,(1- mantissa-bits))
                                     (ldb (byte ,(1- mantissa-bits) 0) ieee))
                                ,type)
                               (- (ldb (byte ,exponent-bits ,(1- mantissa-bits))
                                       ieee) 
                                  ,(1- (expt 2 (1- exponent-bits)))
                                  ,(1- mantissa-bits)))))
        (if (zerop (ldb (byte 1 ,(1- (+ exponent-bits mantissa-bits))) ieee))
            aval
            (- aval))))));;gen-ieee-encoding


(gen-ieee-encoding float-32 'single-float  8 24)
(gen-ieee-encoding float-64 'double-float 11 53)

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Cats meow out of angst
"Thumbs! If only we had thumbs!
We could break so much!"

From: Harald Hanche-Olsen
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 00:27:22 +0000
Message-ID: <pco653dbpn9.fsf@shuttle.math.ntnu.no>

+ Pascal Bourguignon <····@mouse-potato.com>:

| Harald Hanche-Olsen <······@math.ntnu.no> writes:
| 
| > Nice.  But I see no floats there.  Not that I doubt it can be done,
| > but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
| > the corresponding read() for brevity.  
| 
| Always "brevity", "velocity"...  But what about correctness and
| interchangeability (portability)?
| 
| It's hard to beat ite(fd,&some_float,sizeof(some_float)); for bugs.

Of course, of course.  That is why you find me here and not on a C
language newsgroup.

| Well, I've not seen a lot of scientific data files, but those I've
| seen were all in pure ASCII.

I have a colleague who regularly works with SAR (synthetic aperture
radar) pictures.  I think each picture (unprocessed) may be a couple
hundred megabytes' worth of binary data.

| (And by the way, one good reason to keep data in ASCII is that it
| takes much less space than in binary).

Really?

CL-USER> pi                          ; 8 bytes
3.141592653589793d0
CL-USER> (coerce pi 'single-float)   ; 4 bytes
3.1415927

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow

From: rif
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 02:57:17 +0000
Message-ID: <wj0u0qx5wfm.fsf@five-percent-nation.mit.edu>

As someone who works with CL all the time, and does mostly
floating-point numerical computation, I can say that both sides of
this debate have some merit.  The issue is that I often have a huge
collection of double-floats (say tens to hundreds of megabytes worth)
that I want to be able to easily read in and read out.  C makes this
both easy and fast, with fread and fwrite.  There is no portable CL
way to deal with it that is also fast.  You can parse the ASCII
numbers into double-floats, but that's slow.  You can put it into
binary and then use decode-float hacks to read and write it, but
that's slow too.

I use CMUCL and SBCL specifically (they are free as in beer and
compile to native code, which are two requirements for me).  I wrapped
fread and fwrite through the FFI.  Took a couple of hours.  Works great.

rif

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 03:21:57 +0000
Message-ID: <4653dh3u2.fsf@franz.com>

rif <···@mit.edu> writes:

> As someone who works with CL all the time, and does mostly
> floating-point numerical computation, I can say that both sides of
> this debate have some merit.  The issue is that I often have a huge
> collection of double-floats (say tens to hundreds of megabytes worth)
> that I want to be able to easily read in and read out.  C makes this
> both easy and fast, with fread and fwrite.  There is no portable CL
> way to deal with it that is also fast.  You can parse the ASCII
> numbers into double-floats, but that's slow.  You can put it into
> binary and then use decode-float hacks to read and write it, but
> that's slow too.
> 
> I use CMUCL and SBCL specifically (they are free as in beer and
> compile to native code, which are two requirements for me).  I wrapped
> fread and fwrite through the FFI.  Took a couple of hours.  Works great.

Obviously one can do anything in C that one can do in Lisp+FFI.  But
there are two problems with doing this for I/O:

1. It proves the point that is being made - Common Lisp's I/O is not
(as defined) as good as that functionality that had to be called
via FFI.  You're not using your Lisp's I/O, you're using C's I/O.

2. The fread/fwrite you use cannot be intermingled with Lisp functionality
without incurring buffering problems; you could probably open a file
using cl:open, but then you'd have to reach in an grab a file-descriptor
or FILE struct pointer and use that directly, and then if you used the
same stream for Lisp output as well as fwrite calls, you have to perform
finish-output after every write, not to mention flush() calls on the
C/FFI side.

Alternatively, you could use the simple-streams write-vector function to
do the job directly.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Mario S. Mommer
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 09:45:03 +0000
Message-ID: <fzis7d3yzk.fsf@germany.igpm.rwth-aachen.de>

rif <···@mit.edu> writes:
> As someone who works with CL all the time, and does mostly
> floating-point numerical computation, I can say that both sides of
> this debate have some merit.  The issue is that I often have a huge
> collection of double-floats (say tens to hundreds of megabytes worth)
> that I want to be able to easily read in and read out.  C makes this
> both easy and fast, with fread and fwrite.  There is no portable CL
> way to deal with it that is also fast.  You can parse the ASCII
> numbers into double-floats, but that's slow.  You can put it into
> binary and then use decode-float hacks to read and write it, but
> that's slow too.
>
> I use CMUCL and SBCL specifically (they are free as in beer and
> compile to native code, which are two requirements for me).  I wrapped
> fread and fwrite through the FFI.  Took a couple of hours.  Works great.

I use fasls for the same purpose. It implies being carefull when
upgrading (as fasls usually are incompatible across versions), but
that is also easy (use ascii as intermediate format). Plus, you get to
store whatever you want efficiently, without bothering about how to
write/read the binary.

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 13:50:40 +0000
Message-ID: <877jnsyk3z.fsf@thalassa.informatimago.com>

rif <···@mit.edu> writes:

> As someone who works with CL all the time, and does mostly
> floating-point numerical computation, I can say that both sides of
> this debate have some merit.  The issue is that I often have a huge
> collection of double-floats (say tens to hundreds of megabytes worth)
> that I want to be able to easily read in and read out.  C makes this
> both easy and fast, with fread and fwrite.  There is no portable CL
> way to deal with it that is also fast.  You can parse the ASCII
> numbers into double-floats, but that's slow.  You can put it into
> binary and then use decode-float hacks to read and write it, but
> that's slow too.

What about SAVE-IMAGE? ;-)

Implementations could provide partitionned heaps and a way to save and
load a part of a heap (using for example mmap).  The problem is that
consing occurs everywhere in Common-Lisp, it's not even specified when
and when not, and you would need operators to copy (shallow, deep, how
deep?) data to and from heaps.


> I use CMUCL and SBCL specifically (they are free as in beer and
> compile to native code, which are two requirements for me).  I wrapped
> fread and fwrite through the FFI.  Took a couple of hours.  Works great.
> 
> rif

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 00:53:02 +0000
Message-ID: <87llc9y5jl.fsf@thalassa.informatimago.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:
> | (And by the way, one good reason to keep data in ASCII is that it
> | takes much less space than in binary).
> 
> Really?
> 
> CL-USER> pi                          ; 8 bytes
> 3.141592653589793d0
> CL-USER> (coerce pi 'single-float)   ; 4 bytes
> 3.1415927

Obviously, it depends on the kind of data. "pi": 2 bytes.

For some geographic data, (records of 32-bit integers), the binary
format takes almost twice the size of the ASCII format.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Charles Hixson
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 19:52:55 +0000
Message-ID: <cp7m2p0302j@news3.newsguy.com>

Pascal Bourguignon wrote:
> Harald Hanche-Olsen <······@math.ntnu.no> writes:
> 
>>| (And by the way, one good reason to keep data in ASCII is that it
>>| takes much less space than in binary).
>>
>>Really?
>>
>>CL-USER> pi                          ; 8 bytes
>>3.141592653589793d0
>>CL-USER> (coerce pi 'single-float)   ; 4 bytes
>>3.1415927
> 
> 
> Obviously, it depends on the kind of data. "pi": 2 bytes.
> 
> For some geographic data, (records of 32-bit integers), the binary
> format takes almost twice the size of the ASCII format.
> 
If the data actually uses 32 bits worth of information, then an 
ASCII representation will be significantly longer.  Note that each 
CHARACTER will represent at least 7 bits...considerably more if one 
uses an extended character set. BCD is superior and convenient...but 
still loses storage compared to binary (each digit is stored in 4 
bits, signs can also be coded).

OTOH, if the data can be accessed via a table lookup (ala pi), then 
that encoding can usually be packed much more tightly than ordinary 
characters.

ASCII is NEVER (almost never) a space efficient way to store things. 
  That's why zip, gzip, and bzip2 were created.

This doesn't mean it can't be the most convenient way to store 
things, or the most useable way.  Space isn't everything.  But don't 
use space efficiency as the reason for chosing ASCII.

From: Karl A. Krueger
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 04:14:59 +0000
Message-ID: <cp5v40$n5g$1@baldur.whoi.edu>

Pascal Bourguignon <····@mouse-potato.com> wrote:
> Harald Hanche-Olsen <······@math.ntnu.no> writes:
>> I think if I had to parse the typical file of scientific data,
>> stuffed full of floats, I'd use some form of FFI.
> 
> Well, I've not seen a lot of scientific data files, but those I've
> seen were all in pure ASCII.  (And by the way, one good reason to keep
> data in ASCII is that it takes much less space than in binary).

For what it's worth, around here, I think "typical files of scientific
data" are pretty frequently in NetCDF.  But I'm not in scientific
programming here, so I could well be wrong.  :)

-- 
Karl A. Krueger <········@example.edu> { s/example/whoi/ }

Every program has at least one bug and can be shortened by at least one line.
By induction, every program can be reduced to one line which does not work.

From: Peter Seibel
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 02:03:07 +0000
Message-ID: <m31xe1r1gl.fsf@javamonkey.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + Peter Seibel <·····@javamonkey.com>:
>
> | > I.e., write a couple of
> | > floats, then an integer or two, etc., all in binary format, and read
> | > them back in.
> | 
> | Hmmmm. It's no worse in Lisp than most languages. For one way to do it
> | see:
> | 
> |   <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
> |   <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>
>
> Nice.  But I see no floats there.  Not that I doubt it can be done,
> but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
> the corresponding read() for brevity. 

Well, with my binary data code, once you've defined the appropriate
binary type I think it will equal it for brevity. And, as Pascal,
points out, it is more likely to correctly implement whatever external
float format you need. For instance, if I needed to read and write
Java floats in the format used by Java's DataInputStream and
DataOutputStream, I'd write something like this (untested):

  (define-binary-type single-float ()
    (:reader (in)
      (let ((bits (read-value 'u4 in)))
        (case bits
          (+single-float-not-a-number+ 'nan)
          (+single-float-positive-infinity+ 'positive-infinity)
          (+single-float-negative-infinity+ 'negative-infinity)
          (t (let ((sign (ldb +single-float-sign-bit+ bits))
                   (exponent (ldb +single-float-exponent-bits+ bits))
                   (mantissa (ldb +single-float-mantissa-bits+ bits)))
               (* sign mantissa (expt 2 exponent)))))))
    (:writer (out value)
      (let ((bits
             (case value
               (nan +single-float-not-a-number+)
               (positive-infinity +single-float-positive-infinity+)
               (negative-infinity +single-float-negative-infinity+)
               (t
                (let ((bits 0))
                  (multiple-value-bind (mantissa exponent sign)
                      (integer-decode-float value)
                    (setf (ldb +single-float-sign-bit+ bits) sign)
                    (setf (ldb +single-float-exponent-bits+ bits) exponent)
                    (setf (ldb +single-float-mantissa-bits+ bits) mantissa)))))))
        (write-value 'u4 out bits))))

Obviously if you have an Lisp implementation that has constants for
the various infinities and NaNs you could use those instead of the
symbols I used to represent those values. Now I can read those floats
with this:

  (read-value 'single-float in)

and write them with this:

  (write-value 'single-float out value)

And I can use those types in my binary classes (also untested):

  (define-binary-class experimental-observation ()
    ((x single-float)
     (y single-float)
     (mass single-float)))

and get readers and writers for free.

> I think if I had to parse the typical file of scientific data,
> stuffed full of floats, I'd use some form of FFI.

Why? What library would you use? And why would that be easier than
what I just wrote.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Harald Hanche-Olsen
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 09:06:32 +0000
Message-ID: <pcod5xlnopz.fsf@shuttle.math.ntnu.no>

+ Peter Seibel <·····@javamonkey.com>:

| Well, with my binary data code, once you've defined the appropriate
| binary type I think it will equal it for brevity.

Yep.  Looks easier than I thought it would be, but not by a lot.  I do
wonder however, how efficient it would be on a truly massive data
file.  This is not something a person primarily interested in
crunching a lot of numbers would be likely to cook up, however.  I am
not saying that CL couldn't do it, only that is a bit short on ready
made methods to do it.  But thanks for the code, I'll squirrel it away
against the day I might need it.

| > I think if I had to parse the typical file of scientific data,
| > stuffed full of floats, I'd use some form of FFI.
| 
| Why? What library would you use? And why would that be easier than
| what I just wrote.

Partly ignorance, I guess.  Don't know what library, but there are
lots of them out there.  For simple cases, I'd just write a tiny C
wrapper to read() data into an array, then use whatever mechanisms the
FFI offers to make the result available to Lisp.  (Except now that I
have your code, I'd see if I couldn't adapt it instead.)

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow

From: Peter Seibel
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 16:54:20 +0000
Message-ID: <m33bygpw77.fsf@javamonkey.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + Peter Seibel <·····@javamonkey.com>:
>
> | Well, with my binary data code, once you've defined the appropriate
> | binary type I think it will equal it for brevity.
>
> Yep. Looks easier than I thought it would be, but not by a lot. I do
> wonder however, how efficient it would be on a truly massive data
> file.

I guess there are two scenarios for this kind of thing. One is when
you need to externalize massive amounts of "native" data. I.e. your
program generates a vector of a couple million double floats and you
want to save them to a file and later read them back in. In that case
you don't particularly care what they look like on disk as long as you
get the same values when you read them back.

The other situation is when you need to parse a specific binary format
such as an ID3 tag or a Java class file. In that case you need to read
and write data in a specific format on disk and can't necessarily
count on that format being exactly the same format as the same data
would take in memory. So even in C you'd have to write code something
like what I wrote except in C, without proper macros, it'd be a
million times more painful.

In the first case, the difference between Common Lisp and C is not so
much about "I/O" as about C's feature of giving the programmer pretty
much raw access to arbitrary memory. For example, think of it this
way, I can write this in Common Lisp:

    (defun read-buffer (in size)
      (let ((buffer (make-array size :element-type '(unsigned-byte 8))))
        (read-sequence buffer in)
        buffer))

Hopefully the implementation will compile this down to something
nearly as efficient as the analogous code in C. After I call this
function the I/O part is done; now it's just a question of converting
the data in that array of bytes into other kinds of objects. In C I
can take an address in that buffer and cast it to the type of value I
expect to find there and C will happily oblige. And as long as the
data was written by another C program on the same hardware (and maybe
compiled with the same C compiler) that'll probably work. But that
doesn't really--I don't think--have much to do with I/O. But maybe
that's just being pedantic. However it is the case that other non-C
languages, such as Java, have to approach thing much more how I did
with my Common Lisp code.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Duane Rettig
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 15:58:34 +0000
Message-ID: <4mzwopys5.fsf@franz.com>

Ingvar <······@hexapodia.net> writes:

> Harald Hanche-Olsen <······@math.ntnu.no> writes:
> 
> > + Peter Seibel <·····@javamonkey.com>:
> > 
> > | > I.e., write a couple of
> > | > floats, then an integer or two, etc., all in binary format, and read
> > | > them back in.
> > | 
> > | Hmmmm. It's no worse in Lisp than most languages. For one way to do it
> > | see:
> > | 
> > |   <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>
> > |   <http://www.gigamonkeys.com/book/practical-an-id3-parser.html>
> > 
> > Nice.  But I see no floats there.  Not that I doubt it can be done,
> > but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
> > the corresponding read() for brevity.  I think if I had to parse the
> > typical file of scientific data, stuffed full of floats, I'd use some
> > form of FFI.
> 
> And hope that the float format is sufficiently similar from the
> machine that produced the file and the machine that processes the
> file.
> 
> What happens if my 64-bit floats end up being written as "ABCD" and
> your floats end up being stored as "DCBA"? What happens if your 64-bit
> floats have a 48 bit mantissa and my 64-bit floats have a 46-bit
> mantissa? One of these cases is *way* less realistic than the other,
> but they'd be equally catastrophic.

One thing I forgot to mention in my previous articles where I mention
write-vector is that this function is specified to accept an :endian-swap
keyword.  You can specify it as any integer to get any swapping pattern
you wish, or you can use :byte-8 (no-swaps), :byte-16, :byte-32, :byte-64,
or, if you wish big-endian and little-endian machines to communicate,
you can specify :network-order.  Note the keyword description at 

http://www.franz.com/support/documentation/7.0/doc/operators/excl/write-vector.htm

and follow a link to

http://www.franz.com/support/documentation/7.0/doc/streams.htm#endian-swap-3

to see the description of the byte swapping feature of read-vector and
write-vector.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: ···@gnu.org
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 17:57:23 +0000
Message-ID: <1102528643.939766.156880@f14g2000cwb.googlegroups.com>

Harald Hanche-Olsen wrote:
> Nice.  But I see no floats there.  Not that I doubt it can be done,
> but it's hard to beat write(fd,&some_float,sizeof(some_float)); and
> the corresponding read() for brevity.  I think if I had to parse the
> typical file of scientific data, stuffed full of floats, I'd use some
> form of FFI.

CLISP offers
READ-FLOAT <http://clisp.cons.org/impnotes/stream-dict.html#bin-input>
WRITE-FLOAT
<http://clisp.cons.org/impnotes/stream-dict.html#bin-output>


--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.camera.org> <http://www.iris.org.il>
<http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.honestreporting.com>
Binaries die but source code lives forever.

From: Paolo Amoroso
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 12:54:45 +0000
Message-ID: <878y89j6ga.fsf@plato.moon.paoloamoroso.it>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> Nice.  But I see no floats there.  Not that I doubt it can be done,

See:

  http://www.labri.fr/Perso/~mraspaud/lisp-sources/read-bytes-standalone.lisp


Paolo
-- 
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Recommended Common Lisp libraries/tools (see also http://clrfi.alu.org):
- ASDF/ASDF-INSTALL: system building/installation
- CL-PPCRE: regular expressions
- UFFI: Foreign Function Interface

From: David Sletten
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 03:20:41 +0000
Message-ID: <dqutd.3475$hd.2538@twister.socal.rr.com>

Bud Graziano wrote:
> Hi,
> 
> 
> Yesterday we started learning about Common Lisp and CLOS. My professor
> said that Lisp is weak in I/O. Having a glance at
> http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
> reveals several stream functions and macros. Not to mention the file
> section of the CLHS and the pathnames data-structure. So it seems to
> me that Lisp is good at I/O. At least that's my impression. Can anyone
> comment it?
> 
> Thank you,
> Bud.

Welcome to Lisp world, Bud.

You are in this professor's class because presumably he knows things 
that you don't. You should respect that. However, one of the chief 
reasons you are in college is to develop your ability to think for 
yourself. When someone tells you something which you know is wrong, and 
even better, when you can explain why it's wrong, you have no obligation 
to believe them simply because of their position or reputation. You seem 
confident and inquisitive enough to accept that not everything that 
comes out of your professor's mouth is to be absorbed without 
reflection. That is good. By the same token, don't simply take our word 
for it either. Other people have given you some interesting things to 
investigate. Weigh that against what evidence your professor has 
produced, and hopefully you can judge for yourself.

David Sletten

From: Adam Warner
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 23:11:42 +0000
Message-ID: <pan.2004.12.07.23.11.39.620549@consulting.net.nz>

Hi Bud Graziano,

> Yesterday we started learning about Common Lisp and CLOS. My professor
> said that Lisp is weak in I/O. Having a glance at
> http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
> reveals several stream functions and macros. Not to mention the file
> section of the CLHS and the pathnames data-structure. So it seems to
> me that Lisp is good at I/O. At least that's my impression. Can anyone
> comment it?

It has omissions in the area of binary streams. For example there appears
to be no portable way to construct a binary stream analogous to
MAKE-STRING-OUTPUT-STREAM.

One can almost get there by opening a file as an octet stream, writing to
the file, closing it and reading it back in. What one can't do is the
analogy of GET-OUTPUT-STREAM-STRING. Once you close the file to read it
back in you can't write to the same stream again.

This is how MAKE-STRING-OUTPUT-STREAM works:

* (defparameter *s* (make-string-output-stream))

*s*
* (write-char #\A *s*)

#\A
* (get-output-stream-string *s*)

"A"
* (write-string "BCDEF" *s*)

"BCDEF"
* (get-output-stream-string *s*)

"BCDEF"

That is you can write to the string stream and then get what has been
written so far without having to close the string stream.

If the Common Lisp IO API was complete there would be an analogous
MAKE-BINARY-OUTPUT-STREAM that accepted a binary element type such as
(UNSIGNED-BYTE 8). If anyone can implement this using portable code I'd
like to see it. I'd also like to learn about implementation-specific
solutions.

Regards,
Adam

From: Pascal Bourguignon
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 00:38:54 +0000
Message-ID: <87y8g9y675.fsf@thalassa.informatimago.com>

Adam Warner <······@consulting.net.nz> writes:

> Hi Bud Graziano,
> 
> > Yesterday we started learning about Common Lisp and CLOS. My professor
> > said that Lisp is weak in I/O. Having a glance at
> > http://www.lisp.org/HyperSpec/Body/sec_the_streams_dictionary.html
> > reveals several stream functions and macros. Not to mention the file
> > section of the CLHS and the pathnames data-structure. So it seems to
> > me that Lisp is good at I/O. At least that's my impression. Can anyone
> > comment it?
> 
> It has omissions in the area of binary streams. For example there appears
> to be no portable way to construct a binary stream analogous to
> MAKE-STRING-OUTPUT-STREAM.

You mean like:

;; ----------------------------------------------------------------------

(DEFGENERIC BVSTREAM-POSITION (SELF POSITION))
(DEFGENERIC BVSTREAM-WRITE-BYTE (SELF BYTE))
(DEFGENERIC BVSTREAM-READ-BYTE (SELF))


(DEFCLASS BVSTREAM-OUT ()
  ((BYTES :READER GET-BYTES
          :WRITER SET-BYTES
          :ACCESSOR BYTE-VECTOR
          :INITFORM (MAKE-ARRAY '(1024)
                                :ELEMENT-TYPE '(UNSIGNED-BYTE 8)
                                :ADJUSTABLE T
                                :FILL-POINTER 0)
          :INITARG :BYTES)));;BVSTREAM-OUT


(DEFMETHOD BVSTREAM-POSITION ((SELF BVSTREAM-OUT) POSITION)
  (IF POSITION
    (SETF (FILL-POINTER (BYTE-VECTOR SELF))
          (MIN (ARRAY-DIMENSION (BYTE-VECTOR SELF) 0) (MAX 0 POSITION)))
    (FILL-POINTER (BYTE-VECTOR SELF))));;BVSTREAM-POSITION


(DEFMETHOD BVSTREAM-WRITE-BYTE ((SELF BVSTREAM-OUT) (BYTE INTEGER))
  (VECTOR-PUSH-EXTEND (LDB (BYTE 8 0) BYTE) 
                      (BYTE-VECTOR SELF)
                      (ARRAY-DIMENSION (BYTE-VECTOR SELF) 0)))


(DEFMACRO WITH-OUTPUT-TO-BYTE-VECTOR ((VAR &OPTIONAL BYTE-VECTOR-FORM 
                                           &KEY ELEMENT-TYPE) &BODY BODY)
  (declare (ignore element-type)) ;; TODO: Remove this parameter!
  `(LET ((,VAR (MAKE-INSTANCE 'BVSTREAM-OUT
                 ,@(WHEN BYTE-VECTOR-FORM `(:BYTES ,BYTE-VECTOR-FORM)))))
     (LET ((,VAR ,VAR)) ,@BODY)
     (GET-BYTES ,VAR)));;WITH-OUTPUT-TO-BYTE-VECTOR


(DEFCLASS BVSTREAM-IN ()
  ((BYTES :READER GET-BYTES :WRITER SET-BYTES
          :ACCESSOR BYTE-VECTOR
          :INITARG :BYTES)
   (POSITION :READER GET-POSITION
             :ACCESSOR BIS-POSITION 
             :INITARG :POSITION :INITFORM 0)
   (END :INITARG :END :INITFORM NIL))
  );;BVSTREAM-IN


(DEFMETHOD INITIALIZE-INSTANCE ((SELF BVSTREAM-IN) &REST ARGS)
  (DECLARE (IGNORE ARGS))
  (CALL-NEXT-METHOD)
  (LET ((LEN (LENGTH (BYTE-VECTOR SELF))))
    (SETF (SLOT-VALUE SELF 'END) (IF (SLOT-VALUE SELF 'END)
                                   (MIN (SLOT-VALUE SELF 'END) LEN) LEN)
          (BIS-POSITION SELF) (MAX 0 (MIN (BIS-POSITION SELF) LEN))))
  SELF);;INITIALIZE-INSTANCE
                                                

(DEFMETHOD BVSTREAM-POSITION ((SELF BVSTREAM-IN) POSITION)
  (IF POSITION
    (SETF (BIS-POSITION SELF) (MIN (BIS-POSITION SELF) (MAX 0 POSITION)))
    (BIS-POSITION SELF)))


(DEFMETHOD BVSTREAM-READ-BYTE ((SELF BVSTREAM-IN))
  (IF (< (BIS-POSITION SELF) (SLOT-VALUE SELF 'END))
    (PROG1 (AREF (GET-BYTES SELF) (BIS-POSITION SELF))
      (INCF (BIS-POSITION SELF)))
    :EOF));;BVSTREAM-READ-BYTE


(DEFMACRO WITH-INPUT-FROM-BYTE-VECTOR ((VAR BYTE-VECTOR &KEY INDEX START END)
                                       &BODY BODY)
  `(LET ((,VAR (MAKE-INSTANCE 'BVSTREAM-IN :BYTES ,BYTE-VECTOR
                              ,@(WHEN START `((:POSITION ,START)))
                              ,@(WHEN END   `((:END ,END))))))
     (LET ((,VAR ,VAR)) ,@BODY)
     ,(WHEN INDEX `(SETF ,INDEX (GET-POSITION ,VAR)))
     (GET-POSITION ,VAR)));;WITH-INPUT-FROM-BYTE-VECTOR

;; ----------------------------------------------------------------------

(DEFMACRO ENCODE-BYTES (ENCODE BYTES LINE-WIDTH NEW-LINE)
  `(WITH-OUTPUT-TO-STRING (OUT)
     (WITH-INPUT-FROM-BYTE-VECTOR (IN ,BYTES)
       (LET ((COLUMN 0)) 
         (,ENCODE
          ;; read-byte:
          (LAMBDA () (LET ((BYTE (BVSTREAM-READ-BYTE IN)))
                  (IF (EQ :EOF BYTE) NIL BYTE)))
          ;; write-char
          (IF ,LINE-WIDTH
            (LAMBDA (CH) 
              (WRITE-CHAR CH OUT)
              (INCF COLUMN)
              (WHEN (<= ,LINE-WIDTH COLUMN)
                (SETF COLUMN 0)
                (PRINC ,NEW-LINE OUT)))
            (LAMBDA (CH)
              (WRITE-CHAR CH OUT))))
         (WHEN (AND ,LINE-WIDTH (/= 0 COLUMN))
           (PRINC ,NEW-LINE OUT))))));;ENCODE-BYTES


(DEFMACRO DECODE-BYTES (DECODE ENCODED IGNORE-CRLF IGNORE-INVALID-INPUT)
  `(WITH-OUTPUT-TO-BYTE-VECTOR (OUT)
     (WITH-INPUT-FROM-STRING (IN ,ENCODED)
       (,DECODE
        ;; read-char
        (IF ,IGNORE-CRLF
          (LAMBDA () (DO ((CH (READ-CHAR IN NIL NIL)(READ-CHAR IN NIL NIL)))
                    ((OR (NULL CH) (NOT (MEMBER (CHAR-CODE CH) '(10 13)))) 
                     CH)))
          (LAMBDA () (READ-CHAR IN NIL NIL)))
        ;; write-byte
        (LAMBDA (BYTE) (BVSTREAM-WRITE-BYTE OUT BYTE))
        :IGNORE-INVALID-INPUT ,IGNORE-INVALID-INPUT))));;DECODE-BYTES

;; ----------------------------------------------------------------------


Note that the above class is not a gray stream and has no relation
with streams. It's unfortunate that primitives like cl:read-byte,
cl:write-byte, cl:position are not generic functions, but it does not
prevent anybody to write portably WITH-OUTPUT-TO-BYTE-VECTOR and
WITH-INPUT-FROM-BYTE-VECTOR.  In any case, you have to specify how you
read and write other types than (unsigned-byte 8).


A HOST-BYTE-SIZE constant in COMMON-LISP would have been useful to
avoid assuming 8 bit/byte...


> One can almost get there by opening a file as an octet stream, writing to
> the file, closing it and reading it back in. What one can't do is the
> analogy of GET-OUTPUT-STREAM-STRING. Once you close the file to read it
> back in you can't write to the same stream again.

Note that Common-Lisp does not specify how binary data is read or
written.  Indeed, this is a weakness: each implementation may choose
to use a different file format (byte-sex, header, padding, etc). That
means that applications that wish to do binary I/O MUST write their
own I/O anyway, using implementation dependant primitives (for example
assuming the implementation byte-sex and avoiding strange binary types
like BIT or non-word-multiples of UNSIGNED-BYTE or SIGNED-BYTE).


> If the Common Lisp IO API was complete there would be an analogous
> MAKE-BINARY-OUTPUT-STREAM that accepted a binary element type such as
> (UNSIGNED-BYTE 8). If anyone can implement this using portable code I'd
> like to see it. I'd also like to learn about implementation-specific
> solutions.

But then, you cannot use write(fd,&aFloat,sizeof(aFloat)) on non-POSIX
systems.  Try that on MVS!  Common-Lisp I/O API is designed to be
usable on a wider range of I/O systems.





So before specifying WITH-OUTPUT-TO-BYTE-VECTOR and
WITH-INPUT-FROM-BYTE-VECTOR, we would need to specify standard
external format for binary I/O, and this could be done only on a per
system class basis. You'd have a binary I/O external format for POSIX
systems, a binary I/O external format for LispMachine systems, a I/O
external format for MVS systems or record based systems, another for
36-bit systems, etc.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.

From: Adam Warner
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 04:20:12 +0000
Message-ID: <pan.2004.12.08.04.20.10.235258@consulting.net.nz>

Hi Pascal Bourguignon,

>> It has omissions in the area of binary streams. For example there appears
>> to be no portable way to construct a binary stream analogous to
>> MAKE-STRING-OUTPUT-STREAM.
> 
> You mean like:

Thanks for the example but I certainly don't mean that.

(stream-element-type (make-instance 'bvstream-out))
=> debugger invoked on a TYPE-ERROR in thread 2494:
   The value #<BVSTREAM-OUT {90CF839}> is not of type STREAM.

(streamp (make-instance 'bvstream-out)) => NIL

This is the critical point: You have NOT constructed an object of type
STREAM let alone a STREAM of unsigned bytes.

(stream-element-type 
  (open "/tmp/filename" :direction :output
                        :element-type '(unsigned-byte 8)
                        :if-exists :supersede))
=> (UNSIGNED-BYTE 8)

The above is a binary stream. It's just not a suitable one as I have
already explained.

<http://www.lisp.org/HyperSpec/Body/syscla_built-in-class.html>
Not being able to make-instances of nor subclass built-in-classes can
be painful. I think a lot of us wish ANSI Common Lisp wasn't quite so
static.

Regards,
Adam

From: Christopher C. Stacy
Subject: Re: Lisp vs. I/O
Date: Wed, 08 Dec 2004 07:24:25 +0000
Message-ID: <ud5xlw8ut.fsf@news.dtpq.com>

Which Common Lisp implementations don't have the ability to 
support "simple streams" (or at least "gray streams")?

(I mean, which ones is it not possible to, at least,
load an existing library which gives you the ability
to create new streams?)

From: Adam Warner
Subject: Re: Lisp vs. I/O
Date: Tue, 07 Dec 2004 23:53:47 +0000
Message-ID: <pan.2004.12.07.23.53.45.436180@consulting.net.nz>

> If the Common Lisp IO API was complete there would be an analogous
> MAKE-BINARY-OUTPUT-STREAM that accepted a binary element type such as
> (UNSIGNED-BYTE 8).

If it's not clear the analogous object returned would be a simple array
of the same element type. That's easy to achieve so long as READ-BYTE
works upon the binary stream without having to close it first like one has
to do with a file.

Regards,
Adam