Basic streams/formatting questions

From: rif
Subject: Basic streams/formatting questions
Date: Tue, 08 Jul 2003 15:27:18 +0000
Message-ID: <wj0fzlhkpih.fsf@five-percent-nation.mit.edu>

1.  I would like a format directive that gives me the full "string
    representation" of an arbitrary double-float, without the d0.  For
    instance, I have a double-precision float that displays in CMUCL
    as

0.012759357078948153d0

    and I want a format directive (or other method) to print.

0.012759357078948153d0

    The number of digits before/after the decimal point is *not* known
    in advance.

2.  What I actually have is a very large set of vectors of
    double-floats (each of length 100) that I want to print to a file
    in ASCII.  Is there anything that's going to be faster than
    calling format inside a dotimes loop for each vector?  This is not
    premature optimization --- the program is already written
    (although problem 1 above is not well-solved) and it's too slow
    right now.  (This is all in CMUCL, if that's relevant).

Cheers,

rif

Re: Basic streams/formatting questions Raymond Toy
- Re: Basic streams/formatting questions rif
  - Re: Basic streams/formatting questions Raymond Toy
    - Re: Basic streams/formatting questions rif
  - Re: Basic streams/formatting questions Marco Antoniotti
Re: Basic streams/formatting questions Rene de Visser
- Re: Basic streams/formatting questions Kalle Olavi Niemitalo
  - Re: Basic streams/formatting questions Barry Margolin
    - Re: Basic streams/formatting questions Kalle Olavi Niemitalo
  - Re: Basic streams/formatting questions Alexey Dejneka
    - Re: Basic streams/formatting questions Kalle Olavi Niemitalo
      - Re: Basic streams/formatting questions Raymond Toy
        CMUCL's deftransform (was: Basic streams/formatting questions) Kalle Olavi Niemitalo
        Re: CMUCL's deftransform (was: Basic streams/formatting questions) Raymond Toy

From: Raymond Toy
Subject: Re: Basic streams/formatting questions
Date: Tue, 08 Jul 2003 15:57:17 +0000
Message-ID: <4n4r1x810i.fsf@edgedsp4.rtp.ericsson.se>

>>>>> "rif" == rif  <···@mit.edu> writes:

    rif> 1.  I would like a format directive that gives me the full "string
    rif>     representation" of an arbitrary double-float, without the d0.  For
    rif>     instance, I have a double-precision float that displays in CMUCL
    rif>     as

    rif> 0.012759357078948153d0

    rif>     and I want a format directive (or other method) to print.

    rif> 0.012759357078948153d0

Presumably, the "d0" here was not supposed to be printed.

    rif>     The number of digits before/after the decimal point is *not* known
    rif>     in advance.

What do you want if the number is 1d300?  1 followed by 300 zeroes?
What about 1d-300?  299 zeroes followed by 1?

"~w,df" comes close, but you need to select w and d carefully.

    rif> 2.  What I actually have is a very large set of vectors of
    rif>     double-floats (each of length 100) that I want to print to a file
    rif>     in ASCII.  Is there anything that's going to be faster than
    rif>     calling format inside a dotimes loop for each vector?  This is not

Not that I know of.

Ray

From: rif
Subject: Re: Basic streams/formatting questions
Date: Tue, 08 Jul 2003 16:28:01 +0000
Message-ID: <wj0brw5kmpa.fsf@five-percent-nation.mit.edu>

>     rif>     and I want a format directive (or other method) to print.
> 
>     rif> 0.012759357078948153d0
> 
> Presumably, the "d0" here was not supposed to be printed.

Yes, my mistake.  Duh.

> 
>     rif>     The number of digits before/after the decimal point is *not* known
>     rif>     in advance.
> 
> What do you want if the number is 1d300?  1 followed by 300 zeroes?
> What about 1d-300?  299 zeroes followed by 1?
> 
> "~w,df" comes close, but you need to select w and d carefully.

I know that there will be between 0 and 10 digits to the left of the
decimal place.  I can get something reasonable with ~w,dF, but I also
have to accept a bunch of extra spaces in my file for this, yes?

>     rif> 2.  What I actually have is a very large set of vectors of
>     rif>     double-floats (each of length 100) that I want to print to a file
>     rif>     in ASCII.  Is there anything that's going to be faster than
>     rif>     calling format inside a dotimes loop for each vector?  This is not
> 
> Not that I know of.
> 

That is surprising and unfortunate (if true), because I need to do
this with some frequency.  This kind of thing is pretty fast (at a
rough guess, 10x faster?) in either C++ or perl, although I'm not sure
why.  Would it perhaps be faster to use a string-stream for sets of
100 numbers, and then flush them all to disk at once?  It seems that
the stream implementation is already doing some buffering, so I don't
see why that would help, but then again, I'm not clear why it's so
slow.

Cheers,

rif

From: Raymond Toy
Subject: Re: Basic streams/formatting questions
Date: Tue, 08 Jul 2003 18:21:17 +0000
Message-ID: <4nvfuc7uci.fsf@edgedsp4.rtp.ericsson.se>

>>>>> "rif" == rif  <···@mit.edu> writes:

    >> 
    >> What do you want if the number is 1d300?  1 followed by 300 zeroes?
    >> What about 1d-300?  299 zeroes followed by 1?
    >> 
    >> "~w,df" comes close, but you need to select w and d carefully.

    rif> I know that there will be between 0 and 10 digits to the left of the
    rif> decimal place.  I can get something reasonable with ~w,dF, but I also
    rif> have to accept a bunch of extra spaces in my file for this, yes?

How about ~G?  For numbers greater than 1, this probably does what you
want.  For numbers less than 1, it will eventually switch to
exponential notation if they're small enough.  Perhaps some
combination of ~w,dF and ~G would work.

    rif> 2.  What I actually have is a very large set of vectors of
    rif> double-floats (each of length 100) that I want to print to a file
    rif> in ASCII.  Is there anything that's going to be faster than
    rif> calling format inside a dotimes loop for each vector?  This is not
    >> 
    >> Not that I know of.
    >> 

    rif> That is surprising and unfortunate (if true), because I need to do
    rif> this with some frequency.  This kind of thing is pretty fast (at a
    rif> rough guess, 10x faster?) in either C++ or perl, although I'm not sure
    rif> why.  Would it perhaps be faster to use a string-stream for sets of

Perhaps because CMUCL tries very hard to make sure the printed result
can be read back in to give exactly the same result.  CMUCL does a lot
of bignum arithmetic to make sure this happens, which probably
explains why it's so slow.

    rif> 100 numbers, and then flush them all to disk at once?  It seems that
    rif> the stream implementation is already doing some buffering, so I don't
    rif> see why that would help, but then again, I'm not clear why it's so
    rif> slow.

How about doing a foreign call to sprintf, and printing out the buffer
that sprintf filled with the desired number?

Ray

From: rif
Subject: Re: Basic streams/formatting questions
Date: Tue, 08 Jul 2003 19:03:10 +0000
Message-ID: <wj03chgyh75.fsf@five-percent-nation.mit.edu>

>     rif> 2.  What I actually have is a very large set of vectors of
>     rif> double-floats (each of length 100) that I want to print to a file
>     rif> in ASCII.  Is there anything that's going to be faster than
>     rif> calling format inside a dotimes loop for each vector?  This is not
>     >> 
>     >> Not that I know of.
>     >> 
> 
>     rif> That is surprising and unfortunate (if true), because I need to do
>     rif> this with some frequency.  This kind of thing is pretty fast (at a
>     rif> rough guess, 10x faster?) in either C++ or perl, although I'm not sure
>     rif> why.  Would it perhaps be faster to use a string-stream for sets of
> 
> Perhaps because CMUCL tries very hard to make sure the printed result
> can be read back in to give exactly the same result.  CMUCL does a lot
> of bignum arithmetic to make sure this happens, which probably
> explains why it's so slow.
> 
>     rif> 100 numbers, and then flush them all to disk at once?  It seems that
>     rif> the stream implementation is already doing some buffering, so I don't
>     rif> see why that would help, but then again, I'm not clear why it's so
>     rif> slow.
> 
> How about doing a foreign call to sprintf, and printing out the buffer
> that sprintf filled with the desired number?
> 
> Ray

This seems like it's worth investigating.  I'll try to look into this
approach soon.  Thanks!

Cheers,

rif

From: Marco Antoniotti
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 15:00:12 +0000
Message-ID: <3F0D7F7C.6010408@cs.nyu.edu>

Jan Rychter wrote:
>>>>>>"rif" == rif  <···@mit.edu> writes:
>>>>>
> [...]
>  rif> 2.  What I actually have is a very large set of vectors of
>  rif> double-floats (each of length 100) that I want to print to a file
>  rif> in ASCII.  Is there anything that's going to be faster than
>  rif> calling format inside a dotimes loop for each vector?  This is not
> 
> Raymond Toy:
>  >> Not that I know of.
> 
>  rif> That is surprising and unfortunate (if true), because I need to do
>  rif> this with some frequency.  This kind of thing is pretty fast (at a
>  rif> rough guess, 10x faster?) in either C++ or perl, although I'm not
>  rif> sure why.  Would it perhaps be faster to use a string-stream for
>  rif> sets of 100 numbers, and then flush them all to disk at once?  It
>  rif> seems that the stream implementation is already doing some
>  rif> buffering, so I don't see why that would help, but then again, I'm
>  rif> not clear why it's so slow.
> 
> You're not the only one bitten by this. I was also rather surprised to
> find out that format is very slow, and even more surprised to learn that
> there is no standard faster/lesser way to do I/O.
> 
> It seems what people need is a "simple I/O" library that would basically
> be a wrapper around fprintf/sprintf and friends. Preferably using UFFI,
> so that it can actually be reused. Hmm.

Or a standardized implementation of ACL SIMPLE-STREAMs?

Incidentally.  I still need a loader/dumper of Matlab .mat files.


Cheers

--
Marco

From: Rene de Visser
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 15:35:45 +0000
Message-ID: <bek15t$7je$1@news1.wdf.sap-ag.de>

> 2.  What I actually have is a very large set of vectors of
>     double-floats (each of length 100) that I want to print to a file
>     in ASCII.  Is there anything that's going to be faster than
>     calling format inside a dotimes loop for each vector?  This is not
>     premature optimization --- the program is already written
>     (although problem 1 above is not well-solved) and it's too slow
>     right now.  (This is all in CMUCL, if that's relevant).

Can you create a compiled formatter object outside the loop and
use this inside the loop instead of format?

See the macro FORMATTER in the Hyperspec.

Rene.

From: Kalle Olavi Niemitalo
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 17:35:51 +0000
Message-ID: <87brw2gu88.fsf@Astalo.kon.iki.fi>

"Rene de Visser" <··············@hotmail.de> writes:

> Can you create a compiled formatter object outside the loop and
> use this inside the loop instead of format?
>
> See the macro FORMATTER in the Hyperspec.

Would it be reasonable for the Lisp implementation to provide a
compiler macro like this:

  (define-compiler-macro format (&whole whole
                                 destination control-string &rest args)
    (if (stringp control-string)
        (case destination
          ((t) `(progn (funcall (formatter ,control-string)
                                *standard-output* ,@args)
                       nil))
          (otherwise `(format ,destination (formatter ,control-string)
                              ,@args)))
        whole))

Debian CMUCL 18e-4 doesn't, and I wonder why.

From: Barry Margolin
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 17:51:46 +0000
Message-ID: <SIhPa.79$Vb2.47@news.level3.com>

In article <··············@Astalo.kon.iki.fi>,
Kalle Olavi Niemitalo  <···@iki.fi> wrote:
>Would it be reasonable for the Lisp implementation to provide a
>compiler macro like this:
>
>  (define-compiler-macro format (&whole whole
>                                 destination control-string &rest args)
>    (if (stringp control-string)
>        (case destination
>          ((t) `(progn (funcall (formatter ,control-string)
>                                *standard-output* ,@args)
>                       nil))
>          (otherwise `(format ,destination (formatter ,control-string)
>                              ,@args)))
>        whole))

That will call FORMATTER each time FORMAT would have been called, so it
doesn't really buy you much.  You need to pull the call to FORMATTER out of
the inner loop.

-- 
Barry Margolin, ··············@level3.com
Level(3), Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Kalle Olavi Niemitalo
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 19:04:13 +0000
Message-ID: <87he5udx02.fsf@Astalo.kon.iki.fi>

Barry Margolin <··············@level3.com> writes:

> That will call FORMATTER each time FORMAT would have been called, so it
> doesn't really buy you much.  You need to pull the call to FORMATTER out of
> the inner loop.

Remember that FORMATTER is a macro.  CMUCL defines it as:

  (defmacro formatter (control-string)
    `#',(%formatter control-string))

If the lambda expression returned by FORMAT::%FORMATTER doesn't
close over any lexical bindings, which should be the case, then
the expansion ought to be compiled as a reference to a constant.

From: Alexey Dejneka
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 19:02:28 +0000
Message-ID: <m3fzleurwb.fsf@comail.ru>

Kalle Olavi Niemitalo <···@iki.fi> writes:

> "Rene de Visser" <··············@hotmail.de> writes:
> 
> > Can you create a compiled formatter object outside the loop and
> > use this inside the loop instead of format?
> >
> > See the macro FORMATTER in the Hyperspec.
> 
> Would it be reasonable for the Lisp implementation to provide a
> compiler macro like this:
> 
>   (define-compiler-macro format (&whole whole
>                                  destination control-string &rest args)
>     (if (stringp control-string)
>         (case destination
>           ((t) `(progn (funcall (formatter ,control-string)
>                                 *standard-output* ,@args)
>                        nil))
>           (otherwise `(format ,destination (formatter ,control-string)
>                               ,@args)))
>         whole))
> 
> Debian CMUCL 18e-4 doesn't, and I wonder why.

How did you check it? Did you try

(disassemble (compile nil '(lambda (x)
                            (declare (optimize (speed 3) (space 0)))
                            (format t "~D" x))))

-- 
Regards,
Alexey Dejneka

From: Kalle Olavi Niemitalo
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 19:11:15 +0000
Message-ID: <87el0ydwoc.fsf@Astalo.kon.iki.fi>

Alexey Dejneka <········@comail.ru> writes:

> How did you check it?

* (compiler-macro-function 'format)

NIL

So there is no compiler macro defined for the FORMAT function.

> Did you try
>
> (disassemble (compile nil '(lambda (x)
>                             (declare (optimize (speed 3) (space 0)))
>                             (format t "~D" x))))

No, I didn't.  This appears to call WRITE directly.  How does that work?

From: Raymond Toy
Subject: Re: Basic streams/formatting questions
Date: Thu, 10 Jul 2003 20:10:26 +0000
Message-ID: <4nsmpe5ej1.fsf@edgedsp4.rtp.ericsson.se>

>>>>> "Kalle" == Kalle Olavi Niemitalo <···@iki.fi> writes:

    Kalle> Alexey Dejneka <········@comail.ru> writes:
    >> How did you check it?

    Kalle> * (compiler-macro-function 'format)

    Kalle> NIL

    Kalle> So there is no compiler macro defined for the FORMAT function.

    >> Did you try
    >> 
    >> (disassemble (compile nil '(lambda (x)
    >> (declare (optimize (speed 3) (space 0)))
    >> (format t "~D" x))))

    Kalle> No, I didn't.  This appears to call WRITE directly.  How does that work?

Because there's a CMUCL-internal deftransform that transforms
functions into another form.  I think this is more powerful than
compiler macros because deftransforms give you access to everything
the compiler knows at the point of the call.

Ray

From: Kalle Olavi Niemitalo
Subject: CMUCL's deftransform (was: Basic streams/formatting questions)
Date: Thu, 10 Jul 2003 21:01:18 +0000
Message-ID: <87brw2drkx.fsf_-_@Astalo.kon.iki.fi>

Raymond Toy <···@rtp.ericsson.se> writes:

> Because there's a CMUCL-internal deftransform that transforms
> functions into another form.

Thank you for explaining this.  I must have been confusing
deftransforms with VOPs.

> I think this is more powerful than compiler macros because
> deftransforms give you access to everything the compiler knows
> at the point of the call.

Indeed, deftransform format (in compiler/srctran.lisp) can easily
check whether the destination parameter is known to evaluate to a
stream or T, whereas my compiler macro would only look for a
literal T.

I am somewhat surprised by the syntax though.  Instead of

  (deftransform format ((tee control &rest args)
                        ((member t) function &rest t)
                        ...)
    ...)

I'd have used the more CLOS-like

  (deftransform format (((tree (member t))
                         (control function)
                         &rest args)
                        ...)
    ...)

Of course, there would be no point in changing it now.

From: Raymond Toy
Subject: Re: CMUCL's deftransform (was: Basic streams/formatting questions)
Date: Thu, 10 Jul 2003 22:10:46 +0000
Message-ID: <4nk7aq58yh.fsf@edgedsp4.rtp.ericsson.se>

>>>>> "Kalle" == Kalle Olavi Niemitalo <···@iki.fi> writes:

    Kalle> I am somewhat surprised by the syntax though.  Instead of

    Kalle>   (deftransform format ((tee control &rest args)
    Kalle>                         ((member t) function &rest t)
    Kalle>                         ...)
    Kalle>     ...)

    Kalle> I'd have used the more CLOS-like

    Kalle>   (deftransform format (((tree (member t))
    Kalle>                          (control function)
    Kalle>                          &rest args)
    Kalle>                         ...)
    Kalle>     ...)

I assume that deftransforms predate CLOS.  But I don't really know
that; I'm just guessing.

Ray