From: ·····@uncommon-sense.net
Subject: fast io of numbers in ascii
Date: 
Message-ID: <1180188643.360363.173690@k79g2000hse.googlegroups.com>
Hi,

I have a program calculating eigenvalues of large matrices, and I want
to write out the matrices and eigenvectors directly in a postscript
format.

Apart from headers and page breaks, the output is basically a long
list of the form

DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
...

the resulting files get large very quickly (easily 10MB to 100MB), and
run time is dominated completely by writing the files.

I wondered a bit about this, and tested the following in CMUCL:

;;
;; file: io-test.lisp
;;
(declaim (optimize (speed 3) (safety 0)))

(defun test-write1 ()
  (with-open-file (s "/tmp/test1" :direction :output :if-
exists :supersede)
    (let ((m 1000000))
      (declare (type fixnum m))
      (dotimes (i m)
        (declare (type fixnum i))
        (format s "~d~%" i)))
    nil))

CL-USER> (compile-file "io-test")
CL-USER> (load "io-test")
CL-USER> (time (test-write1))
; Compiling LAMBDA
NIL:
; Compiling Top-Level
Form:

; Evaluation
took:
;   5.83f0 seconds of real
time
;   3.936402f0 seconds of user run
time
;   0.507923f0 seconds of system run
time
;   11,669,926,800 CPU
cycles
;   [Run times include 1.27f0 seconds GC run
time]
;   0 page faults
and
;   584,963,680 bytes
consed.
;
NIL

Now, the C equivalent (?) I wrote is:

/* file: test-write.c */
#include <stdio.h>

int main()
{
  FILE* f = fopen("/tmp/testC", "w");
  int m = 1000000;
  int i;
  for(i = 0; i < m; ++i)
    fprintf(f, "%d\n", i);
  fclose(f);
}

$ gcc test-write.c
$ time ./test-write

real    0m0.190s
user    0m0.182s
sys     0m0.007s

This is a factor of about 30 faster.  I want to write out numbers this
fast, but in Lisp.

I also tried 3 other versions appended below (compiled with (speed 3)
(safety 0)), but they all gave nearly identical results.  It's not
only the runtime, but also the consing which I find excessive
(although I don't know how much the C version mallocs).

Since I actually want to output double-floats, not integers, I would
also appreciate an idea on how to write out long lists of floats fast.

Thanks for any help on this.

Regards,
Boris

----------------------------------------------------------------------

(defun test-write2 ()
  (with-open-file (s "/tmp/test2" :direction :output :if-
exists :supersede)
    (let ((m 1000000))
      (declare (type fixnum m))
      (dotimes (i m)
        (declare (type fixnum i))
        (prin1 i s)
        (terpri s)))
    nil))

(defun test-write3 ()
  (let* ((n 10000000)
         (m 1000000)
         (buf (make-array n :element-type 'base-char :fill-pointer
0)))
    (declare (type fixnum n m))
    (with-open-file (s "/tmp/test3" :direction :output :if-
exists :supersede)
      (with-output-to-string (ss buf)
        (dotimes (i m)
          (declare (type fixnum i))
          (format ss "~d~%" i)))
      (write-sequence buf s))
    nil))

(defun test-write4 ()
  (let* ((n 10000000)
         (m 1000000)
         (buf (make-array n :element-type 'base-char :fill-pointer
0)))
    (declare (type fixnum n m))
    (with-open-file (s "/tmp/test4" :direction :output :if-
exists :supersede)
      (with-output-to-string (ss buf)
        (dotimes (i m)
          (declare (type fixnum i))
          (prin1 i ss)
          (terpri ss)))
      (write-sequence buf s))
    nil))

CL-USER> (time (test-write2))
; Compiling LAMBDA
NIL:
; Compiling Top-Level
Form:

; Evaluation
took:
;   5.55f0 seconds of real
time
;   3.457474f0 seconds of user run
time
;   0.477927f0 seconds of system run
time
;   11,114,732,130 CPU
cycles
;   [Run times include 1.22f0 seconds GC run
time]
;   0 page faults
and
;   584,993,120 bytes
consed.
;
NIL
CL-USER> (time (test-write3))
; Compiling LAMBDA
NIL:
; Compiling Top-Level
Form:

; Evaluation
took:
;   6.86f0 seconds of real
time
;   4.6013f0 seconds of user run
time
;   0.502924f0 seconds of system run
time
;   13,732,357,600 CPU
cycles
;   [Run times include 1.3f0 seconds GC run
time]
;   0 page faults
and
;   602,980,328 bytes
consed.
;
NIL
CL-USER> (time (test-write4))
; Compiling LAMBDA
NIL:
; Compiling Top-Level
Form:

; Evaluation
took:
;   6.41f0 seconds of real
time
;   4.176365f0 seconds of user run
time
;   0.481926f0 seconds of system run
time
;   12,838,301,630 CPU
cycles
;   [Run times include 1.23f0 seconds GC run
time]
;   0 page faults
and
;   603,038,512 bytes
consed.
;
NIL

From: Matthias Benkard
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <1180199371.246062.220950@q66g2000hsg.googlegroups.com>
Hi,

> (defun test-write2 ()
>   (with-open-file (s "/tmp/test2" :direction :output :if-
> exists :supersede)
>     (let ((m 1000000))
>       (declare (type fixnum m))
>       (dotimes (i m)
>         (declare (type fixnum i))
>         (prin1 i s)
>         (terpri s)))
>     nil))

I don't all that much about optimising code for speed, but...

1. You could try changing (LET ((M ...)) ...) into (LET ((M ...)
(*PRINT-PRETTY* NIL)) ...).  On my system, this reduces runtime from
around four seconds to less than a second (using PRIN1, not FORMAT).

2. I don't think CMUCL is dumb enough that it makes DECLAREing those
FIXNUMs there worthwile.

Bye-bye,
Matthias
From: ·····@uncommon-sense.net
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <1180202627.504717.299110@q66g2000hsg.googlegroups.com>
On 26 Mai, 19:09, Matthias Benkard <··········@gmail.com> wrote:
> I don't all that much about optimising code for speed, but...
>
> 1. You could try changing (LET ((M ...)) ...) into (LET ((M ...)
> (*PRINT-PRETTY* NIL)) ...).  On my system, this reduces runtime from
> around four seconds to less than a second (using PRIN1, not FORMAT).

This helps much, thanks.  However, for printing floats the effect is
not nearly as big as for integers.  See my John Thingstad's reply and
my reply to him.

> 2. I don't think CMUCL is dumb enough that it makes DECLAREing those
> FIXNUMs there worthwile.

Yes.  They don't really change anything.

Thanks,
Boris
From: John Thingstad
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <op.tsxx1pvepqzri1@pandora.upc.no>
On Sat, 26 May 2007 16:10:43 +0200, <·····@uncommon-sense.net> wrote:

> Hi,
>
> I have a program calculating eigenvalues of large matrices, and I want
> to write out the matrices and eigenvectors directly in a postscript
> format.
>
> Apart from headers and page breaks, the output is basically a long
> list of the form
>
> DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
> DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
> DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
> DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT DOUBLE-FLOAT arrow
> ...
>
> the resulting files get large very quickly (easily 10MB to 100MB), and
> run time is dominated completely by writing the files.
>
> I wondered a bit about this, and tested the following in CMUCL:
>
> ;;
> ;; file: io-test.lisp
> ;;
> (declaim (optimize (speed 3) (safety 0)))
>
> (defun test-write1 ()
>   (with-open-file (s "/tmp/test1" :direction :output :if-
> exists :supersede)
>     (let ((m 1000000))
>       (declare (type fixnum m))
>       (dotimes (i m)
>         (declare (type fixnum i))
>         (format s "~d~%" i)))
>     nil))
>

Well for one thing the integer optimation won't help much so
don't clutter up the code with it.
Secondly make sure to turn off pretty printing.
so set *print-petty* to nil.

(defun test-write ()
   (let ((*print-pretty* nil))
     (with-open-file (s "test" :direction :output :if-exists :supersede)
       (dotimes (i 1000000)
         (print i s))
       nil)))

CL-USER 9 > (time (test-write))
Timing the evaluation of (TEST-WRITE)

User time    =        1.781
System time  =        0.046
Elapsed time =        1.829
Allocation   = 7236 bytes
0 Page faults
NIL

(17 seconds and 12 Mb allocated with *print-pretty* to t.)

That is about as fast as it is going to get I guess.
2 Seconds is fast enough anyhow.

print works for double-float as well.

(defun test-write ()
   (let ((*print-pretty* nil))
     (with-open-file (s "test" :direction :output :if-exists :supersede)
       (loop repeat 1000000 do
             (print (random 1.0) s))
       nil)))

CL-USER 12 > (time (test-write))
Timing the evaluation of (TEST-WRITE)

User time    =       11.531
System time  =        0.046
Elapsed time =       11.812
Allocation   = 81378808 bytes
0 Page faults
NIL

So not terribly fast.
Lots of allocation..

(time on Intel 820 with 1 Gb RAM and SATA drive)

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: ·····@uncommon-sense.net
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <1180201263.424412.41010@m36g2000hse.googlegroups.com>
On 26 Mai, 17:45, "John Thingstad" <··············@chello.no> wrote:
> Secondly make sure to turn off pretty printing.
> so set *print-petty* to nil.

Many thanks.  On my system this reduces runtime to 0.8 seconds and
almost no consing.  This is excellent.  However, as you already noted,
printing double-floats is still slow and conses a lot.

On my system:

(defun test-write-float1 ()
  (with-open-file (s "/tmp/test1" :direction :output :if-
exists :supersede)
    (let ((*print-pretty* nil))
      (dotimes (i 1000000)
        (print (random 1.0) s)))
    nil))

(time (test-write-float1))

; Evaluation
took:
;   8.83 seconds of real
time
;   7.96079 seconds of user run
time
;   0.215967 seconds of system run
time
;   17,681,512,720 CPU
cycles
;   [Run times include 0.42 seconds GC run
time]
;   0 page faults
and
;   242,969,504 bytes consed.

With *print-pretty* set to t this takes 17s instead of 8.83s and it
conses about three times as much.This is much better.  However, the C
version below runs in only 0.7 seconds.

#include <stdio.h>
#include <stdlib.h>

int main()
{
  FILE* f = fopen("/tmp/testC", "w");
  int m = 1000000;
  int i;
  for(i = 0; i < m; ++i)
  {
    double d = ((double)rand()) / (RAND_MAX+1.0);
    fprintf(f, "%lf\n", d);
  }
  fclose(f);
}

Regards,
Boris
From: Alexander Schmolck
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <yfsy7jas71s.fsf@gmail.com>
·····@uncommon-sense.net writes:

> On 26 Mai, 17:45, "John Thingstad" <··············@chello.no> wrote:
>> Secondly make sure to turn off pretty printing.
>> so set *print-petty* to nil.
>
> Many thanks.  On my system this reduces runtime to 0.8 seconds and
> almost no consing.  This is excellent.  However, as you already noted,
> printing double-floats is still slow and conses a lot.

Random uninformed guess: printing double-floats in cmucl might involve
bignums, which would explain the consing and presumably doesn't help (BTW
how's the speed if you only print 0s?), especially given that cmucl at least
used to have a pretty bad bignum implementation -- I also seem to remember
that CMUCL was none too sophisticated about reading and printing floats, but I
might be wrong -- have a look at the code and if it seems you'd be better of
rolling your own maybe this paper will be useful (also contains code in
scheme):

<http://citeseer.ist.psu.edu/rd/3022765%2C28233%2C1%2C0.25%2CDownload/http://citeseer.ist.psu.edu/cache/papers/cs/2654/http:zSzzSzwww.cs.indiana.eduzSz%7EburgerzSzFP-Printing-PLDI96.pdf/printing-floating-point-numbers.pdf>

I'm sure the cmucl developers would welcome a patch if it indeed turns out
that cmucl's float printing code is suboptimal.

'as
From: ·····@uncommon-sense.net
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <1180429483.250836.157850@p77g2000hsh.googlegroups.com>
On 27 Mai, 15:57, Alexander Schmolck <··········@gmail.com> wrote:
>
> Random uninformed guess: printing double-floats in cmucl might involve
> bignums, which would explain the consing and presumably doesn't help (BTW
> how's the speed if you only print 0s?), especially given that cmucl at least
> used to have a pretty bad bignum implementation --

For doing (print 0d0) and (print 0) one million times, the results
are:

double-float:
  3.81 seconds of real time
  184,183,488 bytes consed.

integer:
  0.3 seconds of real time
  2,360 bytes consed.

> I also seem to remember
> that CMUCL was none too sophisticated about reading and printing floats, but I
> might be wrong -- have a look at the code and if it seems you'd be better of
> rolling your own maybe this paper will be useful (also contains code in
> scheme):
>
> <http://citeseer.ist.psu.edu/rd/3022765%2C28233%2C1%2C0.25%2CDownload/...>
>
> I'm sure the cmucl developers would welcome a patch if it indeed turns out
> that cmucl's float printing code is suboptimal.

Thanks for the pointer, I'll have a look at it.

Regards,
Boris
From: Alex Mizrahi
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <465855dd$0$90270$14726298@news.sunsite.dk>
(message (Hello ······@uncommon-sense.net)
(you :wrote  :on '(26 May 2007 07:10:43 -0700))
(

 b> I have a program calculating eigenvalues of large matrices, and I want
 b> to write out the matrices and eigenvectors directly in a postscript
 b> format.

 b>         (format s "~d~%" i)))

do not do format this way -- it needs to parse it's arguments each time. 
either use (format s (formatter "~d~%") i) -- formatter will create a 
compiled optimized function for formatting. or better use PRINC function.

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"I am everything you want and I am everything you need") 
From: Edi Weitz
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <ubqg7y3ad.fsf@agharta.de>
On Sat, 26 May 2007 18:44:21 +0300, "Alex Mizrahi" <········@users.sourceforge.net> wrote:

>  b>         (format s "~d~%" i)))
>
> do not do format this way -- it needs to parse it's arguments each
> time.

Really?  Shouldn't a sufficiently smart compiler be able to figure out
that the format string is constant and (using a compiler macro) act
accordingly?

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")
From: Kalle Olavi Niemitalo
Subject: Re: fast io of numbers in ascii
Date: 
Message-ID: <874plzo78f.fsf@Astalo.kon.iki.fi>
Edi Weitz <········@agharta.de> writes:

> Shouldn't a sufficiently smart compiler be able to figure out
> that the format string is constant and (using a compiler macro)
> act accordingly?

FWIW, (compiler-macro-function 'cl:format) returns NIL in SBCL
0.9.5.50, but the compiler generates calls to WRITE and TERPRI
anyway, if suitable optimization qualities have been declared;
this has been implemented with (deftransform format ...) forms
in src/compiler/srctran.lisp.

(I'm aware there are newer versions.)