From: Gabe Garza
Subject: I/O optimization in CMUCL
Date: 
Message-ID: <FPMG6.29884$Jh5.26175626@news1.rdc1.sfba.home.com>
Hello,

    For a class, I'm writing a ray tracer; I've chosen CL as the
implementation language. 

    I'm having efficiency problems writing the output to a file.
For a 400x400 rendering of a sphere ray-traced with lambertian illumination,
it takes my machine (266mhz PII, FreeBSD, CMUCL) about 3 seconds to 
compute the image and 19 seconds(!) to write the ppm file as output.
I'm at a loss at how to speed up the file-writing routine.

    I've read relevant sections of CMUCL's manual, Dejanew's archive, 
and Cltl2 and have a good handle on how to write fast numerical code,
but haven't really found much about how to improve something like this.

    I'd be very grateful for any advice on how to speed this up or pointers
to such advice.

Thanks,

Gabe Garza

Here is the function that writes the file:

(defun write-ppm-file (buffer file-name)
  (declare (optimize (speed 3) (compilation-speed 0)))
  (declare (type (array double-float (* * 3)) buffer))
  (declare (string file-name))
  (let
      ((file (open file-name :direction :output :if-exists :new-version)))
    (cond
     ((not (null file))
      (format file "P3") (terpri file)
      (format file "~D ~D" x-resolution y-resolution) (terpri file)
      (format file "255") (terpri file)
      (loop for i of-type fixnum from 0 to (1- x-resolution) do
	    (loop for j of-type fixnum from 0 to (1- y-resolution) do
		  (let ((red (values (round (* (aref buffer i j 0) 254.0))))
			(green (values (round (* (aref buffer i j 1) 254.0))))
			(blue  (values (round (* (aref buffer i j 2) 254.0)))))
		    (declare (type (integer 0 255) red green blue))
		    (declare (stream file))
		    (format file "~D ~D ~D" red green blue) (terpri file))))
      (close file))
     (t (format nil "Error: Unable to open file ~S for writing." file-name)))))

Here's the output of (profile:report-time) after ray-tracing the above sphere:

  Seconds  |  Consed     |  Calls  |  Sec/Call  |  Name:
------------------------------------------------------
    19.940 | 271,937,968 |       1 |   19.93998 | WRITE-PPM-FILE
     2.626 | 32,935,528  | 306,474 |    0.00001 | RAY-FROM-POINTS
     0.250 |         0   | 160,000 |    0.00000 | SPHERE-INTERSECTION
     0.099 | 7,721,760   |  73,237 |    0.00000 | RED-LAMBERT
     0.023 | 13,190,264  | 306,474 |    0.00000 | MAKE-RAY
     0.000 |         0   |  73,237 |    0.00000 | GET-COLOR
     0.000 | 4,463,192   |  73,237 |    0.00000 | SURFACE-ORIGIN
     0.000 | 14,010,568  | 366,185 |    0.00000 | INTERSECTION-POINT
     0.000 | 1,810,152   | 160,000 |    0.00000 | MIN-INTERSECTION
     0.000 |         0   | 160,000 |    0.00000 | GET-INTERSECTION
     0.000 | 13,186,944  |       1 |    0.00000 | RAY-TRACE
     0.000 | 4,755,272   | 233,237 |    0.00000 | SPHERE-MIN-ROOT
     0.000 |         0   |  73,237 |    0.00000 | SPHERE-COLOR
------------------------------------------------------
    22.938 | 364,011,648 | 1,985,320 |            | Total

Estimated total profiling overhead: 59.56 seconds

From: Raymond Wiker
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <86k8446v05.fsf@raw.grenland.fast.no>
Gabe Garza <······@kynopolis.org> writes:

> Hello,
> 
>     For a class, I'm writing a ray tracer; I've chosen CL as the
> implementation language. 
> 
>     I'm having efficiency problems writing the output to a file.
> For a 400x400 rendering of a sphere ray-traced with lambertian illumination,
> it takes my machine (266mhz PII, FreeBSD, CMUCL) about 3 seconds to 
> compute the image and 19 seconds(!) to write the ppm file as output.
> I'm at a loss at how to speed up the file-writing routine.
> 
>     I've read relevant sections of CMUCL's manual, Dejanew's archive, 
> and Cltl2 and have a good handle on how to write fast numerical code,
> but haven't really found much about how to improve something like this.
> 
>     I'd be very grateful for any advice on how to speed this up or pointers
> to such advice.
> 
> Thanks,
> 
> Gabe Garza
> 
> Here is the function that writes the file:
> 
> (defun write-ppm-file (buffer file-name)
>   (declare (optimize (speed 3) (compilation-speed 0)))
>   (declare (type (array double-float (* * 3)) buffer))
>   (declare (string file-name))
>   (let
>       ((file (open file-name :direction :output :if-exists :new-version)))
>     (cond
>      ((not (null file))
>       (format file "P3") (terpri file)
>       (format file "~D ~D" x-resolution y-resolution) (terpri file)
>       (format file "255") (terpri file)
>       (loop for i of-type fixnum from 0 to (1- x-resolution) do
> 	    (loop for j of-type fixnum from 0 to (1- y-resolution) do
> 		  (let ((red (values (round (* (aref buffer i j 0) 254.0))))
> 			(green (values (round (* (aref buffer i j 1) 254.0))))
> 			(blue  (values (round (* (aref buffer i j 2) 254.0)))))
> 		    (declare (type (integer 0 255) red green blue))
> 		    (declare (stream file))
> 		    (format file "~D ~D ~D" red green blue) (terpri file))))
>       (close file))
>      (t (format nil "Error: Unable to open file ~S for writing." file-name)))))

        A couple suggestions: 

        - use a single loop in conjunction with row-major-aref instead
of your 2.5 nested loops (i, j, and the unrolled loop over the colour
components :-)

        - use (formatter "~D ~D ~D") instead of just "~D ~D ~D"

        No guarantees that this will help much, though...

        //Raymond.

-- 
Raymond Wiker
·············@fast.no
From: Christophe Rhodes
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <sqk843lo4e.fsf@lambda.jesus.cam.ac.uk>
Raymond Wiker <·············@fast.no> writes:

> Gabe Garza <······@kynopolis.org> writes:
> 
> > Hello,
> > 
> >     For a class, I'm writing a ray tracer; I've chosen CL as the
> > implementation language. 
> > 
> >     I'm having efficiency problems writing the output to a file.
> > For a 400x400 rendering of a sphere ray-traced with lambertian illumination,
> > it takes my machine (266mhz PII, FreeBSD, CMUCL) about 3 seconds to 
> > compute the image and 19 seconds(!) to write the ppm file as output.
> > I'm at a loss at how to speed up the file-writing routine.
> > 
> >     I've read relevant sections of CMUCL's manual, Dejanew's archive, 
> > and Cltl2 and have a good handle on how to write fast numerical code,
> > but haven't really found much about how to improve something like this.
> > 
> >     I'd be very grateful for any advice on how to speed this up or pointers
> > to such advice.
> > 
> > Thanks,
> > 
> > Gabe Garza
> > 
> > Here is the function that writes the file:
> > 
> > (defun write-ppm-file (buffer file-name)
> >      (t (format nil "Error: Unable to open file ~S for writing." file-name)))))
> 
>         A couple suggestions: 
> 
>         - use a single loop in conjunction with row-major-aref instead
> of your 2.5 nested loops (i, j, and the unrolled loop over the colour
> components :-)
> 
>         - use (formatter "~D ~D ~D") instead of just "~D ~D ~D"
> 
>         No guarantees that this will help much, though...
> 
>         //Raymond.

A couple more suggestions:

* consider using WITH-OPEN-FILE for better style and error-checking;

* I would guess that accumulating a string (pre-made with MAKE-STRING;
  you know how big it needs to be, right?)  and then writing that to
  file with WRITE-SEQUENCE would be faster.

Again, I haven't tested these things.

Cheers,

Christophe
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 524 842
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)
From: Peter Van Eynde
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <86g0eqv86n.fsf@mustyr-host.hq.fitit.be>
A bit late I fear..

I've played a bit in the past with making I/O fast[1], so what I would do
is: 

(in addition to their hints)

Christophe Rhodes <·····@cam.ac.uk> writes:

> >         A couple suggestions: 
> > 
> >         - use a single loop in conjunction with row-major-aref instead
> > of your 2.5 nested loops (i, j, and the unrolled loop over the colour
> > components :-)
> > 
> >         - use (formatter "~D ~D ~D") instead of just "~D ~D ~D"

- pre-calculate a vector with the 256 possible strings, and 
  just write them out. This will save on the consing but also speed up
  the function in general as format is _very_ expensive.

> * consider using WITH-OPEN-FILE for better style and error-checking;
> 
> * I would guess that accumulating a string (pre-made with MAKE-STRING;
>   you know how big it needs to be, right?)  and then writing that to
>   file with WRITE-SEQUENCE would be faster.

- why you have (values (round ....)) ?

- the (declare (string file)) is not needed.

- try to use a binary file format

- use write-sequence to write out the pre-calculated strings

- use print, princ prin1 to print constants, try to avoid format.

- use *var* for special variables

- use ARRAY-DIMENSIONS to get the x and y resolution! There is no need
  to these extra parameters.

- use documentation strings

All in all it was a lot better then some commercial code I've seen :-)

Groetjes, Peter

1: Wrote a Lisp-based OODB, speed was of the essence.

-- 
It's logic Jim, but not as we know it. | ········@debian.org
"God, root, what is difference?" - Pitr|
"God is more forgiving." - Dave Aronson| http://cvs2.cons.org/~pvaneynd/
From: Sam Steingold
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <u66fmp5r4.fsf@xchange.com>
> * In message <··············@mustyr-host.hq.fitit.be>
> * On the subject of "Re: I/O optimization in CMUCL"
> * Sent on 30 Apr 2001 20:27:17 +0200
> * Honorable Peter Van Eynde <········@debian.org> writes:
>
>   ... as format is _very_ expensive.
> - use print, princ prin1 to print constants, try to avoid format.

many people made this point.
why is format expensive?!
when the format string is constant,
        (format stream string args...)
should compile to
        (funcall #.(formatter string) stream args)
i.e.,
        (format t "~d" n)
should compile to something like
        (sys::%print-decimal n ...)

-- 
Sam Steingold (http://www.podval.org/~sds)
Only a fool has no doubts.
From: Geoff Summerhayes
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <terhjfsrmva092@corp.supernews.com>
"Sam Steingold" <···@gnu.org> wrote in message ··················@xchange.com...
> > * In message <··············@mustyr-host.hq.fitit.be>
> > * On the subject of "Re: I/O optimization in CMUCL"
> > * Sent on 30 Apr 2001 20:27:17 +0200
> > * Honorable Peter Van Eynde <········@debian.org> writes:
> >
> >   ... as format is _very_ expensive.
> > - use print, princ prin1 to print constants, try to avoid format.
>
> many people made this point.
> why is format expensive?!
> when the format string is constant,
>         (format stream string args...)
> should compile to
>         (funcall #.(formatter string) stream args)
> i.e.,
>         (format t "~d" n)
> should compile to something like
>         (sys::%print-decimal n ...)
>

For the same reason printf("%s",str); in C doesn't get altered into
puts(str); at compilation, implementing that kind of control in any
compiler is a pain in the butt. :-)
Even if it had a static format string there no guarantees that the
optimization you suggest could be performed (consider ~:P, ~R, ~[, etc.),
it would take design resources better used somewhere else. Format is just
a handy, efficient (programming-time), inefficient (run-time) tool. Most
of the time that is all you need.
If the programmer needs speed, efficient functions for correct output
already exist and can be called directly.

Geoff
From: Stig E. Sandoe
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <87sniqqf5i.fsf@palomba.bananos.org>
"Geoff Summerhayes" <·············@hNoOtSmPaAiMl.com> writes:

> "Sam Steingold" <···@gnu.org> wrote in message ··················@xchange.com...
> > > * In message <··············@mustyr-host.hq.fitit.be>
> > > * On the subject of "Re: I/O optimization in CMUCL"
> > > * Sent on 30 Apr 2001 20:27:17 +0200
> > > * Honorable Peter Van Eynde <········@debian.org> writes:
> > >
> > >   ... as format is _very_ expensive.
> > > - use print, princ prin1 to print constants, try to avoid format.
> >
> > many people made this point.
> > why is format expensive?!
> > when the format string is constant,
> >         (format stream string args...)
> > should compile to
> >         (funcall #.(formatter string) stream args)
> > i.e.,
> >         (format t "~d" n)
> > should compile to something like
> >         (sys::%print-decimal n ...)
> >
> 
> For the same reason printf("%s",str); in C doesn't get altered into
> puts(str); at compilation, implementing that kind of control in any
> compiler is a pain in the butt. :-)

Not to forget that printf("%s",str); isn't equivalent to puts(str);
That alone should stop such "optimisations". 

-- 
------------------------------------------------------------------
Stig Erik Sandoe     ····@ii.uib.no    http://www.ii.uib.no/~stig/
From: Geoff Summerhayes
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <terjukaolp7g70@corp.supernews.com>
"Stig E. Sandoe" <····@ii.uib.no> wrote in message ···················@palomba.bananos.org...
> "Geoff Summerhayes" <·············@hNoOtSmPaAiMl.com> writes:
>
> >
> > For the same reason printf("%s",str); in C doesn't get altered into
> > puts(str); at compilation, implementing that kind of control in any
> > compiler is a pain in the butt. :-)
>
> Not to forget that printf("%s",str); isn't equivalent to puts(str);
> That alone should stop such "optimisations".

lol, gri\n

Geoff
From: Frode Vatvedt Fjeld
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <2h3daq16zi.fsf@dslab7.cs.uit.no>
Sam Steingold <···@gnu.org> writes:

> when the format string is constant,
>         (format stream string args...)
> should compile to
>         (funcall #.(formatter string) stream args)

At least ACL appears to do some kind of pre-compilation of constant
formatting. That doesn't prevent it from consing quite a bit, though.

-- 
Frode Vatvedt Fjeld
From: Jochen Schmidt
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <9cgule$dpgla$1@ID-22205.news.dfncis.de>
Gabe Garza wrote:

> Hello,
> 
>     For a class, I'm writing a ray tracer; I've chosen CL as the
> implementation language.
> 
>     I'm having efficiency problems writing the output to a file.
> For a 400x400 rendering of a sphere ray-traced with lambertian
> illumination, it takes my machine (266mhz PII, FreeBSD, CMUCL) about 3
> seconds to compute the image and 19 seconds(!) to write the ppm file as
> output. I'm at a loss at how to speed up the file-writing routine.
> 
>     I've read relevant sections of CMUCL's manual, Dejanew's archive,
> and Cltl2 and have a good handle on how to write fast numerical code,
> but haven't really found much about how to improve something like this.
> 
>     I'd be very grateful for any advice on how to speed this up or
>     pointers
> to such advice.
> 
> Thanks,
> 
> Gabe Garza
> 
> Here is the function that writes the file:
> 
> (defun write-ppm-file (buffer file-name)
>   (declare (optimize (speed 3) (compilation-speed 0)))
>   (declare (type (array double-float (* * 3)) buffer))
>   (declare (string file-name))
>   (let
>       ((file (open file-name :direction :output :if-exists :new-version)))
>     (cond
>      ((not (null file))
>       (format file "P3") (terpri file)
>       (format file "~D ~D" x-resolution y-resolution) (terpri file)
>       (format file "255") (terpri file)
>       (loop for i of-type fixnum from 0 to (1- x-resolution) do
> (loop for j of-type fixnum from 0 to (1- y-resolution) do
> (let ((red (values (round (* (aref buffer i j 0) 254.0))))
> (green (values (round (* (aref buffer i j 1) 254.0))))
> (blue  (values (round (* (aref buffer i j 2) 254.0)))))
> (declare (type (integer 0 255) red green blue))
> (declare (stream file))
> (format file "~D ~D ~D" red green blue) (terpri file))))
>       (close file))
>      (t (format nil "Error: Unable to open file ~S for writing."
>      file-name)))))

FORMAT is a rather expensive IO-Operation. If possible you should use 
Functions like WRITE-SEQUENCE, WRITE-CHAR, WRITE-BYTE.
I've not used FORMATTER yet (as Raymond Wiker pointed out) but it is maybe 
a good solution...

Regards,
Jochen Schmidt
From: Frode Vatvedt Fjeld
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <2hbspfpuzc.fsf@dslab7.cs.uit.no>
Two things: Do whatever your implementation requires to inline the
AREF and scaling operations, so you don't cons one (or even more)
double-floats for each color component. And consider hand-writing the
formatting.

Style issues: Use WITH-OPEN-FILE instead of OPEN, use "~%" rather than
TERPRI when you're doing a FORMAT anyway, merge several FORMATs into
one, use *x-resolution* and *y-resolution* (if those are indeed
special/free variables), and your use of VALUES is redundant given any
half-decent compiler.

-- 
Frode Vatvedt Fjeld
From: Daniel Barlow
Subject: Re: I/O optimization in CMUCL
Date: 
Message-ID: <87n18zeddt.fsf@noetbook.telent.net>
Gabe Garza <······@kynopolis.org> writes:

> 		    (format file "~D ~D ~D" red green blue) (terpri file))))

1) As others have pointed out, try rewriting without use of FORMAT 

2) Have you considered using "raw" ppm format instead of "plain"?  

       - A  raster  of  Width * Height pixels, proceeding through
         the image in normal English reading order.   Each  pixel
         is  a  triplet  of red, green, and blue samples, in that
         order.  Each sample is represented  in  pure  binary  by
         either 1 or 2 bytes.  If the Maxval is less than 256, it
         is 1 byte.  Otherwise, it is 2 bytes.  The most signifi�
         cant byte is first.

(see the ppm(5) manual page for the rest)

You'd end up with significantly smaller files as well, I expect.


-dan

-- 

  http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources