From: David Wallis
Subject: Where did all the garbage come from?
Date: 
Message-ID: <417CEA78.810AA402@cpom.ucl.ac.uk>
I'm trying to understand why this function (rseq2) is so slow, and why
it's garbage collecting. What does it have to get rid of? A call to
make-array returns almost instantly. I'm running it in CMUCL. [Note:
setfq is just to stop setf printing the whole array. Is there a 'proper'
way to do this?]

; setf quiet
(defmacro setfq (x lst)
  `(progn (setf ,x ,lst) t))

; Makes an n x n array with rows ranging between a and b
(defun rseq2 (a b n &key (element-type 'double-float))
  (let ((r (make-array (list n n) :element-type element-type))
     (sp (/ (- b a) (1- n))))
        (dotimes (i n)
          (dotimes (j n)
             (setf (aref r i j) (+ (* i sp) a)))) r))

* (compile 'rseq2)
; Compiling LAMBDA (A B N &KEY (ELEMENT-TYPE #)):
; Compiling Top-Level Form:

RSEQ2
NIL
NIL
* (time (setfq x (rseq2 1d0 1000d0 1000)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; [GC threshold exceeded with 25,240,024 bytes in use.  Commencing GC.]
; [GC completed with 16,241,552 bytes retained and 8,998,472 bytes
freed.]
; [GC will next occur when at least 28,241,552 bytes are in use.]
; [GC threshold exceeded with 28,241,936 bytes in use.  Commencing GC.]
; [GC completed with 16,241,192 bytes retained and 12,000,744 bytes
freed.]
; [GC will next occur when at least 28,241,192 bytes are in use.]
; [GC threshold exceeded with 28,241,936 bytes in use.  Commencing GC.]
; [GC completed with 16,241,208 bytes retained and 12,000,728 bytes
freed.]
; [GC will next occur when at least 28,241,208 bytes are in use.]

; Evaluation took:
;   1.64 seconds of real time
;   0.77 seconds of user run time
;   0.44 seconds of system run time
;   0 CPU cycles
;   [Run times include 0.54 seconds GC run time]
;   0 page faults and
;   40,008,312 bytes consed.
;
T

David

From: Raymond Toy
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <sxdis8ysxyg.fsf@edgedsp4.rtp.ericsson.se>
>>>>> "David" == David Wallis <···@cpom.ucl.ac.uk> writes:

    David> I'm trying to understand why this function (rseq2) is so slow, and why
    David> it's garbage collecting. What does it have to get rid of? A call to
    David> make-array returns almost instantly. I'm running it in CMUCL. [Note:
    David> setfq is just to stop setf printing the whole array. Is there a 'proper'
    David> way to do this?]

    David> ; setf quiet
    David> (defmacro setfq (x lst)
    David>   `(progn (setf ,x ,lst) t))

    David> ; Makes an n x n array with rows ranging between a and b
    David> (defun rseq2 (a b n &key (element-type 'double-float))
    David>   (let ((r (make-array (list n n) :element-type element-type))
    David>      (sp (/ (- b a) (1- n))))
    David>         (dotimes (i n)
    David>           (dotimes (j n)
    David>              (setf (aref r i j) (+ (* i sp) a)))) r))


Because the compiler doesn't know anything about types of a, b, and n,
it has to use generic arithmetic to do all the computations.  Since it
doesn't know at compile-time what the type of the array r is, you
probably lose there too.

If it makes sense, declare a, b, and n with the correct types.

Ray
From: Pascal Bourguignon
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <873c02vqg0.fsf@thalassa.informatimago.com>
David Wallis <···@cpom.ucl.ac.uk> writes:

> I'm trying to understand why this function (rseq2) is so slow, and why
> it's garbage collecting. What does it have to get rid of? A call to
> make-array returns almost instantly. I'm running it in CMUCL. [Note:
> setfq is just to stop setf printing the whole array. Is there a 'proper'
> way to do this?]
> 
> ; setf quiet
> (defmacro setfq (x lst)
>   `(progn (setf ,x ,lst) t))
> 
> ; Makes an n x n array with rows ranging between a and b
> (defun rseq2 (a b n &key (element-type 'double-float))
>   (let ((r (make-array (list n n) :element-type element-type))
>      (sp (/ (- b a) (1- n))))
>         (dotimes (i n)
>           (dotimes (j n)
>              (setf (aref r i j) (+ (* i sp) a)))) r))

It does not depend on the function, it depends on the data!

(defun f (n x) (let ((r 1)) (dotimes (i n r) (setf r (* r x)))))

        (f 40 0.6666) ==> almost no garbage.
        (f 40 2/3)    ==> 39 rationals in the garbage, some of them 
                          consisting of two bignums.

 
-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

Voting Democrat or Republican is like choosing a cabin in the Titanic.
From: Kalle Olavi Niemitalo
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <87acuaalii.fsf@Astalo.kon.iki.fi>
David Wallis <···@cpom.ucl.ac.uk> writes:

> [Note: setfq is just to stop setf printing the whole array.
> Is there a 'proper' way to do this?]

Set *PRINT-ARRAY* to false.

If you want to prevent a form from returning an unwanted value,
then I think it is more customary to return no values than T.
From: Svein Ove Aas
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <cll83k$rd9$2@services.kq.no>
Kalle Olavi Niemitalo wrote:

> David Wallis <···@cpom.ucl.ac.uk> writes:
> 
>> [Note: setfq is just to stop setf printing the whole array.
>> Is there a 'proper' way to do this?]
> 
> Set *PRINT-ARRAY* to false.
> 
> If you want to prevent a form from returning an unwanted value,
> then I think it is more customary to return no values than T.

Which is done by replacing the T with (values).
 
From: Adam Warner
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <pan.2004.10.25.13.25.51.938369@consulting.net.nz>
Hi David Wallis,

> I'm trying to understand why this function (rseq2) is so slow, and why
> it's garbage collecting.

Add a (declare (optimize (speed 3))) declaration within the function and
use the compiler notes to start adding type declarations to eliminate
(some of) the generic arithmetic.

(declare (double-float a b) (fixnum n)) is probably a good start.

If you want to maintain the genericity of being able to input single or
double floats consider coercing a and b to double floats within the
function:

   (setf a (coerce a 'double-float))
   (setf b (coerce b 'double-float))

The compiler should then be able to deduce that a and b are subsequently
double floats. Unfortunately CMUCL appears to miss this obvious deduction.
SBCL doesn't.

   (defun rseq2 (a b n)
     (declare (optimize (speed 3)) (fixnum n))
     (setf a (coerce a 'double-float))
     (setf b (coerce b 'double-float))
     (let ((r (make-array (list n n) :element-type 'double-float))
           (sp (/ (- b a) (1- n))))
       (dotimes (i n)
         (dotimes (j n)
           (setf (aref r i j) (+ (* i sp) a)))) r))

In SBCL this only conses one-fifth as much and is much faster.

Regards,
Adam
From: Alan Crowe
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <86oeipadph.fsf@cawtech.freeserve.co.uk>
David Wallis explained a detail of his code:
> [Note: setfq is just to stop setf printing the whole
> array. Is there a 'proper' way to do this?]

* (setf *print-array* nil) => NIL

* #(a b c) => #<(SIMPLE-VECTOR 3) {48539437}>

* (let ((*print-array* t))(print #(a b c)))
#(A B C) 
=> #<(SIMPLE-VECTOR 3) {48539B57}>

David WAllis also asked:
> I'm trying to understand why this function (rseq2) is so
> slow

Yup, its slow. On my old 166MHz Pentium using CMUCL 18d I
get:

(time (defparameter *a* (rseq2 1d0 1000d0 1000)))
Compiling LAMBDA NIL: 
Compiling Top-Level Form: 

Evaluation took:
  3.8 seconds of real time
  3.640304 seconds of user run time
  0.140172 seconds of system run time
  [Run times include 0.8 seconds GC run time]
  0 page faults and
  39973360 bytes consed.
*A*

After working on the code, putting it in a file, running the
file compiler and reading the efficiency notes

(declaim (optimize (speed 3)))
(defun rseq3 (a b n )
  (declare (type double-float a b))
  (declare (type (integer 2 30000) n))
  (let ((r (make-array (list n n) :element-type 'double-float))
	(sp (/ (- b a) (1- n))))
    (dotimes (i n)
      (declare (type (integer 0 30000) i ))
      (dotimes (j n)
	(declare (fixnum j))
	(setf (aref r i j) (+ (* i sp) a)))) r))

I get

(time (defparameter *a* (rseq3 1d0 1000d0 1000)))
Compiling LAMBDA NIL: 
Compiling Top-Level Form: 

Evaluation took:
  0.31 seconds of real time
  0.250095 seconds of user run time
  0.023496 seconds of system run time
  0 page faults and
  8000008 bytes consed.

8 million bytes is the minimum possible for a 1000x1000
array of double floats.

Things I learnt doing this.

1)Must be (defparameter *a* ...)
  not (defparameter a ...)

Explanation: defparameter a makes formal parameter a special
too, which obstructs optimization

                          V
2)(declare (type (integer 2 30000) n)) 
                          ^
Explanation: the optimiser noticed the potential division by
zero in 1/(n-1) when n=1 and was unhappy.

3) :element-type 'double-float

Explanation: CMUCL didn't do nearly as well when the type
was left until run time. I expected that. If you need
various types and maximum performance, I think your best
bet is to use a macro, so that the types get sorted out
before compilation. 

Warning: I'm not a compiler guy and don't actually know what
I'm doing. I just turn up SPEED to 3, watch the efficiency
notes sprout like mushrooms, and then I bodge in
declarations until they go away. Well, it gets me a fifteen
fold speed up.

Alan Crowe
Edinburgh
Scotland
From: Svein Ove Aas
Subject: Re: Where did all the garbage come from?
Date: 
Message-ID: <cllkhc$9ru$1@services.kq.no>
Alan Crowe wrote:

> Warning: I'm not a compiler guy and don't actually know what
> I'm doing. I just turn up SPEED to 3, watch the efficiency
> notes sprout like mushrooms, and then I bodge in
> declarations until they go away. Well, it gets me a fifteen
> fold speed up.
> 
Sounds familiar...

If you turn safety down as well (to 0, even), you'll get another speed
boost.