From: Nils Goesche
Subject: Newbie consing to hell
Date: 
Message-ID: <86zozcu5d9.fsf@goesche.b.uunet.de>
Hi!

I am trying to learn Common Lisp.  I wrote several versions of this
function: 

(defun print-primes (bound lower)
  (let ((prime-list (make-array
		     (floor (/ (* 2 bound) (log bound)))))
	(listsize 1))
    (setf (svref prime-list 0) 2)
    (do ((i 3 (+ i 2)))
	((> i bound) t)
      (when (do* ((index 0 (1+ index))
		  (prime 2 (svref prime-list index))
		  (upper (floor (sqrt i))))
		((> prime upper) index)
	      (if (zerop (mod i prime))
		  (return nil)))
	(if (> i lower)
	    (format t "~A~%" i))
	(setf (svref prime-list listsize) i)
	(setf listsize (1+ listsize))))))

In my first version, prime-list was just a real list and I made
append, or nconc to add new primes to it.  I made the changes
because I keep getting these annoying messages from CMU Lisp:

[GC threshold exceeded with 5,428,360 bytes in use.  Commencing GC.]
[GC completed with 1,516,680 bytes retained and 3,911,680 bytes freed.]
[GC will next occur when at least 5,516,680 bytes are in use.]

Don't tell me to (setf *gc-verbose* nil) :-)

When I took an array instead of a list, I thought the messages would
go away.  But they didn't.  Why is that?  Isn't all memory allocated
once and for all?  Compiling the function helped a lot, it seemed, but
there is still a lot of garbage produced.  Where?

Remark: I wrote a real monster version of this function in scheme.
For each i, I called a function which checked for primality which
called itself recursively with the cdr of the prime-list (and I used
append, no nconc :-) xscheme had no problems with it, the function
ran incredibly fast.
-- 
Nils Goesche
----------------------------------------
Fatal Error # 0x155 : Windows not found.
(C)heer  (P)arty  (D)ance

From: Gareth McCaughan
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <86hflky73d.fsf@g.local>
Nils Goesche wrote:

> I am trying to learn Common Lisp.  I wrote several versions of this
> function: 
> 
> (defun print-primes (bound lower)
>   (let ((prime-list (make-array
> 		     (floor (/ (* 2 bound) (log bound)))))
> 	(listsize 1))
>     (setf (svref prime-list 0) 2)
>     (do ((i 3 (+ i 2)))
> 	((> i bound) t)
>       (when (do* ((index 0 (1+ index))
> 		  (prime 2 (svref prime-list index))
> 		  (upper (floor (sqrt i))))
> 		((> prime upper) index)
> 	      (if (zerop (mod i prime))
> 		  (return nil)))
> 	(if (> i lower)
> 	    (format t "~A~%" i))
> 	(setf (svref prime-list listsize) i)
> 	(setf listsize (1+ listsize))))))
[and it conses a lot, and Nils wants to know why]

1. I hope you're compiling the function. If not, you'll get
   lots of consing as a result of interpreter overhead.

2. Replacing (floor (sqrt i)) with (isqrt i) drastically
   reduces the amount of consing. It would appear that the
   compiler is "boxing" the result of calling SQRT (i.e.,
   not just sticking it in a floating-point register, but
   allocating memory for it and using a pointer to it); I
   conjecture that it's failed to notice that I is always
   positive, so it has to consider the possibility that
   (SQRT I) might be complex instead of real. <hack hack>
   No, that's not it -- if I explicitly tell it I is >=0
   it still conses a lot. Even if I also tell it that I is
   not very large, so you'd think it could open-code the SQRT.
   Strange.

-- 
Gareth McCaughan  ················@pobox.com
sig under construction
From: Paul Meurer
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <37c7bc08.4000421@nntp.uib.no>
On 28 Aug 1999 03:25:26 +0100, Gareth McCaughan
<················@pobox.com> wrote:

>Nils Goesche wrote:
>
>> I am trying to learn Common Lisp.  I wrote several versions of this
>> function: 
>> 
>> (defun print-primes (bound lower)
>>   (let ((prime-list (make-array
>> 		     (floor (/ (* 2 bound) (log bound)))))
>> 	(listsize 1))
>>     (setf (svref prime-list 0) 2)
>>     (do ((i 3 (+ i 2)))
>> 	((> i bound) t)
>>       (when (do* ((index 0 (1+ index))
>> 		  (prime 2 (svref prime-list index))
>> 		  (upper (floor (sqrt i))))
>> 		((> prime upper) index)
>> 	      (if (zerop (mod i prime))
>> 		  (return nil)))
>> 	(if (> i lower)
>> 	    (format t "~A~%" i))
>> 	(setf (svref prime-list listsize) i)
>> 	(setf listsize (1+ listsize))))))
>[and it conses a lot, and Nils wants to know why]
>
>1. I hope you're compiling the function. If not, you'll get
>   lots of consing as a result of interpreter overhead.
>
>2. Replacing (floor (sqrt i)) with (isqrt i) drastically
>   reduces the amount of consing. 

Why use sqrt at all? You can drastically reduce consing (and save some
cycles) by slightly changing the algorithm: replace the do* loop by:

(do* ((index 0 (1+ index))
                  (prime 2 (svref prime-list index)))
                 ((> (* prime prime) i) index)
              (if (zerop (mod i prime))
                  (return nil)))

- Paul

Paul Meurer at HIT UiB no
Humanities Information Technologies,
University of Bergen
Allegt. 27
5007 Bergen, Norway
From: Nils Goesche
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <86u2pkdsro.fsf@goesche.b.uunet.de>
···········@hit.uib.no (Paul Meurer) writes:

> On 28 Aug 1999 03:25:26 +0100, Gareth McCaughan
> <················@pobox.com> wrote:
> 
> >Nils Goesche wrote:
> >
> >> I am trying to learn Common Lisp.  I wrote several versions of this
> >> function: 
> >> 

[snip]

> >[and it conses a lot, and Nils wants to know why]
> >
> >1. I hope you're compiling the function. If not, you'll get
> >   lots of consing as a result of interpreter overhead.
> >
> >2. Replacing (floor (sqrt i)) with (isqrt i) drastically
> >   reduces the amount of consing. 
> 
> Why use sqrt at all? You can drastically reduce consing (and save some
> cycles) by slightly changing the algorithm: replace the do* loop by:
> 
> (do* ((index 0 (1+ index))
>                   (prime 2 (svref prime-list index)))
>                  ((> (* prime prime) i) index)
>               (if (zerop (mod i prime))
>                   (return nil)))

Ah.  Thank you very much.  Both hints, isqrt and (* prime prime), turn
out to be much better than my version.  I ran
(print-primes 10000100 10000000) and both took about 2:15 minutes.  No
garbage collection.

I had expected the isqrt version to be faster because isqrt is only
called once for each i.  Probably it is still somewhat expensive, and
because most i's are checked only a few times until a divisor is
found, the few (* prime prime) are not really an overhead.
-- 
Nils Goesche
----------------------------------------
Fatal Error # 0x155 : Windows not found.
(C)heer  (P)arty  (D)ance
From: Gareth McCaughan
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <86wvugw3o8.fsf@g.local>
Paul Meurer wrote:

[I said:]
>> 2. Replacing (floor (sqrt i)) with (isqrt i) drastically
>>   reduces the amount of consing. 
> 
> Why use sqrt at all? You can drastically reduce consing (and save some
> cycles) by slightly changing the algorithm: replace the do* loop by:
[SNIP]

Because he asked "what's causing the consing?", not "how can I find
prime numbers quickly?". (If what you really want is fast prime-finding,
there are much better ways: Eratosthenes, priority queues, primality
tests other than trial division, etc. But it's clear that the real
purpose of the PRINT-PRIMES function was to explore CL, not to find
prime numbers as efficiently as possible.

-- 
Gareth McCaughan  ················@pobox.com
sig under construction
From: Raymond Toy
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <4n1zcmplnq.fsf@rtp.ericsson.se>
>>>>> "Gareth" == Gareth McCaughan <················@pobox.com> writes:

    Gareth> 2. Replacing (floor (sqrt i)) with (isqrt i) drastically
    Gareth>    reduces the amount of consing. It would appear that the
    Gareth>    compiler is "boxing" the result of calling SQRT (i.e.,
    Gareth>    not just sticking it in a floating-point register, but
    Gareth>    allocating memory for it and using a pointer to it); I
    Gareth>    conjecture that it's failed to notice that I is always
    Gareth>    positive, so it has to consider the possibility that
    Gareth>    (SQRT I) might be complex instead of real. <hack hack>

Yes, the compiler is not smart enough to figure out I is positive, so
decides that (sqrt i) is (or (complex single-float) (single-float
0.0)).

    Gareth>    No, that's not it -- if I explicitly tell it I is >=0
    Gareth>    it still conses a lot. Even if I also tell it that I is
    Gareth>    not very large, so you'd think it could open-code the SQRT.
    Gareth>    Strange.

I believe it does open-code SQRT, but I can't tell in this case.

However, the consing could be coming from printing the numbers.  CMUCL 
conses quite a bit in printing numbers.

Ray
From: Gareth McCaughan
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <86n1vadnu4.fsf@g.local>
Raymond Toy wrote:

[me:]
>     Gareth> 2. Replacing (floor (sqrt i)) with (isqrt i) drastically
>     Gareth>    reduces the amount of consing. It would appear that the
>     Gareth>    compiler is "boxing" the result of calling SQRT (i.e.,
>     Gareth>    not just sticking it in a floating-point register, but
>     Gareth>    allocating memory for it and using a pointer to it); I
>     Gareth>    conjecture that it's failed to notice that I is always
>     Gareth>    positive, so it has to consider the possibility that
>     Gareth>    (SQRT I) might be complex instead of real. <hack hack>
> 
> Yes, the compiler is not smart enough to figure out I is positive, so
> decides that (sqrt i) is (or (complex single-float) (single-float
> 0.0)).

It's not even as simple as that. If you wrap the reference to I
in (the (integer 0 1000000)) or something, that doesn't stop the
consing.

>     Gareth>    No, that's not it -- if I explicitly tell it I is >=0
>     Gareth>    it still conses a lot. Even if I also tell it that I is
>     Gareth>    not very large, so you'd think it could open-code the SQRT.
>     Gareth>    Strange.
> 
> I believe it does open-code SQRT, but I can't tell in this case.
> 
> However, the consing could be coming from printing the numbers.  CMUCL 
> conses quite a bit in printing numbers.

Nope. Even if you call it with (= BOUND LOWER) it still conses
quite a bit.

-- 
Gareth McCaughan  ················@pobox.com
sig under construction
From: Raymond Toy
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <4nk8qdnvyh.fsf@rtp.ericsson.se>
>>>>> "Gareth" == Gareth McCaughan <················@pobox.com> writes:

    >> Yes, the compiler is not smart enough to figure out I is positive, so
    >> decides that (sqrt i) is (or (complex single-float) (single-float
    >> 0.0)).

    Gareth> It's not even as simple as that. If you wrap the reference to I
    Gareth> in (the (integer 0 1000000)) or something, that doesn't stop the
    Gareth> consing.

That's odd.  When I do this (with speed 3), I no longer get compiler
notes about boxing up numbers.


    Gareth> Nope. Even if you call it with (= BOUND LOWER) it still conses
    Gareth> quite a bit.

The only compiler note that explicitly says anything about consing is
the array bounds check for (setf (svref prime-list listsize) i).  A
number is boxed up.  This might account for some of the consing.

Ray
From: Gareth McCaughan
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <86zoz9b3t3.fsf@g.local>
Raymond Toy wrote:

>>> Yes, the compiler is not smart enough to figure out I is positive, so
>>> decides that (sqrt i) is (or (complex single-float) (single-float
>>> 0.0)).
> 
>     Gareth> It's not even as simple as that. If you wrap the reference to I
>     Gareth> in (the (integer 0 1000000)) or something, that doesn't stop the
>     Gareth> consing.
> 
> That's odd.  When I do this (with speed 3), I no longer get compiler
> notes about boxing up numbers.

I get 2 fewer compiler notes, but it still conses the same
amount. I'm using 18a on FreeBSD/x86.

>     Gareth> Nope. Even if you call it with (= BOUND LOWER) it still conses
>     Gareth> quite a bit.
> 
> The only compiler note that explicitly says anything about consing is
> the array bounds check for (setf (svref prime-list listsize) i).  A
> number is boxed up.  This might account for some of the consing.

For me, it says "Doing signed word to integer coercion"; does that
necessarily mean any consing happens? If the integer is always
actually inrange, won't the result always be a fixnum and therefore
not require any consing?

-- 
Gareth McCaughan  ················@pobox.com
sig under construction
From: Raymond Toy
Subject: Re: Newbie consing to hell
Date: 
Message-ID: <4nu2phlzvh.fsf@rtp.ericsson.se>
>>>>> "Gareth" == Gareth McCaughan <················@pobox.com> writes:

    Gareth> For me, it says "Doing signed word to integer coercion"; does that
    Gareth> necessarily mean any consing happens? If the integer is always
    Gareth> actually inrange, won't the result always be a fixnum and therefore
    Gareth> not require any consing?

I think you're right.

Actually, it's worse than that.  I just added the few declarations
necessary to make CMUCL not produce any compiler notes (except one for 
the make-array).

Consing doesn't change at all.  The only place I can see any consing
is the array, but its size isn't close to the amount of consing I'm
seeing.

Clueless,

Ray