From: Richard J. Fateman
Subject: factorial of largish numbers
Date: 
Message-ID: <e10udn$2jgi$1@agate.berkeley.edu>
I invite comments on a paper posted at
http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf

on computing factorial  of, say, 20,000 or more.
Programs in lisp and some comparison with Maple and
Mathematica. are included.

I've used (Allegro 7.0) Lisp + GMP for best times, but maybe
you have other ideas.
Thanks
RJF

From: Jon Harrop
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44342248$0$23204$ed2e19e4@ptn-nntp-reader04.plus.net>
Richard J. Fateman wrote:
> I invite comments on a paper posted at
> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
> 
> on computing factorial  of, say, 20,000 or more.
> Programs in lisp and some comparison with Maple and
> Mathematica. are included.
> 
> I've used (Allegro 7.0) Lisp + GMP for best times, but maybe
> you have other ideas.

Just when I thought I'd read the last paper on factorial functions written
in Lisp...

Well, the first criticism has to be: why write about computing 20,000! when
you could write about something a little more useful? The next criticism
has to be why use Lisp when other languages are as fast and more concise?
Finally, you seem to have written the most verbose and least efficient
factorial function possible in Mathematica:

  f[n_] := Block[{p=1}, Do[p*=i, {i, n}]; p]

when you could have written any of:

  f[0] := 1
  f[n_] := n f[n-1]

  f[n_] := ·····@@(Range[n])

  f[n_] := Product[i, {i, n}]

  f[n_] := Fold[Times, 1, Range[n]]

This is all academic, of course, because you would just write "n!" in
Mathematica and it would run much faster. Indeed, that's the whole point...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/ocaml_for_scientists/chapter1.html
From: ····@axiom-developer.org
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <1144275433.752025.9500@u72g2000cwu.googlegroups.com>
I suggest a different algorithm using binary arithmetic.
Call it the "trailing zeros" algorithm.

1 = 1
2 = 10 = 1 * 2^1 ==> 2! = 2
3 = 11 = 3 * 2^0 ==> 3! = 3*1*2^1 = 110 = 6
4 = 100 = 1 * 2^2 ==> 4! = 3 * 2^3 = 11000 = 24
5 = 101 = 5 * 2^2 ==> 5! = 5 * 3 * 2^3 = 1111000 = 120
6 = 110 = 3 * 2^1 ==> 6! = 3 * 5 * 3 * 2^4 = 10 1101 0000 = 720
7 = 111 = 7 * 2^0 ==> 7! = 7 * 3 * 5 * 3 * 2^4 = 1 0011 1011 0000 =
5040
8 = 1000 = 1 * 2^3 ==> 8! = 7 * 3 * 5 * 3 * 2^7 = 1 0011 1011 0000 0000
= 40320

notice that we're using shifting to rid the world of the large trailing
zeros
and that certain computations can be see speedup because we're only
keeping significant factor bits. 7! and 8! obviously are the same
computation.

further speedups are possible with table lookups of values of products
with
certain hex values, rather like special multiplication tables for hex
products
of, say, 7*3, etc.

if you "sieve out" all of the factors that have no trailing zeros you
can get
the "content" of the number without the "fat".

there is also "fat" content with the intermediate zeros (e.g.
100....0001)
which could be removed by dealing with the number first in hex
segments,
then in word segments, then doubleword, quadword, etc.

I'm sure this exists somewhere but cannot find it in a google search.

Tim Daly
From: Geoffrey Summerhayes
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <oXcZf.2632$sh3.181396@news20.bellglobal.com>
<····@axiom-developer.org> wrote in message 
···························@u72g2000cwu.googlegroups.com...
>I suggest a different algorithm using binary arithmetic.
> Call it the "trailing zeros" algorithm.
>
> 1 = 1
> 2 = 10 = 1 * 2^1 ==> 2! = 2
> 3 = 11 = 3 * 2^0 ==> 3! = 3*1*2^1 = 110 = 6
> 4 = 100 = 1 * 2^2 ==> 4! = 3 * 2^3 = 11000 = 24
> 5 = 101 = 5 * 2^2 ==> 5! = 5 * 3 * 2^3 = 1111000 = 120
> 6 = 110 = 3 * 2^1 ==> 6! = 3 * 5 * 3 * 2^4 = 10 1101 0000 = 720
> 7 = 111 = 7 * 2^0 ==> 7! = 7 * 3 * 5 * 3 * 2^4 = 1 0011 1011 0000 =
> 5040
> 8 = 1000 = 1 * 2^3 ==> 8! = 7 * 3 * 5 * 3 * 2^7 = 1 0011 1011 0000 0000
> = 40320
>
> notice that we're using shifting to rid the world of the large trailing
> zeros
> and that certain computations can be see speedup because we're only
> keeping significant factor bits. 7! and 8! obviously are the same
> computation.

If I understand this right, you're thinking along the lines of:

(defstruct big-fact-num
  mantissa
  power-of-2)

(defun convert(n)
  ;; Obviously need something more efficent here
  (do ((x n (floor x 2))
       (power 0 (incf power)))
      ((oddp x) (values x power))))

(defun fact(n)
  ;; Again, simple algorithm
  (if (< n 2)
      (make-big-fact-num :mantissa 1 :power-of-2 0)
    (let ((fact-so-far (fact (1- n))))
      (multiple-value-bind (mant-n pow-n)
          (convert n)
        (incf (big-fact-num-power-of-2 fact-so-far) pow-n)
        (setf (big-fact-num-mantissa fact-so-far)
              (* (big-fact-num-mantissa fact-so-far) mant-n)))
      fact-so-far)))

CL-USER > (fact 8)
#S(BIG-FACT-NUM :MANTISSA 315 :POWER-OF-2 7)

CL-USER > (* 315 (expt 2 7))
40320

???
----
Geoff
From: Richard Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <D2fZf.45241$_S7.42334@newssvr14.news.prodigy.com>
The algorithms given by Peter Luschny cover a lot of ground.

I think Tim's observation is basically that if k(n,1) == n!, then

k(n,m) := if even(n) then 2^n*k(n/2,m)*k(n-m,m)   // rule for (2*n)!
                   else n*k(n-1,m)           // where n-1 will be even.

Note that multiplication by 2^n for integer n can be written as an
arithmetic shift in languages that permit it.  In lisp, (ash x n) .

There are similar rules for other factorials of numbers
with other divisors, e.g. if n is divisible by 3.

Multiplying x by 3 can be done by  (+ x (ash x 1)) though mult by
powers of 3 are not as fast as binary shifts.

There is a potential for speedup by observing that the answer can always be 
factored,
but then if you want to multiply it through to get a single bignum, you
have to figure out the right sequence of multiplies to save time.
Interleaving a prime number sieve with the computation can win, but
only by about a factor of 2, according to Luschny.

For GMP-based systems you should try to get equal-sized inputs to multiply.
For systems which do not use 'advanced' bignum multiply, different
heuristics apply: for them it may be faster to have inputs that are
smallnum X bignum.


There is perhaps a simpler way of expressing some of the
idea in that Geoff's version, eg.

;; (h2 n) is factorial of n, but faster.
(defun hf2(n);; n is positive integer
  (if (< n 2) 1
    (if (evenp n)
 (let ((q (ash n -1)))
     (ash (* (hf2 q)(f (- n 1) 2)) q))
      (* n (hf2 (1- n))))))

(defun f (n m) (if (<= n 1) 1 (* n (f (- n m) m)))) ;; could be hacked, too.

 
From: Alexander Schmolck
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <yfsvetn6fdg.fsf@oc.ex.ac.uk>
"Richard J. Fateman" <·······@eecs.berkeley.edu> writes:

> I invite comments on a paper posted at
> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf

The reason for using lisp is somewhat bogus. For example this python program

    def k(n, m=1):
        if n <= m: return n
        else:      return k(n,2*m) * k(n-m,2*m)

will compute k(20000) just fine in any non-ancient version of python (it is in
fact several orders of magnitude faster than compiled cmucl which I eventually
^C'ed and seems to be about the same speed as clisp).

'as
From: Richard J. Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e1141a$2lga$1@agate.berkeley.edu>
OK, Python now does bignums too.   thanks.

CMUCL may benefit if you set the optimization to (speed 3) and
also (declare (fixnum n m)).  It is unreasonable to compute factorials
of numbers that are not fixnums.  Python and CLISP presumably both
use GMP arithmetic.
RJF



Alexander Schmolck wrote:
> "Richard J. Fateman" <·······@eecs.berkeley.edu> writes:
> 
> 
>>I invite comments on a paper posted at
>>http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
> 
> 
> The reason for using lisp is somewhat bogus. For example this python program
> 
>     def k(n, m=1):
>         if n <= m: return n
>         else:      return k(n,2*m) * k(n-m,2*m)
> 
> will compute k(20000) just fine in any non-ancient version of python (it is in
> fact several orders of magnitude faster than compiled cmucl which I eventually
> ^C'ed and seems to be about the same speed as clisp).
> 
> 'as
From: Jon Harrop
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44342201$0$23204$ed2e19e4@ptn-nntp-reader04.plus.net>
Richard J. Fateman wrote:
> OK, Python now does bignums too.   thanks.

SML too:

  fun f n : IntInf.int = if n=0 then 1 else n*f(n-1)

http://en.wikipedia.org/wiki/Standard_ML#Arbitrary-precision_factorial_function_.28libraries.29

> CMUCL may benefit if you set the optimization to (speed 3) and
> also (declare (fixnum n m)).  It is unreasonable to compute factorials
> of numbers that are not fixnums.  Python and CLISP presumably both
> use GMP arithmetic.

This is an awful performance benchmark, it stresses only fixnum*bignum
performance, which is very specific and rarely used in practice. It would
be interesting to compare code size instead but this single function is too
small to be useful even in that respect.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/ocaml_for_scientists/chapter1.html
From: Richard J. Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44342B74.1080901@eecs.berkeley.edu>
Actually, YOUR program does fixnum*bignum, after about 13 calls.
The program I gave does about half  fixnum* fixnum and half
bignum*bignum. I assume it could be written in SML in about twice
the size of the code you suggest.

I agree that as a performance benchmark it is measuring
something that may not be particularly important to you, or to me.
That's probably par for the course in benchmarks.
I am not aware of any special use for exact bignum factorials
  of very large numbers.

Oh, I tried Python; it took about 0.267 CPU seconds, compared
to the Lisp+GMP version of about 0.027 CPU seconds  (2.59GH Pentium 4)

And about Mathematica:  I gave the timing for 20000! first.

The timing for Product[i,{i,20000}] is the same, perhaps because
it is first transformed into 20000! ? The expression using "Do"
is the first that occurred to me that definitively expressed an
operation that was not going to be simplified away.

It had not occurred to me that a reasonable way to compute
factorial of 20000 is to create a list of numbers {1, 2, ..., 20000}
and then apply "times" to it. But Mathematica does that rather
fast, in 0.047 seconds, so it is indeed faster that the Do loop.
The program with "Fold" is 10X slower than the program with @@.



Jon Harrop wrote:

> Richard J. Fateman wrote:
> 
>>OK, Python now does bignums too.   thanks.
> 
> 
> SML too:
> 
>   fun f n : IntInf.int = if n=0 then 1 else n*f(n-1)
> 
> http://en.wikipedia.org/wiki/Standard_ML#Arbitrary-precision_factorial_function_.28libraries.29
> 
> 
>>CMUCL may benefit if you set the optimization to (speed 3) and
>>also (declare (fixnum n m)).  It is unreasonable to compute factorials
>>of numbers that are not fixnums.  Python and CLISP presumably both
>>use GMP arithmetic.
> 
> 
> This is an awful performance benchmark, it stresses only fixnum*bignum
> performance, which is very specific and rarely used in practice. It would
> be interesting to compare code size instead but this single function is too
> small to be useful even in that respect.
> 
From: Jon Harrop
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44342f31$0$23159$ed2e19e4@ptn-nntp-reader04.plus.net>
Richard J. Fateman wrote:
> Actually, YOUR program does fixnum*bignum, after about 13 calls.
> The program I gave does about half  fixnum* fixnum and half
> bignum*bignum.

By number of calls, yes, but by time?

> I assume it could be written in SML in about twice 
> the size of the code you suggest.

The compiler might do that for you. I'm not sure.

> I agree that as a performance benchmark it is measuring
> something that may not be particularly important to you, or to me.
> That's probably par for the course in benchmarks.
> I am not aware of any special use for exact bignum factorials
>   of very large numbers.
> 
> Oh, I tried Python; it took about 0.267 CPU seconds, compared
> to the Lisp+GMP version of about 0.027 CPU seconds  (2.59GH Pentium 4)
> 
> And about Mathematica:  I gave the timing for 20000! first.
> 
> The timing for Product[i,{i,20000}] is the same, perhaps because
> it is first transformed into 20000! ?

Good point.

> The expression using "Do" 
> is the first that occurred to me that definitively expressed an
> operation that was not going to be simplified away.
> 
> It had not occurred to me that a reasonable way to compute
> factorial of 20000 is to create a list of numbers {1, 2, ..., 20000}
> and then apply "times" to it. But Mathematica does that rather
> fast, in 0.047 seconds, so it is indeed faster that the Do loop.
> The program with "Fold" is 10X slower than the program with @@.

I get similar relative timings on MMA 5.2 on AMD64.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/ocaml_for_scientists/chapter1.html
From: Jens Axel Søgaard
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44342e89$0$38646$edfadb0f@dread12.news.tele.dk>
Jon Harrop wrote:

> This is an awful performance benchmark, it stresses only fixnum*bignum
> performance, which is very specific and rarely used in practice. 

Did you miss the abstract?

-- 
Jens Axel S�gaard
From: Jon Harrop
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44343049$0$23159$ed2e19e4@ptn-nntp-reader04.plus.net>
Jens Axel S�gaard wrote:
> Jon Harrop wrote:
>> This is an awful performance benchmark, it stresses only fixnum*bignum
>> performance, which is very specific and rarely used in practice.
> 
> Did you miss the abstract?

Why take the time and effort to write a paper that starts off by stating
that it is silly and useless? Perhaps it would be better to concentrate on
revolutionary work...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/ocaml_for_scientists/chapter1.html
From: Alexander Schmolck
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <yfsirpn6avb.fsf@oc.ex.ac.uk>
"Richard J. Fateman" <·······@eecs.berkeley.edu> writes:

> OK, Python now does bignums too.   thanks.

As do ruby, smalltalk, J etc. On the other hand some popular scheme
implementations (e.g. bigloo, chicken and stalin) either don't have bignums at
all or require you to jump through hoops to get them -- so I think it is more
a question of high-levelness vs. focus on speed than algol vs lisp-like
syntax.
 
> Python and CLISP presumably both use GMP arithmetic.

Despite the developer overlap for clisp/GMP neither does IIRC; I haven't
checked again though.

'as
From: Deon Garrett
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <87ek0bix9h.fsf@clapton.csm.astate.edu>
Alexander Schmolck <··········@gmail.com> writes:

> "Richard J. Fateman" <·······@eecs.berkeley.edu> writes:
>
>> I invite comments on a paper posted at
>> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
>
> The reason for using lisp is somewhat bogus. For example this python program
>
>     def k(n, m=1):
>         if n <= m: return n
>         else:      return k(n,2*m) * k(n-m,2*m)
>
> will compute k(20000) just fine in any non-ancient version of python (it is in
> fact several orders of magnitude faster than compiled cmucl which I eventually
> ^C'ed and seems to be about the same speed as clisp).
>

Putting aside the utility of the benchmark...

I don't know how to time something in python, but using my watch, it took
about 5 seconds to compute k(20000).  The same code in sbcl (compiled and
with fixnum declarations where appropriate...)

* (defun k (n m)       
    (declare (fixnum n m))
    (if (<= n m)
        n   
        (* (k n (* 2 m)) (k (- n m) (* 2 m)))))

* (defun fact (n)
    (k n 1))

* (declaim (optimize (speed 3) (safety 0)))

* (compile 'k)

* (compile 'fact)

* (time (fact 20000))

Evaluation took:
  0.19 seconds of real time
  0.187972 seconds of user run time
  0.0 seconds of system run time
  0 page faults and
  538,400 bytes consed.

With CMUCL, I got 0.11 seconds.  Are you sure you compiled it?
From: Richard J. Fateman
Subject: Re: factorial of largish numbers / Timing Python
Date: 
Message-ID: <443430C5.1040207@eecs.berkeley.edu>
I'm not a Python expert, but what worked for
me was

import cProfile
cProfile.run('k(20000)');

        40001 function calls (3 primitive calls) in 0.267 CPU seconds

(on 2.59GH Pentium 4)

Thanks for more info on SBCL / CMUCL.

RJF

Deon Garrett wrote:

> Alexander Schmolck <··········@gmail.com> writes:
> 
> 
>>"Richard J. Fateman" <·······@eecs.berkeley.edu> writes:
>>
>>
>>>I invite comments on a paper posted at
>>>http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
>>
>>The reason for using lisp is somewhat bogus. For example this python program
>>
>>    def k(n, m=1):
>>        if n <= m: return n
>>        else:      return k(n,2*m) * k(n-m,2*m)
>>
>>will compute k(20000) just fine in any non-ancient version of python (it is in
>>fact several orders of magnitude faster than compiled cmucl which I eventually
>>^C'ed and seems to be about the same speed as clisp).
>>
> 
> 
> Putting aside the utility of the benchmark...
> 
> I don't know how to time something in python, but using my watch, it took
> about 5 seconds to compute k(20000).  The same code in sbcl (compiled and
> with fixnum declarations where appropriate...)
> 
> * (defun k (n m)       
>     (declare (fixnum n m))
>     (if (<= n m)
>         n   
>         (* (k n (* 2 m)) (k (- n m) (* 2 m)))))
> 
> * (defun fact (n)
>     (k n 1))
> 
> * (declaim (optimize (speed 3) (safety 0)))
> 
> * (compile 'k)
> 
> * (compile 'fact)
> 
> * (time (fact 20000))
> 
> Evaluation took:
>   0.19 seconds of real time
>   0.187972 seconds of user run time
>   0.0 seconds of system run time
>   0 page faults and
>   538,400 bytes consed.
> 
> With CMUCL, I got 0.11 seconds.  Are you sure you compiled it?
> 
From: Marko Kocic
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <1144278029.401912.67040@j33g2000cwa.googlegroups.com>
With a little more type settings (sbcl) I got:
CL-USER> (defun k (n m)
		   (declare (optimize (speed 3)(debug 0)(safety 0)))
		   (declare (fixnum n m))
		   (if (<= n m)
			   n
			   (the fixnum (* (k n (the fixnum (* 2 m))) (k (the fixnum (- n m))
(the fixnum (* 2 m)))))))
STYLE-WARNING: redefining K in DEFUN
K
CL-USER> (compile 'k)
K
NIL
NIL
CL-USER> (time (progn (fact 20000) nil))
Evaluation took:
  0.001 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 page faults and
  0 bytes consed.
NIL

Without (the fixnum ...) it took 0.228s
From: Richard Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <I7bZf.12586$tN3.7903@newssvr27.news.prodigy.net>
I wouldn't go too far with this solution, simply because you
are not producing the right answer.  Your program always
produces a result which is a fixnum, but factorial of 20,000
has about 77,300 decimal digits.  I am not away of any
Lisp implementation for which this is a fixnum!

RJF

"Marko Kocic" <···········@gmail.com> wrote in message 
····························@j33g2000cwa.googlegroups.com...
> With a little more type settings (sbcl) I got:
> CL-USER> (defun k (n m)
>    (declare (optimize (speed 3)(debug 0)(safety 0)))
>    (declare (fixnum n m))
>    (if (<= n m)
>    n
>    (the fixnum (* (k n (the fixnum (* 2 m))) (k (the fixnum (- n m))
> (the fixnum (* 2 m)))))))
> STYLE-WARNING: redefining K in DEFUN
> K
> CL-USER> (compile 'k)
> K
> NIL
> NIL
> CL-USER> (time (progn (fact 20000) nil))
> Evaluation took:
>  0.001 seconds of real time
>  0.0 seconds of user run time
>  0.0 seconds of system run time
>  0 page faults and
>  0 bytes consed.
> NIL
>
> Without (the fixnum ...) it took 0.228s
> 
From: Christophe Rhodes
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <sqek0bac8p.fsf@cam.ac.uk>
Deon Garrett <·····@acm.org> writes:

> I don't know how to time something in python, but using my watch, it took
> about 5 seconds to compute k(20000).  The same code in sbcl (compiled and
> with fixnum declarations where appropriate...)
>
> * (defun k (n m)       
>     (declare (fixnum n m))
>     (if (<= n m)
>         n   
>         (* (k n (* 2 m)) (k (- n m) (* 2 m)))))
>
> * (defun fact (n)
>     (k n 1))
>
> * (declaim (optimize (speed 3) (safety 0)))
>
> * (compile 'k)
>
> * (compile 'fact)
>
> * (time (fact 20000))

Be careful of what you are measuring.  This time, you're fine from the
printing point of view, but in this SBCL transcript, assuming it is in
fact what you did (you don't quote return values, so it's obviously
not a simple cut-and-paste) the code you are executing was compiled
with default optimization qualities, not with your request for
optimized code [ (speed 3) (safety 0) ].

In this instance, this may not make a difference in this instance, as
the type checks are cheap compared to the actual work done.

Christophe
From: Alexander Schmolck
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <yfsbqvf639a.fsf@oc.ex.ac.uk>
Deon Garrett <·····@acm.org> writes:

>> Alexander Schmolck <··········@gmail.com> writes:
> 
> > "Richard J. Fateman" <·······@eecs.berkeley.edu> writes:
> >
> >> I invite comments on a paper posted at
> >> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
> >
> > The reason for using lisp is somewhat bogus. For example this python program
> >
> >     def k(n, m=1):
> >         if n <= m: return n
> >         else:      return k(n,2*m) * k(n-m,2*m)
> >
> > will compute k(20000) just fine in any non-ancient version of python (it is in
> > fact several orders of magnitude faster than compiled cmucl which I eventually
> > ^C'ed and seems to be about the same speed as clisp).
> >
> 
> Putting aside the utility of the benchmark...
> 
> I don't know how to time something in python, but using my watch, it took
> about 5 seconds to compute k(20000).  

Have you maybe printed the result in python but not in lisp (5 secs is what I
get if I print it); I think I made a similar mistake with my version of cmucl
which does really badly at printing bignums (consing madly), so I somewhat
prematurely switched to clisp which is traditionally said to be faster for
bignum stuff after cmucl still hadn't finished long after the time it took to
get the result printed in python. I still think python is faster than my
version of cmucl, but not orders of magnitude.

try this:

12:27AM> cat ~/py/fact.py
def k(n, m=1):
    if n <= m: return n
    else:     return k(n,2*m) * k(n-m,2*m)
k(50000)

12:28AM> time python ~/py/fact.py
python ~/py/fact.py  0.83s user 0.00s system 95% cpu 0.868 total

(the overhead of starting python doesn't really matter much here, but normally
you'd use e.g. python -mtimeit -s 'from fact import k' 'k(50000)')

For the k right out of the article (no declarations):

* (load "fact")
; Loading #p"[...]/fact.x86f".
* (time (and (k 50000 1) 'done))
[...]
; Evaluation took:
;   1.84f0 seconds of real time
[...]
DONE

So python is 2x as fast for 50000.

the respective versions are:

Python 2.4.1 (#2, Mar 30 2005, 21:51:10) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] 

CMU Common Lisp CVS release-19a 19a-release-20040728 + minimal debian patches

running on a centrino notebook

> * (declaim (optimize (speed 3) (safety 0)))

I think it makes more sense not to do this for this comparison (at the very
least you'd have to compare it against a python version that uses psyco, weave
startkiller or whatever).

> 
> * (compile 'k)
> 
> * (compile 'fact)
> 
> * (time (fact 20000))
> 
> Evaluation took:
>   0.19 seconds of real time
>   0.187972 seconds of user run time
>   0.0 seconds of system run time
>   0 page faults and
>   538,400 bytes consed.
> 
> With CMUCL, I got 0.11 seconds.  Are you sure you compiled it?

Yes. See the fasl file it loads above.

'as
From: Juho Snellman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <slrne38kt7.4pe.jsnell@sbz-30.cs.Helsinki.FI>
Alexander Schmolck <··········@gmail.com> wrote:
> CMU Common Lisp CVS release-19a 19a-release-20040728 + minimal debian patches

You might want to try a newer CMUCL. IIRC, some changes I made to the
bignum multiplier just after 19a was released should speed up this
benchmark a lot.

-- 
Juho Snellman
From: Jens Axel Søgaard
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <44342e64$0$38630$edfadb0f@dread12.news.tele.dk>
Alexander Schmolck wrote:
> "Richard J. Fateman" <·······@eecs.berkeley.edu> writes:
> 
> 
>>I invite comments on a paper posted at
>>http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
> 
> 
> The reason for using lisp is somewhat bogus. 

Why? Luschny is into factorial progams and uses Java: 
<http://www.luschny.de/math/factorial/index.html>. It makes
sense to choose a language with bignums to showcase factorial
algorithms. The ones mentioned in the paper haven't got builtin
bignums and Lisp does.

-- 
Jens Axel S�gaard
From: Christophe Rhodes
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <sqirpnackh.fsf@cam.ac.uk>
[ note followups ]

Alexander Schmolck <··········@gmail.com> writes:

> will compute k(20000) just fine in any non-ancient version of python
> (it is in fact several orders of magnitude faster than compiled
> cmucl which I eventually ^C'ed and seems to be about the same speed
> as clisp).

Be careful of what you are measuring.  You don't quote how you made
your observations, but you might be running into a performance problem
in cmucl, where the factorial is in fact computed acceptably fast, but
printing of return value (by the REPL) takes a very long time.

Christophe
From: Rob Warnock
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <VN-dnUjhNO2s0anZRVn-gw@speakeasy.net>
Christophe Rhodes  <·····@cam.ac.uk> wrote:
+---------------
| Alexander Schmolck <··········@gmail.com> writes:
| > will compute k(20000) just fine in any non-ancient version of python
| > (it is in fact several orders of magnitude faster than compiled
| > cmucl which I eventually ^C'ed and seems to be about the same speed
| > as clisp).
| 
| Be careful of what you are measuring.  You don't quote how you made
| your observations, but you might be running into a performance problem
| in cmucl, where the factorial is in fact computed acceptably fast, but
| printing of return value (by the REPL) takes a very long time.
+---------------

No, I tried his example in CMUCL, and observed that the printout
from TIME occurred *before* all the GC'ing started to print the value.
So he was indeed measuring what he thought he did.

That is, (TIME (DEFPARAMETER *F20K* (FACT 20000))) didn't GC at all
[though it did cons just over 900KB] and reported the same runtimes
as (TIME (FACT 20000)), whereas simply typing *F20K* to the REPL
GC'd *numerous* times before finally printing the value.  ;-}


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Alexander Schmolck
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <yfs7j63634q.fsf@oc.ex.ac.uk>
Christophe Rhodes <·····@cam.ac.uk> writes:

> [ note followups ]
> 
> Alexander Schmolck <··········@gmail.com> writes:
> 
> > will compute k(20000) just fine in any non-ancient version of python
> > (it is in fact several orders of magnitude faster than compiled
> > cmucl which I eventually ^C'ed and seems to be about the same speed
> > as clisp).
> 
> Be careful of what you are measuring.  You don't quote how you made
> your observations, but you might be running into a performance problem
> in cmucl, where the factorial is in fact computed acceptably fast, but
> printing of return value (by the REPL) takes a very long time.

Yes, sorry (see my other post). I still think python is quite competitive with
unoptimized but compiled cmucl for this task on my machine, but it isn't
orders of magnitude faster (although python and clisp seem to be much better
at printing bignums).

'as
From: Björn Lindberg
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <hcsr74anxsa.fsf@my.nada.kth.se>
Alexander Schmolck <··········@gmail.com> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
> 
> > [ note followups ]
> > 
> > Alexander Schmolck <··········@gmail.com> writes:
> > 
> > > will compute k(20000) just fine in any non-ancient version of python
> > > (it is in fact several orders of magnitude faster than compiled
> > > cmucl which I eventually ^C'ed and seems to be about the same speed
> > > as clisp).
> > 
> > Be careful of what you are measuring.  You don't quote how you made
> > your observations, but you might be running into a performance problem
> > in cmucl, where the factorial is in fact computed acceptably fast, but
> > printing of return value (by the REPL) takes a very long time.
> 
> Yes, sorry (see my other post). I still think python is quite
> competitive with unoptimized but compiled cmucl for this task on my
> machine, but it isn't orders of magnitude faster (although python
> and clisp seem to be much better at printing bignums).

I don't know which magnitude you are using, but CMUCL did it in about
a tenth of the time Python took when I tried it.


Bj�rn
From: Peter Luschny
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e11bom$s6g$1@online.de>
Richard J. Fateman wrote:

 > I invite comments on a paper posted at
 > http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
 > on computing factorial  of, say, 20,000 or more.

It's nice to see you contemplating this seemingly
trivial problem. It is not trivial at all!

And it can be used to study implementations of arithmetic,
computational models and computational complexity in detail.

For example Arnold Schoenhage studied in his book
'Fast Algorithms: a Multitape Turing Machine Implementation'
(with A.Grotefeld and E.Vetter) his routines for integer
arithmetic comparing different implementations of n!.
See chapter 6.5 'How to compute n!'.

The running time of his algorithm is asymptotically bounded
by the same order of growth as the time for integer
multiplication for binary numbers of length log(n!),
improving a result of P.B. Borwein 'On the complexity
of calculating factorials' by a factor of order log log n.

On the other hand Schoenhage et alia wrote:
"As it stands, it is a bit disappointing to see [this
algorithm] merely better by a factor of two - in
contrast to the theoretical improvement by a factor
'of order' log n .."

Having implemented his algorithm and many others
in a completely different computational environment
(namely Java), I made the same observation.
For those who care to look:

http://www.luschny.de/math/factorial/FastFactorialFunctions.htm

Regrettably I have no implementations in Lisp. I do would like to
include some. Perhaps someone in this group could give me a
helping hand and show me the Lisp implementation of the algorithm
which I like best? I put it in an appendix, written in a C-like language.

Regards Peter

Integer Factorial(int n)
{
     if(n < 2) return 1;

     Integer p = 1, r = 1;
     N = 1;

     int h = 0, shift = 0, high = 1;
     int log2n = Floor(Log2(n));

     while(h != n)
     {
         shift += h;
         h = n >> log2n--;
         int len = high;
         high = (h & 1) == 1 ? h : h - 1;
         len = (high - len) / 2;

         if(len > 0)
         {
             p *= Product(len);
             r *= p;
         }
     }
     return r << shift;
}

int N;

Integer Product(int n)
{
     int m = n / 2;
     if( m == 0 ) return new Integer(N += 2);
     if( n == 2 ) return new Integer((N += 2)*(N += 2));
     return Product(n - m) * Product(m);
}
From: Richard J. Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e11c8i$2ooq$1@agate.berkeley.edu>
Peter Luschny wrote:
 > Perhaps someone in this group could give me a
> helping hand and show me the Lisp implementation of the algorithm
> which I like best? I put it in an appendix, written in a C-like language.
.......

here it is in lisp


(defun sprfact(n)
   (cond ((< n 0)(error "bad arg ~s to factorial" n))
	((< n 2) 1)
	(t
	 (let ((p 1) (r 1) (N 1) (log2n (floor (log n 2)))
	       (h 0) (shift 0) (high 1) (len 0))
	   (declare (special N))
	   (while (/= h n)
	     (incf shift h)
	     (setf h (ash n  (- log2n)))
	     (decf log2n)
	     (setf len high)
	     (setf high (if (oddp h) h (1- h)))
	     (setf len (ash (- high len) -1))
	     (cond ((> len 0)
		    (setf p (* p (prod len)))
		    (setf r (* r p)))))
	   (ash r shift)))))

(defun prod(n)
   (declare (special N))
   (let ((m (ash n -1)))
     (cond ((= m 0) (incf N 2))  ((= n 2) (* (incf N 2)(incf N 2)))
	  (t (* (prod (- n m)) (prod m))))))
From: Rob Warnock
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <ULWdnVUQaeznzKnZ4p2dnA@speakeasy.net>
Richard J. Fateman <·······@eecs.berkeley.edu> wrote:
+---------------
| Peter Luschny wrote:
|  > Perhaps someone in this group could give me a
| > helping hand and show me the Lisp implementation of the algorithm
| 
| here it is in lisp
| 
| (defun sprfact(n)
|    (cond ((< n 0)(error "bad arg ~s to factorial" n))
| 	((< n 2) 1)
| 	(t
| 	 (let ((p 1) (r 1) (N 1) (log2n (floor (log n 2)))
| 	       (h 0) (shift 0) (high 1) (len 0))
| 	   (declare (special N))
| 	   (while (/= h n)
| 	     (incf shift h)
| 	     (setf h (ash n  (- log2n)))
| 	     (decf log2n)
| 	     (setf len high)
| 	     (setf high (if (oddp h) h (1- h)))
| 	     (setf len (ash (- high len) -1))
| 	     (cond ((> len 0)
| 		    (setf p (* p (prod len)))
| 		    (setf r (* r p)))))
| 	   (ash r shift)))))
| 
| (defun prod(n)
|    (declare (special N))
|    (let ((m (ash n -1)))
|      (cond ((= m 0) (incf N 2))  ((= n 2) (* (incf N 2)(incf N 2)))
| 	  (t (* (prod (- n m)) (prod m))))))
+---------------

This will not run in Common Lisp as as written, because of the
WHILE, and even when that is corrected will give incorrect answers
[e.g., (SPRFACT 5) => 3] because the default READTABLE-CASE in
Common Lisp is :UPCASE. If you put this line at the top of your
program:

    (setf (readtable-case *readtable*) :invert)

and then define WHILE as follows [yeah, yeah, I know, using DO
would be safer]:

    (defmacro while (bool &body body)
      `(loop while ,bool do ,@body))

and only *THEN* type [or paste] the above definitions for SPRFACT/PROD,
then it will work and give correct answers.

But on CMUCL it's more than twice as slow [0.21f0 s versus 0.1f0 s]
than the previous K/FACT version mentioned in this thread. [And, yes,
I did compile it.]  I suspect that if you rewrite it without using
special variables it might improve [e.g., perhaps by making PROD be
an FLET inside the LET binding within SPRFACT].


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Peter Luschny
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e12hi4$74s$1@online.de>
Rob Warnock wrote:

> This will not run in Common Lisp as as written, because of the
> WHILE, and even when that is corrected will give incorrect answers
> [e.g., (SPRFACT 5) => 3] because the default READTABLE-CASE in
> Common Lisp is :UPCASE. If you put this line at the top of your
> program:
>
>     (setf (readtable-case *readtable*) :invert)
>
> and then define WHILE as follows [yeah, yeah, I know, using DO
> would be safer]:
>
>     (defmacro while (bool &body body)
>       `(loop while ,bool do ,@body))
>
> and only *THEN* type [or paste] the above definitions for SPRFACT/PROD,
> then it will work and give correct answers.

Thank you! Yes, I ran into trouble with ‘while’, as you expected.

LispWorks said: Error: ‘Undefined function WHILE called with
arguments (T 0 0 2 1 -1 -1 NIL)’.

On the other hand Allegro CL said: Error: ‘Attempt to make a
function definition for the name while as a macro. This name
is in the EXCL package and redefining it is a violation for
portable programs.’

But with your hints above things now work.

> But on CMUCL it's more than twice as slow [0.21f0 s versus 0.1f0 s]
> than the previous K/FACT version mentioned in this thread. [And, yes,
> I did compile it.]

Therefore I compared things with Java. For some largish numbers.

                       SPRFACT          K/Fact
N =    32,000          0.2              0.3
N =   256,000          3.9              6.4
N = 1,024,000         20.1             34.0
N = 2,048,000         44.8             76.0

All times in seconds.

For more benchmark results, also showing a relative
ranking including the more powerful prime algorithms:
http://www.luschny.de/math/factorial/FactorialBenchmark.html

> I suspect that if you rewrite it without using
> special variables it might improve [e.g., perhaps by making PROD be
> an FLET inside the LET binding within SPRFACT].

Thank you. Regrettably I am not proficient enough with Lisp to try
to improve on these fine points.

But yes, K/FACT is an enjoyable tiny recursion I haven't seen before.
I will include it in my collection of factorial algorithms.

But as far as I can see, it has not the asymptotic power of the
split recursion algorithm. For a better understanding of the latter
I add the trace of a small example (perhaps to prompt a more
Lisp-like implementation of the idea?) :

Compute 30!

init: p=1, r=1, then loop:

p_1          = 3
p = p * p_1  = 3
r = r * p    = 3

p_2          = 5 * 7
p = p * p_2  = 105
r = r * p    = 315

p_3          = 9 * 11 * 13 * 15
p = p * p_3  = 2027025
r = r * p    = 638512875

p_4          = 17 * 19 * 21 * 23 * 25 * 27 * 29
p = p * p_4  = 6190283353629375
r = r * p    = 3952575621190533915703125

return r * 2^26 = 265252859812191058636308480000000

Do the computation of the partial products p_n by binary
splitting (i.e. like a binary tree) to ensure a balanced load.

Regards Peter
From: Frank Buss
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <1ciqm5ho7uwjz.1ud4h7h0uqsrt$.dlg@40tude.net>
Rob Warnock wrote:

> perhaps by making PROD be
> an FLET inside the LET binding within SPRFACT].

you need LABELS, because PROD is a recursive function.

(defun sprfact (n)
  (cond ((< n 0) (error "bad arg ~s to sprfact" n))
	((< n 2) 1)
	(t
	 (let ((p 1)
               (r 1)
               (nn 1)
               (log2n (floor (log n 2)))
	       (h 0)
               (shift 0)
               (high 1)
               (len 0))
           (labels ((prod (n)
                      (let ((m (ash n -1)))
                        (cond ((= m 0) (incf nn 2))
                              ((= n 2) (* (incf nn 2)
                                          (incf nn 2)))
                              (t (* (prod (- n m))
                                    (prod m)))))))
             (loop while (/= h n) do
                   (incf shift h)
                   (setf h (ash n  (- log2n)))
                   (decf log2n)
                   (setf len high)
                   (setf high (if (oddp h) h (1- h)))
                   (setf len (ash (- high len) -1))
                   (cond ((> len 0)
                          (setf p (* p (prod len)))
                          (setf r (* r p)))))
             (ash r shift))))))

But it still doesn't look very Lisp-like. I didn't understand the
algorithm, so I can't write a better implementation :-)

-- 
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
From: Peter Luschny
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e12qh7$noh$1@online.de>
Frank Buss wrote:

 > you need LABELS, because PROD is a recursive function.
 > (defun sprfact (n)

Thank you, Frank. As far as I can judge this runs fine!

 > But it still does not look very Lisp-like.

Well, I have been told Lisp is hard, isn't it? ;-)

 > I didn't understand the
 > algorithm, so I can't write a better implementation :-)

Look at the trace I gave in my other posting.

I am now learning Lisp since 32 hours. However, I do not
think this qualifies me to suggest an implementation
to this group ;-)

Regards Peter
From: Richard Fateman
Subject: Re: factorial of largish numbers / more stylish Lisp
Date: 
Message-ID: <ytQZf.504$Lm5.295@newssvr12.news.prodigy.com>
Peter, Frank  (and others who seem to have been caught by this
problem)...

Here is a more lisp-like version of the split-recursive factorial

(defun fact (n)
  (let ((shift 0))
    (labels
 ((kg1 (n m)
    (cond ((and (evenp n)(> m 1))
    (incf shift (ash n -1))
    (kg1 (ash n -1) (ash m -1)))
   ((<= n m) n)
   (t (* (kg1 n (ash m 1))
         (kg1 (- n m)(ash m 1)))))))
      (ash (kg1 n 1) shift)  )))

 It runs at the same speed, pretty much, as the unrolled program posted 
earlier.  I recommend that
 you set the optimization flags to (speed 3)(safety 0)   and declare that 
n,m,shift  are fixnums.

To make it run competitively in Allegro CL, I changed it to use the GMP 
library,
but then had to change it more so it used a small collection of preallocated 
GMP numbers
instead of madly allocating small GMP numbers just to toss them out after 
one or
two arithmetic operations. A lisp system using GMP "natively" might be able 
to
use the above program.  Discussion at 
http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
has been updated.
Thanx
RJF




"Peter Luschny" <·········@luschny.de> wrote in message 
·················@online.de...
> Frank Buss wrote:
>
> > But it still does not look very Lisp-like.
>
> Well, I have been told Lisp is hard, isn't it? ;-)
> 
From: Peter Luschny
Subject: Re: factorial of largish numbers / more stylish Lisp
Date: 
Message-ID: <e192ef$puo$1@online.de>
Richard Fateman wrote:

 > Peter, Frank  (and others who seem to have been caught by this
 > problem)...

Thanks Richard.

I contribute the secret formula for the binary splitting factorial.

I also wrote a new implementation in a C-like language, more Lisp-
oriented (at least I hope so), no more 'while's necessary ;-)

http://www.luschny.de/math/factorial/binarysplitfact.html

Regards Peter
From: Peter Luschny
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e11i1e$413$1@online.de>
 > Richard J. Fateman schrieb:
>> Peter Luschny wrote:

>> Perhaps someone in this group could give me a
>> helping hand and show me the Lisp implementation of the algorithm
>> which I like best? 

> here it is in lisp

Superbe! Merci beaucoup!
From: Alexander Schmolck
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <yfs3bgr62vi.fsf@oc.ex.ac.uk>
Peter Luschny <·········@luschny.de> writes:

>  > Richard J. Fateman schrieb:
> >> Peter Luschny wrote:
> 
> >> Perhaps someone in this group could give me a
> >> helping hand and show me the Lisp implementation of the algorithm
> >> which I like best?
> 
> 
> > here it is in lisp

I think you will also need something like (untested)

(defacro while (condition &body body)
  `(loop while ,condition do (progn ,@body)))


'as
From: Alexander Schmolck
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <yfsy7yj4o7w.fsf@oc.ex.ac.uk>
Alexander Schmolck <··········@gmail.com> writes:

> I think you will also need something like (untested)
> 
> (defacro while (condition &body body)
(defmacro
From: Pascal Bourguignon
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <87vetnbsnv.fsf@thalassa.informatimago.com>
"Richard J. Fateman" <·······@eecs.berkeley.edu> writes:

> I invite comments on a paper posted at
> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
>
> on computing factorial  of, say, 20,000 or more.
> Programs in lisp and some comparison with Maple and
> Mathematica. are included.
>
> I've used (Allegro 7.0) Lisp + GMP for best times, but maybe
> you have other ideas.
> Thanks
> RJF

I would write fact5 as:

(defun fact5 (x)
  (if (typep x '(integer 0 12))
      (aref #(1 1 2 6 24 120 720 5040 40320 362880 3628800 39916800 479001600)
            x)
      (error "~S is not a valid argument to FACT5" x)))


clisp already  uses GMP.  On  my Athlon 1200  MHz, here is  the native
factorial time:

[290]> (time (defparameter *e* (ext:! 20000)))
Real time: 0.070124 sec.
Run time: 0.072005 sec.
Space: 696224 Bytes
*E*


For reference, here's the maxfact time:


[306]> (ext:gc)
1588278
[307]> (time (defparameter *e* (ext:! 20000)))
Real time: 0.069836 sec.
Run time: 0.068004 sec.
Space: 696224 Bytes
*E*
[308]> (mapcar 'compile '(maxfact cloop loopprod))
(MAXFACT CLOOP LOOPPROD)
[309]> (ext:gc)
1587970
[310]> (time (defparameter *m* (maxfact 20000)))
Real time: 0.113327 sec.
Run time: 0.112007 sec.
Space: 3504744 Bytes
GC: 2, GC time: 0.032002 sec.
*M*
[311]> (assert (= *m* *e*))
NIL
[312]> 


------------------------------------------------------------------------
;; This, taken from the pdf file:

(defun cloop (n) 
  (let* ((start (list 1)) (end start))
    (dotimes (i (1- n) (setf (cdr end) start))
      (setq start (cons (+ 2 i) start)))))

(setf *print-circle* t)

(defun loopprod (l)
 "efficient product of all the approximately equal numbers in the loop
trying to keep the sizes of inputs approximately balanced. the
circular loop l is destroyed in the process. "
 (cond ((eq (cdr l) l)(car l))
       (t (setf (car l) (* (car l) (cadr l)))
          (setf (cdr l) (cddr l)) 
          (loopprod (cdr l)))))

(defun maxfact (n) 
  (let* ((z (if (< n 100) 5 100))
         (aloop (cloop z)))
    (loop for i from (1+ z) to n
       do (setf (car aloop) (* (car aloop) i))
         (setf aloop (cdr aloop)))
    (loopprod aloop)))

;; gives:
(maxfact 2) --> 120
------------------------------------------------------------------------


Otherwise, it seems to me that in addition to the cloop, it would be
worthwhile to extract the even.  For 20000!, we'll have 10000
doubles, so we can easily avoid 10000 bits in the bignum
multiplications, and just shift the last product.
We can even avoid some more bits for the multiples of other powers of two.

So, I'd try to use a cloop of  a power of two instead of 5 or 100, and
instead of  multiplying i, I'd multiply  (ash i -2^p) to the cloop and
increment p to a counter.


Something like:

(defun cloop (n)    ; #1=( 1 1 1 ... 1 . #1# )
  (let ((result (make-list n :initial-element 1)))
    (setf (cdr (last result)) result)))


(defmacro unroll (z aloop pot i &optional (n nil n-p))
  ;; Should be a macrolet inside maxfact, 
  ;; but it's easier to debug as a defmacro.
  (let ((end (gensym)))
    `(,(if n-p 'tagbody 'progn)
       ,@(loop
            :for u :from 1 :to z        ; =i[z]
            :for p = (loop
                        :for p :from 0
                        :for x = u :then (truncate x 2)
                        :while (evenp x)
                        :finally (return p)) ; i\2^p
            :if n-p
              :collect `(when (> ,i ,n) (go ,end))
            :end
            :collect (if (plusp p)
                         `(setf (car ,aloop) (* (car ,aloop) (ash ,i ,(- p)))
                                ,pot (+ ,pot ,p))
                         `(setf (car ,aloop) (* (car ,aloop) ,i)))
            :collect `(setf ,aloop (cdr ,aloop) ,i (1+ ,i)))
       ,@(when n-p `(,end)))))


(defun maxfact (n)
  (macrolet ((compute (z)
               `(let ((aloop (cloop ,z))
                      (pot 0)
                      (m (1+ (* ,z (truncate n ,z)))))
                  ;; (print `(m = ,m))
                  (loop 
                     :with  i = 1
                     :while (< i m)
                     :do (unroll ,z aloop pot i))
                  ;;(print aloop) (print `(pot = ,pot)) (print `(i = ,i))
                  (unroll ,z aloop pot m n)
                  ;; (print aloop) (print `(pot = ,pot)) (print `(m = ,m))
                  (ash (loopprod aloop) pot))))
    (if (< n 128)
        (compute 4)
        (compute 128))))



[296]> (ext:gc)
1597184
[297]> (time (defparameter *e* (ext:! 20000)))
Real time: 0.070717 sec.
Run time: 0.072005 sec.
Space: 696224 Bytes
*E*
[298]> (mapcar 'compile '(maxfact cloop loopprod))
(MAXFACT CLOOP LOOPPROD)
[299]> (ext:gc)
1589146
[300]> (time (defparameter *m* (maxfact 20000)))
Real time: 0.081703 sec.
Run time: 0.080005 sec.
Space: 2667600 Bytes
GC: 1, GC time: 0.016001 sec.
*M*
[301]> (assert (= *m* *e*))
NIL

[316]> (dotimes (i 512) (assert (= (ext:! i) (maxfact i))))
NIL

So I reach almost the internal #+clisp ext:! time, but still have some
bignum consing that the internal ext:! is able to avoid using directly
GMP.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

READ THIS BEFORE OPENING PACKAGE: According to certain suggested
versions of the Grand Unified Theory, the primary particles
constituting this product may decay to nothingness within the next
four hundred million years.
From: Robert Israel
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <e122tl$ite$1@nntp.itservices.ubc.ca>
In article <·············@agate.berkeley.edu>,
Richard J. Fateman <·······@eecs.berkeley.edu> wrote:
>I invite comments on a paper posted at
>http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
>
>on computing factorial  of, say, 20,000 or more.
>Programs in lisp and some comparison with Maple and
>Mathematica. are included.
>
>I've used (Allegro 7.0) Lisp + GMP for best times, but maybe
>you have other ideas.
>Thanks
>RJF

I believe that what current versions of Maple use is the GMP factorial
function mpz_fac_ui: see <http://www.swox.com/gmp/gmp-man-4.2.pdf>.  
It's quite fast: 40000! took only 0.093 seconds on my machine (Pentium 
4, 3 GHz running Maple 10.03 Classic under Windows XP).  Section 
16.7.2 of that document tells about the algorithm used.

Robert Israel                                ······@math.ubc.ca
Department of Mathematics        http://www.math.ubc.ca/~israel 
University of British Columbia            Vancouver, BC, Canada
From: Richard Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <%e1Zf.52421$2O6.12630@newssvr12.news.prodigy.com>
I looked at the algorithm in gmp,  and timed calling it directly from Lisp. 
It is
somewhat faster than running a good factorial algorithm in Lisp (There is no
reason for it to be much faster... it is still doing the same big 
multiplies, and
doesn't really have a better algorithm).

 As I recall it makes some number of buckets and tries to accumulate
products in all of them of about equal size. (Like the version in Maxima.)
This is not too bad, but presumably not as good as Peter Luschny's best,
which requires interleaving a prime number sieve with the calculation, 
similar
to what Tim Daly suggests.

I tried running the version (in Lisp)of Lucshny's posted earlier, but 
modified to run
in Lisp + GMP.  It is good, but not as good as some others, probably
because it requires creating lots of  GMPZ numbers to return from the
"prod" routine, instead of storing those numbers destructively on top
of each other.

RJF


"Robert Israel" <······@math.ubc.ca> wrote in message 
·················@nntp.itservices.ubc.ca...
> In article <·············@agate.berkeley.edu>,
> Richard J. Fateman <·······@eecs.berkeley.edu> wrote:
>>I invite comments on a paper posted at
>>http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
>>
>>on computing factorial  of, say, 20,000 or more.
>>Programs in lisp and some comparison with Maple and
>>Mathematica. are included.
>>
>>I've used (Allegro 7.0) Lisp + GMP for best times, but maybe
>>you have other ideas.
>>Thanks
>>RJF
>
> I believe that what current versions of Maple use is the GMP factorial
> function mpz_fac_ui: see <http://www.swox.com/gmp/gmp-man-4.2.pdf>.
> It's quite fast: 40000! took only 0.093 seconds on my machine (Pentium
> 4, 3 GHz running Maple 10.03 Classic under Windows XP).  Section
> 16.7.2 of that document tells about the algorithm used.
>
> Robert Israel                                ······@math.ubc.ca
> Department of Mathematics        http://www.math.ubc.ca/~israel
> University of British Columbia            Vancouver, BC, Canada
>
> 
From: Klueless
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <lI0Zf.692514$qk4.227916@bgtnsc05-news.ops.worldnet.att.net>
"Richard J. Fateman" <·······@eecs.berkeley.edu> wrote in message ··················@agate.berkeley.edu...
> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf

     There is a lot here you should look at.
     <http://www.luschny.de/math/factorial/FastFactorialFunctions.htm>
     <http://www.luschny.de/math/factorial/conclusions.html>
From: Mark Lawton
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <1144325242.836200.295070@g10g2000cwb.googlegroups.com>
Hi Richard,

Not really a serious comment - just interesting to note that the
following Maxima function computes 20,000! correctly to 9 digits:

fact(x) := (y: sum(float(log(n)),n,1,x),
  y: float(y/log(10)),
    print(10^(y-fix(y)), "e", fix(y)),"")$

If I set floating point precision to 100, and rewrite the function as a
bigfloat, it computes it correctly to 92 digits:

fact(x) := (y: sum(bfloat(log(n)),n,1,x),
  y: bfloat(y/log(10)),
    print(10^(y-fix(y)), "e", fix(y)),"")$

So if you don't need your factorial's full precision, you can use this
method. I remembered this trick of obtaining factorials by summing the
logs from the early 1980s, when the only tool I had was my TI-58C
programmable calculator.

Richard J. Fateman wrote:
> I invite comments on a paper posted at
> http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf
>
> on computing factorial  of, say, 20,000 or more.
> Programs in lisp and some comparison with Maple and
> Mathematica. are included.
>
> I've used (Allegro 7.0) Lisp + GMP for best times, but maybe
> you have other ideas.
> Thanks
> RJF
From: Mark Lawton
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <1144326661.400497.215160@i39g2000cwa.googlegroups.com>
I missed the obvious point that there is no need to use logs when using
a CAS - sorry.

fact(x) := (y: 1,
  for i: 2 thru x do
    y: bfloat(y * i),
y)$

Using the above Maxima function with floating point precision set to
100, the result of 20,000! is produced in 0.5 seconds (circa 2 ghz PC)
and all 100 digits produced are correct.

Mark Lawton wrote:
> Hi Richard,
>
> Not really a serious comment - just interesting to note that the
> following Maxima function computes 20,000! correctly to 9 digits:
>
> fact(x) := (y: sum(float(log(n)),n,1,x),
>   y: float(y/log(10)),
>     print(10^(y-fix(y)), "e", fix(y)),"")$
>
> If I set floating point precision to 100, and rewrite the function as a
> bigfloat, it computes it correctly to 92 digits:
>
> fact(x) := (y: sum(bfloat(log(n)),n,1,x),
>   y: bfloat(y/log(10)),
>     print(10^(y-fix(y)), "e", fix(y)),"")$
>
> So if you don't need your factorial's full precision, you can use this
> method. I remembered this trick of obtaining factorials by summing the
> logs from the early 1980s, when the only tool I had was my TI-58C
> programmable calculator.
From: Richard Fateman
Subject: Re: factorial of largish numbers
Date: 
Message-ID: <PAcZf.67072$dW3.29858@newssvr21.news.prodigy.com>
Thanks for all the comments!

The sparse recursive factorial algorithm as Peter L wrote in Java
does a lot of allocation of numbers (as bignums) when they
are really just small numbers.  A more efficient way of doing
could be to allocate two bignums for p and r and overwrite them. The
"API" offered by GMP allows you to do this overwriting, and the Lisp
version I provide allows access to these destructive versions.
(The Python implementation, according to at least one note I
found by Googling, did not provide such access.)  Anyway, here
is a Lisp version for sparse recursive, with a modification of
the "prod" routine to take another argument, ans.  If the old prod(n)
computed a value C(n), the new gprod(n,ans) produces a result
like ans:=C(n)*ans, destroying the old value of ans.
The ordering and size of multiplications is perhaps not as good
in this re-working.   I believe it is just an unrolling of the recurrence
f(n,m):=f(n,2*m)*f(n-m,2*m).

This sprfact is not quite as fast as the "k" program, presumably because
it doesn't do as good a job of keeping numbers equally small, and 
accumulating
their product "gently". It uses almost no storage except inside the GMP 
allocator.

This is the program (it won't run unless you have my gmp library). If
you are familiar with GMP, the programs mpz_.... are named the same as
the GMP entries. I've also changed the naming and the while, and labels to 
be
fully ANSI common lisp. I've also prefixed the fixnum operators with
cl::   just in case this might make it compile better. I've overloaded +, *, 
etc
with GMP generic functions.


(defun gsprfact(n);;works.  in :gmp package
  (declare (fixnum n) (optimize (speed 3)(safety 0)) )
  (let  ((p (into 1)) (r (into 1)) ;initialize by converting 1 into GMPZ 
numbers.
  (NN 1) (log2n (floor (log n 2)))
  (h 0) (shift 0) (high 1) (len 0))
    (declare (fixnum log2n h shift high len NN))
    (labels ((gprod(n ans)
     (declare (fixnum n NN))
     (let ((m (ash n -1)))
       (declare (fixnum m))
       (cond ((cl::= m 0)
       (mpz_mul_ui ans ans
     (cl::incf NN 2)))
      ((cl::= n 2)
       (mpz_mul_ui ans ans
     (cl::* (cl::incf NN 2)
            (cl::incf NN 2))))
      (t  (gprod (cl::- n m) (gprod m ans))))
       ans)))
      (cond ((< n 0)(error "bad arg ~s to factorial" n))
     ((< n 2) (gmp::into 1))
     (t
      (loop while (cl::/= h n) do
     (cl::incf shift h)
     (setq h (ash n (cl::- log2n)))
     (cl::decf log2n)
     (setq len high)
     (setq high (if (oddp h) h (cl::1- h)))
     (setq len (ash (cl::- high len) -1))
     (cond ((cl::> len 0)
     (gprod len (gmpz-z p));change p
     (mpz_mul (gmpz-z r)(gmpz-z r)(gmpz-z p)) )))
      (mpz_mul_2exp (gmpz-z r) (gmpz-z r) shift);; arithmetic shift
      r)))))

Here's another version, ANSI CL, but less hacking.

(defun sprfact(n)
  (declare (fixnum n))
  (let  ((p 1) (r 1) (NN 1) (log2n (floor (log n 2)))
  (h 0) (shift 0) (high 1) (len 0))
    (declare (fixnum log2n h shift high len NN))
    (labels ((prod(n)
        (declare (fixnum n))
  (let ((m (ash n -1)))
    (cond ((= m 0) (incf NN 2))
   ((= n 2) (* (incf NN 2)(incf NN 2)))
   (t  (* (prod (- n m)) (prod m)))))))

      (cond ((< n 0)(error "bad arg ~s to factorial" n))
     ((< n 2) 1)
     (t
      (loop while (/= h n) do
     (incf shift h)
     (setf h (ash n (- log2n)))
     (decf log2n)
     (setf len high)
     (setf high (if (oddp h) h (1- h)))
     (setf len (ash (- high len) -1))
     (cond ((> len 0)
     (setf p (* p (prod len)))
     (setf r (* r p)))))
      (ash r shift))))))