Ratio calculations speed

From: Vladimir Nesterovsky
Subject: Ratio calculations speed
Date: Fri, 28 Dec 2001 10:34:32 +0000
Message-ID: <tcio2ucso5eace5dc7cnbevpc2nqukgofh@4ax.com>

I observed great differences in ratio calculations speed between
various Lisp implementations. I was very surprised to find
CLISP perform _tens times_ faster then AllegroCL (under Win98)
in some cases.

I played with some code to calculate more then 16 digits of the 
number pi. The simplest series for pi/4 is 1 - 1/3 + 1/5 - 1/7 ... 
with Euler transformation applied on its partial sums series 
several times (as per SICP).

Of course denominators will grow very fast if we'd just
perform calculations in a straightforward manner which would
make them very slow, so I tried to "adjust" the ratio to having 
no more precision than I need (say 1000 digits), which greatly
improved the speed and made it possible to calculate much more
digits of pi in the same amount of time:

(defun adjust-ratio (r n)
  "adjust ratio's precision to n decimal places"
  (let ((base (expt 10 n)))
    (multiple-value-bind (quot rem)
                         (round r)
      (if (< (denominator rem) base)
        r                         
        (/ (round (* r base)) base)))))

I found that for bigger number of digits the overall time needed in 
Allegro could be _tens_ times that of CLISP, which seems strange. 
Did I do something wrong or some type declarations can change the 
picture, or is it just that CLISP is that much better in ratio 
arithmetics? Any comments are much welcomed.

Here's the code, if someone's interested:

;; sample calls (timings on PIII/550/Win98):
;;   make 20 euler transforms, pre-generate 41 terms 
;;   (2n+1 by default because each transform shortens the list by two)
;; 	(find-pi 20)     ; 0.3 sec CLISP // 5 sec ACL 
;;   make 20 euler transforms, pre-generate 100 terms
;; 	(find-pi 20 100) ; 0.3 sec CLISP // 6 sec ACL
;;   make 20 euler transforms, pre-generate 200 terms
;; 	(find-pi 20 200) ; 0.3 sec // 7 sec
;;   50 transforms, 101 pre-generated terms
;; 	(find-pi 50)     ; 2 sec // 37 sec (35 compiled) 
;;   1000 pre-generated (working only on 101 last) terms for 120 digits
;; 	(find-pi 50 1000 120)     ; 2 sec // 84 sec (80 compiled)
;;   10,000 pre-generated terms for better precision for 150 digits
;; 	(find-pi 50 10000 150)    ; 4.5 sec // ??
;;   find 1040 digits of the number pi
;; 	(find-pi 120 150000 1039) ; 6.5 min, CLISP // ?? ACL


(defvar *report-digits* 55)        ; digits to report
(defvar *interim-precision* 95)    ; decimal places of interim precision
(defvar *supress-interim-printout* nil)
(defvar *supress-interim-dot-printout* nil)

(defun adjust-ratio (r n)
  "adjust ratio's precision to n decimal places"
  (let ((base (expt 10 n)))
    (multiple-value-bind (quot rem)
                         (round r)
      (if (< (denominator rem) base)
        r                         
        (/ (round (* r base)) base)))))

;; a series for pi/4 = 
;;   = 1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + ....
(defun s-pi/4 ( m &optional (len-from-end m))
  "generate up to m partial sums for pi series"
  (do* ((i0 (- m len-from-end))
        (i  0   (+ i 1))
        (s  0   (adjust-ratio (+ s xi) *interim-precision*))
        (j  1   (- j))
        (n  1   (+ n 2))
        (xi 1   (adjust-ratio (/ j n) *interim-precision*))
        (sums ()  (if (> i i0) (cons s sums))))
       ((>= i m)  (format t " --DONE") (reverse sums))
    (if (not *supress-interim-dot-printout*)
      (if (and (zerop (rem i 1000)) (> i 0))
        (format t ".")))))

;; the Euler transformation of a partial sums series of alternating series
;; A[n] = S[n+1] - (S[n+1]-S[n])^2/(S[n-1]-2S[n]+S[n+1])
;;   returns (cons 'div-zero seq) on 0/0 (to stop further transformations)
(defun euler-trans (seq) 
  "perform a euler transformation on a sequence"
  (let ((a1 (car seq))
        (a2 (cadr seq)))
    (mapcar
      #'(lambda(a3)
         (let ((den (+ a1 (* -2 a2) a3))) 
           (if (zerop den) ; sould I use throw/catch here?
             (return-from euler-trans (cons 'div-zero seq)))
           (let ((r (- a3 (/ (sqr (- a3 a2)) den))))
             (psetq a1 a2
                    a2 a3)
             (adjust-ratio r *interim-precision*))))
      (cddr seq))))

;; main function to call
(defun find-pi (n &optional
                (m (+ (* 2 n) 1))
                (dig *report-digits*)
                (prec (+ dig 40))
                (sup *supress-interim-printout*))
  "(find-pi n [m d p]) to find d digits of the pi value by performing n euler
   transformations on m terms of pi series with p digits of interim precision."

  (let ((*report-digits* dig)
        (*interim-precision* prec)
        (*supress-interim-printout* sup))
    (if (< m (+ (* 2 n) 1))
      (setq m (+ (* 2 n) 1)))
    (format t "~&Generating ~A terms of pi/4 series " m)
    (do ((s   (s-pi/4 m (+ (* 2 n) 1))
                  (euler-trans s))
         (i   1   (+ i 1)))
        ((or (> i n) (eq 'div-zero (car s)))
         (format t "~& After ~A Euler transforms:~%" (- i 1))
         (n-digits (* 4 (car (last s)))
                   *report-digits*))
      (if (not *supress-interim-printout*)
        (format t "~&~A"
          (last-n-digits
            (n-digits (* 4 (car (last s)))
                      *report-digits*) 
            79))))))

(defun n-digits (v n)
  "return v with n digits after decimal point as integer"
  (multiple-value-bind (a b)
                       (round (* v (expt 10 n)))
    a))

(defun last-n-digits (v n)
  "return last n digits of long integer as integer"
  (multiple-value-bind (a b)
                       (floor v (expt 10 n))
    b))

(defun sqr(x)
  "find the square of number"
  (* x x))


#|
(find-pi 20 100) ; 55 digits
31415926535897932384626433832795028841971693993751058210
(find-pi 20 1000 78) ; 78 digits, 1000 terms
3141592653589793238462643383279502884197169399375105820974944592307816406286209
(find-pi 90 0 78 81) ; 181 terms --
       ; like SICP's "tableau" of streams --
       ; all in all slower because it produces bigger interim lists to be able
       ; to run much more transforms (although it exits in the middle on 0/0).
3141592653589793238462643383279502884197169399375105820974944592307816406286209
|#
---
Vlad   http://vnestr.tripod.com/

Re: Ratio calculations speed Kevin Rosenberg
Re: Ratio calculations speed Raymond Toy
- Re: Ratio calculations speed Vladimir Nesterovsky

From: Kevin Rosenberg
Subject: Re: Ratio calculations speed
Date: Fri, 28 Dec 2001 16:41:13 +0000
Message-ID: <slrna2p7qg.3jh.kevin@pal.med-info.com>

On Fri, 28 Dec 2001 12:34:32 +0200, 
Vladimir Nesterovsky <······@netvision.net.il> wrote:
>I observed great differences in ratio calculations speed between
>various Lisp implementations. I was very surprised to find
>CLISP perform _tens times_ faster then AllegroCL (under Win98)
>in some cases.
>[...]
>;;   10,000 pre-generated terms for better precision for 150 digits
>;; 	(find-pi 50 10000 150)    ; 4.5 sec // ??

Another data point: I get 71x faster for interpreted CLISP
vs. compiled ACL.  There's also a 67x difference in memory utilized.
It seems that Franz has significant potential for improvement
in this area.

Kevin


On a dual Athlon MP1800+ 2GB with Linux:
;; CLISP Interpreted
(find-pi 50 10000 150) 
  Real time: 1.199598 sec.          
  Run time: 1.12 sec.
  Space: 27196808 Bytes
  GC: 52, GC time: 0.06 sec.

;; ACL 6.1 compiled (speed 3) (compilation-speed 0) (safety 0)
(find-pi 50 10000 150) 
; cpu time (non-gc) 53,720 msec user, 560 msec system
; cpu time (gc)     19,000 msec user, 60 msec system
; cpu time (total)  72,720 msec (00:01:12.720) user, 620 msec system
; real time  79,629 msec (00:01:19.629)
; space allocation:
;  2,367 cons cells, 1,838,283,424 other bytes, 0 static bytes

From: Raymond Toy
Subject: Re: Ratio calculations speed
Date: Sat, 29 Dec 2001 20:41:45 +0000
Message-ID: <4ng05tkbcm.fsf@rtp.ericsson.se>

>>>>> "Vladimir" == Vladimir Nesterovsky <······@netvision.net.il> writes:

    Vladimir> I observed great differences in ratio calculations speed between
    Vladimir> various Lisp implementations. I was very surprised to find
    Vladimir> CLISP perform _tens times_ faster then AllegroCL (under Win98)
    Vladimir> in some cases.

It has been my experience that Clisp has VERY good bignum (and ratio)
arithmetics.  

    Vladimir> ;;   10,000 pre-generated terms for better precision for 150 digits
    Vladimir> ;; 	(find-pi 50 10000 150)    ; 4.5 sec // ??

On a 300 MHz Ultra 30 (Ultrasparc IIi), Clisp gives 25.7 sec run time.
CMUCL gives 59.86 sec for a ratio of just 2.3 times slower.  Not bad,
considering that all of CMUCL's bignum and ratio arithmetic is written
in Lisp.  (Clisp has its arithmetic in C.)  And CMUCL uses the simple
pencil-and-paper/schoolbook multiplication and division algorithms
whereas Clisp can use much faster methods (Karatsuba, maybe FFT), if
the numbers are large enough.

Ray

From: Vladimir Nesterovsky
Subject: Re: Ratio calculations speed
Date: Mon, 31 Dec 2001 02:27:29 +0000
Message-ID: <soiv2uk96i3biufamoaatihdn8v051dckk@4ax.com>

Thanks for the information and additional testing to all. :-)

On 29 Dec 2001 15:41:45 -0500, Raymond Toy <···@rtp.ericsson.se> wrote:

>>>>>> "Vladimir" == Vladimir Nesterovsky <······@netvision.net.il> writes:
>
>    Vladimir> I observed great differences in ratio calculations speed between
>    Vladimir> various Lisp implementations. I was very surprised to find
>    Vladimir> CLISP perform _tens times_ faster then AllegroCL (under Win98)
>    Vladimir> in some cases.
>
>It has been my experience that Clisp has VERY good bignum (and ratio)
>arithmetics.  
>
>    Vladimir> ;;   10,000 pre-generated terms for better precision for 150 digits
>    Vladimir> ;; 	(find-pi 50 10000 150)    ; 4.5 sec // ??
>
>On a 300 MHz Ultra 30 (Ultrasparc IIi), Clisp gives 25.7 sec run time.
>CMUCL gives 59.86 sec for a ratio of just 2.3 times slower.  Not bad,
>considering that all of CMUCL's bignum and ratio arithmetic is written
>in Lisp.  (Clisp has its arithmetic in C.)  And CMUCL uses the simple
>pencil-and-paper/schoolbook multiplication and division algorithms
>whereas Clisp can use much faster methods (Karatsuba, maybe FFT), if
>the numbers are large enough.
>
>Ray

Maybe it's also how the bignums storage is implemented that affects the 
efficiency. I haven't checked this thoroughly but it seems to "hit the wall" 
for certain length of denominator (== *interim-precision* in my code). 
Allegro may do this as soon as 150 digits, CLISP is fine up until 1000 and 
more. 

---
Vlad   http://vnestr.tripod.com/