Date: 5 Jun 91 13:32:11 GMT
From: ······@tansei.cc.u-tokyo.ac.jp (Akira Kurihara)
I am interested in the speed of a Common Lisp function isqrt
for extremely big bignums.
I added the results of running the test on a Symbolics machine. The
results for the builtin ISQRT are surprising. While I normally run
screaming from numeric code, I had a look at the source, and it uses one
of those clever bit-twiddling tricks that appears to calculate the
result in O( (INTEGER-LENGTH n) ) time. One of the "numerical recipes"
books probably has an explanation of it.
I compared three functions:
fast-isqrt-mid1.....given below
fast-isqrt-rec......given below
isqrt...............a Common Lisp function
These functions should return the same value.
I tried it on the following three environments:
MACL.........Macintosh Allegro Common Lisp 1.3.2 on Mac SE
accelerated with 25 MHz 68020
lucid........Lucid Sun Common Lisp 4 on SPARCstation IPC
akcl.........Austin Kyoto Common Lisp 1.530 on SPARCstation IPC
XL1200.......Symbolics XL1200 running Genera 8.0.2
I tried to calculate isqrt of:
(* 2 (expt 10 2000)).....to get 1000 digits of square root of 2
(* 2 (expt 10 6000)).....to get 3000 digits of square root of 2
(* 2 (expt 10 20000))....to get 10000 digits of square root of 2
Here is the result.
*** For 1000 digits ***
fast-isqrt-mid1 fast-isqrt-rec isqrt
MACL 1.250 sec 0.317 sec 174 sec
lucid 2.72 sec 0.63 sec 11.91 sec
akcl 2.400 sec 0.567 sec 2.483 sec
XL1200 0.757 sec 0.178 sec 0.00122 sec
*** For 3000 digits ***
fast-isqrt-mid1 fast-isqrt-rec isqrt
MACL 12.233 sec 2.450 sec 4503 sec
lucid 27.68 sec 5.45 sec 102.12 sec
akcl 24.567 sec 5.100 sec 24.567 sec
XL1200 7.257 sec 1.484 sec 0.00450 sec
*** For 10000 digits ***
fast-isqrt-mid1 fast-isqrt-rec isqrt
MACL 142 sec 35 sec very long
lucid 327.75 sec 80.31 sec 1126.10 sec
akcl 273.917 sec 64.367 sec 275.783 sec
XL1200 84.780 sec 20.910 sec 0.0120 sec
Looking at this table,
(1) 25 MHz 68020 Macintosh can be as twice as faster than
SPARCstation IPC as far as it is concerned with division of bignums
by bignums.
I have found that MCL (nee MACL) is a really nice second-generation
Common Lisp. In most of the cases I have tested, it does better than
the UNIX-based Lisps. Plus, you can get a decent development platform
for under $10K. The ISQRT implementation is evidently a disappointment,
though.
--David Gadbois
Here are the results I got on a DECsystem 3100 using Lucid Common Lisp.
By the way, when I compiled the code in production mode,
the isqrt time went to zero. I assume that the compiler "optimized" the
calculation out of existence, presumably since the value was not used.
(This probably also explains the Syumbolics results reported in close to
zero time.)
;;; Lucid Common Lisp/DECsystem
;;; Development Environment Version 4.0, 12 November 1990
;;; Copyright (C) 1985, 1986, 1987, 1988, 1989, 1990 by Lucid, Inc.
;;; All Rights Reserved
;;;
;;; This software product contains confidential and trade secret information
;;; belonging to Lucid, Inc. It may not be copied for any reason other than
;;; for archival and backup purposes.
;;;
;;; Lucid Common Lisp is a trademark of Lucid, Inc.
;;; DECsystem is a trademark of Digital Equipment Corp.
Lisp> (isqrt-time-comparison (* 2 (expt 10 2000)) 1 1)
"do nothing"
Elapsed Real Time = 0.03 seconds
Total Run Time = 0.00 seconds
User Run Time = 0.00 seconds
System Run Time = 0.00 seconds
Process Page Faults = 1
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 840
"fast-isqrt-mid1"
Elapsed Real Time = 2.46 seconds
Total Run Time = 2.36 seconds
User Run Time = 2.34 seconds
System Run Time = 0.02 seconds
Process Page Faults = 2
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 21,616
"fast-isqrt-rec"
Elapsed Real Time = 0.59 seconds
Total Run Time = 0.55 seconds
User Run Time = 0.55 seconds
System Run Time = 0.00 seconds
Process Page Faults = 1
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 10,256
"isqrt"
Elapsed Real Time = 12.18 seconds
Total Run Time = 11.68 seconds
User Run Time = 11.54 seconds
System Run Time = 0.14 seconds
Process Page Faults = 57
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 10,502,096
There were 20 ephemeral GCs
NIL
Lisp> (isqrt-time-comparison (* 2 (expt 10 6000)) 1 1)
"do nothing"
Elapsed Real Time = 0.00 seconds
Total Run Time = 0.00 seconds
User Run Time = 0.00 seconds
System Run Time = 0.00 seconds
Process Page Faults = 1
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 2,496
"fast-isqrt-mid1"
Elapsed Real Time = 24.14 seconds
Total Run Time = 23.91 seconds
User Run Time = 23.87 seconds
System Run Time = 0.04 seconds
Process Page Faults = 1
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 74,088
"fast-isqrt-rec"
Elapsed Real Time = 4.77 seconds
Total Run Time = 4.72 seconds
User Run Time = 4.71 seconds
System Run Time = 0.01 seconds
Process Page Faults = 0
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 30,872
"isqrt"
Elapsed Real Time = 97.92 seconds (1 minute, 37.92 seconds)
Total Run Time = 96.18 seconds (1 minute, 36.18 seconds)
User Run Time = 95.85 seconds (1 minute, 35.85 seconds)
System Run Time = 0.33 seconds
Process Page Faults = 105
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 93,533,248
There were 179 ephemeral GCs
NIL
Lisp> (isqrt-time-comparison (* 2 (expt 10 20000)) 1 1)
"do nothing"
Elapsed Real Time = 0.00 seconds
Total Run Time = 0.00 seconds
User Run Time = 0.00 seconds
System Run Time = 0.00 seconds
Process Page Faults = 0
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 8,312
"fast-isqrt-mid1"
Elapsed Real Time = 291.34 seconds (4 minutes, 51.34 seconds)
Total Run Time = 285.04 seconds (4 minutes, 45.04 seconds)
User Run Time = 283.89 seconds (4 minutes, 43.89 seconds)
System Run Time = 1.16 seconds
Process Page Faults = 22
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 262,072
There was 1 ephemeral GC
"fast-isqrt-rec"
Elapsed Real Time = 70.17 seconds (1 minute, 10.17 seconds)
Total Run Time = 69.63 seconds (1 minute, 9.63 seconds)
User Run Time = 69.54 seconds (1 minute, 9.54 seconds)
System Run Time = 0.09 seconds
Process Page Faults = 3
Dynamic Bytes Consed = 0
Ephemeral Bytes Consed = 113,392
"isqrt"
Elapsed Real Time = 1068.20 seconds (17 minutes, 48.20 seconds)
Total Run Time = 1055.11 seconds (17 minutes, 35.11 seconds)
User Run Time = 1052.86 seconds (17 minutes, 32.86 seconds)
System Run Time = 2.25 seconds
Process Page Faults = 381
Dynamic Bytes Consed = 223,568
Ephemeral Bytes Consed = 1,035,828,048
There were 1989 ephemeral GCs
NIL
Lisp>
In article <························@KISIN.ACA.MCC.COM>, David Gadbois writes:
> Date: 5 Jun 91 13:32:11 GMT
> From: ······@tansei.cc.u-tokyo.ac.jp (Akira Kurihara)
>
> I am interested in the speed of a Common Lisp function isqrt
> for extremely big bignums.
>
> I added the results of running the test on a Symbolics machine. The
> results for the builtin ISQRT are surprising. While I normally run
> screaming from numeric code, I had a look at the source, and it uses one
> of those clever bit-twiddling tricks that appears to calculate the
> result in O( (INTEGER-LENGTH n) ) time. One of the "numerical recipes"
> books probably has an explanation of it.
Actually, one of the compiler books probably has an explanation of it!
I did the same computations (but using Genera 8.0.1, in case it matters)
and got essentially the same results as you did.
I mailed the results directly to Akira -- with one change:
Apparently the symbolics knows that isqrt has no `real' side-effects and
optimizes it out -- the disassembled code reveals no call to isqrt --
and I doubt that isqrt is stashed in microcode!
To trick the compiler, I changed each timing body to
(setq result n)
(setq result (fast-isqrt-mid1 n))
...
(setq result (isqrt n))
and redid the tests. Everything was the same except the isqrt timings
which now are essentially the same as fast-isqrt-mid1 (which _is_ "one of
those clever bit-twiddling tricks" !).
bruce
······@cam.nist.gov
In article <··········@ARTEMIS.cam.nist.gov>, ······@FS1.cam.nist.gov
(Bruce R. Miller) writes:
|>
|> In article <························@KISIN.ACA.MCC.COM>, David
Gadbois writes:
|> > Date: 5 Jun 91 13:32:11 GMT
|> > From: ······@tansei.cc.u-tokyo.ac.jp (Akira Kurihara)
|> >
|> > I am interested in the speed of a Common Lisp function isqrt
|> > for extremely big bignums.
|> >
|> > I added the results of running the test on a Symbolics machine. The
|> > results for the builtin ISQRT are surprising. [...]
|>
|> [...]
|> Apparently the symbolics knows that isqrt has no `real' side-effects and
|> optimizes it out [...]
I'm relieved to hear this: I spent a few hours finding a method that runs
about twice as fast as fast-isqrt-rec and was disappointed that it was
rendered worse than obsolete.
I have a [lovely?, no!] proof, which unfortunately cannot fit in
this margin :-), for this algorithm. Essentially it works by noting that
this every recursive call of fast-isqrt-rec does two or three iterations
and that the last iteration is never useful (it only verifies the answer).
The three iteration case needs to be checked for but this only requires
a multiply of numbers only half as long.
For allegro on a DS3100, I get the comparison:
(I don't have the figures for the other versions, I sent them to Akira already)
fast-isqrt-rec faster-isqrt-rec
1000 .884 .517
3000 7.30 4.02
10000 107. 43.8
Here's the function. throw stones at it and make it break.
(Who knows, maybe it has subtle bugs?)
(It uses a lot of code from the old fast-isqrt-rec)
---------------------------------------------------------------------
(defun faster-isqrt-rec (n &aux n-len-quarter n-half n-half-isqrt
init-value q r iterated-value)
"argument n must be a non-negative integer"
(cond
((> n 24) ; theoretically (> n 7) ,i.e., n-len-quarter > 0
(setq n-len-quarter (ash (integer-length n) -2))
(setq n-half (ash n (- (ash n-len-quarter 1))))
(setq n-half-isqrt (faster-isqrt-rec n-half))
(setq init-value (ash (1+ n-half-isqrt) n-len-quarter))
(multiple-value-setq (q r) (floor n init-value))
(setq iterated-value (ash (+ init-value q) -1))
(if (eq (logbitp 0 q) (logbitp 0 init-value)) ; same sign
;; average is exact and we need to test the result
(let ((m (- iterated-value init-value)))
(if (> (* m m) r)
(- iterated-value 1)
iterated-value))
;; average was not exact, we take value
iterated-value))
((> n 15) 4)
((> n 8) 3)
((> n 3) 2)
((> n 0) 1)
((> n -1) 0)
(t nil)))
-------------------------------------------------------------------
John Boyland
·······@sequoia.berkeley.edu