I need to speed up the following function:
(defun double-float-escape-fn (c-real c-imag)
"Return iterations for c-real + c-imag i as floating-point numbers"
(declare (type double-float c-real c-imag)
(optimize (speed 3) (safety 0) (space 0)))
(loop for iterations below *max-iterations*
for period from 1
for z-imag = 0.0D0 then (+ (* 2.0D0 z-real z-imag)
c-imag)
for z-real = 0.0D0 then (+ (- z-real-squared
z-imag-squared) c-real)
for z-real-squared = 0.0D0 then (* z-real z-real)
for z-imag-squared = 0.0D0 then (* z-imag z-imag)
with a-imag = 0.0D0
with a-real = 0.0D0
with check = 3
unless (< (+ z-real-squared z-imag-squared) 4.0D0) return (the
fixnum iterations)
when (and (= a-imag z-imag) (= a-real z-real) (> iterations 1))
return (the fixnum *max-iterations*)
when (> period check) do (setf period 1 check (* 2 check) a-imag
z-imag a-real z-real)
finally (return (the fixnum iterations))))
The function calculates the escape value for the Mandelbrot set. I
use the returned value to decide what color to paint a pixel. I am
considering doing this function in assembler and using FFI. The only
problem with that is that it locks me into an architecture and
implementation. So before I go that route, I want to see what else
can be done in Lisp land.
Here is a sample timing of the function from SBCL on OS X running on a
PowerBook G4 866Mhz:
* (svref *real-array* 260)
-0.10488076384628028d0
* (svref *imag-array* 145)
0.9256099657646029d0
* *max-iterations*
2858197
* (time (double-float-escape-fn (svref *real-array* 260) (svref
*imag-array* 145)))
Evaluation took:
3.51 seconds of real time
2.35 seconds of user run time
1.07 seconds of system run time
0 page faults and
204,646,208 bytes consed.
2858197
This was one pixel in a 320x240 image.
Thanks.
--
An ideal world is left as an excercise to the reader.
--- Paul Graham, On Lisp 8.1
compile and disassemble are your friends.
I didn't benchmark, but adding types to all loop variables
as follows made Allegro 7.0beta compile all numeric operations
inline.
(defun double-float-escape-fn (c-real c-imag)
"Return iterations for c-real + c-imag i as floating-point numbers"
(declare (type double-float c-real c-imag)
(optimize (speed 3) (safety 0) (space 0)))
(loop for iterations of-type fixnum below *max-iterations*
for period of-type fixnum from 1
for z-imag of-type double-float = 0.0D0 then (+ (* 2.0D0 z-real z-imag)
c-imag)
for z-real of-type double-float = 0.0D0 then (+ (- z-real-squared
z-imag-squared) c-real)
for z-real-squared of-type double-float = 0.0D0 then (* z-real z-real)
for z-imag-squared of-type double-float = 0.0D0 then (* z-imag z-imag)
with a-imag of-type double-float = 0.0D0
with a-real of-type double-float = 0.0D0
with check of-type fixnum = 3
unless (< (+ z-real-squared z-imag-squared) 4.0D0) return (the
fixnum iterations)
when (and (= a-imag z-imag) (= a-real z-real) (> iterations 1))
return (the fixnum *max-iterations*)
when (> period check) do (setf period 1 check (* 2 check) a-imag
z-imag a-real z-real)
finally (return (the fixnum iterations))))
* David Steuber:
> I need to speed up the following function:
[snip]
Try this one with type declarations for all variables:
(defun double-float-escape-fn (c-real c-imag)
"Return iterations for c-real + c-imag i as floating-point numbers"
(declare (type double-float c-real c-imag)
(optimize (speed 3) (safety 0) (space 0)))
(loop for iterations fixnum below *max-iterations*
for period fixnum from 1
for z-imag double-float = 0.0D0 then (+ (* 2.0D0 z-real z-imag) c-imag)
for z-real double-float = 0.0D0 then (+ (- z-real-squared
z-imag-squared) c-real)
for z-real-squared double-float = 0.0D0 then (* z-real z-real)
for z-imag-squared double-float = 0.0D0 then (* z-imag z-imag)
with a-imag double-float = 0.0D0
with a-real double-float = 0.0D0
with check fixnum = 3
unless (< (+ z-real-squared z-imag-squared) 4.0D0)
return (the fixnum iterations)
when (and (= a-imag z-imag) (= a-real z-real) (> iterations 1))
return (the fixnum *max-iterations*)
when (> period check) do (setf period 1 check (* 2 check) a-imag
z-imag a-real z-real)
finally (return (the fixnum iterations))))
> Evaluation took:
> 3.51 seconds of real time
> 2.35 seconds of user run time
> 1.07 seconds of system run time
> 0 page faults and
> 204,646,208 bytes consed.
Note the last line. Over 200 MB of memory was allocated. With type
declariations for all variables this should be reduced to 0.
David Steuber <·····@david-steuber.com> writes:
> I need to speed up the following function:
>
> ...
Not quite your function (I did not understand why you do some stuff), but
much faster:
(setq *max-iterations* 2858197)
(defun double-float-escape-fn (c-real c-imag &key (max-iterations *max-iterations*))
"Return iterations for c-real + c-imag i as floating-point numbers"
(declare (type double-float c-real c-imag)
(type fixnum max-iterations)
(optimize (speed 3) (safety 0) (space 0)))
(loop for iterations of-type fixnum below max-iterations
for z-imag of-type double-float = 0.0D0 then (+ (* 2.0D0 z-real z-imag) c-imag)
for z-real of-type double-float = 0.0D0 then (+ (- z-real-squared z-imag-squared) c-real)
for z-real-squared of-type double-float = 0.0D0 then (* z-real z-real)
for z-imag-squared of-type double-float = 0.0D0 then (* z-imag z-imag)
until (>= (+ z-real-squared z-imag-squared) 4.0D0)
finally (return iterations)))
(time (double-float-escape-fn -0.10488076384628028d0 0.9256099657646029d0))
=>
; Evaluation took:
; 0.05f0 seconds of real time
; 0.06f0 seconds of user run time
; 0.0f0 seconds of system run time
; 132,102,896 CPU cycles
; 0 page faults and
; 0 bytes consed.
2858197
This paper might be useful, in case you are not already aware of it:
"Fast Floating-Point Processing in Common Lisp"
Richard J. Fateman, Kevin A. Broughan, Diane K. Willcock, Duane Rettig
I'm offline right now and I don't have a URL handy, but it should be
pretty easy to find it with Google.
Paolo
--
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Recommended Common Lisp libraries/tools (Google for info on each):
- ASDF/ASDF-INSTALL: system building/installation
- CL-PPCRE: regular expressions
- UFFI: Foreign Function Interface
> I need to speed up the following function:
Most people have suggested adding types to your code. The compiler
can give you that hint by itself, too. CMUCL, for example will tell
you about places where it can use optimized calls (such as inline
addition instead of calling #'+) if you give it a narrower type. I
assume that SBCL will do the same. I've used this information fairly
extensively in some of my tight loops.
These are the compiler settings I had at the time:
(declaim (optimize (debug 3) (safety 3) (inhibit-warnings 3)))
(setq *compile-print* nil)
(setq *compile-verbose* nil)
I compiled your code on CMUCL here, and here's what I got:
* (compile 'double-float-escape-fn)
In: LAMBDA (C-REAL C-IMAG)
(LOOP FOR ITERATIONS BELOW *MAX-ITERATIONS* ...)
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY WHEN
--> COND IF >= IF
==>
(< ITERATIONS #:G0)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a SINGLE-FLOAT.
The second argument is a REAL, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a DOUBLE-FLOAT.
The second argument is a REAL, not a SINGLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
The second argument is a REAL, not a RATIONAL.
(+ Z-REAL-SQUARED Z-IMAG-SQUARED)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
The second argument is a NUMBER, not a FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
(< (+ Z-REAL-SQUARED Z-IMAG-SQUARED) 4.0d0)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a SINGLE-FLOAT.
(= A-IMAG Z-IMAG)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
The second argument is a NUMBER, not a REAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a REAL.
The second argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a (COMPLEX DOUBLE-FLOAT).
The second argument is a NUMBER, not a REAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a REAL.
The second argument is a NUMBER, not a (COMPLEX DOUBLE-FLOAT).
Note: Unable to open code because:
Operands might not be the same type.
(= A-REAL Z-REAL)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
The second argument is a NUMBER, not a REAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a REAL.
The second argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a (COMPLEX DOUBLE-FLOAT).
The second argument is a NUMBER, not a REAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a REAL.
The second argument is a NUMBER, not a (COMPLEX DOUBLE-FLOAT).
Note: Unable to open code because:
Operands might not be the same type.
(> ITERATIONS 1)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
(> PERIOD CHECK)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a SINGLE-FLOAT.
The second argument is a REAL, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a DOUBLE-FLOAT.
The second argument is a REAL, not a SINGLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
The second argument is a REAL, not a RATIONAL.
(LOOP FOR ITERATIONS BELOW *MAX-ITERATIONS* ...)
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY
--> ANSI-LOOP::LOOP-REALLY-DESETQ SETQ 1+
==>
(+ ITERATIONS 1)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY WHEN
--> COND IF >= IF
==>
(< ITERATIONS #:G0)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a SINGLE-FLOAT.
The second argument is a REAL, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a DOUBLE-FLOAT.
The second argument is a REAL, not a SINGLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
The second argument is a REAL, not a RATIONAL.
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY
--> ANSI-LOOP::LOOP-REALLY-DESETQ SETQ 1+
==>
(+ PERIOD 1)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
(* 2.0d0 Z-REAL Z-IMAG)
==>
(* (* 2.0d0 Z-REAL) Z-IMAG)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
The second argument is a NUMBER, not a FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
Note: Unable to convert x*2^k to shift due to type uncertainty:
The first argument is a NUMBER, not a INTEGER.
The second argument is a NUMBER, not a INTEGER.
Note: Unable to recode as shift and add due to type uncertainty:
The first argument is a NUMBER, not a (UNSIGNED-BYTE 32).
The second argument is a NUMBER, not a (UNSIGNED-BYTE 32).
The result is a NUMBER, not a (UNSIGNED-BYTE 32).
(+ (* 2.0d0 Z-REAL Z-IMAG) C-IMAG)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
(- Z-REAL-SQUARED Z-IMAG-SQUARED)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
The second argument is a NUMBER, not a FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
(+ (- Z-REAL-SQUARED Z-IMAG-SQUARED) C-REAL)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
(* Z-REAL Z-REAL)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
The second argument is a NUMBER, not a FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
Note: Unable to convert x*2^k to shift due to type uncertainty:
The first argument is a NUMBER, not a INTEGER.
The second argument is a NUMBER, not a INTEGER.
Note: Unable to recode as shift and add due to type uncertainty:
The first argument is a NUMBER, not a (UNSIGNED-BYTE 32).
The second argument is a NUMBER, not a (UNSIGNED-BYTE 32).
The result is a NUMBER, not a (UNSIGNED-BYTE 32).
(* Z-IMAG Z-IMAG)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
The second argument is a NUMBER, not a FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a FLOAT.
The second argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
Note: Unable to convert x*2^k to shift due to type uncertainty:
The first argument is a NUMBER, not a INTEGER.
The second argument is a NUMBER, not a INTEGER.
Note: Unable to recode as shift and add due to type uncertainty:
The first argument is a NUMBER, not a (UNSIGNED-BYTE 32).
The second argument is a NUMBER, not a (UNSIGNED-BYTE 32).
The result is a NUMBER, not a (UNSIGNED-BYTE 32).
(* 2.0d0 Z-REAL Z-IMAG)
--> * *
==>
(* C::Y 2.0d0)
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a RATIONAL.
Note: Unable to optimize due to type uncertainty:
The first argument is a NUMBER, not a SINGLE-FLOAT.
(* 2 CHECK)
==>
(* C::Y 2)
Note: Unable to optimize due to type uncertainty:
The first argument is a REAL, not a FLOAT.
Note: Unable to convert x*2^k to shift due to type uncertainty:
The first argument is a REAL, not a INTEGER.
Note: Unable to recode as shift and add due to type uncertainty:
The first argument is a REAL, not a (UNSIGNED-BYTE 32).
The result is a REAL, not a (UNSIGNED-BYTE 32).
(LOOP FOR ITERATIONS BELOW *MAX-ITERATIONS* ...)
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY WHEN
--> COND IF >= IF
==>
(< ITERATIONS #:G0)
Note: Forced to do GENERIC-< (cost 10).
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a DOUBLE-FLOAT.
The second argument is a REAL, not a DOUBLE-FLOAT.
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a SINGLE-FLOAT.
The second argument is a REAL, not a SINGLE-FLOAT.
etc.
(+ Z-REAL-SQUARED Z-IMAG-SQUARED)
Note: Forced to do GENERIC-+ (cost 10).
Unable to do inline float arithmetic (cost 2) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a REAL, not a DOUBLE-FLOAT.
Unable to do inline float arithmetic (cost 2) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
The result is a REAL, not a SINGLE-FLOAT.
etc.
(< (+ Z-REAL-SQUARED Z-IMAG-SQUARED) 4.0d0)
Note: Forced to do GENERIC-< (cost 10).
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a DOUBLE-FLOAT.
(= A-IMAG Z-IMAG)
Note: Forced to do GENERIC-= (cost 10).
Unable to do inline float comparison (cost 3) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Unable to do inline float comparison (cost 3) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
(= A-REAL Z-REAL)
Note: Forced to do GENERIC-= (cost 10).
Unable to do inline float comparison (cost 3) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
Unable to do inline float comparison (cost 3) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
(> ITERATIONS 1)
Note: Forced to do GENERIC-> (cost 10).
Unable to do inline fixnum comparison (cost 3) because:
The first argument is a REAL, not a FIXNUM.
Unable to do inline fixnum comparison (cost 4) because:
The first argument is a REAL, not a FIXNUM.
etc.
(> PERIOD CHECK)
Note: Forced to do GENERIC-> (cost 10).
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a DOUBLE-FLOAT.
The second argument is a REAL, not a DOUBLE-FLOAT.
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a SINGLE-FLOAT.
The second argument is a REAL, not a SINGLE-FLOAT.
etc.
(* 2 CHECK)
==>
(* C::Y 2)
Note: Forced to do GENERIC-* (cost 30).
Unable to do inline fixnum arithmetic (cost 3) because:
The first argument is a REAL, not a FIXNUM.
The result is a REAL, not a FIXNUM.
Unable to do inline fixnum arithmetic (cost 4) because:
The first argument is a REAL, not a FIXNUM.
The result is a REAL, not a FIXNUM.
etc.
(LOOP FOR ITERATIONS BELOW *MAX-ITERATIONS* ...)
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY
--> ANSI-LOOP::LOOP-REALLY-DESETQ SETQ 1+
==>
(+ ITERATIONS 1)
Note: Forced to do GENERIC-+ (cost 10).
Unable to do inline fixnum arithmetic (cost 1) because:
The first argument is a REAL, not a FIXNUM.
The result is a REAL, not a FIXNUM.
Unable to do inline fixnum arithmetic (cost 2) because:
The first argument is a REAL, not a FIXNUM.
The result is a REAL, not a FIXNUM.
etc.
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY WHEN
--> COND IF >= IF
==>
(< ITERATIONS #:G0)
Note: Forced to do GENERIC-< (cost 10).
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a DOUBLE-FLOAT.
The second argument is a REAL, not a DOUBLE-FLOAT.
Unable to do inline float comparison (cost 3) because:
The first argument is a REAL, not a SINGLE-FLOAT.
The second argument is a REAL, not a SINGLE-FLOAT.
etc.
--> LET LET LET LET LET LET LET LET LET BLOCK ANSI-LOOP::LOOP-BODY TAGBODY
--> ANSI-LOOP::LOOP-REALLY-DESETQ SETQ 1+
==>
(+ PERIOD 1)
Note: Forced to do GENERIC-+ (cost 10).
Unable to do inline fixnum arithmetic (cost 1) because:
The first argument is a REAL, not a FIXNUM.
The result is a REAL, not a FIXNUM.
Unable to do inline fixnum arithmetic (cost 2) because:
The first argument is a REAL, not a FIXNUM.
The result is a REAL, not a FIXNUM.
etc.
(* 2.0d0 Z-REAL Z-IMAG)
--> * *
==>
(* C::Y 2.0d0)
Note: Forced to do GENERIC-* (cost 30).
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
==>
(* (* 2.0d0 Z-REAL) Z-IMAG)
Note: Forced to do GENERIC-* (cost 30).
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
The result is a NUMBER, not a SINGLE-FLOAT.
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
etc.
(+ (* 2.0d0 Z-REAL Z-IMAG) C-IMAG)
Note: Forced to do GENERIC-+ (cost 10).
Unable to do inline float arithmetic (cost 2) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
(- Z-REAL-SQUARED Z-IMAG-SQUARED)
Note: Forced to do GENERIC-- (cost 10).
Unable to do inline float arithmetic (cost 2) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
Unable to do inline float arithmetic (cost 2) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
The result is a NUMBER, not a SINGLE-FLOAT.
etc.
(+ (- Z-REAL-SQUARED Z-IMAG-SQUARED) C-REAL)
Note: Forced to do GENERIC-+ (cost 10).
Unable to do inline float arithmetic (cost 2) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
(* Z-REAL Z-REAL)
Note: Forced to do GENERIC-* (cost 30).
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
The result is a NUMBER, not a SINGLE-FLOAT.
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
etc.
(* Z-IMAG Z-IMAG)
Note: Forced to do GENERIC-* (cost 30).
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a SINGLE-FLOAT.
The second argument is a NUMBER, not a SINGLE-FLOAT.
The result is a NUMBER, not a SINGLE-FLOAT.
Unable to do inline float arithmetic (cost 3) because:
The first argument is a NUMBER, not a DOUBLE-FLOAT.
The second argument is a NUMBER, not a DOUBLE-FLOAT.
The result is a NUMBER, not a DOUBLE-FLOAT.
etc.
(THE FIXNUM *MAX-ITERATIONS*)
Warning: Undefined variable: *MAX-ITERATIONS*
Warning: This variable is undefined:
*MAX-ITERATIONS*
Compilation unit finished.
2 warnings
82 notes
DOUBLE-FLOAT-ESCAPE-FN
T
NIL
*
Joel Ray Holveck <·····@juniper.net> writes:
> > I need to speed up the following function:
>
> Most people have suggested adding types to your code. The compiler
> can give you that hint by itself, too. CMUCL, for example will tell
> you about places where it can use optimized calls (such as inline
> addition instead of calling #'+) if you give it a narrower type. I
> assume that SBCL will do the same. I've used this information fairly
> extensively in some of my tight loops.
>
> These are the compiler settings I had at the time:
> (declaim (optimize (debug 3) (safety 3) (inhibit-warnings 3)))
> (setq *compile-print* nil)
> (setq *compile-verbose* nil)
>
> I compiled your code on CMUCL here, and here's what I got:
<snip>
SBCL also gives such notes. My big problem is not knowing what to do
about them. Being informed that OF-TYPE existed was a major help to
me and now I get fewer notes when compiling my code. In time, I hope
to understand things better so that I can satisfy the compiler.
OpenMCL has not given me any notes at all. It may be some setting I
have to turn on for that, I don't know.
It's all part of the process of learning how to make Lisp compile into
code that runs as fast as C. This experience has helped me see how to
do that. I'm sure Lisp would be a more popular language if more
people knew how to write Lisp programs that ran as fast as their C
equivalents but could be developed as quickly as something written in
Python or Perl.
A 48x improvement was pretty darn good. If I can learn to understand
Wade's code and it runs 4x faster than that, well, that will have me
very impressed. The thought of going 192x faster makes my head spin
in a good way.
--
An ideal world is left as an excercise to the reader.
--- Paul Graham, On Lisp 8.1
David Steuber <·····@david-steuber.com> writes:
> I need to speed up the following function:
[snip]
i saw you're using os x, if you're also using openmcl i had _great_
success with randall bear's fpc-ppc:
http://vorlon.cwru.edu/~beer/Software/FPC-PPC/FPC-PPC-DOC-0.2.txt
--
-Marco
Ring the bells that still can ring.
Forget your perfect offering.
There is a crack in everything.
That's how the light gets in.
-Leonard Cohen
Marco Baringer <··@bese.it> writes:
> i saw you're using os x, if you're also using openmcl i had _great_
> success with randall bear's fpc-ppc:
>
> http://vorlon.cwru.edu/~beer/Software/FPC-PPC/FPC-PPC-DOC-0.2.txt
I've been developing on Emacs + SLIME + OpenMCL and testing with SBCL
which has been running faster (and also giving me notes on things it
can't optimize, although I do not understand all things yet). The
above link is something I will look at though. Thanks.
Also thanks to the others for their replies. I need to look at Wade's
code. If I can gain another 4x improvement over a fully typed loop,
then that will be nearly 200 times faster than my original code. As
it is, OF-TYPE was the key. I was not familiar with this word in Lisp
and it made a tremendous improvement all by itself. With SBCL on my
Mac, here is the before and after:
From:
* (time (double-float-escape-fn -0.10488076384628028d0
0.9256099657646029d0))
Evaluation took:
3.51 seconds of real time
2.35 seconds of user run time
1.07 seconds of system run time
0 page faults and
204,646,208 bytes consed.
2858197
To:
* (time (double-float-escape-fn -0.10488076384628028d0
0.9256099657646029d0))
Evaluation took:
0.073 seconds of real time
0.07 seconds of user run time
0.0 seconds of system run time
0 page faults and
80 bytes consed.
2858197
* (/ 3.51 0.073)
48.08219
The real fun will be trying to speed up bignum code when the
resolution can't be represented by a double-float. At that point, the
iteration counts will also be much higher. Until I grok and test
Wade's version, here is how the function currently stands:
(defun double-float-escape-fn (c-real c-imag)
"Return iterations for c-real + c-imag i as floating-point numbers"
(declare (type double-float c-real c-imag)
(optimize (speed 3) (safety 0) (space 0)))
(loop for iterations of-type fixnum below *max-iterations*
for period of-type fixnum from 1
for z-imag of-type double-float = 0.0D0 then (+ (* 2.0D0
z-real z-imag) c-imag)
for z-real of-type double-float = 0.0D0 then (+ (-
z-real-squared z-imag-squared) c-real)
for z-real-squared of-type double-float = 0.0D0 then (* z-real
z-real)
for z-imag-squared of-type double-float = 0.0D0 then (* z-imag
z-imag)
with a-imag of-type double-float = 0.0D0
with a-real of-type double-float = 0.0D0
with check of-type fixnum = 3
unless (< (+ z-real-squared z-imag-squared) 4.0D0) return
iterations
when (and (= a-imag z-imag) (= a-real z-real) (> iterations 1))
return *max-iterations*
when (> period check) do (setf period 1 check (* 2 check) a-imag
z-imag a-real z-real)
finally (return iterations)))
--
An ideal world is left as an excercise to the reader.
--- Paul Graham, On Lisp 8.1
Marco Baringer <··@bese.it> wrote in message news:<··············@convey.it>...
> David Steuber <·····@david-steuber.com> writes:
>
> > I need to speed up the following function:
>
> [snip]
>
> i saw you're using os x, if you're also using openmcl i had _great_
> success with randall bear's fpc-ppc:
>
> http://vorlon.cwru.edu/~beer/Software/FPC-PPC/FPC-PPC-DOC-0.2.txt
I haven't followed the thread, but one should be aware that OS X and
trigonometric functions will automatically result in a desaster
provided one uses a G3 or G4 processor. A G3/G4 1000 MHz processor is
not better than a Celeron when speaking of cosine, sine, etc.
functions.
That should bet kept in mind.
Wade Humeniuk <····································@telus.net> writes:
> Here's the fastest version I can get on LWW
<snip>
> CL-USER 18 > (time (double-float-escape-fn -0.10488076384628028d0 0.9256099657646029d0))
>
> 0.1 seconds used.
> Standard Allocation 0 bytes.
> Fixlen Allocation 0 bytes.
> 2858197
>
> CL-USER 19 >
>
> It's about 4 times faster than a fully typed loop version. It's about
> as close to the bone as I can think of.
Thanks. I was looking for the big speed up, but it didn't happen for
me on SBCL :-(. Perhaps SBCL just has a really good LOOP macro or the
compiler is just that good. I made a couple minor changes when I
tested so that I could do a side by side apples to apples comparison.
I made max-iterations *max-iterations* to point to my global variable
and I renamed the function so that it could coexist with the one I
had. Here are my results:
* (setf *max-iterations* 2858197)
2858197
* (time (double-float-escape-fn -0.10488076384628028d0
0.9256099657646029d0))
Evaluation took:
0.072 seconds of real time
0.07 seconds of user run time
0.0 seconds of system run time
0 page faults and
80 bytes consed.
2858197
* (time (wade-double-float-escape-fn -0.10488076384628028d0
0.9256099657646029d0))
Evaluation took:
0.075 seconds of real time
0.07 seconds of user run time
0.0 seconds of system run time
0 page faults and
80 bytes consed.
2858197
I'd call it a wash, so I don't think I'll go with the extra
complexity. It is interesting to see C style GOTO functionality
though. Thanks.
--
An ideal world is left as an excercise to the reader.
--- Paul Graham, On Lisp 8.1