Fast numerical functions

From: ··········@gmail.com
Subject: Fast numerical functions
Date: Thu, 16 Feb 2006 23:42:23 +0000
Message-ID: <1140133343.369625.173630@g43g2000cwa.googlegroups.com>

Hello,

Suppose I have a purely numeric function, say
(defun foo (x y)
    (+ x (* 2 y)))

and I want it to execute quickly (no generic dispatches on arithmetic).
 If I knew the types of x,y, I could say
(defun foo (x y)
    (declare (type the-type x y))
    (+ x (* 2 y)))

But in that case, the function would justifiably signal an error when x
and y are not of type the-type or worse.  (I don't want to coerce)
Another approach would be to use generic functions and specialize them
for the types that I cared about, as well has having a 't' specializer
for the slow, unoptimized version.  However, in this case, I would have
to either have multiple copies of code, or have a macro which defines
specialized versions, which feels like reinventing the wheel.

This clearly feels like a 'sufficiently smart compiler could figure
this out and make multiple versions' type problem.  I've noticed that
SBCL handles these issues with deftransform, but that's an internal
method and feels too low level for me.  Is there some sort of an
optimization directive I could pass to make the compiler make
specialized copies of the method for me?  A built-in macro I could use?
 Or should I homebrew a macro to define multiple functions?

Thanks,
Alex

Re: Fast numerical functions rif
- Re: Fast numerical functions Richard J. Fateman
- Re: Fast numerical functions ··········@gmail.com
Re: Fast numerical functions Joe Marshall
Re: Fast numerical functions Thomas A. Russ

From: rif
Subject: Re: Fast numerical functions
Date: Thu, 16 Feb 2006 23:58:09 +0000
Message-ID: <wj0oe16rh4e.fsf@five-percent-nation.mit.edu>

··········@gmail.com writes:

> Hello,
> 
> Suppose I have a purely numeric function, say
> (defun foo (x y)
>     (+ x (* 2 y)))
> 
> and I want it to execute quickly (no generic dispatches on arithmetic).
>  If I knew the types of x,y, I could say
> (defun foo (x y)
>     (declare (type the-type x y))
>     (+ x (* 2 y)))
> 
> But in that case, the function would justifiably signal an error when x
> and y are not of type the-type or worse.  (I don't want to coerce)
> Another approach would be to use generic functions and specialize them
> for the types that I cared about, as well has having a 't' specializer
> for the slow, unoptimized version.  However, in this case, I would have
> to either have multiple copies of code, or have a macro which defines
> specialized versions, which feels like reinventing the wheel.
> 
> This clearly feels like a 'sufficiently smart compiler could figure
> this out and make multiple versions' type problem.  I've noticed that
> SBCL handles these issues with deftransform, but that's an internal
> method and feels too low level for me.  Is there some sort of an
> optimization directive I could pass to make the compiler make
> specialized copies of the method for me?  A built-in macro I could use?
>  Or should I homebrew a macro to define multiple functions?
> 
> Thanks,
> Alex

1.  If you want the fastest possible speed, you might want to set
    safety to 0.  On sbcl and cmucl, this diasbles the type checks.
    So instead of getting the "justifiable error" if you pass the
    wrong types, you can crash your image.  But hey, speed is speed.

2.  You probably DON'T want a generic, because then you have to pay
    for the dispatch.

3.  I would just do a bit of macrology for this.  A simple example I
    use is:

(defmacro make-typed-op (name function type)
  `(defmacro ,name (&rest args)
      `(the ,',type (,',function ,@(mapcar (lambda (a) `(the ,',type ,a)) args))
)))

(make-typed-op fix* * fixnum)
(make-typed-op fix+ + fixnum)
(make-typed-op fix- - fixnum)
(make-typed-op fix/ / fixnum)

(make-typed-op df* * double-float)
(make-typed-op df+ + double-float)
(make-typed-op df- - double-float)
(make-typed-op df/ / double-float)

4.  A "sufficiently smart" compiler could do a lot of nice things.  I
    join you in wishing for it.

Cheers,

rif

From: Richard J. Fateman
Subject: Re: Fast numerical functions
Date: Fri, 17 Feb 2006 02:20:32 +0000
Message-ID: <dt3btg$2o68$1@agate.berkeley.edu>

rif wrote:

> ··········@gmail.com writes:
> 
> 
>>Hello,
>>
>>Suppose I have a purely numeric function, say
>>(defun foo (x y)
>>    (+ x (* 2 y)))
>>
>>and I want it to execute quickly (no generic dispatches on arithmetic).
>> If I knew the types of x,y, I could say
>>(defun foo (x y)
>>    (declare (type the-type x y))
>>    (+ x (* 2 y)))
>>

Assuming for the moment that the-type is double-float, then
it seems to me that you might want to insert 2.0d0 in
there instead of 2. but a compiler might do that for you.

However, the more significant cost at runtime might be the cost
to put the result in a box, since the compiler does not
know who is calling foo, and cannot ordinarily trust that
the recipient is going to be happy receiving a raw double-float.

So if bar calls foo, bar  might benefit from an in-line declaration
so that foo's body is inserted in place of the call to foo.
or a more elaborate declaration in which the argument and return
types of foo are specified, so that a different (unboxed) linkage
can be set up between bar and foo.

If you have a particular compiler in mind, you can experiment,
especially if disassemble  works.
RJF

From: ··········@gmail.com
Subject: Re: Fast numerical functions
Date: Fri, 17 Feb 2006 02:25:03 +0000
Message-ID: <1140143103.089256.299450@z14g2000cwz.googlegroups.com>

rif wrote:
> ··········@gmail.com writes:
>
> > Hello,
> >
> > Suppose I have a purely numeric function, say
> > (defun foo (x y)
> >     (+ x (* 2 y)))
> >
> > and I want it to execute quickly (no generic dispatches on arithmetic).
> >  If I knew the types of x,y, I could say
> > (defun foo (x y)
> >     (declare (type the-type x y))
> >     (+ x (* 2 y)))
> >
> > But in that case, the function would justifiably signal an error when x
> > and y are not of type the-type or worse.  (I don't want to coerce)
> > Another approach would be to use generic functions and specialize them
> > for the types that I cared about, as well has having a 't' specializer
> > for the slow, unoptimized version.  However, in this case, I would have
> > to either have multiple copies of code, or have a macro which defines
> > specialized versions, which feels like reinventing the wheel.
> >
> > This clearly feels like a 'sufficiently smart compiler could figure
> > this out and make multiple versions' type problem.  I've noticed that
> > SBCL handles these issues with deftransform, but that's an internal
> > method and feels too low level for me.  Is there some sort of an
> > optimization directive I could pass to make the compiler make
> > specialized copies of the method for me?  A built-in macro I could use?
> >  Or should I homebrew a macro to define multiple functions?
> >
> > Thanks,
> > Alex
>
> 1.  If you want the fastest possible speed, you might want to set
>     safety to 0.  On sbcl and cmucl, this diasbles the type checks.
>     So instead of getting the "justifiable error" if you pass the
>     wrong types, you can crash your image.  But hey, speed is speed.

Agreed.  And hey, my program will finish running even faster!

> 2.  You probably DON'T want a generic, because then you have to pay
>     for the dispatch.

Well, my functions are complex enough that I'm willing to pay for the
dispatch overhead.  Ideally, I could use non-generics and the compiler
would be 'smart enough'(aaah, that magic word again) to figure out
which one to use.  However, that's hard and I'm willing to pay for the
overhead, at least for now.

> 3.  I would just do a bit of macrology for this.  A simple example I
>     use is:
>
> (defmacro make-typed-op (name function type)
>   `(defmacro ,name (&rest args)
>       `(the ,',type (,',function ,@(mapcar (lambda (a) `(the ,',type ,a)) args))
> )))
>
> (make-typed-op fix* * fixnum)
> (make-typed-op fix+ + fixnum)
> (make-typed-op fix- - fixnum)
> (make-typed-op fix/ / fixnum)
>
> (make-typed-op df* * double-float)
> (make-typed-op df+ + double-float)
> (make-typed-op df- - double-float)
> (make-typed-op df/ / double-float)

Sadly, this will not help the whole problem.  In particular (func (the
double-float arg)) will still be slow, because there will only be one
version of func, which uses generic arithmetic.  Maybe if func got
inlined, the compiler could recompile it to take advantage of the type
declarations.

> 4.  A "sufficiently smart" compiler could do a lot of nice things.  I
>     join you in wishing for it.
> 
> Cheers,
> 
> rif

Alex

From: Joe Marshall
Subject: Re: Fast numerical functions
Date: Fri, 17 Feb 2006 16:50:38 +0000
Message-ID: <1140195038.444048.193960@f14g2000cwb.googlegroups.com>

··········@gmail.com wrote:
>  If I knew the types of x,y, I could say
> (defun foo (x y)
>     (declare (type the-type x y))
>     (+ x (* 2 y)))
>
> But in that case, the function would justifiably signal an error when x
> and y are not of type the-type or worse.

Just a quick clarification.  An error might not be signalled at all;
the function may erroneously assume that X and Y are of the declared
type and just mangle the bits.

From: Thomas A. Russ
Subject: Re: Fast numerical functions
Date: Tue, 21 Feb 2006 17:44:09 +0000
Message-ID: <ymi64n8ehee.fsf@sevak.isi.edu>

··········@gmail.com writes:

> Hello,
> 
> Suppose I have a purely numeric function, say
> (defun foo (x y)
>     (+ x (* 2 y)))
> 
> and I want it to execute quickly (no generic dispatches on arithmetic).
>  If I knew the types of x,y, I could say
> (defun foo (x y)
>     (declare (type the-type x y))
>     (+ x (* 2 y)))
> 
> ...
> Another approach would be to use generic functions and specialize them
> for the types that I cared about, as well has having a 't' specializer
> for the slow, unoptimized version.  However, in this case, I would have
> to either have multiple copies of code, or have a macro which defines
> specialized versions, which feels like reinventing the wheel.

This latter approach won't really work.  OK, it will work, but it won't
give you the performance that you seek.  There are a couple of reasons
for that:

(1)  You can't (portably) specialize generic functions to the required
level of detail.  You can have INTEGER and FLOAT as type specifiers for
generic dispatch, but there is no FIXNUM or DOUBLE-FLOAT.  That is
because there is no requirement that there be more than one type of
integer or floating point data type.  So, if you write generic functions
you won't have the required level of granularity available to generate
efficient numeric code.

(2)  Instead of the overhead of generic arithmetic, you will now have
the overhead of generic function dispatch.  This could easily eat up an
savings in time, although if the internal computation were much more
complicated, this could still win.

Instead, you really want to a macro that defines specialized versions of
the code.  If you want to adopt a somewhat existing convention, I have
seen the suffix "&" used for fixnum and "$" for floating point
operations.  I believe at least the latter was from MacLisp days.  They
were available as part of the basic arithmetic functions:  +& +$ /$ *&
etc.

Now, I would venture that you are actually unlikely to have more than
two types of specialized functions required.  So it would be pretty easy
to write a macro that does what you want.  (untested):

(defmacro defnumfun (name lambda-list &body body)
  ;; OK, for this example, we'll assume a plain lambda list with
  ;; no use of &optional, &rest, &key.
  (let ((fixnum-name (intern (concatenate 'string (symbol-name name) "&")
                             (symbol-package name)))
        (float-name (intern (concatenate 'string (symbol-name name) "*")
                            (symbol-package name))))
     `(progn
         (defun ,name ,lambda-list "generic case"
            ,@body)
         (defun ,fixnum-name ,lambda-list "fixnum case"
            (declare (type fixnum ,@lambda-list))
            ,@body)
         (defun ,float-name ,lambda-list "fixnum case"
            (declare (type fixnum ,@lambda-list))
            ,@body))))

-- 
Thomas A. Russ,  USC/Information Sciences Institute