Avoiding work in closures?

From: Peter Seibel
Subject: Avoiding work in closures?
Date: Sat, 16 Oct 2004 23:56:38 +0000
Message-ID: <m33c0efcio.fsf@javamonkey.com>

Suppose I'm writing a function that returns a closure that is going to
be called many times. My natural inclination is to do any work that I
can outside the body of the closure in order to avoid doing it each
time the closure is called. For instance here's a function from some
code I've been working on for the book:

  (defun extractor (schema)
    (let ((names (mapcar #'name schema)))
      #'(lambda (row)
          (loop for c in column-names collect c collect (getf row c)))))

While the savings aren't huge, I figure the closure returned will do
less work than if I had written it like this:

  (defun extractor (schema)
    #'(lambda (row)
        (loop for c in (mapcar #'name schema) collect c collect (getf row c))))

Then I started worrying about a SSC turning the later into the
equivalent of the former (since the SCC would know that NAME is
side-effect free, etc.) and having the Lisp Compiler Guru's come out
of their lairs to flog me up if I claimed that one should prefer the
former over the later for performance reasons. Have I overclocked my
paranoia or do such SCCs exist?

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

Re: Avoiding work in closures? Paul F. Dietz
- Re: Avoiding work in closures? Peter Seibel
Re: Avoiding work in closures? Barry Margolin
- Re: Avoiding work in closures? Peter Seibel
  - Re: Avoiding work in closures? Tim Bradshaw
    - Re: Avoiding work in closures? Peter Seibel
      - Re: Avoiding work in closures? Tim Bradshaw
    - Re: Avoiding work in closures? Vassil Nikolov
Re: Avoiding work in closures? William D Clinger
- Re: Avoiding work in closures? William D Clinger

From: Paul F. Dietz
Subject: Re: Avoiding work in closures?
Date: Sun, 17 Oct 2004 00:08:43 +0000
Message-ID: <7Y6dndXx3cCRJezcRVn-gQ@dls.net>

Peter Seibel wrote:

>   (defun extractor (schema)
>     (let ((names (mapcar #'name schema)))
>       #'(lambda (row)
>           (loop for c in column-names collect c collect (getf row c)))))
> 
> While the savings aren't huge, I figure the closure returned will do
> less work than if I had written it like this:
> 
>   (defun extractor (schema)
>     #'(lambda (row)
>         (loop for c in (mapcar #'name schema) collect c collect (getf row c))))
> 
> Then I started worrying about a SSC turning the later into the
> equivalent of the former (since the SCC would know that NAME is
> side-effect free, etc.)

That would be an unsafe transformation.  How is the SSC to know that the
list refered to by schema is not elsewhere destructively modified after
extractor returns?

	Paul

From: Peter Seibel
Subject: Re: Avoiding work in closures?
Date: Sun, 17 Oct 2004 00:36:35 +0000
Message-ID: <m3y8i6dw3n.fsf@javamonkey.com>

"Paul F. Dietz" <·····@dls.net> writes:

> Peter Seibel wrote:
>
>>   (defun extractor (schema)
>>     (let ((names (mapcar #'name schema)))
>>       #'(lambda (row)
>>           (loop for c in column-names collect c collect (getf row c)))))
>> While the savings aren't huge, I figure the closure returned will do
>> less work than if I had written it like this:
>>   (defun extractor (schema)
>>     #'(lambda (row)
>>         (loop for c in (mapcar #'name schema) collect c collect (getf row c))))
>> Then I started worrying about a SSC turning the later into the
>> equivalent of the former (since the SCC would know that NAME is
>> side-effect free, etc.)
>
> That would be an unsafe transformation. How is the SSC to know that
> the list refered to by schema is not elsewhere destructively
> modified after extractor returns?

Duh. Good point. Good thing I'm not writing a compiler. ;-) I was
thinking about--and ignoring--the issue of multiple threads but
totally spaced on the fact that the closure was going to be invoked
*in the future*. As I said, duh. Thanks.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Barry Margolin
Subject: Re: Avoiding work in closures?
Date: Sun, 17 Oct 2004 00:36:56 +0000
Message-ID: <barmar-9F5678.20365516102004@comcast.dca.giganews.com>

In article <··············@javamonkey.com>,
 Peter Seibel <·····@javamonkey.com> wrote:

> Suppose I'm writing a function that returns a closure that is going to
> be called many times. My natural inclination is to do any work that I
> can outside the body of the closure in order to avoid doing it each
> time the closure is called. For instance here's a function from some
> code I've been working on for the book:
> 
>   (defun extractor (schema)
>     (let ((names (mapcar #'name schema)))
>       #'(lambda (row)
>           (loop for c in column-names collect c collect (getf row c)))))

Is COLUMN-NAMES supposed to be NAMES?

> 
> While the savings aren't huge, I figure the closure returned will do
> less work than if I had written it like this:
> 
>   (defun extractor (schema)
>     #'(lambda (row)
>         (loop for c in (mapcar #'name schema) collect c collect (getf row 
>         c))))
> 
> Then I started worrying about a SSC turning the later into the
> equivalent of the former (since the SCC would know that NAME is
> side-effect free, etc.) and having the Lisp Compiler Guru's come out
> of their lairs to flog me up if I claimed that one should prefer the
> former over the later for performance reasons. Have I overclocked my
> paranoia or do such SCCs exist?

Although a SSC *could* do some complex optimizations, it's a bit 
far-fetched to avoid simple optimizations in your source code based on 
this possibility.  Especially if you know that few real-world compilers 
are this smart.

And as the other responder pointed out, the compiler can't tell that 
SCHEMA will not be destructively modified.  Since you apparently know 
this a priori, you can implement optimizations that a compiler would 
have great difficulty performing.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

From: Peter Seibel
Subject: Re: Avoiding work in closures?
Date: Sun, 17 Oct 2004 13:57:41 +0000
Message-ID: <m3k6tpe9ky.fsf@javamonkey.com>

Barry Margolin <······@alum.mit.edu> writes:

> In article <··············@javamonkey.com>,
>  Peter Seibel <·····@javamonkey.com> wrote:
>
>> Suppose I'm writing a function that returns a closure that is going
>> to be called many times. My natural inclination is to do any work
>> that I can outside the body of the closure in order to avoid doing
>> it each time the closure is called. For instance here's a function
>> from some code I've been working on for the book:
>> 
>>   (defun extractor (schema)
>>     (let ((names (mapcar #'name schema)))
>>       #'(lambda (row)
>>           (loop for c in column-names collect c collect (getf row c)))))
>
> Is COLUMN-NAMES supposed to be NAMES?

Yes. Sorry.

>> While the savings aren't huge, I figure the closure returned will
>> do less work than if I had written it like this:
>> 
>>   (defun extractor (schema)
>>     #'(lambda (row)
>>         (loop for c in (mapcar #'name schema) collect c collect (getf row 
>>         c))))
>> 
>> Then I started worrying about a SSC turning the later into the
>> equivalent of the former (since the SCC would know that NAME is
>> side-effect free, etc.) and having the Lisp Compiler Guru's come
>> out of their lairs to flog me up if I claimed that one should
>> prefer the former over the later for performance reasons. Have I
>> overclocked my paranoia or do such SCCs exist?
>
> Although a SSC *could* do some complex optimizations, it's a bit
> far-fetched to avoid simple optimizations in your source code based
> on this possibility. Especially if you know that few real-world
> compilers are this smart.

Right. I was just paranoid about making the statement "one *should*
write it the former way" that someone would pop up to say, "But what
you don't understand is that we've known how to optimize that case
since Steele's AIM memo <mumble> so that's wrong." I.e. I wasn't
asking so much about practical coding practice but rather theoretical
accuracy. Like I said, I'm getting paranoid.

> And as the other responder pointed out, the compiler can't tell that
> SCHEMA will not be destructively modified. Since you apparently know
> this a priori, you can implement optimizations that a compiler would
> have great difficulty performing.

Yes.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Tim Bradshaw
Subject: Re: Avoiding work in closures?
Date: Mon, 18 Oct 2004 11:35:50 +0000
Message-ID: <1098099350.588393.322300@c13g2000cwb.googlegroups.com>

Peter Seibel wrote:
> Right. I was just paranoid about making the statement "one *should*
> write it the former way" that someone would pop up to say, "But what
> you don't understand is that we've known how to optimize that case
> since Steele's AIM memo <mumble> so that's wrong." I.e. I wasn't
> asking so much about practical coding practice but rather theoretical
> accuracy. Like I said, I'm getting paranoid.
>

Well, here's another way of being paranoid.  Basically the two
patterns you have are something like:

(defun make-closure-precomputing (data)
(let ((stuff (compute-thing data)))
#'(lambda (...) ... stuff ...)))

or

(defun make-closure-simple (data)
#'(lambda (...) ... (compute-thing data)))

OK.  Well, what if STUFF is big, and the closure is long-lived but not
often called, or not called for a long time after it is created?  Some
clever GC will then, maybe, tenure STUFF, and now there's a possible
memory leak.  So let's second guess it:

(defun make-closure-lazily-precomputing (data)
(let ((stuff data))
#'(lambda (...)
;; assumption that COMPUTE-THING is not the identity
(when (eq stuff data) (setf stuff (compute-thing data)))
... stuff ...)))

And now we can make it even more hairy and obscure:

(defun make-closure-lazily-precomputing (data)
(let ((stuff data))
#'(lambda (... &key (forget nil))
;; assumption that COMPUTE-THING is not the identity
(multiple-value-prog1
(progn
(when (eq stuff data) (setf stuff (compute-thing data)))
... stuff ...)
;; if this is the last time we will be called for a while,
;; forget STUFF.
(when forget (setf stuff data))))))

And I'm sure there are some other tricks that can be done to make the
code hard to understand but perhaps more efficient.  Maybe we could
use a weak pointer so STUFF can be auto-forgotten!

I think the secret lesson is: never do any of this unless profiling
tells you you need to.

--tim

From: Peter Seibel
Subject: Re: Avoiding work in closures?
Date: Mon, 18 Oct 2004 18:27:34 +0000
Message-ID: <m3u0srkhtw.fsf@javamonkey.com>

"Tim Bradshaw" <··········@tfeb.org> writes:

> I think the secret lesson is: never do any of this unless profiling
> tells you you need to.

So am I allowed to just write this:

  (loop with stuff = (compute-hairy-idempotent-stuff)
        for i from 1 to 1000000
        collect (frob stuff i))

instead of this:

  (loop for i from 1 to 1000000
        collect (frob (compute-hairy-idempotent-stuff) i))

Or *must* I write both versions and profile them?

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Tim Bradshaw
Subject: Re: Avoiding work in closures?
Date: Tue, 19 Oct 2004 09:37:24 +0000
Message-ID: <1098178643.992701.205310@z14g2000cwz.googlegroups.com>

Peter Seibel wrote:
>
> Or *must* I write both versions and profile them?
>

Sorry, I phrased myself badly.  What I meant was something like: write
the obviously OK one (readable, maintainable, not obviously horribly
inefficient, even if `obviously horribly inefficient' is actually OK
because of some SSC) and then, if it turns out to be a performance
issue worry about fiddling with it.  I think it's a bad idea to write
code that tries to second-guess the compiler/runtime without actual
evidence that it matters.

--tim

From: Vassil Nikolov
Subject: Re: Avoiding work in closures?
Date: Tue, 19 Oct 2004 05:20:20 +0000
Message-ID: <lzzn2j5lnv.fsf@janus.vassil.nikolov.names>

"Tim Bradshaw" <··········@tfeb.org> writes:

> [...]
> (defun make-closure-lazily-precomputing (data)
> (let ((stuff data))
> #'(lambda (...)
> ;; assumption that COMPUTE-THING is not the identity
> (when (eq stuff data) (setf stuff (compute-thing data)))
> ... stuff ...)))
>
> And now we can make it even more hairy and obscure:
>
> (defun make-closure-lazily-precomputing (data)
> (let ((stuff data))
> #'(lambda (... &key (forget nil))
> ;; assumption that COMPUTE-THING is not the identity
> (multiple-value-prog1
> (progn
> (when (eq stuff data) (setf stuff (compute-thing data)))
> ... stuff ...)
> ;; if this is the last time we will be called for a while,
> ;; forget STUFF.
> (when forget (setf stuff data))))))
>
> And I'm sure there are some other tricks that can be done to make the
> code hard to understand but perhaps more efficient.

  ...and the usual worries about mutual exclusion if concurrency is
  involved...

  ---Vassil.

-- 
Vassil Nikolov <········@poboxes.com>

Hollerith's Law of Docstrings: Everything can be summarized in 72 bytes.

From: William D Clinger
Subject: Re: Avoiding work in closures?
Date: Mon, 18 Oct 2004 22:52:04 +0000
Message-ID: <fb74251e.0410181452.6dc74079@posting.google.com>

I have corrected the naming and inserted a line break.

Peter Seibel <·····@javamonkey.com> wrote:
> Suppose I'm writing a function that returns a closure that is going to
> be called many times. My natural inclination is to do any work that I
> can outside the body of the closure in order to avoid doing it each
> time the closure is called. For instance here's a function from some
> code I've been working on for the book:
> 
>   (defun extractor (schema)
>     (let ((names (mapcar #'name schema)))
>       #'(lambda (row)
>           (loop for c in names collect c collect (getf row c)))))
> 
> While the savings aren't huge, I figure the closure returned will do
> less work than if I had written it like this:
> 
>   (defun extractor (schema)
>     #'(lambda (row)
>         (loop for c in (mapcar #'name schema)
>               collect c collect (getf row c))))

Not likely.  The first version is usually (i.e. almost always) better.

Your code extracts the names from the schema, which suggests that the
schema is a larger data structure than the names.  If so, then the
closure of the first version will retain a smaller data structure
than the closure of the second version, and the first version will
do less work than the second version in any case.

If the schema is much smaller than the list of names that is computed
from it, then there would be some reason to prefer the second version,
but that seems unlikely.

Another case in which the second version might be preferred is when
the closure is unlikely to be called more than once, but you said
it's going to be called many times.

My reasoning assumes an Insufficiently Dumb Compiler (IDC).  If your
compiler is not an IDC, then it might close the lambda expression of
the first version over both the names and the schema, as interpreters
are wont to do.  I don't like to think about compilers that are not
insufficiently dumb, so I recommend you just avoid them.

Will

From: William D Clinger
Subject: Re: Avoiding work in closures?
Date: Tue, 19 Oct 2004 12:45:32 +0000
Message-ID: <fb74251e.0410190445.72c78a2f@posting.google.com>

I wrote:
> > While the savings aren't huge, I figure the closure returned will do
> > less work than if I had written it like this:
> 
> ...
> 
> Not likely.  The first version is usually (i.e. almost always) better.

I didn't see the word "than" in the first statement, so I thought
it said the opposite of what it actually says.  Had I read it
correctly, I probably wouldn't have replied to it.  I apologize
to anyone who wasted a minute or two trying to figure out what
I meant by my post.

Will