From: Robert Monfera
Subject: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <3895F02C.269CBBCD@fisec.com>
On a recent thread somehow related to CL's relative performance vs. that
of C, someone mentioned that you can't compare the two, because in CL
you can compile fast code on the fly, while in C one can't.  I share
this view, and I'd like to solicit comments on whether the following is
a good way to exploit the compiler.

Problem:  certain things that are useful to know for optimization, e.g.,
the lengths or element types of vectors, are not known in advance when
developing the application.  This way the resulting more general code
runs at a much slower speed compared to fully-declared speed.

Solution:  compile function (closure) on the fly

Conditions of applicability:
- information useful for the compiler becomes available run-time
    OR some overhead (e.g., function call) can be avoided
- expected speedup outweighs compilation time
- benefits of speedup outweigh extra development efforts
- it's OK to have the compiler in the image

Example (trivial):  accumulation (summarization) of array elements

Instead of this:

(defun generic-adder (a)
  (declare (optimize speed))
    (let ((result 0d0))
      (dotimes (i (length a) result)
        (incf (the double-float result)
              (aref (the (simple-array t (*)) a)
		    i)))))

We can use this:

(defun execut (function &rest args)
  "Executes function while cuts execution time"
  (funcall (funcall #'compile nil (apply function args))))

(defun adder (a)
 `(lambda ()
    (declare (optimize speed))
    (let ((result 0d0))
      (dotimes (i ,(length a) result)
        (incf (the double-float result)
          (aref (the (simple-array ,(array-element-type a) (*)) ,a)
   		i))))))

(defparameter *f* (make-array 8000000
	                      :element-type 'single-float
                              :initial-element 100000s0))

> (time (execut #'adder *f*))

A good while ago I thought this kind of optimization is perhaps done by
a "sufficiently smart compiler", but of course people wouldn't
appreciate the extra compilation time and consing when an array isn't
accessed too many times  - unless (declare (array-access a-lot a)) is
recognized, that is :-)

What do you think about this?  Is there a better practice that achieves
the same in a more elegant way?

Robert

From: Barry Margolin
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <Rgol4.65$qd1.1367@burlma1-snr2>
In article <·················@fisec.com>,
Robert Monfera  <·······@fisec.com> wrote:
>(defun execut (function &rest args)
>  "Executes function while cuts execution time"
>  (funcall (funcall #'compile nil (apply function args))))

What's the second funcall for?  Why not

(funcall (compile nil (apply function args)))

Note also that this only wins in the long run if the optimizations save
enough time to make up for compiling the code every time.

>What do you think about this?  Is there a better practice that achieves
>the same in a more elegant way?

It seems like you're making something very complex for little benefit.  It
also means that a code shaker can't leave the compiler out of the runtime
environment.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Robert Monfera
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <38963456.CA504689@fisec.com>
Barry Margolin wrote:

> What's the second funcall for?

You're right, it was verbose - I was playing with other variations (like
defining adder as a macro etc.) and it got stuck.  I simplified it.

> It seems like you're making something very complex for little
> benefit.  It also means that a code shaker can't leave the compiler
> out of the runtime environment.

These optimizations make orders of magnitude of difference in run-time
and garbage-avoidance (0.3s vs. 31s due to consing hundreds of megs on a
64MB array).

I don't think it is very complex either - the difference between my
"naive" adder and the optimized one is very little (placing commas
selectively).  In any case, I listed the preconditions for even
considering such optimizations to avoid the stigma of premature
optimizing :-)

As for the time required for compilation of short closures, it is
practically immeasurable on ACL - I hear it is not the case on the Lispm
though.

This is an example of when and why it's a good idea to keep the compiler
- not necessarily to enable that the user implements additional
functionality, but rather to have the compiler give its final touches
when specifics become known.

Thanks for your thoughts,
Robert
From: Tim Bradshaw
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <ey37lgp9oax.fsf@cley.com>
* Barry Margolin wrote:

> It seems like you're making something very complex for little benefit.  It
> also means that a code shaker can't leave the compiler out of the runtime
> environment.

But isn't that (not leaving the compiler out) now the fashionable
thing to do?  People have berated Lisp for ever for not being able to
ship 10 byte binaries, but now all the java runtimes have JIT
compilation systems probably much more hairy than the average Lisp
compiler.  People are now even trying to sell *processors* (Transmeta)
with compilers in!

I suggest that the symbols COMPILE and COMPILE-FILE be immediately
changed to MORPH-CODE and MORPH-CODE-FILE, and EVAL become
RUN-VIRTUAL-MACHINE.  Lisp will immediately become the language of
choice for internet applications!  

Or something.

--tim
From: Jan Vroonhof
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <bysnzdf23g.fsf@bolzano.math.ethz.ch>
Tim Bradshaw <···@cley.com> writes:

> I suggest that the symbols COMPILE and COMPILE-FILE be immediately
> changed to MORPH-CODE and MORPH-CODE-FILE, and EVAL become
> RUN-VIRTUAL-MACHINE.  Lisp will immediately become the language of
> choice for internet applications!  

You forgot the all important "Change all parenthesis to angle
brackets".

Jan
From: Robert Posey
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <38973ED6.3F4FE08B@raytheon.com>
Jan Vroonhof wrote:
> 
> Tim Bradshaw <···@cley.com> writes:
> 
> > I suggest that the symbols COMPILE and COMPILE-FILE be immediately
> > changed to MORPH-CODE and MORPH-CODE-FILE, and EVAL become
> > RUN-VIRTUAL-MACHINE.  Lisp will immediately become the language of
> > choice for internet applications!
> 
> You forgot the all important "Change all parenthesis to angle
> brackets".

You may have hit one to the major "Problems" with LISP.  If someone learns
C, Basic, JAVA or even Fortran first, LISP Syntax is going to require a
considerable initial mental twist.  LISP syntax is concise, which helps in
a interpreted language, but in a complied one has very little if any
performance advantage.  The result is that if you don't know LISP, the
code is much harder to get the initial grasp of than similar language.  This
is why JAVA's designers were so clever to use C's syntax when possible.  
LISP syntax has advantages, but the effort for that first level of understanding
is pretty high.

Muddy

> 
> Jan
From: Tim Bradshaw
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <ey3vh488xec.fsf@cley.com>
* Robert Posey wrote:

> You may have hit one to the major "Problems" with LISP.  If someone learns
> C, Basic, JAVA or even Fortran first, LISP Syntax is going to require a
> considerable initial mental twist.  LISP syntax is concise, which helps in
> a interpreted language, but in a complied one has very little if any
> performance advantage.  The result is that if you don't know LISP, the
> code is much harder to get the initial grasp of than similar language.  This
> is why JAVA's designers were so clever to use C's syntax when possible.  
> LISP syntax has advantages, but the effort for that first level of understanding
> is pretty high.

Well, it takes at least a few days to get used to it, yes.  I guess in
these days of `learn Fortran 90 in 3 hours while also talking on your
mobile phone and reading news' books that must seem like a lot.

The interpreted / compiled thing is somewhat vacuous...
From: Robert Posey
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <38975D75.EB852653@raytheon.com>
Tim Bradshaw wrote:
> 
> * Robert Posey wrote:
> 
> > You may have hit one to the major "Problems" with LISP.  If someone learns
> > C, Basic, JAVA or even Fortran first, LISP Syntax is going to require a
> > considerable initial mental twist.  LISP syntax is concise, which helps in
> > a interpreted language, but in a complied one has very little if any
> > performance advantage.  The result is that if you don't know LISP, the
> > code is much harder to get the initial grasp of than similar language.  This
> > is why JAVA's designers were so clever to use C's syntax when possible.
> > LISP syntax has advantages, but the effort for that first level of understanding
> > is pretty high.
> 
> Well, it takes at least a few days to get used to it, yes.  I guess in
> these days of `learn Fortran 90 in 3 hours while also talking on your
> mobile phone and reading news' books that must seem like a lot.
> 
> The interpreted / compiled thing is somewhat vacuous...

Oh my theory is based on admittedly Scheme used in robots where it is a
advantage 
supposedly that you only have to download a much smaller file to a robot when
using a slow serial link.  While the complete image of a C system maybe smaller,
the actual load module is much bigger.  Scheme's compact code size is supposed
to
make this process many times faster.  I would assume the same would be true
using
LISP, though the footprint of the interpreter would be larger.  If someone new
to
Fortran can learn F90 in 3 hours I should try it, however I thinking LISP is
more like
a couple of weeks, to get to the point where you have the feel of the language. 
OF 
course any language is going to take years to become an expert.  I will admit
the
effective elimination of Operator Precedence is a very good thing for
maintaining and
debugging code.  The more I use C, the more I become convinced a lot of C code
works
by the grace of the Goddess and not by intelligent design.


Muddy
From: Christopher Browne
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <UUPl4.2903$OI1.128762@news5.giganews.com>
Centuries ago, Nostradamus foresaw a time when Robert Posey would say:
>Jan Vroonhof wrote:
>> Tim Bradshaw <···@cley.com> writes:
>> > I suggest that the symbols COMPILE and COMPILE-FILE be immediately
>> > changed to MORPH-CODE and MORPH-CODE-FILE, and EVAL become
>> > RUN-VIRTUAL-MACHINE.  Lisp will immediately become the language of
>> > choice for internet applications!
>> 
>> You forgot the all important "Change all parenthesis to angle
>> brackets".
>
>You may have hit one to the major "Problems" with LISP.  If someone
>learns C, Basic, JAVA or even Fortran first, LISP Syntax is going to
>require a considerable initial mental twist.  LISP syntax is concise,
>which helps in a interpreted language, but in a complied one has very
>little if any performance advantage.  The result is that if you don't
>know LISP, the code is much harder to get the initial grasp of than
>similar language.  This is why JAVA's designers were so clever to use
>C's syntax when possible.  LISP syntax has advantages, but the effort
>for that first level of understanding is pretty high.

This is about as logical as claiming that COBOL is more maintainable
because "it looks just like English."

The *clever* thing about Java is that they used a syntax very similar
to C++, thus allowing the people that had jumped on the 
   "Let's adopt C++!!!" 
bandwagon to, as a result of the seeming similarity, make what appears
a small leap to the
   "Let's adopt Java!!!  It looks a lot like C++!!!"
bandwagon.

The fact that Lisp has a "bunch of parentheses" is, to these people, a
Bad Thing in much the same manner and to the same degree that:

- The use of [] and | in Smalltalk and Objective C are a Bad Thing,
  making them harder to grasp than C++,

- The lack of vast quantities of {braces} in Eiffel and Modula3 is a
  Bad Thing, making them harder to grasp than C++,

- The use, in Python, of whitespace to express control structure
  indentation is a Bad Thing, making it harder to grasp than C++.

[To the Hint-challenged...  I'd tend to think that the mass adoption
of *any* of these languages would probably have resulted in higher
quality software than the mass adoption of C++...]
-- 
"Parentheses?  What  parentheses? I  haven't  noticed any  parentheses
since my  first month of Lisp  programming.  I like to  ask people who
complain about  parentheses in  Lisp if they  are bothered by  all the
spaces between words in a newspaper..."
-- Kenny Tilton <····@liii.com>
········@ntlug.org- <http://www.hex.net/~cbbrowne/c.html>
From: Lars Marius Garshol
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <m3zotj3krb.fsf@lambda.garshol.priv.no>
* Robert Posey
| 
| You may have hit one to the major "Problems" with LISP.  If someone
| learns C, Basic, JAVA or even Fortran first, LISP Syntax is going to
| require a considerable initial mental twist.  

I think it's not so much the syntax as the overall design of the
language itself. It's quite simply rather different from these
languages, which on the other hand are very similar.

Getting used to things like the lack of a for statement, the
strangeness of do, let, the fact that all statements return a value
and many more things like that are the main obstacles, I felt when I
tried to learn Lisp.

The problem was that the ways in which I usually wrote programs in
traditional programming languages did not translate very easily to
Lisp and I had to change the low-level formulation of my program in a
way I wasn't used to. The parentheses had nothing to do with it.

| LISP syntax is concise, which helps in a interpreted language, 

Huh? Nearly all interpreted languages are compiled to some more
efficient format for execution (such as bytecode), and so the syntax
itself no longer matters.

| The result is that if you don't know LISP, the code is much harder
| to get the initial grasp of than similar language.  This is why
| JAVA's designers were so clever to use C's syntax when possible.

Marketing-wise it was a necessary decision, I think, but I don't think
it says much about Lisp. Of course it's hard to understand the
programs when you don't know the syntax.

--Lars M.
From: Fernando
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <gb3g9s0jqvm0eq2o3rhgton4hel5n0g97l@4ax.com>
On Tue, 01 Feb 2000 14:15:18 -0600, Robert Posey <·····@raytheon.com>
wrote:


>You may have hit one to the major "Problems" with LISP.  If someone learns
>C, Basic, JAVA or even Fortran first, LISP Syntax is going to require a
>considerable initial mental twist. 

	Maybe...  During the first week the parenthesis don't let you
see the Lisp. ;-)  I don't think this is a serious problem.  If a
cryptic syntax is a serious handicap, nobody would use C++ or Perl....




//-----------------------------------------------
//	Fernando Rodriguez Romero
//
//	frr at mindless dot com
//------------------------------------------------
From: Jon K Hellan
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <87u2jrckxl.fsf@parus.parus.no>
Robert Posey <·····@raytheon.com> writes:

> Jan Vroonhof wrote:
> You may have hit one to the major "Problems" with LISP.  If someone learns
> C, Basic, JAVA or even Fortran first, LISP Syntax is going to require a
> considerable initial mental twist. 

On the other hand, a Fortran programmer I know became productive in
Emacs Lisp in a matter of hours.

Jon
From: Bulent Murtezaoglu
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <874sbtwum5.fsf@kapi.internal>
[...]
    RM> Conditions of applicability: - information useful for the
    RM> compiler becomes available run-time OR some overhead (e.g.,
    RM> function call) can be avoided - expected speedup outweighs
    RM> compilation time - benefits of speedup outweigh extra
    RM> development efforts - it's OK to have the compiler in the
    RM> image  [...]

I agree with all of this, and I think I may have posted sample code
that does this in the past.  I agree with your example also but of
course every case will be different and I could not think up an
elegant facility to generalize the process.  Nonetheless, I think it
is a powerful technique that helps enormously with gazillion-iteration
inner loops in a manner that just punting to FFI'ed C cannot do (punting 
to C takes care of inefficiencies in the Lisp compiler implementation, 
not inefficiencies caused by compile-time ignorance).

Similarly, I would love to be able to de-CLOS-ify and compile inner
loops at run-time.  Outside of Henry Baker's (Clostrophobia?) paper I
don't know of any references for this kind of optimization in general
and optimization via runtime compilation in particular.  Any relevant
pointers?

BM
From: Robert Monfera
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <3896EE30.CFDABB2A@fisec.com>
Bulent Murtezaoglu wrote:
>
> [...]
>     RM> Conditions of applicability: - information useful for the
>     RM> compiler becomes available run-time OR some overhead (e.g.,
>     RM> function call) can be avoided - expected speedup outweighs
>     RM> compilation time - benefits of speedup outweigh extra
>     RM> development efforts - it's OK to have the compiler in the
>     RM> image  [...]
>
> I agree with all of this, and I think I may have posted sample code
> that does this in the past.

Thanks, I got my hands on it:

http://x24.deja.com/getdoc.xp?AN=269334305

This is a more detailed write-up with a better explanation on how it may
make Lisp to be faster than any conventional language (even assembly
:-).  Recommended reading for people asking about unique Lisp features.

> I agree with your example also but of
> course every case will be different and I could not think up an
> elegant facility to generalize the process.

Perhaps someone has an opinion on this.  In 1997, your posting wasn't
followed up, maybe because people didn't use these techniques in a
systematic manner.

Probably the core idea is to do additional analysis and recompilation if
it is known that elements of collections (practically, arrays) of a
certain type will be hit a lot of times.  In the case of arrays and
structs, it's not only recompilation with specific types but also their
access-time optimized representation, while space efficiency can also be
drastically improved if collections of instances are known to be of a
specific class.

Another ingredient is loop optimization: currently it does not happen
necessarily (maybe because of possible side effects):

(defun test-loop (a)
       (declare (optimize speed (safety 0))
                (type fixnum a))
       (let ((result 0))
            (declare (type fixnum result))
            (dotimes (i 10000000)
              (declare (type fixnum i))
              (incf result (the fixnum (+ i (the fixnum (max a))))))))

(defun test-loop2 (a)
       (declare (optimize speed (safety 0))
                (type fixnum a))
       (let ((result 0) (m (max a)))
            (declare (type fixnum result m))
            (dotimes (i 10000000)
              (declare (type fixnum i))
              (incf result (the fixnum (+ i m))))))

Maybe slot access improves if you prepare low-level slot indices outside
the loop and use more primitive (platform-dependent) slot accessors.

> Similarly, I would love to be able to de-CLOS-ify and compile inner
> loops at run-time.  Outside of Henry Baker's (Clostrophobia?) paper I
> don't know of any references for this kind of optimization in general
> and optimization via runtime compilation in particular.  Any relevant
> pointers?

Surely you know about the class sealing facility of Dylan also.  I think
it was discussed here before, where it was pointed out that it's not
usually productive to consider class hierarchies static.  Baker has a
point that changing database schemata are often traumatic already, and
the extra requirement for compilation does not make it much worse. It
can be further softened if the recompilation happens in the run-time
system, maybe transparently to the users.

Robert
From: Bulent Murtezaoglu
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <87snz9vdxq.fsf@kapi.internal>
>>>>> "RM" == Robert Monfera <·······@fisec.com> writes:
[...]
    RM> Perhaps someone has an opinion on this.  In 1997, your posting
    RM> wasn't followed up, maybe because people didn't use these
    RM> techniques in a systematic manner.

This kind of runtime compilation only makes sense in very specific cases
(as you have outlined).  That's probably why.  

    RM> Probably the core idea is to do additional analysis and
    RM> recompilation if it is known that elements of collections
    RM> (practically, arrays) of a certain type will be hit a lot of
    RM> times.  In the case of arrays and structs, it's not only
    RM> recompilation with specific types but also their access-time
    RM> optimized representation, while space efficiency can also be
    RM> drastically improved if collections of instances are known to
    RM> be of a specific class.

Yes, this does happen to a degree with present compilers.  I take the 
class in the last sentence to not mean CLOS class.

[the parts I agree with deleted]

    RM> Maybe slot access improves if you prepare low-level slot
    RM> indices outside the loop and use more primitive
    RM> (platform-dependent) slot accessors.

Yes, but as soon as you do that you are breaking portability and it
could be argued you are not using CL any more. If you need that then,
with a good FFI, punting to C is also a possibility.  Again, I think
this will only be needed in rare cases, and there are solutions within
CL that would cope with it and cope with it much better than static C
code could. There is not much in the language that precludes
efficiency (if you allow run-time compilation and tricks like that).
CLOS is the only clear exception to the above that I know of.

[...]
    RM> Surely you know about the class sealing facility of Dylan
    RM> also.  

Surely I knew it at some point, but I'd completely forgotten about it 
when I posted!  

    RM> I think it was discussed here before, where it was
    RM> pointed out that it's not usually productive to consider class
    RM> hierarchies static.  [...]

I agree with this, though I could not constuct an example off the top of 
my head to illustrate this point.  People who use CLOS every day might 
be able to.

BM
From: Robert Monfera
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <389A4E2B.D33A37EE@fisec.com>
Bulent Murtezaoglu wrote:

>     RM> [...] while space efficiency can also be
>     RM> drastically improved if collections of instances are known to
>     RM> be of a specific class.
>
> Yes, this does happen to a degree with present compilers.  I take the
> class in the last sentence to not mean CLOS class.

I have indeed meant CLOS classes, but "can" may have been an ambiguous
verb.  What I wanted to say is that space overhead associated with CLOS
class (or struct) instances could theoretically be eliminated if all
elements of an array are instances of a specific CLOS class (or
struct).  Being inexperienced in language implementations, this looks
perfectly analogous with how good a job for example ACL does with
storing and accessing double-floats directly in the array, rather than
pointers to boxed double-floats found somewhere else in the memory.

I ended up using the word "can" because CL implementations are powerful
enough to allow users to implement it themselves or work with their
vendor if they really need to.

Robert
From: Eric Marsden
Subject: Re: Manual Hot Spot optimization in Lisp
Date: 
Message-ID: <wzizotljoqz.fsf@mail.dotcom.fr>
>>>>> "rm" == Robert Monfera <·······@fisec.com> writes:

  rm> On a recent thread somehow related to CL's relative performance
  rm> vs. that of C, someone mentioned that you can't compare the two,
  rm> because in CL you can compile fast code on the fly, while in C
  rm> one can't. I share this view, and I'd like to solicit comments
  rm> on whether the following is a good way to exploit the compiler.

you'll find a large number of references to research work in this area
using the keywords runtime code generation, dynamic compilation,
value-specific optimizations (VSOs). See

   <URL:http://www.cs.berkeley.edu/~abegel/cs265/biblio.html>  

-- 
Eric Marsden