Potential speed

From: Peter Scott
Subject: Potential speed
Date: Sat, 23 Oct 2004 19:07:38 +0000
Message-ID: <659f82ff.0410231107.6dcebec5@posting.google.com>

Lisp is generally a good deal slower than C unless you do heavy
optimizations on the bottlenecks that you identify through profiling.
However, does Lisp make it easier for a compiler to do some
optimizations that could make it faster than C?

For example, suppose that you want to sum up all the numbers in a
vector. In C, that could be done with this code (courtesy of Larry
O'Brien):

double arr[100]; //<- Values set previously
double sum;
sum = 0.0;
for (int i = 0; i < 100; i++){
   sum += a[i];
}

C's for loop doesn't offer much information about what sort of looping
is being done. In lisp, you could write the code like this:

(reduce #'+ arr)

Aside from being simpler and easier to understand, this is easier to
optimize. If you convert the C code to this:

double arr[100]; //<- Values set previously
double sum, s1, s2, s3, s4;
sum = s1 = s2 = s3 = s4 = 0.0;
for( int i = 0; i < 100; i += 4) {
   s1 += a[i];
   s2 += a[i + 1];
   s3 += a[i + 2];
   s4 += a[i + 3];
}
sum = s1 + s2 + s3 + s4; 

Then it is more likely to be optimized by a C compiler by making the
addition proceed in parallel, giving it a big speedup on CPUs that
support SIMD operations. But it's ugly, and not the sort of thing you
want to think about.

In Lisp, calling reduce on an arithmetic funtion and an array could be
easily optimized to do this same sort of thing.

For another example, suppose that you have a program which does a lot
of lookups in a hash table (or some searching data structure)---and
almost all of them miss. There are special data structures optimized
for the search miss operation. A sufficiently smart compiler could use
one of those for the searching.

However, an SSC is very hard to write. A Lisp runtime could, however,
do some profiling on the searches, gather information, and use it to
automatically improve the compilation strategy. I've heard that Java
has a VM that does this sort of thing, with very impressive results.

Is this stuff possible? How does Lisp compare with C for optimization
potential? Am I just obviously clueless?

Re: Potential speed Christopher C. Stacy
- Re: Potential speed Peter Scott
  - Re: Potential speed Edi Weitz
  - Re: Potential speed Julian Stecklina
  - Re: Potential speed Adam Warner
    - Re: Potential speed Cesar Rabak
      - Re: Potential speed Adam Warner
  - Re: Potential speed Joe Marshall
    - Re: Potential speed Duane Rettig
      - Re: Potential speed Joe Marshall
        Re: Potential speed George Neuner
        Re: Potential speed Joe Marshall
  - Re: Potential speed Paul Foley
Re: Potential speed Kaz Kylheku

From: Christopher C. Stacy
Subject: Re: Potential speed
Date: Sat, 23 Oct 2004 21:02:37 +0000
Message-ID: <uwtxhqhaq.fsf@news.dtpq.com>

·········@chase3000.com (Peter Scott) writes:
> Lisp is generally a good deal slower than C unless you do heavy optimizations

It is?

From: Peter Scott
Subject: Re: Potential speed
Date: Sun, 24 Oct 2004 16:45:51 +0000
Message-ID: <7e267a92.0410240845.579424f7@posting.google.com>

······@news.dtpq.com (Christopher C. Stacy) wrote in message news:<·············@news.dtpq.com>...
> ·········@chase3000.com (Peter Scott) writes:
> > Lisp is generally a good deal slower than C unless you do heavy optimizations
> 
> It is?

That depends on how you define "a good deal". I used "a good deal
slower" to mean "if you transform a C program into a Lisp one, chances
are that the Lisp one will be slower". I heard about 30% slower on
average from Duane Rettig. Lisp is pretty fast, and if you put a
little effort into the critical sections of your code, it can be fast
enough for most things. However, the C language simply seems closer to
the machine, and benchmarks typically put C programs a bit ahead of
Lisp ones, although there have been some notable reversals.

C strikes me as a language for writing programs that are completely
optimized at a low level. Sure, I think it's *premature*
optimization---but it's optimization just the same.

I was wondering, though, if Lisp's more high-level approach could make
it easier, in some cases, to write fast programs. In cl-ppcre, for
instance, Lisp's macros and built-in compiler make it relatively
painless to compile regular expressions to native code, resulting in
impressive speed.

-Peter

From: Edi Weitz
Subject: Re: Potential speed
Date: Sun, 24 Oct 2004 17:37:57 +0000
Message-ID: <ud5z8585m.fsf@agharta.de>

On 24 Oct 2004 09:45:51 -0700, ·········@gmail.com (Peter Scott) wrote:

> I was wondering, though, if Lisp's more high-level approach could
> make it easier, in some cases, to write fast programs. In cl-ppcre,
> for instance, Lisp's macros and built-in compiler make it relatively
> painless to compile regular expressions to native code, resulting in
> impressive speed.

In cl-ppcre, the Lisp compiler is not used at runtime. Regex scanners
are created by chaining closures which were compiled when the library
was built.

Cheers,
Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")

From: Julian Stecklina
Subject: Re: Potential speed
Date: Sun, 24 Oct 2004 21:46:37 +0000
Message-ID: <867jpfx002.fsf@goldenaxe.localnet>

·········@gmail.com (Peter Scott) writes:

> C strikes me as a language for writing programs that are completely
> optimized at a low level. Sure, I think it's *premature*
> optimization---but it's optimization just the same.

C is only a very thin layer over writing assembler. Using a good macro
assembler to write programs is quite as effective as using C, in my
opinion. Most of the time I found easier to write stuff with TASM than
in C. That does not protect you from writing slow code, though.

(It has to be mentioned her that C has the obvious advantage that it
is (kind of) portable. Your assembler code will eventually be a pain
to maintain.)

Regards,
-- 
                    ____________________________
 Julian Stecklina  /  _________________________/
  ________________/  /
  \_________________/  LISP - truly beautiful

From: Adam Warner
Subject: Re: Potential speed
Date: Mon, 25 Oct 2004 00:08:35 +0000
Message-ID: <pan.2004.10.25.00.08.29.378873@consulting.net.nz>

Hi Peter Scott,

> That depends on how you define "a good deal". I used "a good deal
> slower" to mean "if you transform a C program into a Lisp one, chances
> are that the Lisp one will be slower".

If you transform a non-trivial Lisp program into a C one, chances are the
the C one will be slower. Much slower. You are very likely to run into
Greenspun's Tenth Rule of Programming[1] while trying to produce C code
with the same semantics as the Lisp version. To avoid this you should use
a pre-existing, debugged, high performance Lisp to C compiler. GCL is a
potential candidate.

It's even possible to appreciate some of this complexity with a trivial
program:

  (defun factorial (n) (if (zerop n) 1 (* n (factorial (- n 1)))))

What's your C translation of:

  (mapcar #'factorial (mapcar #'random (make-list 10 :initial-element 100)))

Regards,
Adam

[1] <http://philip.greenspun.com/research/>
    "Any sufficiently complicated C or Fortran program contains an ad-hoc,
     informally-specified bug-ridden slow implementation of half of Common
     Lisp."

From: Cesar Rabak
Subject: Re: Potential speed
Date: Mon, 25 Oct 2004 10:10:50 +0000
Message-ID: <417CD12A.5090307@acm.org>

Adam Warner escreveu:
> Hi Peter Scott,
> 
> 
>>That depends on how you define "a good deal". I used "a good deal
>>slower" to mean "if you transform a C program into a Lisp one, chances
>>are that the Lisp one will be slower".
> 
> 
> If you transform a non-trivial Lisp program into a C one, chances are the
> the C one will be slower. Much slower. You are very likely to run into
> Greenspun's Tenth Rule of Programming[1] while trying to produce C code
> with the same semantics as the Lisp version. To avoid this you should use
> a pre-existing, debugged, high performance Lisp to C compiler. GCL is a
> potential candidate.
> 
> It's even possible to appreciate some of this complexity with a trivial
> program:
> 
>   (defun factorial (n) (if (zerop n) 1 (* n (factorial (- n 1)))))
> 
> What's your C translation of:
> 
>   (mapcar #'factorial (mapcar #'random (make-list 10 :initial-element 100)))
> 
> Regards,
Er... Adam!

Since in the example above the list values are trown away and the only 
results we get is the ones from the application of factorial to the 
outcomes of ten random values, do you agree that the problem will be 
solved if a horde of C coders just come here with a for loop construct 
to substitute the mapcar construct?

I'm understanding the example above is about semantics and not about 
other advantages of Lisp like the accuracy of the results, right? :-)

just my .19999....

From: Adam Warner
Subject: Re: Potential speed
Date: Mon, 25 Oct 2004 11:24:00 +0000
Message-ID: <pan.2004.10.25.11.23.57.419339@consulting.net.nz>

Hi Cesar Rabak,

> [...] do you agree that the problem will be solved if a horde of C
> coders just come here with a for loop construct to substitute the mapcar
> construct?

The horde's still stuck writing the factorial function!

Have fun,
Adam

From: Joe Marshall
Subject: Re: Potential speed
Date: Mon, 25 Oct 2004 17:35:26 +0000
Message-ID: <r7nmvgyp.fsf@ccs.neu.edu>

·········@gmail.com (Peter Scott) writes:

> That depends on how you define "a good deal". I used "a good deal
> slower" to mean "if you transform a C program into a Lisp one, chances
> are that the Lisp one will be slower". 

Hmmm.  So if you transform a Lisp program into a C program it'll be
faster?

> I was wondering, though, if Lisp's more high-level approach could make
> it easier, in some cases, to write fast programs. 

Yes.

At one company I worked for, we had a client that needed to parse
output from legacy FORTRAN programs.  One thing they did was locate
specific strings in the fortran output and grab the text right
afterwards (`screen scraping').  We didn't know what the specific
strings were going to be, but it wasn't as if they were going to
change very often.  Since parsing these files was the major
bottleneck, it was worthwhile to optimize the string searching.

The Boyer-Moore string searcher is quite good, but the usual technique
is to write code that takes the Boyer-Moore tables and the string in
which to search as two parameters.  What we did in Lisp was to use the
Boyer-Moore tables to generate the Lisp code that searched for the
particular string.  We then ran the resultant code through the
compiler.

The compiled function only knew how to find one particular string, but
it was *FAST*.

From: Duane Rettig
Subject: Re: Potential speed
Date: Mon, 25 Oct 2004 18:19:25 +0000
Message-ID: <4654yhd8y.fsf@franz.com>

Joe Marshall <···@ccs.neu.edu> writes:

> ·········@gmail.com (Peter Scott) writes:
> 
> > That depends on how you define "a good deal". I used "a good deal
> > slower" to mean "if you transform a C program into a Lisp one, chances
> > are that the Lisp one will be slower". 
> 
> Hmmm.  So if you transform a Lisp program into a C program it'll be
> faster?

Actually, our experience with customers who have tried it is that
when they convert their application from Lisp to <non-Lisp> it
gets faster, and when they convert back from <non-lisp> to Lisp,
it gets faster.  (no, sorry, I can't give references :-)

What this really amounts to is a "rewrite" (normally a nasty word
in the eyes of management - they suddenly see a black hole of
programmer non-productivity and rail against it).  When rewrites
occur, I believe that they tend to have the same effect, to the
extent that the rewrites are complete and not partial.  Of course,
moving an application to another language more or less forces a
complete rewrite, and thus allows the benefit of the intimate
knowledge of the problem domain, without retaining much if any of
the baggage of botched experiments that are still in the old code
base.  In the experience of those customers who cam back to CL
after rewriting in C++, Java, etc., and finding that the program
ran faster but was harder to maintain, they cam back and found
even more speedup, either due to discovering even more cobwebs
that got accidentally transferred in the translation, or new and
better ways to do things.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182

From: Joe Marshall
Subject: Re: Potential speed
Date: Tue, 26 Oct 2004 04:18:18 +0000
Message-ID: <7jpe2juf.fsf@comcast.net>

Duane Rettig <·····@franz.com> writes:

> Joe Marshall <···@ccs.neu.edu> writes:
>
>> ·········@gmail.com (Peter Scott) writes:
>> 
>> > That depends on how you define "a good deal". I used "a good deal
>> > slower" to mean "if you transform a C program into a Lisp one, chances
>> > are that the Lisp one will be slower". 
>> 
>> Hmmm.  So if you transform a Lisp program into a C program it'll be
>> faster?
>
> Actually, our experience with customers who have tried it is that
> when they convert their application from Lisp to <non-Lisp> it
> gets faster, and when they convert back from <non-lisp> to Lisp,
> it gets faster.  (no, sorry, I can't give references :-)

How do you average in the speed of the <non-Lisp> version when it
doesn't work at all?

-- 
~jrm

From: George Neuner
Subject: Re: Potential speed
Date: Tue, 26 Oct 2004 22:01:52 +0000
Message-ID: <u1htn05jqqhosc039j1ts7err7t2usgkbs@4ax.com>

On Tue, 26 Oct 2004 04:18:18 GMT, Joe Marshall
<·············@comcast.net> wrote:

>Duane Rettig <·····@franz.com> writes:
>
>> Actually, our experience with customers who have tried it is that
>> when they convert their application from Lisp to <non-Lisp> it
>> gets faster, and when they convert back from <non-lisp> to Lisp,
>> it gets faster.  (no, sorry, I can't give references :-)
>
>How do you average in the speed of the <non-Lisp> version when it
>doesn't work at all?

When I worked in embedded systems, I had a manager who wouldn't allow
Lisp to be used because he had been burned badly in previous projects
by lousy programmers who wrote crap in Lisp that worked in spite of
itself.

He once said to me something to the effect of "it's better to have a C
program that doesn't work at all than a Lisp program that half works.
I can get somebody to fix the C program."

I was never able to change his mind about it.

George
-- 
for email reply remove "/" from address

From: Joe Marshall
Subject: Re: Potential speed
Date: Wed, 27 Oct 2004 14:47:32 +0000
Message-ID: <zn28rzej.fsf@ccs.neu.edu>

George Neuner <·········@comcast.net> writes:

> When I worked in embedded systems, I had a manager who wouldn't allow
> Lisp to be used because he had been burned badly in previous projects
> by lousy programmers who wrote crap in Lisp that worked in spite of
> itself.
>
> He once said to me something to the effect of "it's better to have a C
> program that doesn't work at all than a Lisp program that half works.
> I can get somebody to fix the C program."

Well, you can get somebody to fix the Lisp one, too.  I think it is
more likely he was sure that *he* could fix the C program.

> I was never able to change his mind about it.

Look at the data points:  He sees lousy programmers write crappy code
and blames... the language?  He prefers non-working code to working
code? 

Who was this guy?  I can write *reams* of C code that doesn't work at
all.  I'll give him a good rate, too.

From: Paul Foley
Subject: Re: Potential speed
Date: Mon, 25 Oct 2004 04:00:30 +0000
Message-ID: <m2r7nnbg69.fsf@mycroft.actrix.gen.nz>

On 24 Oct 2004 09:45:51 -0700, Peter Scott wrote:

> ······@news.dtpq.com (Christopher C. Stacy) wrote in message news:<·············@news.dtpq.com>...
>> ·········@chase3000.com (Peter Scott) writes:
>> > Lisp is generally a good deal slower than C unless you do heavy optimizations
>> 
>> It is?

> That depends on how you define "a good deal". I used "a good deal
> slower" to mean "if you transform a C program into a Lisp one, chances
> are that the Lisp one will be slower". I heard about 30% slower on

Maybe.  If you transform a Lisp program into C, chances are the C one
will be slower, too.

> average from Duane Rettig. Lisp is pretty fast, and if you put a
> little effort into the critical sections of your code, it can be fast
> enough for most things. However, the C language simply seems closer to
> the machine,

Depends on the machine...do you have a PDP-11?

-- 
Malum est consilium quod mutari non potest             -- Publilius Syrus

(setq reply-to
  (concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))

From: Kaz Kylheku
Subject: Re: Potential speed
Date: Wed, 27 Oct 2004 19:53:37 +0000
Message-ID: <cf333042.0410271153.2a5c44a8@posting.google.com>

·········@chase3000.com (Peter Scott) wrote in message news:<····························@posting.google.com>...
> Aside from being simpler and easier to understand, this is easier to
> optimize. If you convert the C code to this:
> 
> double arr[100]; //<- Values set previously
> double sum, s1, s2, s3, s4;
> sum = s1 = s2 = s3 = s4 = 0.0;
> for( int i = 0; i < 100; i += 4) {
>    s1 += a[i];
>    s2 += a[i + 1];
>    s3 += a[i + 2];
>    s4 += a[i + 3];
> }
> sum = s1 + s2 + s3 + s4; 

If you ``unroll'' the loop like this, you will change the value that
it computes, so it's an invalid optimization.

Since you are dealing with floating-point values, the arithmetic does
not obey the associative law.

In general, (A + B) + C may not be the same as A + (B + C).

The unrolling would have to be done with this body:

  double s1 = a[i];
  double s2 = a[i + 1];
  double s3 = a[i + 2];
  double s4 = a[i + 3];
  sum = sum + s1 + s2 + s3 + s4;

Note that I didn't say ''sum += s1 + s2 ...'', because that rearranges
the calculation so that the sn's are added first, and then added to
the sum.

Of course, here we are only parallelizing the accesses to the array.

This is the type of thing that the compiler can do for us with its
loop unrolling transformations.

> Then it is more likely to be optimized by a C compiler by making the
> addition proceed in parallel, giving it a big speedup on CPUs that
> support SIMD operations. But it's ugly, and not the sort of thing you
> want to think about.

There is no way to parallelize this by parallelizing the order of
discrete additions without affecting the mathematics.

Perhaps with special hardware that does composed additions so that it
can do sum + s1 + s2 + s3 + s4 faster than four independent additions
by some kind of pipelining or other trick. E.g. maybe a dependent
addition can begin with a partial operand before the previous one has
produced the complete operand.

Adding a vector of floating point numbers is tricky business! Suppose
that the very first number is large, and the subsequent ones are tiny.
If you do it beginning to end, it could be that adding the little ones
to the accumulator does not change its value! But if you do it in the
reverse order, then the small ones will gather enough magnitude that
when they are finally added to the large number at the front, the new
value will be distinguishable from that biggie.

In some applications, you may have to sort the numbers first. For
instance, add all the positives, from smallest magnitude to largest.
Then add all the negatives in the same way. Finally, combine the
negative and positive partial results.