How faster can this be done?

From: Emre  Sevinc
Subject: How faster can this be done?
Date: Sat, 19 May 2007 13:38:28 +0000
Message-ID: <1179581908.005364.241100@w5g2000hsg.googlegroups.com>

Hello Lisp meisters,

I was reading a blog about the performance of implementations of a
very simple Pythagorean triplets algorithm in different languages:

http://tempe.st/2007/05/erlang-ruby-and-php-battle-it-out/

And I wondered what I could do with Common Lisp (SBCL 0.9.17) I have
tried two versions:

(defconstant +max+ 5000
"The constant value up to which the main loop will run")

(defun pythagorean-triplet()
  (let ((a 0) (i 0))
    (loop for c from 2 to +max+
	  do
	  (loop for b from 1 below c
		do (progn
		     (setf a (sqrt (- (* c c) (* b b))))
		     (when (= (floor a) a) (incf i)))))
    (print i)))

(defun pythagorean-triplet-optimized-for-speed()
  (declare (optimize (speed 3) (debug 0) (safety 0))
	   (type single-float a)
	   (type fixnum i))
  (let ((a 0) (i 0))
    (loop for c from 2 to +max+
	  do
	  (loop for b from 1 below c
		do (progn
		     (setf a (sqrt (- (* c c) (* b b))))
		     (when (= (ffloor a) a) (incf i)))))
    (print i)))

Then I timed them:

CL-USER> (time (pythagorean-triplet))
13413
Evaluation took:
  5.015 seconds of real time
  3.87941 seconds of user run time
  0.035995 seconds of system run time
  [Run times include 0.25 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  299,940,200 bytes consed.


CL-USER> (time (pythagorean-triplet-optimized-for-speed))
13413
Evaluation took:
  1.571 seconds of real time
  1.433782 seconds of user run time
  0.002999 seconds of system run time
  [Run times include 0.151 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  199,960,776 bytes consed.


I wonder if pythagorean-triplet-optimized-for-speed function can be
made faster by tweaking the compilation parameters (without touching
the structure of the loop).


Maybe some of you will try and show me (or maybe SBCL > 1 versions got
faster, I don't know).


The machine I ran this on was a Debian with the following
specifications:


debian:/home/fz# uname -a
Linux debian 2.6.11-1-686 #1 Mon Jun 20 22:00:38 MDT 2005 i686 GNU/
Linux



··@debian:~$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Celeron(R) CPU 2.40GHz
stepping        : 1
cpu MHz         : 2392.778
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
monitor ds_cpl cid xtpr
bogomips        : 4718.59

Cheers
--
Emre Sevinc

Re: How faster can this be done? Chris Russell
- Re: How faster can this be done? Bob Felts
  - Re: How faster can this be done? Chris Russell
    - Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? Wolfram Fenske
        Re: How faster can this be done? Chris Russell
        Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? Chris Russell
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Wade Humeniuk
        Re: How faster can this be done? Chris Russell
        Re: How faster can this be done? Wolfram Fenske
        Re: How faster can this be done? Wade Humeniuk
        Re: How faster can this be done? Wolfram Fenske
        Re: How faster can this be done? Wade Humeniuk
        Re: How faster can this be done? Wolfram Fenske
        Re: How faster can this be done? Wade Humeniuk
        Re: How faster can this be done? Frank Buss
        Re: How faster can this be done? Wade Humeniuk
        Re: How faster can this be done? Wade Humeniuk
        Re: How faster can this be done? Frank Buss
        Re: How faster can this be done? ·········@gmail.com
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Kent M Pitman
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Alex Mizrahi
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? fireblade
        Re: How faster can this be done? Rainer Joswig
        Re: How faster can this be done? fireblade
        Re: How faster can this be done? Dan Bensen
        Re: How faster can this be done? fireblade
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Dan Bensen
        Re: How faster can this be done? Rainer Joswig
        Re: How faster can this be done? fireblade
        Re: How faster can this be done? fireblade
        Re: How faster can this be done? Joe Marshall
        Re: How faster can this be done? Tamas Papp
        Re: How faster can this be done? Joe Marshall
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Rainer Joswig
      - Re: How faster can this be done? Rainer Joswig
      - Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? Kjetil S. Matheussen
        Re: How faster can this be done? Kjetil S. Matheussen
Re: How faster can this be done? vy
Re: How faster can this be done? Juho Snellman
- Re: How faster can this be done? Jon Harrop
  - Re: How faster can this be done? Wolfram Fenske
  - Re: How faster can this be done? Juho Snellman
    - Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? Wolfram Fenske
  - Re: How faster can this be done? Larry Clapp
    - Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? j.oke
        Re: How faster can this be done? fireblade
        Re: How faster can this be done? jt
        Re: How faster can this be done? fireblade
      - Re: How faster can this be done? Larry Clapp
      - Re: How faster can this be done? Sacha
  - Re: How faster can this be done? Frank Buss
    - Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? Frank Buss
        Re: How faster can this be done? Jon Harrop
  - Re: How faster can this be done? Kjetil S. Matheussen
    - Re: How faster can this be done? Kjetil S. Matheussen
    - Re: How faster can this be done? Jon Harrop
      - Re: How faster can this be done? Kjetil Svalastog Matheussen
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Raffael Cavallaro
        Re: How faster can this be done? Jon Harrop
        Re: How faster can this be done? Raffael Cavallaro
        Re: How faster can this be done? Pascal Costanza
        Re: How faster can this be done? Kjetil S. Matheussen
        Re: How faster can this be done? Jon Harrop

From: Chris Russell
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 14:25:46 +0000
Message-ID: <1179584745.964140.59850@o5g2000hsb.googlegroups.com>

> I wonder if pythagorean-triplet-optimized-for-speed function can be
> made faster by tweaking the compilation parameters (without touching
> the structure of the loop).
>
> Maybe some of you will try and show me (or maybe SBCL > 1 versions got
> faster, I don't know).
>

You need to put more (the fixnum _)s in there to guarantee that the
numbers won't overflow. By default common lisp should assume that the
sum/product of two fixnums is an integer not another fixnum.

(defun pythagorean-triplet-re-optimized-for-speed()
  (declare (optimize (speed 3) (debug 0) (safety 0)))
  (let ((a 0.0) (i 0))
    (declare (type single-float a) (type fixnum i)) ;Moved to avoid
warnings
    (loop for c from 2 to (the fixnum +max+) do
          (loop for b from 1 below c
                do (progn
                     (setf a (sqrt (the fixnum (- (the fixnum (* c c))
(the fixnum (* b b))))))
                     (when (= (the single-float(ffloor a)) a) (incf
i))))) ;;Strangely this seemed to help
    (print i)))

CL-USER> (time (pythagorean-triplet-re-optimized-for-speed))

13413
Evaluation took:
  1.366 seconds of real time
  1.336084 seconds of user run time
  0.008001 seconds of system run time
  [Run times include 0.116 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  199,961,352 bytes consed.
13413
CL-USER> (time (pythagorean-triplet-optimized-for-speed))

13413
Evaluation took:
  1.66 seconds of real time
  1.624102 seconds of user run time
  0.040002 seconds of system run time
  [Run times include 0.124 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  199,957,592 bytes consed.
13413

From: Bob Felts
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 15:51:00 +0000
Message-ID: <1hyczk8.q2dynm1a65s6oN%wrf3@stablecross.com>

Chris Russell <·····················@gmail.com> wrote:

[...]

> 
> You need to put more (the fixnum _)s in there to guarantee that the
> numbers won't overflow. By default common lisp should assume that the
> sum/product of two fixnums is an integer not another fixnum.
> 
> (defun pythagorean-triplet-re-optimized-for-speed()
>   (declare (optimize (speed 3) (debug 0) (safety 0)))
>   (let ((a 0.0) (i 0))
>     (declare (type single-float a) (type fixnum i)) ;Moved to avoid
> warnings
>     (loop for c from 2 to (the fixnum +max+) do
>           (loop for b from 1 below c
>                 do (progn
>                      (setf a (sqrt (the fixnum (- (the fixnum (* c c))
> (the fixnum (* b b))))))
>                      (when (= (the single-float(ffloor a)) a) (incf
> i))))) ;;Strangely this seemed to help
>     (print i)))
> 
> CL-USER> (time (pythagorean-triplet-re-optimized-for-speed))
> 
> 13413
> Evaluation took:
>   1.366 seconds of real time
>   1.336084 seconds of user run time
>   0.008001 seconds of system run time
>   [Run times include 0.116 seconds GC run time.]
>   0 calls to %EVAL
>   0 page faults and
>   199,961,352 bytes consed.
> 13413
> CL-USER> (time (pythagorean-triplet-optimized-for-speed))
> 
> 13413
> Evaluation took:
>   1.66 seconds of real time
>   1.624102 seconds of user run time
>   0.040002 seconds of system run time
>   [Run times include 0.124 seconds GC run time.]
>   0 calls to %EVAL
>   0 page faults and
>   199,957,592 bytes consed.
> 13413

Why is this consing so much?

From: Chris Russell
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 17:06:01 +0000
Message-ID: <1179594361.317664.233160@l77g2000hsb.googlegroups.com>

> Why is this consing so much?

I don't know. I assume that these single-floats must be passed by
reference and that this is where it's coming from.
More drastically, these algorithms are flawed.

> (- (+ (expt 4576 2) (expt 2015 2))
     (expt 5000 2))

gives 1 but 4576, 2015, 5000 is considered an acceptable solution by
the first algorithm due to the rounding error.

I only noticed this when I tried:

(defun pythagorean-triplet()
  (let ((c2-b2 0) (a 0) (i 0))
   (loop for c from 2 to 5000 do
          (loop for b from 1 below c do
	(progn
                         (setf c2-b2 (- (* c c) (* b b))
		     a (isqrt c2-b2))
                     (when (= c2-b2 (* a a))
		       (incf i)))))
	  (print i)))

Which has an output of
    11362

Fortunately, passing double-floats instead of fixnums to sqrt gives
you the right answer, but triples the amount of consing again.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 21:31:54 +0000
Message-ID: <464f6e03$0$8714$ed2619ec@ptn-nntp-reader02.plus.net>

Chris Russell wrote:
> (defun pythagorean-triplet()
>   (let ((c2-b2 0) (a 0) (i 0))
>    (loop for c from 2 to 5000 do
>           (loop for b from 1 below c do
> (progn
>                          (setf c2-b2 (- (* c c) (* b b))
> a (isqrt c2-b2))
>                      (when (= c2-b2 (* a a))
> (incf i)))))
> (print i)))

I think this is the only Lisp posted so far that gives the same answer as
the original program. As far as timings, I get:

OCaml: 0.341s
C:     0.384s
Lisp:  9.233s

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Wolfram Fenske
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 23:33:35 +0000
Message-ID: <1179617615.429691.146720@p77g2000hsh.googlegroups.com>

Jon Harrop <···@ffconsultancy.com> writes:

> Chris Russell wrote:
>> (defun pythagorean-triplet()
>>   (let ((c2-b2 0) (a 0) (i 0))
>>    (loop for c from 2 to 5000 do
>>           (loop for b from 1 below c do
>>             (progn
>>               (setf c2-b2 (- (* c c) (* b b))
>>                     a (isqrt c2-b2))
>>               (when (= c2-b2 (* a a))
>>                 (incf i)))))
>>    (print i)))
>
> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:
>
> OCaml: 0.341s
> C:     0.384s
> Lisp:  9.233s

For some reason ISQRT seems to be very slow.  Replacing

  (isqrt c2-b2)

with

  (truncate (sqrt c2-b2))

brings down the runtime on my computer from 9.000s to 0.734s, while
still yielding the same result (11362, right?).  Here it is in full:

--8<---------------cut here---------------start------------->8---
(defun pythagorean-triplet ()
  (let ((c2-b2 0) (a 0) (i 0))
    (loop for c from 2 to 5000 do
      (loop for b from 1 below c do
        (progn
          (setf c2-b2 (- (* c c) (* b b))
                ;;a (isqrt c2-b2)
                a (truncate (sqrt c2-b2)))
          (when (= c2-b2 (* a a))
            (incf i)))))
    (print i)))
--8<---------------cut here---------------end--------------->8---

As has been mentioned, this solution allocates a lot of memory,
ca. 100 MB using SBCL 1.0.5.  If that could be eliminated, I assume
it'd be as fast as the C or OCaml versions.  Unfortunately I don't
know enough about CL optimization to do this.  None of the type
declarations I tried had any effect.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

From: Chris Russell
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 23:49:56 +0000
Message-ID: <1179618596.172604.157970@e65g2000hsc.googlegroups.com>

On 20 May, 00:33, Wolfram Fenske <····@gmx.net> wrote:
> Jon Harrop <····@ffconsultancy.com> writes:
> > Chris Russell wrote:
> >> (defun pythagorean-triplet()
> >>   (let ((c2-b2 0) (a 0) (i 0))
> >>    (loop for c from 2 to 5000 do
> >>           (loop for b from 1 below c do
> >>             (progn
> >>               (setf c2-b2 (- (* c c) (* b b))
> >>                     a (isqrt c2-b2))
> >>               (when (= c2-b2 (* a a))
> >>                 (incf i)))))
> >>    (print i)))
>
> > I think this is the only Lisp posted so far that gives the same answer as
> > the original program. As far as timings, I get:
>
> > OCaml: 0.341s
> > C:     0.384s
> > Lisp:  9.233s
>
> For some reason ISQRT seems to be very slow.  Replacing
>
>   (isqrt c2-b2)
>
> with
>
>   (truncate (sqrt c2-b2))
>
> brings down the runtime on my computer from 9.000s to 0.734s, while
> still yielding the same result (11362, right?).  Here it is in full:

Cool. :)

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 13:33:00 +0000
Message-ID: <46504f66$0$8715$ed2619ec@ptn-nntp-reader02.plus.net>

Wolfram Fenske wrote:
> For some reason ISQRT seems to be very slow.  Replacing
> 
>   (isqrt c2-b2)
> 
> with
> 
>   (truncate (sqrt c2-b2))
> 
> brings down the runtime on my computer from 9.000s to 0.734s, while
> still yielding the same result (11362, right?).

Yes.

> As has been mentioned, this solution allocates a lot of memory,
> ca. 100 MB using SBCL 1.0.5.  If that could be eliminated, I assume
> it'd be as fast as the C or OCaml versions.  Unfortunately I don't
> know enough about CL optimization to do this.  None of the type
> declarations I tried had any effect.

This is much faster. Now I get:

C:     0.299s
OCaml: 0.341s
Lisp:  0.454s

with better compiler options for the C code (-O3 -ffast-math).

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Chris Russell
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 23:34:18 +0000
Message-ID: <1179617657.985171.130060@e65g2000hsc.googlegroups.com>

> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:

Well it's the only program guaranteed to keep giving the right answer
as the numbers get bigger and bigger.
This program works for the same range as the c or ocaml

(defun pythagorean-triple2()
   (declare (optimize (speed 3) (debug 0) (safety 0)))
     (let ((i 0))
       (declare (type fixnum i))
   (loop for c from 2d0 to 5000d0 do
          (loop for b from 1d0 below c do
	(progn
                         (let* ((c2-b2 (- (* c c) (* b b)))
	              (a (ffloor (the double-float (sqrt c2-b2)))))
                     (when (= c2-b2  (* a a))
		       (incf i))))))
  (print i)))

But time wise it's out by a factor of 5 or so, probably due to the
boxing of the floats.

Evaluation took:
  1.792 seconds of real time

Does anyone know how to turn this off?

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 13:43:06 +0000
Message-ID: <465051c3$0$8710$ed2619ec@ptn-nntp-reader02.plus.net>

Chris Russell wrote:
>> I think this is the only Lisp posted so far that gives the same answer as
>> the original program. As far as timings, I get:
> 
> Well it's the only program guaranteed to keep giving the right answer
> as the numbers get bigger and bigger.

When that happens, this program will be taking almost 500 years to run. :-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Wade Humeniuk
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 15:35:26 +0000
Message-ID: <m2wsz3a4ox.fsf@telus.net.no.spam>

None of the proposed solutions is guaranteed to give the correct
solution since they all resort to using imprecise floats.  I am
beginning to hate the warm fuzzies.  Try this instead.

(defparameter +max+ 5000)

(defparameter *perfect-square-lookup*
  (let ((lookup (make-hash-table)))
    (loop for i from 0 to (1+ +max+)
       for i2 = (* i i) do
	 (setf (gethash i2 lookup) i2))
    lookup))
	
(defun perfect-square (n) (gethash n *perfect-square-lookup*))

(defun pythagorean-triplet (&aux (n 0))
  (loop for c from 2 to +max+ do
       (loop for b from 1 below c 
	  when (perfect-square (- (* c c) (* b b)))
	  do (incf n)))
  n)

CL-USER> (time (pythagorean-triplet))
Evaluation took:
  1.471 seconds of real time
  1.469061 seconds of user run time
  9.93e-4 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
11362
CL-USER> 


Wade

From: Chris Russell
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 17:12:00 +0000
Message-ID: <1179681120.059547.281810@n15g2000prd.googlegroups.com>

On May 20, 4:35 pm, Wade Humeniuk <········@telus.net.no.spam> wrote:
> None of the proposed solutions is guaranteed to give the correct
> solution since they all resort to using imprecise floats.  I am
> beginning to hate the warm fuzzies.  Try this instead.

The hyperspec is a bit ambiguous about whether isqrt is rounded out
from the floats or not.
It says:

isqrt returns the greatest integer less than or equal to the *exact*
positive square root of natural.

But there is also a note at the bottom that reads:

 (isqrt x) ==  (values (floor (sqrt x)))

but it is potentially more efficient.

So I looked at the SBCL source code to see what was actually going on
under the hood, and this is the relevant code.

;;; From discussion on comp.lang.lisp and Akira Kurihara.

(defun isqrt (n)
  #!+sb-doc
  "Return the root of the nearest integer less than n which is a
perfect
   square."
  (declare (type unsigned-byte n) (values unsigned-byte))
  ;; Theoretically (> n 7), i.e., n-len-quarter > 0.
  (if (and (fixnump n) (<= n 24))
      (cond ((> n 15) 4)
            ((> n  8) 3)
            ((> n  3) 2)
            ((> n  0) 1)
            (t 0))
      (let* ((n-len-quarter (ash (integer-length n) -2))
             (n-half (ash n (- (ash n-len-quarter 1))))
             (n-half-isqrt (isqrt n-half))
             (init-value (ash (1+ n-half-isqrt) n-len-quarter)))
        (loop
          (let ((iterated-value
                 (ash (+ init-value (truncate n init-value)) -1)))
            (unless (< iterated-value init-value)
              (return init-value))
            (setq init-value iterated-value))))))

I'd say it is also appropriate for calculating the exact solution.

From: Wolfram Fenske
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 17:54:32 +0000
Message-ID: <1179683672.681212.49610@n15g2000prd.googlegroups.com>

Wade Humeniuk <········@telus.net.no.spam> writes:

> None of the proposed solutions is guaranteed to give the correct
> solution since they all resort to using imprecise floats.  I am
> beginning to hate the warm fuzzies.  Try this instead.

>From the "memory is cheap" front comes this modification that replaces
the hash table with a simple vector:

--8<---------------cut here---------------start------------->8---
(defconstant +max+ 5000)

(defun make-lookup (size)
  (let* ((lookup (make-array (* size size) :initial-element nil)))
    (dotimes (i size lookup)
      (setf (svref lookup (* i i)) t))))

(defun pythagorean-triplet ()
  (let* ((n 0)
         (lookup (make-lookup (1+ +max+))))
    (loop for c from 2 to +max+ do
      (loop for b from 1 below c
        when (svref lookup (- (* c c) (* b b)))
        do (incf n)))
    (print n)))
--8<---------------cut here---------------end--------------->8---

This sped things up by a factor of 2 using SBCL 1.0.5.  The hash table
version takes 2.173s on my machine, the vector version 0.974s.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

From: Wade Humeniuk
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 01:11:05 +0000
Message-ID: <m21whb6kwm.fsf@telus.net.no.spam>

Wolfram Fenske <·····@gmx.net> writes:

> Wade Humeniuk <········@telus.net.no.spam> writes:
>
>> None of the proposed solutions is guaranteed to give the correct
>> solution since they all resort to using imprecise floats.  I am
>> beginning to hate the warm fuzzies.  Try this instead.
>
>>From the "memory is cheap" front comes this modification that replaces
> the hash table with a simple vector:
>
> --8<---------------cut here---------------start------------->8---
> (defconstant +max+ 5000)
>
> (defun make-lookup (size)
>   (let* ((lookup (make-array (* size size) :initial-element nil)))
>     (dotimes (i size lookup)
>       (setf (svref lookup (* i i)) t))))
>
> (defun pythagorean-triplet ()
>   (let* ((n 0)
>          (lookup (make-lookup (1+ +max+))))
>     (loop for c from 2 to +max+ do
>       (loop for b from 1 below c
>         when (svref lookup (- (* c c) (* b b)))
>         do (incf n)))
>     (print n)))
> --8<---------------cut here---------------end--------------->8---
>
> This sped things up by a factor of 2 using SBCL 1.0.5.  The hash table
> version takes 2.173s on my machine, the vector version 0.974s.
>

In the same vein,...

(defconstant +max+ 5000)

(defun make-lookup (size)
  (let* ((lookup (make-array (* size size) :element-type 'bit :initial-element 0)))
    (dotimes (i size lookup)
      (setf (sbit lookup (* i i)) 1))))

(defun pythagorean-triplet ()
  (declare (optimize (speed 3) (safety 0) (debug 0)))
  (let* ((n 0)
         (lookup (make-lookup (1+ +max+))))
    (declare (fixnum n))
    (loop for c from 2 to +max+ do
      (loop for b from 1 below c
        unless (zerop (sbit lookup (- (* c c) (* b b))))
        do (incf n)))
    (print n)))


CL-USER> (time (pythagorean-triplet))

11362 
Evaluation took:
  0.07 seconds of real time
  0.069843 seconds of user run time
  2.73e-4 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  3,126,264 bytes consed.
11362
CL-USER> 

Why this is so fast I am not sure, but disassembling...
you can see why.

CL-USER> (disassemble 'pythagorean-triplet)
; 11E9FC6A:       C745F000000000   MOV DWORD PTR [EBP-16], 0  ; no-arg-parsing entry point
;      C71:       8BDC             MOV EBX, ESP
;      C73:       83EC0C           SUB ESP, 12
;      C76:       BA244E0000       MOV EDX, 20004
;      C7B:       8B0540FCE911     MOV EAX, [#x11E9FC40]      ; #<FDEFINITION object for MAKE-LOOKUP>
;      C81:       B904000000       MOV ECX, 4
;      C86:       896BFC           MOV [EBX-4], EBP
;      C89:       8BEB             MOV EBP, EBX
;      C8B:       FF5005           CALL DWORD PTR [EAX+5]
;      C8E:       7302             JNB L0
;      C90:       8BE3             MOV ESP, EBX
;      C92: L0:   8BDA             MOV EBX, EDX
;      C94:       C745EC08000000   MOV DWORD PTR [EBP-20], 8
;      C9B:       EB50             JMP L5
;      C9D: L1:   8B75EC           MOV ESI, [EBP-20]
;      CA0:       BA04000000       MOV EDX, 4
;      CA5:       EB39             JMP L4
;      CA7: L2:   8B4DEC           MOV ECX, [EBP-20]
;      CAA:       C1F902           SAR ECX, 2
;      CAD:       0FAF4DEC         IMUL ECX, [EBP-20]
;      CB1:       8BFA             MOV EDI, EDX
;      CB3:       C1FF02           SAR EDI, 2
;      CB6:       0FAFFA           IMUL EDI, EDX
;      CB9:       29F9             SUB ECX, EDI
;      CBB:       8BF9             MOV EDI, ECX
;      CBD:       C1FF02           SAR EDI, 2
;      CC0:       8BCF             MOV ECX, EDI
;      CC2:       C1E905           SHR ECX, 5
;      CC5:       8B448B01         MOV EAX, [EBX+ECX*4+1]
;      CC9:       8BCF             MOV ECX, EDI
;      CCB:       D3E8             SHR EAX, CL
;      CCD:       83E001           AND EAX, 1
;      CD0:       8945F4           MOV [EBP-12], EAX
;      CD3:       8B4DF4           MOV ECX, [EBP-12]
;      CD6:       C1E102           SHL ECX, 2
;      CD9:       85C9             TEST ECX, ECX
;      CDB:       752D             JNE L6
;      CDD: L3:   83C204           ADD EDX, 4
;      CE0: L4:   39F2             CMP EDX, ESI
;      CE2:       7CC3             JL L2
;      CE4:       8B45EC           MOV EAX, [EBP-20]
;      CE7:       83C004           ADD EAX, 4
;      CEA:       8945EC           MOV [EBP-20], EAX
;      CED: L5:   817DEC204E0000   CMP DWORD PTR [EBP-20], 20000
;      CF4:       7EA7             JLE L1
;      CF6:       8B55F0           MOV EDX, [EBP-16]
;      CF9:       8B0544FCE911     MOV EAX, [#x11E9FC44]      ; #<FDEFINITION object for PRINT>
;      CFF:       B904000000       MOV ECX, 4
;      D04:       FF75F8           PUSH DWORD PTR [EBP-8]
;      D07:       FF6005           JMP DWORD PTR [EAX+5]
;      D0A: L6:   8B4DF0           MOV ECX, [EBP-16]
;      D0D:       83C104           ADD ECX, 4
;      D10:       894DF0           MOV [EBP-16], ECX
;      D13:       EBC8             JMP L3
;      D15:       90               NOP
;      D16:       90               NOP
;      D17:       90               NOP
; 
NIL
CL-USER> 

Wade

From: Wolfram Fenske
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 03:41:16 +0000
Message-ID: <1179718875.940385.137870@x18g2000prd.googlegroups.com>

Wade Humeniuk <········@telus.net.no.spam> writes:

> Wolfram Fenske <·····@gmx.net> writes:
>
>> Wade Humeniuk <········@telus.net.no.spam> writes:
>>
>>> None of the proposed solutions is guaranteed to give the correct
>>> solution since they all resort to using imprecise floats.  I am
>>> beginning to hate the warm fuzzies.  Try this instead.
>>
>> From the "memory is cheap" front comes this modification that replaces
>> the hash table with a simple vector:

[...]

> In the same vein,...
>
> (defconstant +max+ 5000)
>
> (defun make-lookup (size)
>   (let* ((lookup (make-array (* size size) :element-type 'bit :initial-element 0)))
>     (dotimes (i size lookup)
>       (setf (sbit lookup (* i i)) 1))))

[...]

> CL-USER> (time (pythagorean-triplet))
>
> 11362
> Evaluation took:
>   0.07 seconds of real time
>   0.069843 seconds of user run time
>   2.73e-4 seconds of system run time
>   0 calls to %EVAL
>   0 page faults and
>   3,126,264 bytes consed.
> 11362

What kind of machine have you got?  I'm using SBCL 1.0.5 on an AMD64
2800+ or something, the disassemblies look identical, but for me

  (time (pythagorean-triplet))

says:

--8<---------------cut here---------------start------------->8---
Evaluation took:
  0.267 seconds of real time
  0.25461 seconds of user run time
  7.5e-5 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  3,126,264 bytes consed.
--8<---------------cut here---------------end--------------->8---

... which, by the way, is still *awesome* -- the best result so far,
even faster than the C version [1] I have.

Here's my slightly slower version with `+max+' as a parameter:

--8<---------------cut here---------------start------------->8---
(defun pythagorean-triplet (n)
  (declare (optimize speed (safety 0))
	   (type fixnum n))
  (do ((c 2 (1+ c))
       (count 0)
       (lookup (make-lookup (1+ n))))
      ((> c n)
       count)
    (declare (type fixnum c count))
    (do ((b 1 (1+ b)))
	((>= b c))
      (declare (type fixnum b))
      (unless (zerop (sbit lookup (the fixnum
				    (- (the fixnum (* c c))
				       (the fixnum (* b b))))))
        (incf count)))))
--8<---------------cut here---------------end--------------->8---

Timings:

--8<---------------cut here---------------start------------->8---
* (time (pythagorean-triplet 5000))

Evaluation took:
  0.282 seconds of real time
  0.270825 seconds of user run time
  7.4e-5 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  3,126,264 bytes consed.
11362
--8<---------------cut here---------------end--------------->8---

Also faster than the C version (0.356 s) on my machine.


Footnotes:
[1]  My C version doesn't use bit fields, though.  It's a direct
     translation of the original algorithm, using `sqrt' etc.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

From: Wade Humeniuk
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 14:52:28 +0000
Message-ID: <m27ir22pqs.fsf@telus.net.no.spam>

Wolfram Fenske <·····@gmx.net> writes:

>
> What kind of machine have you got?  I'm using SBCL 1.0.5 on an AMD64
> 2800+ or something, the disassemblies look identical, but for me
>
>   (time (pythagorean-triplet))
>
> says:
>
> --8<---------------cut here---------------start------------->8---
> Evaluation took:
>   0.267 seconds of real time
>   0.25461 seconds of user run time
>   7.5e-5 seconds of system run time
>   0 calls to %EVAL
>   0 page faults and
>   3,126,264 bytes consed.
> --8<---------------cut here---------------end--------------->8---
>
> ... which, by the way, is still *awesome* -- the best result so far,
> even faster than the C version [1] I have.
>

I have a new Mac Book Pro.  Its a Centrino Dual Core 2.16GHz
with a 4MB shared L2 Cache.  The cache is probably the thing
that makes the difference.  The bit-vector can fit in it.

SBCL 1.04.

Wade

From: Wolfram Fenske
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 01:42:18 +0000
Message-ID: <1179798138.211377.309530@p77g2000hsh.googlegroups.com>

Wade Humeniuk <········@telus.net.no.spam> writes:

> Wolfram Fenske <·····@gmx.net> writes:
>
>>
>> What kind of machine have you got?  I'm using SBCL 1.0.5 on an AMD64
>> 2800+ or something, the disassemblies look identical, but for me
>>
>>   (time (pythagorean-triplet))
>>
>> says:
>>
>> --8<---------------cut here---------------start------------->8---
>> Evaluation took:
>>   0.267 seconds of real time
>>   0.25461 seconds of user run time
>>   7.5e-5 seconds of system run time
>>   0 calls to %EVAL
>>   0 page faults and
>>   3,126,264 bytes consed.
>> --8<---------------cut here---------------end--------------->8---
>>
>> ... which, by the way, is still *awesome* -- the best result so far,
>> even faster than the C version [1] I have.
>>
>
> I have a new Mac Book Pro.  Its a Centrino Dual Core 2.16GHz
> with a 4MB shared L2 Cache.  The cache is probably the thing
> that makes the difference.  The bit-vector can fit in it.

That has to be it.  I knew that caches are a lot faster than main
memory, but still, this is the most impressive demonstration I've seen
so far.  Whew!  :-)

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

From: Wade Humeniuk
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 02:39:13 +0000
Message-ID: <m2d50teg4v.fsf@telus.net.no.spam>

Wolfram Fenske <·····@gmx.net> writes:

>
> That has to be it.  I knew that caches are a lot faster than main
> memory, but still, this is the most impressive demonstration I've seen
> so far.  Whew!  :-)

Here is a link to the processor blurb.

Intel Core™2 Duo Processor T7400 2.16GHz w/ 4MB Cache

http://www.memoryexpress.com/index.php?PageTag=&page=file&memx_menu=EmbedProductDetail.php&DisplayProductID=9503&SID=

The L2 Cache on the Core2 is the same speed as the processor clock.
The RAM is 667MHz, the processor is 2.16GHz.

This caching thing can be very big.  Its explains why the UltraSpracs
in the Sun we have at work is so fast (8MB Caches).

Wade

From: Frank Buss
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 19:04:20 +0000
Message-ID: <919efzux881t.1evt8p3k0c5xc.dlg@40tude.net>

Wade Humeniuk wrote:

> In the same vein,...
> 
> (defconstant +max+ 5000)
> 
> (defun make-lookup (size)
>   (let* ((lookup (make-array (* size size) :element-type 'bit :initial-element 0)))
>     (dotimes (i size lookup)
>       (setf (sbit lookup (* i i)) 1))))
> 
> (defun pythagorean-triplet ()
>   (declare (optimize (speed 3) (safety 0) (debug 0)))
>   (let* ((n 0)
>          (lookup (make-lookup (1+ +max+))))
>     (declare (fixnum n))
>     (loop for c from 2 to +max+ do
>       (loop for b from 1 below c
>         unless (zerop (sbit lookup (- (* c c) (* b b))))
>         do (incf n)))
>     (print n)))
> 
> 
> CL-USER> (time (pythagorean-triplet))
> 
> 11362 
> Evaluation took:
>   0.07 seconds of real time
>   0.069843 seconds of user run time
>   2.73e-4 seconds of system run time
>   0 calls to %EVAL
>   0 page faults and
>   3,126,264 bytes consed.
> 11362
> CL-USER> 

That's nice. I've tried it with Visual C++ .NET 2003. The source should be
standard C++, so the measurement function is a bit coarse grained, which is
the reason why I run the loop 10 times:

#include <ctime>
#include <cstdlib>
#include <cstdio>
#include <cmath>
#include <vector>

using namespace std;

int main(int argc, char *argv[]) {
	const int n = argc == 2 ? atoi(argv[1]) : 5000;

	clock_t start = clock();
	int i = 0;
	for (int loop = 0; loop < 10; loop++) {
		vector<bool> lookup(n*n);
		for (int j = 0; j < n; j++) lookup[j*j] = true;
		for (int c = 2; c <= n; c++) {
			for (int b = 1; b < c; b++) {
				if (lookup[c*c - b*b]) i++;
			}
		}
	}
	printf("%i\n", i / 10);
	clock_t end = clock();

	float time = float(end-start) / float(CLK_TCK) / 10.0f;
	printf("time: %0.04f", time);

	return 0;
}

Output:

11362
time: 0.1921

On the same computer with SBCL 1.0.2 on Windows with your program:

* (time (pythagorean-triplet))

11362
Evaluation took:
  0.234 seconds of real time
  0.21875 seconds of user run time
  0.015625 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  3,126,264 bytes consed.
11362

BTW: I've tried to loop the Lisp program 10 times, too, but it crashes all
the time on Windows. This is the modified program:

(defconstant +max+ 5000)

(defun make-lookup (size)
  (let* ((lookup (make-array (* size size) :element-type 'bit
:initial-element 0)))
    (dotimes (i size lookup)
      (setf (sbit lookup (* i i)) 1))))

(defun pythagorean-triplet ()
  (declare (optimize (speed 3) (safety 0) (debug 0)))
  (let ((n 0))
    (loop repeat 10 do
          (let ((lookup (make-lookup (1+ +max+))))
            (declare (fixnum n))
            (loop for c from 2 to +max+ do
                  (loop for b from 1 below c
                        unless (zerop (sbit lookup (- (* c c) (* b b))))
                        do (incf n)))))
    (print (/ n 10))))

Calling with "(time (pythagorean-triplet))" gives:

  fatal error encountered in SBCL pid 3708:
  GC invariant lost, file "gencgc.c", line 833

  LDB monitor

Looks like it needs still some work for a stable SBCL version on Windows.

Finally some parts of the disassembly of the C++ program:

	mov	esi, edi
	imul	esi, edi
	mov	DWORD PTR _start$[esp+76], eax
	xor	ebp, ebp
	lea	eax, DWORD PTR [esi+31]
	shr	eax, 5
	mov	DWORD PTR _i$[esp+72], ebp
	mov	DWORD PTR $T10326[esp+76], esi
; Line 15
	mov	DWORD PTR $T10313[esp+76], ebp
	mov	DWORD PTR tv328[esp+76], eax
	mov	DWORD PTR tv242[esp+76], 10		; 0000000aH
	jmp	SHORT $L5974
$L10623:
; Line 12
	mov	eax, DWORD PTR tv328[esp+76]
	mov	esi, DWORD PTR $T10326[esp+76]
	npad	6
$L5974:
; Line 15
	lea	edx, DWORD PTR $T10313[esp+76]
	push	edx
	push	eax
	lea	ecx, DWORD PTR _lookup$6436[esp+88]
	mov	DWORD PTR _lookup$6436[esp+84], ebp
	call	·············@········@·············@·@···@@@···@@········@Z ;
std::vector<unsigned int,std::allocator<unsigned int> >::_Construct_n
	push	esi
	lea	ecx, DWORD PTR _lookup$6436[esp+80]
	mov	DWORD PTR __$EHRec$[esp+88], ebp
	call	······@········@··············@··@···@@@···@@·····@Z ;
std::vector<bool,std::allocator<bool> >::_Trim
; Line 16
	mov	ebx, DWORD PTR _lookup$6436[esp+84]
	xor	edx, edx
	cmp	edi, ebp
	mov	DWORD PTR __$EHRec$[esp+84], -1
	jle	SHORT $L6440
$L6438:
	mov	ecx, edx
	imul	ecx, edx
	mov	eax, ebp
	add	eax, ecx
	mov	ecx, eax
	shr	ecx, 5
	lea	esi, DWORD PTR [ebx+ecx*4]
	and	eax, 31					; 0000001fH
	mov	ecx, 1
	mov	DWORD PTR tv406[esp+76], ecx
	mov	ecx, eax
	mov	eax, DWORD PTR tv406[esp+76]
	shl	eax, cl
	mov	ecx, DWORD PTR [esi]
	or	ecx, eax
	inc	edx
	cmp	edx, edi
	mov	DWORD PTR [esi], ecx
	jl	SHORT $L6438
	xor	ebp, ebp
$L6440:
; Line 17
	mov	eax, 2
	cmp	edi, eax
	mov	DWORD PTR _c$6496[esp+76], eax
	jl	SHORT $L6499
$L6497:
; Line 18
	mov	edx, 1
	cmp	eax, edx
	jle	SHORT $L6498
; Line 19
	mov	edi, eax
	imul	edi, eax
$L6501:
	mov	esi, edx
	imul	esi, edx
	xor	eax, eax
	mov	ebp, edi
	sub	ebp, esi
	mov	eax, ebp
	mov	esi, eax
	mov	ecx, ebx
	shr	esi, 5
	and	eax, 31					; 0000001fH
	lea	esi, DWORD PTR [ecx+esi*4]
	mov	ecx, eax
	mov	eax, DWORD PTR [esi]
	mov	ebp, 1
	shl	ebp, cl
	test	ebp, eax
	je	SHORT $L6502
	inc	DWORD PTR _i$[esp+72]
$L6502:
	mov	eax, DWORD PTR _c$6496[esp+76]
	inc	edx
	cmp	edx, eax
	jl	SHORT $L6501
	mov	edi, DWORD PTR _n$[esp+76]
	xor	ebp, ebp
$L6498:
	inc	eax
	cmp	eax, edi
	mov	DWORD PTR _c$6496[esp+76], eax
	jle	SHORT $L6497
$L6499:
; Line 22
	cmp	ebx, ebp
	mov	DWORD PTR _lookup$6436[esp+76], ebp
	je	SHORT $L10605
	push	ebx
	call	···@······@Z				; operator delete
	add	esp, 4
$L10605:
	mov	eax, DWORD PTR tv242[esp+76]
	dec	eax
	mov	DWORD PTR _lookup$6436[esp+84], ebp
	mov	DWORD PTR _lookup$6436[esp+88], ebp
	mov	DWORD PTR _lookup$6436[esp+92], ebp
	mov	DWORD PTR tv242[esp+76], eax
	jne	$L10623
; Line 23
	mov	eax, 1717986919				; 66666667H
	imul	DWORD PTR _i$[esp+72]
	sar	edx, 2
	mov	eax, edx
	shr	eax, 31					; 0000001fH
	add	eax, edx
	push	eax
	push	OFFSET ·········@···········@···········@
	call	_printf

-- 
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

From: Wade Humeniuk
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 01:58:34 +0000
Message-ID: <m2bqgd39h2.fsf@telus.net.no.spam>

I tried (a modified version of your C++) version on OS X

The C++ version is slower that the SBCL Version (probably because of
caching).  The Lisp version is 40% faster.

wade-humeniuks-computer:~/lisp wade$ gcc -O3 -c pyth.cc
wade-humeniuks-computer:~/lisp wade$ g++ -lstdc++ -o pyth pyth.o
wade-humeniuks-computer:~/lisp wade$ time ./pyth
11362
11362
11362
11362
11362
11362
11362
11362
11362
11362
time: 1.0300
real    0m1.052s
user    0m0.998s
sys     0m0.042s
wade-humeniuks-computer:~/lisp wade$ 

CL-USER> (time (loop repeat 10 do (pythagorean-triplet)))

11362 
11362 
11362 
11362 
11362 
11362 
11362 
11362 
11362 
11362 
Evaluation took:
  0.699 seconds of real time
  0.691419 seconds of user run time
  0.00562 seconds of system run time
  [Run times include 0.015 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  31,295,360 bytes consed.
NIL
CL-USER>

From: Wade Humeniuk
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 02:00:33 +0000
Message-ID: <m27ir139dq.fsf@telus.net.no.spam>

Wade Humeniuk <········@telus.net.no.spam> writes:

> I tried (a modified version of your C++) version on OS X
>
> The C++ version is slower that the SBCL Version (probably because of
> caching).  The Lisp version is 40% faster.
>

Forgot the code.

#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <vector>

using namespace std;

int main(int argc, char *argv[]) {
  const int n = argc == 2 ? atoi(argv[1]) : 5000;

  clock_t start = clock();
  int i = 0;
  for (int loop = 0; loop < 10; loop++) {
    i = 0;
    vector<bool> lookup(n*n);
    for (int j = 0; j < n; j++) lookup[j*j] = true;
    for (int c = 2; c <= n; c++) {
      for (int b = 1; b < c; b++) {
	if (lookup[c*c - b*b]) i++;
      }
    }
    printf("%i\n", i);
  }
  clock_t end = clock();
  
  float time = float(end-start) / float(CLOCKS_PER_SEC);
  printf("time: %0.04f", time);
  
  return 0;
}


> wade-humeniuks-computer:~/lisp wade$ gcc -O3 -c pyth.cc
> wade-humeniuks-computer:~/lisp wade$ g++ -lstdc++ -o pyth pyth.o
> wade-humeniuks-computer:~/lisp wade$ time ./pyth
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> time: 1.0300
> real    0m1.052s
> user    0m0.998s
> sys     0m0.042s
> wade-humeniuks-computer:~/lisp wade$ 
>
> CL-USER> (time (loop repeat 10 do (pythagorean-triplet)))
>
> 11362 
> 11362 
> 11362 
> 11362 
> 11362 
> 11362 
> 11362 
> 11362 
> 11362 
> 11362 
> Evaluation took:
>   0.699 seconds of real time
>   0.691419 seconds of user run time
>   0.00562 seconds of system run time
>   [Run times include 0.015 seconds GC run time.]
>   0 calls to %EVAL
>   0 page faults and
>   31,295,360 bytes consed.
> NIL
> CL-USER>

From: Frank Buss
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 16:41:40 +0000
Message-ID: <aupmqvmm8xkt$.1drxega449uie.dlg@40tude.net>

Wade Humeniuk wrote:

> I tried (a modified version of your C++) version on OS X
> 
> The C++ version is slower that the SBCL Version (probably because of
> caching).  The Lisp version is 40% faster.
> 
> wade-humeniuks-computer:~/lisp wade$ gcc -O3 -c pyth.cc
> wade-humeniuks-computer:~/lisp wade$ g++ -lstdc++ -o pyth pyth.o
> wade-humeniuks-computer:~/lisp wade$ time ./pyth

Yes, usually gcc is slower than Microsoft Visual C++ :-)

-- 
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

From: ·········@gmail.com
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 17:13:41 +0000
Message-ID: <1179854020.620760.66930@p47g2000hsd.googlegroups.com>

On May 20, 9:11 pm, Wade Humeniuk <········@telus.net.no.spam> wrote:
> (defconstant +max+ 5000)
>
> (defun make-lookup (size)
>   (let* ((lookup (make-array (* size size) :element-type 'bit :initial-element 0)))
>     (dotimes (i size lookup)
>       (setf (sbit lookup (* i i)) 1))))
>
> (defun pythagorean-triplet ()
>   (declare (optimize (speed 3) (safety 0) (debug 0)))
>   (let* ((n 0)
>          (lookup (make-lookup (1+ +max+))))
>     (declare (fixnum n))
>     (loop for c from 2 to +max+ do
>       (loop for b from 1 below c
>         unless (zerop (sbit lookup (- (* c c) (* b b))))
>         do (incf n)))
>     (print n)))

I like it.  I made a small tweak to take a parameter (I don't like
constants for things like this):

    (defun count-triples-slow (hypotenuse-max)
      (declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse-max)
               (optimize (speed 3) (safety 0) (debug 0)))
      (let ((lookup (make-lookup (1+ hypotenuse-max))))
        (loop for c from 2 to hypotenuse-max
              sum (loop for b from 1 below c
                        sum (sbit lookup (- (* c c) (* b b)))))))

    CL-USER> (time (count-triples-slow 5000))
    Evaluation took:
      0.474 seconds of real time
      0.476029 seconds of user run time
      0.0 seconds of system run time
      0 page faults and
      3,126,264 bytes consed.
    11362

Note, however, that this algorithm finds all ordered triples for each
hypotenuse.  For example, it counts both (3, 4, 5) and (4, 3, 5).  So
we only have to search to the point where b^2 is half of c^2 and then
double the count.

    (defun count-triples-slow (hypotenuse-max)
      (declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse-max)
               (optimize (speed 3) (safety 0) (debug 0)))
      (let ((lookup (make-lookup (1+ hypotenuse-max))))
        (* 2 (loop for c from 2 to hypotenuse-max
                   sum (loop for b from 1 to (isqrt (truncate (* c c)
2))
                             sum (sbit lookup (- (* c c) (* b
b))))))))

    CL-USER> (time (count-triples-slow 5000))
    Evaluation took:
      0.226 seconds of real time
      0.220013 seconds of user run time
      0.008001 seconds of system run time
      0 page faults and
      3,126,264 bytes consed.
    11362

I then took a look at the macroexpansion of the loop code and noted
that there were some inefficiencies in there.  I pulled the two loops
apart into separate functions and made them into tail-recursive
algorithms.  I also made some pretty restrictive type definitions.
(The lookup table must fit into an array, so the maximum hypotenuse
must be the square root of array-dimension-limit.  Also, the number of
candidate triples below a given hypotenuse length is a triangular
number, so the type for the count of pythagorean triples reflects the
closed formula for triangle number n, (n*(n+1)/2).)

    (defun make-lookup (size)
      (let ((lookup (make-array (* size size) :element-type 'bit
                                              :initial-element 0)))
        (dotimes (i size lookup)
          (setf (sbit lookup (* i i)) 1))))

    (deftype triple-count ()
      `(integer 0 ,(truncate (- array-dimension-limit
                                (isqrt array-dimension-limit))
                             2)))

    (defun triples-with-hypotenuse (hypotenuse lookup)
      (declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse)
               (type simple-bit-vector lookup)
               (optimize (speed 3)))
      (let ((c2 (* hypotenuse hypotenuse)))
        (labels ((doit (b sum)
                   (declare (type (integer 0 #.(isqrt (truncate array-
dimension-limit 2))) b)
                            (type triple-count sum))
                   (if (zerop b)
                       sum
                       (doit (1- b) (+ sum (sbit lookup (- c2 (* b
b))))))))
          (doit (isqrt (truncate c2 2)) 0))))

    (defun count-triples-slow (hypotenuse-max)
      (declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse-max)
               (optimize (speed 3)))
      (let ((lookup (make-lookup (1+ hypotenuse-max))))
        (labels ((doit (c sum)
                   (declare (type (integer 0 #.(isqrt array-dimension-
limit)) c)
                            (type triple-count sum))
                   (if (zerop c)
                       sum
                       (doit (1- c)
                             (+ sum (the triple-count (triples-with-
hypotenuse c lookup)))))))
          (* 2 (doit hypotenuse-max 0)))))

    CL-USER> (time (count-triples-slow 5000))
    Evaluation took:
      0.184 seconds of real time
      0.184011 seconds of user run time
      0.0 seconds of system run time
      0 page faults and
      3,126,264 bytes consed.
    11362

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 17:14:10 +0000
Message-ID: <4650833b$0$8737$ed2619ec@ptn-nntp-reader02.plus.net>

Wade Humeniuk wrote:
> None of the proposed solutions is guaranteed to give the correct
> solution since they all resort to using imprecise floats.

You mean if the program is run for years it will eventually suffer from
imprecision? I think we have more important things to worry about...

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 17:19:34 +0000
Message-ID: <4650847e$0$8738$ed2619ec@ptn-nntp-reader02.plus.net>

j.oke wrote:
> If you can live with compromises (in order to give you some
> superficial(!) advantages), you might consider to switch to some other
> language...

What if the superficial disadvantage is "Lisp lost me a $1M trade because it
hung whilst garbage collecting a big int I never needed" or "nobody is
buying my computer game because Lisp has generalised the player coordinates
to arbitrary-precision numbers and crippled the frame rate"?

The fact is, Lisp is slow and it puts people off.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Kent M Pitman
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 18:01:58 +0000
Message-ID: <uwsz3e5m1.fsf@nhplace.com>

Jon Harrop <···@ffconsultancy.com> writes:

> j.oke wrote:
> > If you can live with compromises (in order to give you some
> > superficial(!) advantages), you might consider to switch to some other
> > language...
> 
> What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> hung whilst garbage collecting a big int I never needed" 

Garbage collection is something done by the Sun JVM and the Microsoft
CLR.  Why don't you worry about the billions they are losing, instead
of the paltry millions we're losing.  Sounds like a real business
opportunity for you.

Or is it possible that this is just a made up problem and that
ordinary quality assurance is responsible for catching it?  Do things
get through QA?  Sure.  But that doesn't make it the responsibility of
the tool.

Incidentally, there's also an unspoken presupposition in this post that
somehow if it had been done in some other language (I can't imagine
which you were thinking of), this wouldn't happen.

What would happen if I lost a million dollars because my statically
type-checked application that nevertheless yielded a bad result.
(Oops.  It turns out that some design errors still yield effects at
runtime.)  I hear you saying "That would just be bad design."  or
"This would be caught by QA, and too bad if someone doesn't QA their
application."  As if somehow those reasons aren't equally true of 
dynamic languages.

The difference is that dynamic language users tend to know they have
to test their programs and make them robust against bad data.  Too
often in static languages, I've seen people assume that after all the
hassle of getting something to compile, they have an entitlement for
it to just work... and then it fails mysteriously at runtime because
they forget that it IS possible for statically checked programs to
have runtime errors.  

I've seen plenty of statically typed applications crash mysteriously
on null errors or other bad-data errors (e.g., a list of homogeneously
typed foo elements was expected to be of known length, so was declared
[foo], when really it had a type limitation no one encoded, so it 
slipped past the type checker but produced bad data something else
never expected because it thought the upstream facility was strongly
checking everything).  

> or "nobody is buying my computer game because Lisp has generalised
> the player coordinates to arbitrary-precision numbers and crippled
> the frame rate"?

Wow, it got all the way through to release without needing to be tested
and ran on the first try, just a little slowly?  Cool.

Does that happen a lot in ML?  Someone writes an entire game application
and never tests it, then compiles it and runs it and it plays fine
AND fast on the first try?

Or is it possible that this is just a completely contrived example
not typical of anything?

> The fact is, Lisp is slow and it puts people off.

If you're so put off, why are you still posting here?

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 20:07:50 +0000
Message-ID: <4650abef$0$8743$ed2619ec@ptn-nntp-reader02.plus.net>

Kent M Pitman wrote:
> Incidentally, there's also an unspoken presupposition in this post that
> somehow if it had been done in some other language...

You've really given up on doing this in Lisp? I had expected to get a
competitive solution with only a few more days work... ;-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Alex Mizrahi
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 06:55:40 +0000
Message-ID: <46514277$0$90264$14726298@news.sunsite.dk>

(message (Hello 'Jon)
(you :wrote  :on '(Sun, 20 May 2007 21:07:50 +0100))
(

 ??>> Incidentally, there's also an unspoken presupposition in this post
 ??>> that somehow if it had been done in some other language...

 JH> You've really given up on doing this in Lisp? I had expected to get a
 JH> competitive solution with only a few more days work... ;-)

why do you discard competitive solution written in Scheme by Kjetil?

and why do you need these benchmarks? there are already lots of them. e.g.

http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=sbcl&lang2=ocaml

you can find SBCL there beats ocaml in 5 benchmarks, go better optimize 
ocaml code..
(OTOH ocaml beats SBCL in 10 benchmarks, but very slightly, maybe it's just 
because of 13x larger startup time of SBCL)

btw, you've mentioned GC time. binary-trees is a benchmark for GC made by 
Hans Boehm. SBCL is 1.4 times faster than ocaml in it, and on par with C.

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"I am everything you want and I am everything you need")

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 11:15:35 +0000
Message-ID: <465180b2$0$8730$ed2619ec@ptn-nntp-reader02.plus.net>

Alex Mizrahi wrote:
> why do you discard competitive solution written in Scheme by Kjetil?

I haven't discarded it: I just haven't gotten around to looking at it yet
(Stalin doesn't run in 64-bit AFAIK).

> and why do you need these benchmarks?

We're just playing. :-)

> there are already lots of them. e.g. 
> 
>
http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=sbcl&lang2=ocaml
> 
> you can find SBCL there beats ocaml in 5 benchmarks, go better optimize
> ocaml code..

My machine gives very different results: OCaml is only slower on
binary-trees.

> (OTOH ocaml beats SBCL in 10 benchmarks, but very slightly, maybe it's
> just because of 13x larger startup time of SBCL)
> 
> btw, you've mentioned GC time. binary-trees is a benchmark for GC made by
> Hans Boehm. SBCL is 1.4 times faster than ocaml in it, and on par with C.

The benchmark definition is poor: I submitted several progressively faster
OCaml implementation that took time going down to zero seconds about two
years ago and they all got rejected.

The same optimizations can be applied to the Lisp, of course. So you aren't
really measuring anything.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:08:15 +0000
Message-ID: <1179749295.091197.217100@z28g2000prd.googlegroups.com>

On May 20, 8:01 pm, Kent M Pitman <······@nhplace.com> wrote:
> Jon Harrop <····@ffconsultancy.com> writes:
>
> > or "nobody is buying my computer game because Lisp has generalised
> > the player coordinates to arbitrary-precision numbers and crippled
> > the frame rate"?
>
> Wow, it got all the way through to release without needing to be tested
> and ran on the first try, just a little slowly?  Cool.
>
> Does that happen a lot in ML?  Someone writes an entire game application
> and never tests it, then compiles it and runs it and it plays fine
> AND fast on the first try?
>
> Or is it possible that this is just a completely contrived example
> not typical of anything?
>
> > The fact is, Lisp is slow and it puts people off.
>
> If you're so put off, why are you still posting here?

If he plans to make a game he would use a premade engine , the
commercial or an opensource one. Making your own engine is better
left to people who have a enough time , knowledge and a good team who
will help them. People with those talents don't spend time trolling in
forums they build their engine. Where we gonna see engine build in
OCaml or F# ?

And Jon  you better took my advcie and write some demos with
Irrlicht .Net and show us how good is F#. It'll give you some
necessary credentials to speak on this subject.

From: Rainer Joswig
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 17:46:21 +0000
Message-ID: <joswig-E82511.19462020052007@news-europe.giganews.com>

In article <························@ptn-nntp-reader02.plus.net>,
 Jon Harrop <···@ffconsultancy.com> wrote:

> j.oke wrote:
> > If you can live with compromises (in order to give you some
> > superficial(!) advantages), you might consider to switch to some other
> > language...
> 
> What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> hung whilst garbage collecting a big int I never needed" or "nobody is
> buying my computer game because Lisp has generalised the player coordinates
> to arbitrary-precision numbers and crippled the frame rate"?
> 
> The fact is, Lisp is slow and it puts people off.

                            ___________________________
                   /|  /|  |                          |
                   ||__||  |       Please don't       |
                  /   O O\__           feed           |
                 /          \       the trolls        |
                /      \     \                        |
               /   _    \     \ ---------------------- 
              /    |\____\     \     ||                
             /     | | | |\____/     ||                
            /       \|_|_|/   |    __||                
           /  /  \            |____| ||                
          /   |   | /|        |      --|               
          |   |   |//         |____  --|               
   * _    |  |_|_|_|          |     \-/                
*-- _--\ _ \     //           |                        
  /  _     \\ _ //   |        /                        
*  /   \_ /- | -     |       |                         
  *      ___ c_c_c_C/ \C_c_c_c____________

-- 
http://lispm.dyndns.org

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 11:56:03 +0000
Message-ID: <1179748563.230818.255940@36g2000prm.googlegroups.com>

On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> In article <························@ptn-nntp-reader02.plus.net>,
>  Jon Harrop <····@ffconsultancy.com> wrote:
>
> > j.oke wrote:
> > > If you can live with compromises (in order to give you some
> > > superficial(!) advantages), you might consider to switch to some other
> > > language...
>
> > What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> > hung whilst garbage collecting a big int I never needed" or "nobody is
> > buying my computer game because Lisp has generalised the player coordinates
> > to arbitrary-precision numbers and crippled the frame rate"?
>
> > The fact is, Lisp is slow and it puts people off.
>
>                             ___________________________
>                    /|  /|  |                          |
>                    ||__||  |       Please don't       |
>                   /   O O\__           feed           |
>                  /          \       the trolls        |
>                 /      \     \                        |
>                /   _    \     \ ----------------------
>               /    |\____\     \     ||                
>              /     | | | |\____/     ||                
>             /       \|_|_|/   |    __||                
>            /  /  \            |____| ||                
>           /   |   | /|        |      --|              
>           |   |   |//         |____  --|              
>    * _    |  |_|_|_|          |     \-/                
> *-- _--\ _ \     //           |                        
>   /  _     \\ _ //   |        /                        
> *  /   \_ /- | -     |       |                        
>   *      ___ c_c_c_C/ \C_c_c_c____________
>
> --http://lispm.dyndns.org

Sorry Rainer but I really hate when people speak thrash game
development, beside I find this troll quite interesthing.

From: Dan Bensen
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:37:14 +0000
Message-ID: <f2s3f1$7rl$1@wildfire.prairienet.org>

fireblade wrote:
> On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
>>  Jon Harrop <····@ffconsultancy.com> wrote:
>>> The fact is, Lisp is slow and it puts people off.
>>                             ___________________________
>>                    /|  /|  |                          |
>>                    ||__||  |       Please don't       |
>>                   /   O O\__           feed           |
>>                  /          \       the trolls        |
>>                 /      \     \                        |
>>                /   _    \     \ ----------------------
>>               /    |\____\     \     ||                
>>              /     | | | |\____/     ||                
>>             /       \|_|_|/   |    __||                
>>            /  /  \            |____| ||                
>>           /   |   | /|        |      --|              
>>           |   |   |//         |____  --|              
>>    * _    |  |_|_|_|          |     \-/                
>> *-- _--\ _ \     //           |                        
>>   /  _     \\ _ //   |        /                        
>> *  /   \_ /- | -     |       |                        
>>   *      ___ c_c_c_C/ \C_c_c_c____________
>>
>> --http://lispm.dyndns.org
> 
> Sorry Rainer but I really hate when people speak thrash game
> development, beside I find this troll quite interesthing.

I don't.  I find him very boring and upsetting, and I appreciate
when people call out trolls for what they are.

The fact is that Lisp is a dynamically typed language, and dynamic
type checking takes time.  That's not an argument against Lisp
any more than dozens of other popular languages.  Even less so,
in fact, because Lisp has optional type declarations.

According to Peter Norivig and Greg Sullivan, the fact is that
dynamic typing is more expressive than static typing.  Statically
typed languages don't have that expressiveness, and some people
are "put off" by that.

The fact is that dynamic typing is better for RAD.  Statically
typed languages tend to quibble over minor type issues, and
some people are "put off" by that.

The fact is that trolls are disruptive and unhelpful, and *many*
people are put off by that.

-- 
Dan
www.prairienet.org/~dsb/

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:56:55 +0000
Message-ID: <1179752215.326252.107060@a26g2000pre.googlegroups.com>

On May 21, 2:37 pm, Dan Bensen <··········@cyberspace.net> wrote:
> fireblade wrote:
> > On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> >>  Jon Harrop <····@ffconsultancy.com> wrote:
> >>> The fact is, Lisp is slow and it puts people off.
> >>                             ___________________________
> >>                    /|  /|  |                          |
> >>                    ||__||  |       Please don't       |
> >>                   /   O O\__           feed           |
> >>                  /          \       the trolls        |
> >>                 /      \     \                        |
> >>                /   _    \     \ ----------------------
> >>               /    |\____\     \     ||                
> >>              /     | | | |\____/     ||                
> >>             /       \|_|_|/   |    __||                
> >>            /  /  \            |____| ||                
> >>           /   |   | /|        |      --|              
> >>           |   |   |//         |____  --|              
> >>    * _    |  |_|_|_|          |     \-/                
> >> *-- _--\ _ \     //           |                        
> >>   /  _     \\ _ //   |        /                        
> >> *  /   \_ /- | -     |       |                        
> >>   *      ___ c_c_c_C/ \C_c_c_c____________
>
> >> --http://lispm.dyndns.org
>
> > Sorry Rainer but I really hate when people speak thrash game
> > development, beside I find this troll quite interesthing.
>
> I don't.  I find him very boring and upsetting, and I appreciate
> when people call out trolls for what they are.
>
> The fact is that Lisp is a dynamically typed language, and dynamic
> type checking takes time.  That's not an argument against Lisp
> any more than dozens of other popular languages.  Even less so,
> in fact, because Lisp has optional type declarations.
>
> According to Peter Norivig and Greg Sullivan, the fact is that
> dynamic typing is more expressive than static typing.  Statically
> typed languages don't have that expressiveness, and some people
> are "put off" by that.
>
> The fact is that dynamic typing is better for RAD.  Statically
> typed languages tend to quibble over minor type issues, and
> some people are "put off" by that.
>
> The fact is that trolls are disruptive and unhelpful, and *many*
> people are put off by that.
>

Most people still believe in waterfall method, planning everything
from start. The layered architecture, field where embedded languages
perfectly fit in is in little use for someone who prefer big ball of
mud architecture.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 17:55:52 +0000
Message-ID: <4651de8a$0$8752$ed2619ec@ptn-nntp-reader02.plus.net>

Dan Bensen wrote:
> I find him very boring and upsetting

Please don't be upset by what I say. I shall try to be more diplomatic.

> The fact is that dynamic typing is better for RAD.

Dynamic typing can certainly be better for RAD under certain circumstances
(e.g. interop, GUI dev) but not all.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Dan Bensen
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 21:03:32 +0000
Message-ID: <f2t14a$h9t$1@wildfire.prairienet.org>

Jon Harrop wrote:
> Dan Bensen wrote:
>> I find him very boring and upsetting
> I shall try to be more diplomatic.
Could you try to be rational?
That's all I really care about.

-- 
Dan
www.prairienet.org/~dsb/

From: Rainer Joswig
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:39:16 +0000
Message-ID: <joswig-1D7A24.14391621052007@news-europe.giganews.com>

In article <························@36g2000prm.googlegroups.com>,
 fireblade <·················@gmail.com> wrote:

> On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> > In article <························@ptn-nntp-reader02.plus.net>,
> >  Jon Harrop <····@ffconsultancy.com> wrote:
> >
> > > j.oke wrote:
> > > > If you can live with compromises (in order to give you some
> > > > superficial(!) advantages), you might consider to switch to some other
> > > > language...
> >
> > > What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> > > hung whilst garbage collecting a big int I never needed" or "nobody is
> > > buying my computer game because Lisp has generalised the player coordinates
> > > to arbitrary-precision numbers and crippled the frame rate"?
> >
> > > The fact is, Lisp is slow and it puts people off.
> >
> >                             ___________________________
> >                    /|  /|  |                          |
> >                    ||__||  |       Please don't       |
> >                   /   O O\__           feed           |
> >                  /          \       the trolls        |
> >                 /      \     \                        |
> >                /   _    \     \ ----------------------
> >               /    |\____\     \     ||                
> >              /     | | | |\____/     ||                
> >             /       \|_|_|/   |    __||                
> >            /  /  \            |____| ||                
> >           /   |   | /|        |      --|              
> >           |   |   |//         |____  --|              
> >    * _    |  |_|_|_|          |     \-/                
> > *-- _--\ _ \     //           |                        
> >   /  _     \\ _ //   |        /                        
> > *  /   \_ /- | -     |       |                        
> >   *      ___ c_c_c_C/ \C_c_c_c____________
> >
> > --http://lispm.dyndns.org
> 
> Sorry Rainer but I really hate when people speak thrash game
> development, beside I find this troll quite interesthing.

Then discuss game development or other stuff via email with him.

-- 
http://lispm.dyndns.org

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:51:21 +0000
Message-ID: <1179751881.172919.53500@36g2000prm.googlegroups.com>

On May 21, 2:39 pm, Rainer Joswig <······@lisp.de> wrote:
> In article <························@36g2000prm.googlegroups.com>,
>
>
>
>
>
>  fireblade <·················@gmail.com> wrote:
> > On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> > > In article <························@ptn-nntp-reader02.plus.net>,
> > >  Jon Harrop <····@ffconsultancy.com> wrote:
>
> > > > j.oke wrote:
> > > > > If you can live with compromises (in order to give you some
> > > > > superficial(!) advantages), you might consider to switch to some other
> > > > > language...
>
> > > > What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> > > > hung whilst garbage collecting a big int I never needed" or "nobody is
> > > > buying my computer game because Lisp has generalised the player coordinates
> > > > to arbitrary-precision numbers and crippled the frame rate"?
>
> > > > The fact is, Lisp is slow and it puts people off.
>
> > >                             ___________________________
> > >                    /|  /|  |                          |
> > >                    ||__||  |       Please don't       |
> > >                   /   O O\__           feed           |
> > >                  /          \       the trolls        |
> > >                 /      \     \                        |
> > >                /   _    \     \ ----------------------
> > >               /    |\____\     \     ||                
> > >              /     | | | |\____/     ||                
> > >             /       \|_|_|/   |    __||                
> > >            /  /  \            |____| ||                
> > >           /   |   | /|        |      --|              
> > >           |   |   |//         |____  --|              
> > >    * _    |  |_|_|_|          |     \-/                
> > > *-- _--\ _ \     //           |                        
> > >   /  _     \\ _ //   |        /                        
> > > *  /   \_ /- | -     |       |                        
> > >   *      ___ c_c_c_C/ \C_c_c_c____________
>
> > > --http://lispm.dyndns.org
>
> > Sorry Rainer but I really hate when people speak thrash game
> > development, beside I find this troll quite interesthing.
>
> Then discuss game development or other stuff via email with him.
>
> --http://lispm.dyndns.org- Hide quoted text -
>
Ok

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 11:54:16 +0000
Message-ID: <1179748456.318969.323000@z24g2000prd.googlegroups.com>

On May 20, 7:19 pm, Jon Harrop <····@ffconsultancy.com> wrote:
> j.oke wrote:
> > If you can live with compromises (in order to give you some
> > superficial(!) advantages), you might consider to switch to some other
> > language...
>
> What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> hung whilst garbage collecting a big int I never needed" or "nobody is
> buying my computer game because Lisp has generalised the player coordinates
> to arbitrary-precision numbers and crippled the frame rate"?
>
> The fact is, Lisp is slow and it puts people off.
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy

Jon, you never made any game so you don't know is it slow or not. You
never even made a game with Ocaml to speak about it. IMHO , the
largest thing I ever made is a racing demo ~20000 lines with
lightfeather and newton handling physics showed that unoptimised lisp
is about 30% slower , something that people could live with
considering easiness of development with lisp. And BTW making a good
3d game is hardest thing i ever tackled. People develop games with c/c+
+/c# because that's what the GDKs and game engines support.  Some use
higher languages for scripting or develop something in house but
that's another topic.

From: Joe Marshall
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 18:33:13 +0000
Message-ID: <1179772393.096794.44180@x35g2000prf.googlegroups.com>

Dr Jon D Harrop of Flying F*** Consultancy finally hauls out this well-
worn saw:

> The fact is, Lisp is slow and it puts people off.

If I had a nickel for every time I heard this...
I think this counts as a lisp-specific instance of Godwin's law.

Anyone remember `Ahmed the Peaceman'?

Apparently you don't read what I post.  In a conversation you
participated in on comp.lang.lisp just a bit over a year ago I wrote:

Allow me to point you at the Supercomputer Toolkit:
http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf

``The Supercomputer Toolkit compiler performs partial evaluation
of data-independent programs expressed in the Scheme
dialect of Lisp by using the symbolic execution technique described
in previously published work by Berlin.''


Jacob Katzemelson at Technion has apparently used the Supercomputer
Toolkit to run weather prediction models:


``Conservative estimates performed for a typical mesoscale forecast
over the eastern Mediterranean (for 36 hour) suggest that were the
Toolkit constructed from ALPHA processors, 10 processors would do the
prediction in only about 13 minutes. A 36 hours simulation with full
physics for the whole earth will require 2 hours for 80 ALPHA
processors.''

If Lisp is slow, why then are MIT researchers using it for computation-
intensive programs?

What sort of crappy education are they giving at Robinson College
these days?  It only takes a few seconds on Google to find enough
papers on high-performance computing in Lisp to make a decent text
book!  If I were you I'd ask Robinson for my money back.  It seems
they left you as ignorant as when you entered and certainly didn't
train you to change that sad situation.

Next time you feel like uttering a sentence with the word `fact' in it
(or `most people' or `vast majority', etc.) just type it into that
little rectangular box at the top of your browser and *find out* if
there is any truth to it *before* you go spouting off.

From: Tamas Papp
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 20:10:28 +0000
Message-ID: <87ejlaq6ob.fsf@pu100877.student.princeton.edu>

Joe Marshall <··········@gmail.com> writes:

> Allow me to point you at the Supercomputer Toolkit:
> http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf
>
> ``The Supercomputer Toolkit compiler performs partial evaluation
> of data-independent programs expressed in the Scheme
> dialect of Lisp by using the symbolic execution technique described
> in previously published work by Berlin.''

Thanks for posting this, it was extremely interesting.  I do a lot of
numerical work (slowly migrating to Lisp), and I would like to learn
more about these techniques.  For example, is it Scheme specific or
could this be done in CL?  What would be a good place to start?

Thanks,

Tamas

>
>
> Jacob Katzemelson at Technion has apparently used the Supercomputer
> Toolkit to run weather prediction models:
>
>
> ``Conservative estimates performed for a typical mesoscale forecast
> over the eastern Mediterranean (for 36 hour) suggest that were the
> Toolkit constructed from ALPHA processors, 10 processors would do the
> prediction in only about 13 minutes. A 36 hours simulation with full
> physics for the whole earth will require 2 hours for 80 ALPHA
> processors.''
>
> If Lisp is slow, why then are MIT researchers using it for computation-
> intensive programs?
>
> What sort of crappy education are they giving at Robinson College
> these days?  It only takes a few seconds on Google to find enough
> papers on high-performance computing in Lisp to make a decent text
> book!  If I were you I'd ask Robinson for my money back.  It seems
> they left you as ignorant as when you entered and certainly didn't
> train you to change that sad situation.
>
> Next time you feel like uttering a sentence with the word `fact' in it
> (or `most people' or `vast majority', etc.) just type it into that
> little rectangular box at the top of your browser and *find out* if
> there is any truth to it *before* you go spouting off.

From: Joe Marshall
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 22:20:36 +0000
Message-ID: <1179786035.926695.74540@z28g2000prd.googlegroups.com>

On May 21, 1:10 pm, Tamas Papp <······@gmail.com> wrote:
> Joe Marshall <··········@gmail.com> writes:
> > Allow me to point you at the Supercomputer Toolkit:
> >http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf
>
> > ``The Supercomputer Toolkit compiler performs partial evaluation
> > of data-independent programs expressed in the Scheme
> > dialect of Lisp by using the symbolic execution technique described
> > in previously published work by Berlin.''
>
> Thanks for posting this, it was extremely interesting.  I do a lot of
> numerical work (slowly migrating to Lisp), and I would like to learn
> more about these techniques.  For example, is it Scheme specific or
> could this be done in CL?  What would be a good place to start?

It can be done in Common Lisp.  Jeffrey Mark Siskind has a partial
evaluator for Common Lisp at
http://cobweb.ecn.purdue.edu/~qobi/software.html

You should also check out this paper:
@article{ fateman95fast,
    author = "Richard J. Fateman and Kevin A. Broughan and Diane K.
Willcock and Duane Rettig",
    title = "Fast Floating Point Processing in {Common Lisp}",
    journal = "ACM Transactions on Mathematical Software",
    volume = "21",
    number = "1",
    pages = "26--62",
    year = "1995",
    url = "citeseer.ist.psu.edu/fateman95fast.html" }

Two of the authors have been known to post to this group.

A lot of this work is still very experimental, so be prepared to poke
around.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 20:24:24 +0000
Message-ID: <46520151$0$8748$ed2619ec@ptn-nntp-reader02.plus.net>

Joe Marshall wrote:
> Dr Jon D Harrop of Flying F*** Consultancy finally hauls out this well-
> worn saw:
>> The fact is, Lisp is slow and it puts people off.
> 
> Allow me to point you at the Supercomputer Toolkit:
> http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf

Allow me to point you at some benchmarks made this century that run on
platforms that aren't dead:

http://www.ffconsultancy.com/languages/ray_tracer/results.html

> It only takes a few seconds on Google to find enough 
> papers on high-performance computing in Lisp to make a decent text
> book!

Yeah, your new book could include this paper from 1983 as well:

http://portal.acm.org/citation.cfm?id=801672&coll=portal&dl=ACM

> It seems  
> they left you as ignorant as when you entered and certainly didn't
> train you to change that sad situation.

I can't believe you haven't reached for the stereotypical Lisper's
performance comparison with Java. You must be getting slow in your old age.

> Next time you feel like uttering a sentence with the word `fact' in it
> (or `most people' or `vast majority', etc.) just type it into that
> little rectangular box at the top of your browser and *find out* if
> there is any truth to it *before* you go spouting off.

<sarcasm>
I hear Lisp is the new black at high-performance supercomputing conferences.
Not to mention the fact that performance only matters when you're coding
for a supercomputer...
</sarcasm>

ROTFLMAO.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Rainer Joswig
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 20:42:41 +0000
Message-ID: <joswig-D74F3A.22424121052007@news-europe.giganews.com>

In article <························@ptn-nntp-reader02.plus.net>,
 Jon Harrop <···@ffconsultancy.com> wrote:

                            ___________________________
                   /|  /|  |                          |
                   ||__||  |       Please don't       |
                  /   O O\__           feed           |
                 /          \       the trolls        |
                /      \     \                        |
               /   _    \     \ ---------------------- 
              /    |\____\     \     ||                
             /     | | | |\____/     ||                
            /       \|_|_|/   |    __||                
           /  /  \            |____| ||                
          /   |   | /|        |      --|               
          |   |   |//         |____  --|               
   * _    |  |_|_|_|          |     \-/                
*-- _--\ _ \     //           |                        
  /  _     \\ _ //   |        /                        
*  /   \_ /- | -     |       |                         
  *      ___ c_c_c_C/ \C_c_c_c____________

-- 
http://lispm.dyndns.org

From: Rainer Joswig
Subject: Re: How faster can this be done?
Date: Sat, 19 May 2007 22:19:21 +0000
Message-ID: <joswig-596C3D.00192020052007@news-europe.giganews.com>

In article <························@ptn-nntp-reader02.plus.net>,
 Jon Harrop <···@ffconsultancy.com> wrote:

> Chris Russell wrote:
> > (defun pythagorean-triplet()
> >   (let ((c2-b2 0) (a 0) (i 0))
> >    (loop for c from 2 to 5000 do
> >           (loop for b from 1 below c do
> > (progn
> >                          (setf c2-b2 (- (* c c) (* b b))
> > a (isqrt c2-b2))
> >                      (when (= c2-b2 (* a a))
> > (incf i)))))
> > (print i)))
> 
> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:
> 
> OCaml: 0.341s
> C:     0.384s
> Lisp:  9.233s

                            ___________________________
                   /|  /|  |                          |
                   ||__||  |       Please don't       |
                  /   O O\__           feed           |
                 /          \       the trolls        |
                /      \     \                        |
               /   _    \     \ ---------------------- 
              /    |\____\     \     ||                
             /     | | | |\____/     ||                
            /       \|_|_|/   |    __||                
           /  /  \            |____| ||                
          /   |   | /|        |      --|               
          |   |   |//         |____  --|               
   * _    |  |_|_|_|          |     \-/                
*-- _--\ _ \     //           |                        
  /  _     \\ _ //   |        /                        
*  /   \_ /- | -     |       |                         
  *      ___ c_c_c_C/ \C_c_c_c____________

-- 
http://lispm.dyndns.org

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 13:24:43 +0000
Message-ID: <46504d74$0$8735$ed2619ec@ptn-nntp-reader02.plus.net>

j.oke wrote:
> Now we get:
> 
> OCaml: 0.341s [not tested]
> C:     0.384s [not tested]
> Lisp:  0.0s   [tested!]

Can you post your code so that we can benchmark compile times?

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Kjetil S. Matheussen
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 14:19:36 +0000
Message-ID: <Pine.LNX.4.64.0705201616340.9118@ttleush>

On Sat, 19 May 2007, Jon Harrop wrote:

> Chris Russell wrote:
>> (defun pythagorean-triplet()
>>   (let ((c2-b2 0) (a 0) (i 0))
>>    (loop for c from 2 to 5000 do
>>           (loop for b from 1 below c do
>> (progn
>>                          (setf c2-b2 (- (* c c) (* b b))
>> a (isqrt c2-b2))
>>                      (when (= c2-b2 (* a a))
>> (incf i)))))
>> (print i)))
>
> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:
>
> OCaml: 0.341s
> C:     0.384s
> Lisp:  9.233s

C:      0.604s
Scheme: 0.478s



scheme
------
"
(define (pythag max)
   (let ((i 0))
     (let loop ((c 2))
       (if (<= c max)
 	  (begin
 	    (let loop ((b 1))
 	      (if (< b c)
 		  (begin
 		    (let ((a (sqrt (- (* c c) (* b b)))))
 		      (if (= (floor a) a)
 			  (set! i (+ 1 i))))
 		    (loop (+ 1 b)))))
 	    (loop (+ 1 c)))))
     (display i)
     (newline)))
(pythag 5000)
"
stalin -On -copt -O3 -copt -ffast-math pythag.scm
time ./pythag

C
-
"
#include <stdio.h>
#include <math.h>
#define MAX 5000
int main() {
   double c, b;
   int i = 0;
   double a;

   for (c = 2; c <= MAX; c++) {
     for (b = 1; b < c; b++) {
       a = sqrt(c*c - b*b);
       if (trunc(a) == a) {
         i++;
       }
     }
   }
   printf("%d\n", i);
}
"
gcc -O3 -ffast-math pythag_c.c -lm
time ./a.out

From: Kjetil S. Matheussen
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 14:45:56 +0000
Message-ID: <Pine.LNX.4.64.0705201645120.9118@ttleush>

On Sun, 20 May 2007, Kjetil S. Matheussen wrote:

>
>
> On Sat, 19 May 2007, Jon Harrop wrote:
>
>>  Chris Russell wrote:
>> >  (defun pythagorean-triplet()
>> >    (let ((c2-b2 0) (a 0) (i 0))
>> >     (loop for c from 2 to 5000 do
>> >            (loop for b from 1 below c do
>> >  (progn
>> >                           (setf c2-b2 (- (* c c) (* b b))
>> >  a (isqrt c2-b2))
>> >                       (when (= c2-b2 (* a a))
>> >  (incf i)))))
>> >  (print i)))
>>
>>  I think this is the only Lisp posted so far that gives the same answer as
>>  the original program. As far as timings, I get:
>>
>>  OCaml: 0.341s
>>  C:     0.384s
>>  Lisp:  9.233s
>
> C:      0.604s
> Scheme: 0.478s
>

Further experimentation shows:

C:           0.620s  (median of 5)
Scheme/Lisp: 0.408s  (median of 5)



gcc -O3 -ffast-math  -fomit-frame-pointer  -freg-struct-return 
-malign-double pythag_c.c -lm

../stalin-0.11/stalin -Ob -Om -On -Or -Ot -dP -dC -dh -q -d -db 
-no-clone-size-limit -split-even-if-no-widening -copt -O3
-copt -fomit-frame-pointer -copt -freg-struct-return -copt -ffast-math 
pythag.scm

From: vy
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 16:02:21 +0000
Message-ID: <1179676941.199963.93790@k79g2000hse.googlegroups.com>

On May 19, 4:38 pm, Emre Sevinc <···········@gmail.com> wrote:
> (defconstant +max+ 5000
> "The constant value up to which the main loop will run")
>
> (defun pythagorean-triplet()
>   (let ((a 0) (i 0))
>     (loop for c from 2 to +max+
>           do
>           (loop for b from 1 below c
>                 do (progn
>                      (setf a (sqrt (- (* c c) (* b b))))
>                      (when (= (floor a) a) (incf i)))))
>     (print i)))

I'm not sure but maybe it would also be possible to optimize the
algorithm to skip triples with bogus parities.

(defun pythagorean-triplet (lim)
  (loop for c from 2 to (the fixnum lim)
        for cc = (* c c)
        for cp = (mod c 2)
        sum (loop for b from 1 below c
                  for bp = (mod b 2)
                  for aa  = (- cc (* b b))
                  for a  = (sqrt aa)
                  for ap = (mod aa 2)
                  count (and (or (= 0 ap bp cp)                 ; E +
E = E
                                 (and (= 1 ap bp) (= 0 cp))     ; O +
O = E
                                 (and (= 1 ap cp) (= 0 bp))     ; O +
E = O
                                 (and (= 1 bp cp) (= 0 ap)))    ; E +
O = O
                             (= (floor a) a)))))

It seems like parity tests cost much more than it ought to help out.
Maybe others know how to optimize them anyway. (Because parity tests
reduce the example space to its half size.)


Regards.

From: Juho Snellman
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 17:45:02 +0000
Message-ID: <slrnf5128u.p8s.jsnell@sbz-30.cs.Helsinki.FI>

Emre Sevinc <···········@gmail.com> wrote:
> (defun pythagorean-triplet()
>   (let ((a 0) (i 0))
>     (loop for c from 2 to +max+
> 	  do
> 	  (loop for b from 1 below c
> 		do (progn
> 		     (setf a (sqrt (- (* c c) (* b b))))
> 		     (when (= (floor a) a) (incf i)))))
>     (print i)))

(deftype small-positive-double-float ()
  `(double-float 0d0 ,(float most-positive-fixnum 0d0)))

(defun pythagorean-triplet ()
  (declare (optimize speed (safety 0)))
  (loop for c from 2 to 5000
        sum (loop for b from 1 below c
                  count (let ((a (the small-positive-double-float
                                   (sqrt (float (- (* c c) (* b b)) 0d0)))))
                          (= (float (truncate a) 0d0) a)))))

There are a few things to note here:

   * SBCL can open-code SQRT on values known to be positive floats (for
     example on x86-64 into a single sqrtsd instruction).
   * SBCL will only open-code TRUNCATE (or FLOOR) on floats when the
     result is known to fit into a machine word.
   * SBCL won't open-code = on arguments that aren't known to be of the
     same types.

CL-USER> (time (print (pythagorean-triplet)))
11362 
Evaluation took:
  0.16 seconds of real time
  0.159975 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  8,192 bytes consed.

CL-USER> (values (lisp-implementation-type) 
                 (lisp-implementation-version)
		 (machine-type))
"SBCL"
"1.0.5.54"
"X86-64"

-- 
Juho Snellman

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 20:28:51 +0000
Message-ID: <4650b0dc$0$8729$ed2619ec@ptn-nntp-reader02.plus.net>

Juho Snellman wrote:
> (deftype small-positive-double-float ()
>   `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
> 
> (defun pythagorean-triplet ()
>   (declare (optimize speed (safety 0)))
>   (loop for c from 2 to 5000
>         sum (loop for b from 1 below c
>                   count (let ((a (the small-positive-double-float
>                                    (sqrt (float (- (* c c) (* b b))
>                                    0d0)))))
>                           (= (float (truncate a) 0d0) a)))))

This is the fastest Lisp so far and its run time is faster than everything
except the macro solution that j.oke hinted at.

So this is clearly doing significant work at compile time.

For a fair comparison, you either need to include compilation time or not
hardcode the 5000.

Including compile times I get:

C: 0.399s
OCaml: 0.478s
SBCL: 0.992s

Tweaking the programs to accept N as an argument and then measuring only run
times, I get:

C: 0.291s
OCaml: 0.350s
SBCL: 0.885s

So Lisp is around 3x slower than C from this benchmark. OCaml is only 20%
slower.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Wolfram Fenske
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 21:36:53 +0000
Message-ID: <1179697013.197761.275750@x35g2000prf.googlegroups.com>

Jon Harrop <···@ffconsultancy.com> writes:

> Juho Snellman wrote:
>> (deftype small-positive-double-float ()
>>   `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>>
>> (defun pythagorean-triplet ()
>>   (declare (optimize speed (safety 0)))
>>   (loop for c from 2 to 5000
>>         sum (loop for b from 1 below c
>>                   count (let ((a (the small-positive-double-float
>>                                    (sqrt (float (- (* c c) (* b b))
>>                                    0d0)))))
>>                           (= (float (truncate a) 0d0) a)))))
>
> This is the fastest Lisp so far and its run time is faster than everything
> except the macro solution that j.oke hinted at.
>
> So this is clearly doing significant work at compile time.

No, it isn't.  All it does is use some type declarations that help
SBCL avoid boxing and use inline arithmetic where possible.

> For a fair comparison, you either need to include compilation time

No, since nothing is pre-computed during compilation time.

> or not hardcode the 5000.

[...]

> Tweaking the programs to accept N as an argument and then measuring
> only run times, I get:

[...]

> So Lisp is around 3x slower than C from this benchmark. OCaml is
> only 20% slower.

I'm assuming the first few lines of your tweaked version look like
this?

--8<---------------cut here---------------start------------->8---
(defun pythagorean-triplet (n)
  (declare (optimize speed (safety 0)))
  (loop for c from 2 to n
   ...))
--8<---------------cut here---------------end--------------->8---

If that is so, SBCL's type inferencer cannot assume that n is a
machine integer, which means it also cannot assume that c is a machine
integer, etc..  This leads to inefficient code.

Here is a version that takes n as a parameter, annotates even more
types than Juho's version, and runs even faster.  I'm sorry I had to
switch from LOOP to the DO macro, but using LOOP, SBCL kept giving me
performance warnings.

--8<---------------cut here---------------start------------->8---
(deftype small-positive-double-float ()
  `(double-float 0d0 ,(float most-positive-fixnum 0d0)))

(defun pythagorean-triplet (n)
  (declare (optimize speed (safety 0))
           (type fixnum n))
  (do ((c 2 (1+ c))
       (count 0))
      ((> c n)
       count)
    (declare (type fixnum c count))
    (do ((b 1 (1+ b)))
        ((>= b c))
      (declare (type fixnum b))
      (let ((a (the small-positive-double-float
                 (sqrt (float (- (the fixnum (* c c))
                                 (the fixnum (* b b)))
                              0d0)))))
        (when (= (float (truncate a) 0d0) a)
          (incf count))))))
--8<---------------cut here---------------end--------------->8---

I get 0.446183s for this Lisp version and 0.355s for the C version.
Measurements do not include compilation times.

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

From: Juho Snellman
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 21:29:33 +0000
Message-ID: <slrnf51fdt.si6.jsnell@sbz-30.cs.Helsinki.FI>

Jon Harrop <···@ffconsultancy.com> wrote:
> This is the fastest Lisp so far and its run time is faster than everything
> except the macro solution that j.oke hinted at.
>
> So this is clearly doing significant work at compile time.

It's not doing any of the computation at compile time. "Clearly" your
prejudices are showing.

> For a fair comparison, you either need to include compilation time or not
> hardcode the 5000.
>
> Including compile times I get:
>
> C: 0.399s
> OCaml: 0.478s
> SBCL: 0.992s

Compiling that function takes less than 0.03 seconds on my machine. I
suggest that your measurements are complete bullshit.

> Tweaking the programs to accept N as an argument and then measuring only run
> times, I get:
>
> C: 0.291s
> OCaml: 0.350s
> SBCL: 0.885s
>
> So Lisp is around 3x slower than C from this benchmark. OCaml is only 20%
> slower.

Declaring the type of N (for example as an (unsigned-byte 14)) would
fix that.

-- 
Juho Snellman

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 23:15:42 +0000
Message-ID: <4650d7fc$0$8741$ed2619ec@ptn-nntp-reader02.plus.net>

Juho Snellman wrote:
> It's not doing any of the computation at compile time.

Wolfram's results prove otherwise.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Wolfram Fenske
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 23:49:35 +0000
Message-ID: <1179704975.833232.71110@x18g2000prd.googlegroups.com>

Jon Harrop <···@ffconsultancy.com> writes:

> Juho Snellman wrote:
>> It's not doing any of the computation at compile time.
>
> Wolfram's results prove otherwise.

Huh?  How so?

--
Wolfram Fenske

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

From: Larry Clapp
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 02:05:53 +0000
Message-ID: <slrnf51vk1.jq9.larry@theclapp.ddts.net>

/me feeds the troll.

On 2007-05-20, Jon Harrop <···@ffconsultancy.com> wrote:
> Juho Snellman wrote:
>> (deftype small-positive-double-float ()
>>   `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>> 
>> (defun pythagorean-triplet ()
>>   (declare (optimize speed (safety 0)))
>>   (loop for c from 2 to 5000
>>         sum (loop for b from 1 below c
>>                   count (let ((a (the small-positive-double-float
>>                                    (sqrt (float (- (* c c) (* b b))
>>                                    0d0)))))
>>                           (= (float (truncate a) 0d0) a)))))
>
> This is the fastest Lisp so far and its run time is faster than
> everything except the macro solution that j.oke hinted at.
>
> So this is clearly doing significant work at compile time.

Others have disagreed with this statement, but ...

> For a fair comparison, you either need to include compilation time

... I'll disagree with this one.

Suddenly the Lisp program wins, through (you believe) compile-time
computation, and suddenly you change what you measure so that the Lisp
program doesn't win any more.

Why?

You might as well include the typing speed of the programmers too,
"for a fair comparison".  Or, hmm, the square-footage of their houses,
"for a fair comparison".

You should define what you want up front, and whoever wins, wins.

Just prepare yourself for *us* to want different things (or in
different proportions) than *you*.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 11:18:09 +0000
Message-ID: <46518149$0$8730$ed2619ec@ptn-nntp-reader02.plus.net>

Larry Clapp wrote:
>> For a fair comparison, you either need to include compilation time
> 
> ... I'll disagree with this one.
> 
> Suddenly the Lisp program wins, through (you believe) compile-time
> computation, and suddenly you change what you measure so that the Lisp
> program doesn't win any more.
> 
> Why?

Other solutions also take zero run time but significant compile time (to do
the calculation). How do you propose we compare them?

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: j.oke
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:27:02 +0000
Message-ID: <1179750422.448401.265240@b40g2000prd.googlegroups.com>

Jon Harrop wrote:
> Other solutions also take zero run time but significant compile time (to do
> the calculation). How do you propose we compare them?

[off topic, whispering]

(You might consider to change your way of learning Lisp:
In usenet it will take you more than a dozen years... ;-)

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:47:29 +0000
Message-ID: <1179751649.779478.301950@z28g2000prd.googlegroups.com>

On May 21, 2:27 pm, "j.oke" <········@gmail.com> wrote:
> Jon Harrop wrote:
> > Other solutions also take zero run time but significant compile time (to do
> > the calculation). How do you propose we compare them?
>
> [off topic, whispering]
>
> (You might consider to change your way of learning Lisp:
> In usenet it will take you more than a dozen years... ;-)

Yes but it makes a cool looking page. Just imagine  : Jon Harrop's
Road to Lisp.

From: jt
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 13:15:01 +0000
Message-ID: <1179753301.871161.101750@36g2000prm.googlegroups.com>

On May 21, 2:47 pm, fireblade <·················@gmail.com> wrote:
> On May 21, 2:27 pm, "j.oke" <········@gmail.com> wrote:
>
> > Jon Harrop wrote:
> > > Other solutions also take zero run time but significant compile time (to do
> > > the calculation). How do you propose we compare them?
>
> > [off topic, whispering]
>
> > (You might consider to change your way of learning Lisp:
> > In usenet it will take you more than a dozen years... ;-)
>
> Yes but it makes a cool looking page. Just imagine  : Jon Harrop's
> Road to Lisp.

Something like I learned lisp by trolling in usenet, or my first
exposure to lisp was when I decide to convert lispers to my ugly ocaml
like microshit language and by the way sell my stupid journal.

No way.

From: fireblade
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 14:07:09 +0000
Message-ID: <1179756429.594506.319970@z24g2000prd.googlegroups.com>

On May 21, 3:15 pm, jt <·············@gmail.com> wrote:
> On May 21, 2:47 pm, fireblade <·················@gmail.com> wrote:
>
> > On May 21, 2:27 pm, "j.oke" <········@gmail.com> wrote:
>
> > > Jon Harrop wrote:
> > > > Other solutions also take zero run time but significant compile time (to do
> > > > the calculation). How do you propose we compare them?
>
> > > [off topic, whispering]
>
> > > (You might consider to change your way of learning Lisp:
> > > In usenet it will take you more than a dozen years... ;-)
>
> > Yes but it makes a cool looking page. Just imagine  : Jon Harrop's
> > Road to Lisp.
>
> Something like I learned lisp by trolling in usenet, or my first
> exposure to lisp was when I decide to convert lispers to my ugly ocaml
> like microshit language and by the way sell my stupid journal.
>
> No way.

How do you know that his journal is stupid? Wait the minute you are
the lisper that order it ? :)

From: Larry Clapp
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 12:43:34 +0000
Message-ID: <slrnf534vm.jq9.larry@theclapp.ddts.net>

On 2007-05-21, Jon Harrop <···@ffconsultancy.com> wrote:
> Larry Clapp wrote:
>>> For a fair comparison, you either need to include compilation time
>> 
>> ... I'll disagree with this one.
>> 
>> Suddenly the Lisp program wins, through (you believe) compile-time
>> computation, and suddenly you change what you measure so that the
>> Lisp program doesn't win any more.
>> 
>> Why?
>
> Other solutions also take zero run time but significant compile time
> (to do the calculation). How do you propose we compare them?

Which part of

  "You should define what you want up front, and whoever wins, wins."

did you not understand?

From: Sacha
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 14:26:58 +0000
Message-ID: <S_h4i.189589$Ps4.1070151@phobos.telenet-ops.be>

Jon Harrop wrote:
[lots of stuff]

Great, you learned that some things can and should be computed at 
compile time for maximum efficiency.

You also learned that a good algorithm performs way better. And that 
indeed the used language has not much of an impact on that simple fact.

That's two nice stories for your journal, Jon.

Sacha

From: Frank Buss
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 20:56:25 +0000
Message-ID: <13f8njuuofvsj.16gwpv3hkwtpz.dlg@40tude.net>

Jon Harrop wrote:

> Tweaking the programs to accept N as an argument and then measuring only run
> times, I get:
> 
> C: 0.291s
> OCaml: 0.350s
> SBCL: 0.885s
> 
> So Lisp is around 3x slower than C from this benchmark. OCaml is only 20%
> slower.

3x to 5x slower than C is the same result I had with other micro-benchmarks
like this, but who cares? I don't think this could be generalized for
larger applications, because typically there are only a few hotspots, which
could be implemented in C and called by Lisp and for the rest of the
program it is more important to have the flexibility Lisp provides, to make
it easier for developers to write and maintain programs. And with
incremental GCs, like modern Lisp implementations provide, there is no
significant timeout e.g. when using Lisp for games.

Some more reasons why a good GC is faster than manual memory managment:
http://www.jwz.org/doc/gc.html

-- 
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Sun, 20 May 2007 21:01:34 +0000
Message-ID: <4650b887$0$8730$ed2619ec@ptn-nntp-reader02.plus.net>

Frank Buss wrote:
> Some more reasons why a good GC is faster than manual memory managment:
> http://www.jwz.org/doc/gc.html

So you'd recommend the fastest language with GC?

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Frank Buss
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 19:10:55 +0000
Message-ID: <1rm21v33ywwcq.1wkhllj3dx2o2.dlg@40tude.net>

Jon Harrop wrote:

> Frank Buss wrote:
>> Some more reasons why a good GC is faster than manual memory managment:
>> http://www.jwz.org/doc/gc.html
> 
> So you'd recommend the fastest language with GC?

This is language independant, because you can use a GC in C++, too. But
Lisp is much easier to use.

-- 
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 20:40:47 +0000
Message-ID: <46520526$0$8743$ed2619ec@ptn-nntp-reader02.plus.net>

Frank Buss wrote:
> This is language independant...

Language design can affect the potential for optimization in the GC, e.g.
see BIBOP.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Kjetil S. Matheussen
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 00:34:18 +0000
Message-ID: <Pine.LNX.4.64.0705210230320.9828@ttleush>

On Sun, 20 May 2007, Jon Harrop wrote:

> Juho Snellman wrote:
>> (deftype small-positive-double-float ()
>>   `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>>
>> (defun pythagorean-triplet ()
>>   (declare (optimize speed (safety 0)))
>>   (loop for c from 2 to 5000
>>         sum (loop for b from 1 below c
>>                   count (let ((a (the small-positive-double-float
>>                                    (sqrt (float (- (* c c) (* b b))
>>                                    0d0)))))
>>                           (= (float (truncate a) 0d0) a)))))
>
> This is the fastest Lisp so far and its run time is faster than everything
> except the macro solution that j.oke hinted at.
>

No. Not correct. As I posted, the stalin version was about 1.5 times 
faster than the C version, and therefore probably also faster than the 
OCaml version. Even better, the stalin version wasn't even optimized, I 
just wrote down the most straigh-forward version, compiled it, and bang, 
1.5 times faster than C.

From: Kjetil S. Matheussen
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 02:46:27 +0000
Message-ID: <Pine.LNX.4.64.0705210444010.9828@ttleush>

On Mon, 21 May 2007, Kjetil S. Matheussen wrote:

>
>
> On Sun, 20 May 2007, Jon Harrop wrote:
>
>>  Juho Snellman wrote:
>> >  (deftype small-positive-double-float ()
>> >    `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>> > 
>> >  (defun pythagorean-triplet ()
>> >    (declare (optimize speed (safety 0)))
>> >    (loop for c from 2 to 5000
>> >          sum (loop for b from 1 below c
>> >                    count (let ((a (the small-positive-double-float
>> >                                     (sqrt (float (- (* c c) (* b b))
>> >                                     0d0)))))
>> >                            (= (float (truncate a) 0d0) a)))))
>>
>>  This is the fastest Lisp so far and its run time is faster than everything
>>  except the macro solution that j.oke hinted at.
>> 
>
> No. Not correct.

%!#$%

I thought the lisp benchmark result in your post was created running the 
version above.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 11:36:15 +0000
Message-ID: <4651858a$0$8729$ed2619ec@ptn-nntp-reader02.plus.net>

Kjetil S. Matheussen wrote:
> No. Not correct. As I posted, the stalin version was about 1.5 times
> faster than the C version,

Stalin only runs on my 32-bit system so I haven't gotten around to it yet.

> Even better, the stalin version wasn't even optimized, 

Unless you count the page of compiler flags. ;-)

> I 
> just wrote down the most straigh-forward version, compiled it, and bang,
> 1.5 times faster than C.

This also indicates that the C could be improved.

Using ints instead of doubles and (int)(..) instead of trunc(..) makes this
C99 more than twice as fast (0.137s):

int main(int argc, char *argv[]) {
  const int n = (argc == 2 ? atoi(argv[1]) : 5000);
  int i = 0;

  for (int c=2; c<=n; c++)
    for (int b=1; b<c; b++) {
      double a = sqrt(c*c - b*b);
      if ((int)a == a) i++;
    }
  printf("%d\n", i);
}

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Kjetil Svalastog Matheussen
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 13:10:17 +0000
Message-ID: <Pine.LNX.4.64.0705211456580.30119@iannis.localdomain>

On Mon, 21 May 2007, Jon Harrop wrote:

> Kjetil S. Matheussen wrote:
> > No. Not correct. As I posted, the stalin version was about 1.5 times
> > faster than the C version,
> 
> Stalin only runs on my 32-bit system so I haven't gotten around to it yet.
> 
> > Even better, the stalin version wasn't even optimized, 
> 
> Unless you count the page of compiler flags. ;-)
> 
> > I 
> > just wrote down the most straigh-forward version, compiled it, and bang,
> > 1.5 times faster than C.
> 
> This also indicates that the C could be improved.
> 
> Using ints instead of doubles and (int)(..) instead of trunc(..) makes this
> C99 more than twice as fast (0.137s):
> 
> int main(int argc, char *argv[]) {
>   const int n = (argc == 2 ? atoi(argv[1]) : 5000);
>   int i = 0;
> 
>   for (int c=2; c<=n; c++)
>     for (int b=1; b<c; b++) {
>       double a = sqrt(c*c - b*b);
>       if ((int)a == a) i++;
>     }
>   printf("%d\n", i);
> }
> 
> 

Well, for fun, here is the benchmark for stalin (the one I posted 
earlier), c (your version above), and ocaml (the one found at 
http://tempe.st/2007/05/erlang-ruby-and-php-battle-it-out/).
Unfortunately, I don't have sbcl (my linux lacks ntpl).


Compile options
---------------
gcc -O3 --fast-math
ocamlopt -rectypes -inline 100 -ffast-math -ccopt -O3 
stalin -Ob -Om -On -Or -Ot -dP -dC -dh -q -d -db  -no-clone-size-limit -split-even-if-no-widening -copt -O3 -copt -ffast-math

Compiler versions
-----------------
gcc 3.4.6
ocamlopt 3.09.3
stalin 0.11

Results
-------
gcc:    0.307s (best of 5)
OCaml:  0.538s (best of 5)
Stalin: 0.356s (best of 5)

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 19:01:11 +0000
Message-ID: <4651edce$0$8712$ed2619ec@ptn-nntp-reader02.plus.net>

Kjetil Svalastog Matheussen wrote:
> gcc:    0.307s (best of 5)
> OCaml:  0.538s (best of 5)
> Stalin: 0.356s (best of 5)

I get:

gcc: 0.137s
SBCL: 0.214s
Stalin: 0.330s (hardcoded "n")
OCaml: 0.353s

Probably time to increase n. With n=20000:

gcc: 2.251s
SBCL: 2.462s
Stalin: 5.164s
OCaml: 5.433s

Well, this microbenchmark certainly isn't showing anything meaningful. ;-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Raffael Cavallaro
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 20:02:17 +0000
Message-ID: <2007052116021799097-raffaelcavallaro@pasdespamsilvousplaitmaccom>

On 2007-05-21 15:01:11 -0400, Jon Harrop <···@ffconsultancy.com> said:

> I get:
> 
> gcc: 0.137s
> SBCL: 0.214s
> Stalin: 0.330s (hardcoded "n")
> OCaml: 0.353s
> 
> Probably time to increase n. With n=20000:
> 
> gcc: 2.251s
> SBCL: 2.462s
> Stalin: 5.164s
> OCaml: 5.433s
> 
> Well, this microbenchmark certainly isn't showing anything meaningful. ;-)

Apparently benchmark "meaning" is inversely proportional to how much 
faster a lisp runs it than ocaml.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 20:39:51 +0000
Message-ID: <465204ee$0$8743$ed2619ec@ptn-nntp-reader02.plus.net>

Raffael Cavallaro wrote:
> On 2007-05-21 15:01:11 -0400, Jon Harrop <···@ffconsultancy.com> said:
>> Well, this microbenchmark certainly isn't showing anything meaningful.
>> ;-)
> 
> Apparently benchmark "meaning" is inversely proportional to how much
> faster a lisp runs it than ocaml.

You missed off the factorial.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet

From: Raffael Cavallaro
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 04:16:31 +0000
Message-ID: <200705220016317826-raffaelcavallaro@pasdespamsilvousplaitmaccom>

On 2007-05-21 16:39:51 -0400, Jon Harrop <···@ffconsultancy.com> said:

> Raffael Cavallaro wrote:
>> On 2007-05-21 15:01:11 -0400, Jon Harrop <···@ffconsultancy.com> said:
>>> Well, this microbenchmark certainly isn't showing anything meaningful.
>>> ;-)
>> 
>> Apparently benchmark "meaning" is inversely proportional to how much
>> faster a lisp runs it than ocaml.
> 
> You missed off the factorial.

Credit where credit is due:

"OCaml - the language of choice for toy benchmarks from factorials to 
ray-tracers!"

From: Pascal Costanza
Subject: Re: How faster can this be done?
Date: Mon, 21 May 2007 21:54:58 +0000
Message-ID: <5beipiF2sa5euU1@mid.individual.net>

Jon Harrop wrote:
> Kjetil Svalastog Matheussen wrote:
>> gcc:    0.307s (best of 5)
>> OCaml:  0.538s (best of 5)
>> Stalin: 0.356s (best of 5)
> 
> I get:
> 
> gcc: 0.137s
> SBCL: 0.214s
> Stalin: 0.330s (hardcoded "n")
> OCaml: 0.353s
> 
> Probably time to increase n. With n=20000:
> 
> gcc: 2.251s
> SBCL: 2.462s
> Stalin: 5.164s
> OCaml: 5.433s
> 
> Well, this microbenchmark certainly isn't showing anything meaningful. ;-)

No, of course not. Microbenchmarks never show anything meaningful, no 
matter what language they favor.

In this thread we have just seen what every experienced Lisper already 
knows: With enough tweaking we can get in the same ballpark as any other 
compiled language. This is only useful for the hot spots, though, and 
you can never know the hot spots in advance. For everything else, 
performance doesn't matter, as long as its reasonable.

Other aspects are at least similarly important.

For example, from 
http://groups.google.com/group/comp.lang.lisp/msg/c8aac11d40efa4b0

"It is interesting to notice that one year later, [the Common Lisp 
directory] is still the same Lisp process that is running. The project 
has evolved from Linkit to the CLD without ever quitting the Lisp 
process. It has never crashed, even though the application and the CLOS 
classes have changed a lot of times. In the same time, I had to restart 
several times the Apache process and reboot a lot of time the 3COM 
firewall router in front of the server."

Another example, from 
https://lists.csail.mit.edu/pipermail/ll-discuss/2007-February/001186.html

"The second moment came when I had re-factored a class into a 
superclass/subclass relationship.  I then reloaded the file and tested 
the software to see that it worked ok.  It did.

Then I thought about that.  I just re-organized the class topology on 
the fly in a running system.  How on earth could that just work? 
Obviously existing instances had to be converted on the fly to new 
objects and any dependencies had to be recomputed.  But a huge chunk of 
CLOS that I had thought `overkill' was what had stepped in and made sure 
that I could just reload a file and keep going.  Once again I was deeply 
impressed.

I think if Graham spent a good chunk of time working on a CLOS project 
that he, too, would come around to the point of view that CLOS is a 
significant achievement."

Or from http://lib.store.yahoo.net/lib/paulgraham/bbnexcerpts.txt

"Lisp's interactive toplevel is a great help in developing software 
rapidly.  But the biggest advantage for us was probably in finding bugs. 
  As I mentioned before, with Web-based applications you have the users' 
data on your servers and can usually reproduce bugs.

When one of the customer support people came to me with a report of a 
bug in the editor, I would load the code into the Lisp interpreter and 
log into the user's account.  If I was able to reproduce the bug I'd get 
an actual break loop, telling me exactly what was going wrong.  Often I 
could fix the code and release a fix right away.  And when I say right 
away, I mean while the user was still on the phone."

...and so on, and so forth.

Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Kjetil S. Matheussen
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 02:44:53 +0000
Message-ID: <Pine.LNX.4.64.0705220435570.9030@ttleush>

On Mon, 21 May 2007, Jon Harrop wrote:

> Kjetil Svalastog Matheussen wrote:
>> gcc:    0.307s (best of 5)
>> OCaml:  0.538s (best of 5)
>> Stalin: 0.356s (best of 5)
>
> I get:
>
> gcc: 0.137s
> SBCL: 0.214s
> Stalin: 0.330s (hardcoded "n")
> OCaml: 0.353s
>
> Probably time to increase n. With n=20000:
>
> gcc: 2.251s
> SBCL: 2.462s
> Stalin: 5.164s
> OCaml: 5.433s
>
> Well, this microbenchmark certainly isn't showing anything meaningful. ;-)
>

Right. Its weird that you get so slow code from stalin though. I remember 
the same thing happened with your ray-tracer benchmark. both Siskin and 
myself got much faster code out of stalin than you did. I like stalin as a 
proof that untweaked lisp (including other languages with dynamic typing 
and such) can be really fast, and in theory beat everything else when it 
comes to performance, so its dissapointing that you can't get as good 
performance out of it since you make so much buzz about performance in 
your endless number of newsgroup posts and the promotion for your 
ray-tracer benchmark.

From: Jon Harrop
Subject: Re: How faster can this be done?
Date: Tue, 22 May 2007 13:21:40 +0000
Message-ID: <4652efbc$0$8710$ed2619ec@ptn-nntp-reader02.plus.net>

Kjetil S. Matheussen wrote:
> Right. Its weird that you get so slow code from stalin though. I remember
> the same thing happened with your ray-tracer benchmark. both Siskin and
> myself got much faster code out of stalin than you did. I like stalin as a
> proof that untweaked lisp (including other languages with dynamic typing
> and such) can be really fast, and in theory beat everything else when it
> comes to performance, so its dissapointing that you can't get as good
> performance out of it since you make so much buzz about performance in
> your endless number of newsgroup posts and the promotion for your
> ray-tracer benchmark.

I don't think this result is so surprising. Firstly, the Stalin entry is the
only one running in 32-bit. GCC, OCaml and SBCL all support 64-bit and I'm
running an AMD64 that is much faster in 64-bit.

Secondly, previous benchmarks indicate that SBCL and Stalin both do
relatively better than OCaml on Intel hardware. We only use AMD hardware so
I can't benchmark the same programs on Intels. But my guess is that the
results would be significantly different.

At the end of the day, all this really means is that the results are not
significantly different.

There are still several interesting facts to be drawn from these results:

1. Functional programming languages can be very competitive with C on
imperative code. I was very surprised to see the obvious C implementation
beaten by Lisp.

2. This is another benchmark where OCaml's lack of optimized div and mod
hurts.

3. Compiler switches and platforms have at least as big an effect as
language choice on this benchmark.

4. The underlying representation and choice of GC can have very significant
performance implications for a language. In this case, OCaml has trouble
using a bit-vector because it offers only tagged 31/63-bit ints for unboxed
arrays, which require much more work and slightly more memory.

-- 
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet