Hello Lisp meisters,
I was reading a blog about the performance of implementations of a
very simple Pythagorean triplets algorithm in different languages:
http://tempe.st/2007/05/erlang-ruby-and-php-battle-it-out/
And I wondered what I could do with Common Lisp (SBCL 0.9.17) I have
tried two versions:
(defconstant +max+ 5000
"The constant value up to which the main loop will run")
(defun pythagorean-triplet()
(let ((a 0) (i 0))
(loop for c from 2 to +max+
do
(loop for b from 1 below c
do (progn
(setf a (sqrt (- (* c c) (* b b))))
(when (= (floor a) a) (incf i)))))
(print i)))
(defun pythagorean-triplet-optimized-for-speed()
(declare (optimize (speed 3) (debug 0) (safety 0))
(type single-float a)
(type fixnum i))
(let ((a 0) (i 0))
(loop for c from 2 to +max+
do
(loop for b from 1 below c
do (progn
(setf a (sqrt (- (* c c) (* b b))))
(when (= (ffloor a) a) (incf i)))))
(print i)))
Then I timed them:
CL-USER> (time (pythagorean-triplet))
13413
Evaluation took:
5.015 seconds of real time
3.87941 seconds of user run time
0.035995 seconds of system run time
[Run times include 0.25 seconds GC run time.]
0 calls to %EVAL
0 page faults and
299,940,200 bytes consed.
CL-USER> (time (pythagorean-triplet-optimized-for-speed))
13413
Evaluation took:
1.571 seconds of real time
1.433782 seconds of user run time
0.002999 seconds of system run time
[Run times include 0.151 seconds GC run time.]
0 calls to %EVAL
0 page faults and
199,960,776 bytes consed.
I wonder if pythagorean-triplet-optimized-for-speed function can be
made faster by tweaking the compilation parameters (without touching
the structure of the loop).
Maybe some of you will try and show me (or maybe SBCL > 1 versions got
faster, I don't know).
The machine I ran this on was a Debian with the following
specifications:
debian:/home/fz# uname -a
Linux debian 2.6.11-1-686 #1 Mon Jun 20 22:00:38 MDT 2005 i686 GNU/
Linux
··@debian:~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Celeron(R) CPU 2.40GHz
stepping : 1
cpu MHz : 2392.778
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
monitor ds_cpl cid xtpr
bogomips : 4718.59
Cheers
--
Emre Sevinc
> I wonder if pythagorean-triplet-optimized-for-speed function can be
> made faster by tweaking the compilation parameters (without touching
> the structure of the loop).
>
> Maybe some of you will try and show me (or maybe SBCL > 1 versions got
> faster, I don't know).
>
You need to put more (the fixnum _)s in there to guarantee that the
numbers won't overflow. By default common lisp should assume that the
sum/product of two fixnums is an integer not another fixnum.
(defun pythagorean-triplet-re-optimized-for-speed()
(declare (optimize (speed 3) (debug 0) (safety 0)))
(let ((a 0.0) (i 0))
(declare (type single-float a) (type fixnum i)) ;Moved to avoid
warnings
(loop for c from 2 to (the fixnum +max+) do
(loop for b from 1 below c
do (progn
(setf a (sqrt (the fixnum (- (the fixnum (* c c))
(the fixnum (* b b))))))
(when (= (the single-float(ffloor a)) a) (incf
i))))) ;;Strangely this seemed to help
(print i)))
CL-USER> (time (pythagorean-triplet-re-optimized-for-speed))
13413
Evaluation took:
1.366 seconds of real time
1.336084 seconds of user run time
0.008001 seconds of system run time
[Run times include 0.116 seconds GC run time.]
0 calls to %EVAL
0 page faults and
199,961,352 bytes consed.
13413
CL-USER> (time (pythagorean-triplet-optimized-for-speed))
13413
Evaluation took:
1.66 seconds of real time
1.624102 seconds of user run time
0.040002 seconds of system run time
[Run times include 0.124 seconds GC run time.]
0 calls to %EVAL
0 page faults and
199,957,592 bytes consed.
13413
Chris Russell <·····················@gmail.com> wrote:
[...]
>
> You need to put more (the fixnum _)s in there to guarantee that the
> numbers won't overflow. By default common lisp should assume that the
> sum/product of two fixnums is an integer not another fixnum.
>
> (defun pythagorean-triplet-re-optimized-for-speed()
> (declare (optimize (speed 3) (debug 0) (safety 0)))
> (let ((a 0.0) (i 0))
> (declare (type single-float a) (type fixnum i)) ;Moved to avoid
> warnings
> (loop for c from 2 to (the fixnum +max+) do
> (loop for b from 1 below c
> do (progn
> (setf a (sqrt (the fixnum (- (the fixnum (* c c))
> (the fixnum (* b b))))))
> (when (= (the single-float(ffloor a)) a) (incf
> i))))) ;;Strangely this seemed to help
> (print i)))
>
> CL-USER> (time (pythagorean-triplet-re-optimized-for-speed))
>
> 13413
> Evaluation took:
> 1.366 seconds of real time
> 1.336084 seconds of user run time
> 0.008001 seconds of system run time
> [Run times include 0.116 seconds GC run time.]
> 0 calls to %EVAL
> 0 page faults and
> 199,961,352 bytes consed.
> 13413
> CL-USER> (time (pythagorean-triplet-optimized-for-speed))
>
> 13413
> Evaluation took:
> 1.66 seconds of real time
> 1.624102 seconds of user run time
> 0.040002 seconds of system run time
> [Run times include 0.124 seconds GC run time.]
> 0 calls to %EVAL
> 0 page faults and
> 199,957,592 bytes consed.
> 13413
Why is this consing so much?
> Why is this consing so much?
I don't know. I assume that these single-floats must be passed by
reference and that this is where it's coming from.
More drastically, these algorithms are flawed.
> (- (+ (expt 4576 2) (expt 2015 2))
(expt 5000 2))
gives 1 but 4576, 2015, 5000 is considered an acceptable solution by
the first algorithm due to the rounding error.
I only noticed this when I tried:
(defun pythagorean-triplet()
(let ((c2-b2 0) (a 0) (i 0))
(loop for c from 2 to 5000 do
(loop for b from 1 below c do
(progn
(setf c2-b2 (- (* c c) (* b b))
a (isqrt c2-b2))
(when (= c2-b2 (* a a))
(incf i)))))
(print i)))
Which has an output of
11362
Fortunately, passing double-floats instead of fixnums to sqrt gives
you the right answer, but triples the amount of consing again.
Chris Russell wrote:
> (defun pythagorean-triplet()
> (let ((c2-b2 0) (a 0) (i 0))
> (loop for c from 2 to 5000 do
> (loop for b from 1 below c do
> (progn
> (setf c2-b2 (- (* c c) (* b b))
> a (isqrt c2-b2))
> (when (= c2-b2 (* a a))
> (incf i)))))
> (print i)))
I think this is the only Lisp posted so far that gives the same answer as
the original program. As far as timings, I get:
OCaml: 0.341s
C: 0.384s
Lisp: 9.233s
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop <···@ffconsultancy.com> writes:
> Chris Russell wrote:
>> (defun pythagorean-triplet()
>> (let ((c2-b2 0) (a 0) (i 0))
>> (loop for c from 2 to 5000 do
>> (loop for b from 1 below c do
>> (progn
>> (setf c2-b2 (- (* c c) (* b b))
>> a (isqrt c2-b2))
>> (when (= c2-b2 (* a a))
>> (incf i)))))
>> (print i)))
>
> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:
>
> OCaml: 0.341s
> C: 0.384s
> Lisp: 9.233s
For some reason ISQRT seems to be very slow. Replacing
(isqrt c2-b2)
with
(truncate (sqrt c2-b2))
brings down the runtime on my computer from 9.000s to 0.734s, while
still yielding the same result (11362, right?). Here it is in full:
--8<---------------cut here---------------start------------->8---
(defun pythagorean-triplet ()
(let ((c2-b2 0) (a 0) (i 0))
(loop for c from 2 to 5000 do
(loop for b from 1 below c do
(progn
(setf c2-b2 (- (* c c) (* b b))
;;a (isqrt c2-b2)
a (truncate (sqrt c2-b2)))
(when (= c2-b2 (* a a))
(incf i)))))
(print i)))
--8<---------------cut here---------------end--------------->8---
As has been mentioned, this solution allocates a lot of memory,
ca. 100 MB using SBCL 1.0.5. If that could be eliminated, I assume
it'd be as fast as the C or OCaml versions. Unfortunately I don't
know enough about CL optimization to do this. None of the type
declarations I tried had any effect.
--
Wolfram Fenske
A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
On 20 May, 00:33, Wolfram Fenske <····@gmx.net> wrote:
> Jon Harrop <····@ffconsultancy.com> writes:
> > Chris Russell wrote:
> >> (defun pythagorean-triplet()
> >> (let ((c2-b2 0) (a 0) (i 0))
> >> (loop for c from 2 to 5000 do
> >> (loop for b from 1 below c do
> >> (progn
> >> (setf c2-b2 (- (* c c) (* b b))
> >> a (isqrt c2-b2))
> >> (when (= c2-b2 (* a a))
> >> (incf i)))))
> >> (print i)))
>
> > I think this is the only Lisp posted so far that gives the same answer as
> > the original program. As far as timings, I get:
>
> > OCaml: 0.341s
> > C: 0.384s
> > Lisp: 9.233s
>
> For some reason ISQRT seems to be very slow. Replacing
>
> (isqrt c2-b2)
>
> with
>
> (truncate (sqrt c2-b2))
>
> brings down the runtime on my computer from 9.000s to 0.734s, while
> still yielding the same result (11362, right?). Here it is in full:
Cool. :)
Wolfram Fenske wrote:
> For some reason ISQRT seems to be very slow. Replacing
>
> (isqrt c2-b2)
>
> with
>
> (truncate (sqrt c2-b2))
>
> brings down the runtime on my computer from 9.000s to 0.734s, while
> still yielding the same result (11362, right?).
Yes.
> As has been mentioned, this solution allocates a lot of memory,
> ca. 100 MB using SBCL 1.0.5. If that could be eliminated, I assume
> it'd be as fast as the C or OCaml versions. Unfortunately I don't
> know enough about CL optimization to do this. None of the type
> declarations I tried had any effect.
This is much faster. Now I get:
C: 0.299s
OCaml: 0.341s
Lisp: 0.454s
with better compiler options for the C code (-O3 -ffast-math).
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:
Well it's the only program guaranteed to keep giving the right answer
as the numbers get bigger and bigger.
This program works for the same range as the c or ocaml
(defun pythagorean-triple2()
(declare (optimize (speed 3) (debug 0) (safety 0)))
(let ((i 0))
(declare (type fixnum i))
(loop for c from 2d0 to 5000d0 do
(loop for b from 1d0 below c do
(progn
(let* ((c2-b2 (- (* c c) (* b b)))
(a (ffloor (the double-float (sqrt c2-b2)))))
(when (= c2-b2 (* a a))
(incf i))))))
(print i)))
But time wise it's out by a factor of 5 or so, probably due to the
boxing of the floats.
Evaluation took:
1.792 seconds of real time
Does anyone know how to turn this off?
Chris Russell wrote:
>> I think this is the only Lisp posted so far that gives the same answer as
>> the original program. As far as timings, I get:
>
> Well it's the only program guaranteed to keep giving the right answer
> as the numbers get bigger and bigger.
When that happens, this program will be taking almost 500 years to run. :-)
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
None of the proposed solutions is guaranteed to give the correct
solution since they all resort to using imprecise floats. I am
beginning to hate the warm fuzzies. Try this instead.
(defparameter +max+ 5000)
(defparameter *perfect-square-lookup*
(let ((lookup (make-hash-table)))
(loop for i from 0 to (1+ +max+)
for i2 = (* i i) do
(setf (gethash i2 lookup) i2))
lookup))
(defun perfect-square (n) (gethash n *perfect-square-lookup*))
(defun pythagorean-triplet (&aux (n 0))
(loop for c from 2 to +max+ do
(loop for b from 1 below c
when (perfect-square (- (* c c) (* b b)))
do (incf n)))
n)
CL-USER> (time (pythagorean-triplet))
Evaluation took:
1.471 seconds of real time
1.469061 seconds of user run time
9.93e-4 seconds of system run time
0 calls to %EVAL
0 page faults and
0 bytes consed.
11362
CL-USER>
Wade
On May 20, 4:35 pm, Wade Humeniuk <········@telus.net.no.spam> wrote:
> None of the proposed solutions is guaranteed to give the correct
> solution since they all resort to using imprecise floats. I am
> beginning to hate the warm fuzzies. Try this instead.
The hyperspec is a bit ambiguous about whether isqrt is rounded out
from the floats or not.
It says:
isqrt returns the greatest integer less than or equal to the *exact*
positive square root of natural.
But there is also a note at the bottom that reads:
(isqrt x) == (values (floor (sqrt x)))
but it is potentially more efficient.
So I looked at the SBCL source code to see what was actually going on
under the hood, and this is the relevant code.
;;; From discussion on comp.lang.lisp and Akira Kurihara.
(defun isqrt (n)
#!+sb-doc
"Return the root of the nearest integer less than n which is a
perfect
square."
(declare (type unsigned-byte n) (values unsigned-byte))
;; Theoretically (> n 7), i.e., n-len-quarter > 0.
(if (and (fixnump n) (<= n 24))
(cond ((> n 15) 4)
((> n 8) 3)
((> n 3) 2)
((> n 0) 1)
(t 0))
(let* ((n-len-quarter (ash (integer-length n) -2))
(n-half (ash n (- (ash n-len-quarter 1))))
(n-half-isqrt (isqrt n-half))
(init-value (ash (1+ n-half-isqrt) n-len-quarter)))
(loop
(let ((iterated-value
(ash (+ init-value (truncate n init-value)) -1)))
(unless (< iterated-value init-value)
(return init-value))
(setq init-value iterated-value))))))
I'd say it is also appropriate for calculating the exact solution.
Wade Humeniuk <········@telus.net.no.spam> writes:
> None of the proposed solutions is guaranteed to give the correct
> solution since they all resort to using imprecise floats. I am
> beginning to hate the warm fuzzies. Try this instead.
>From the "memory is cheap" front comes this modification that replaces
the hash table with a simple vector:
--8<---------------cut here---------------start------------->8---
(defconstant +max+ 5000)
(defun make-lookup (size)
(let* ((lookup (make-array (* size size) :initial-element nil)))
(dotimes (i size lookup)
(setf (svref lookup (* i i)) t))))
(defun pythagorean-triplet ()
(let* ((n 0)
(lookup (make-lookup (1+ +max+))))
(loop for c from 2 to +max+ do
(loop for b from 1 below c
when (svref lookup (- (* c c) (* b b)))
do (incf n)))
(print n)))
--8<---------------cut here---------------end--------------->8---
This sped things up by a factor of 2 using SBCL 1.0.5. The hash table
version takes 2.173s on my machine, the vector version 0.974s.
--
Wolfram Fenske
A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
Wade Humeniuk <········@telus.net.no.spam> writes:
> Wolfram Fenske <·····@gmx.net> writes:
>
>> Wade Humeniuk <········@telus.net.no.spam> writes:
>>
>>> None of the proposed solutions is guaranteed to give the correct
>>> solution since they all resort to using imprecise floats. I am
>>> beginning to hate the warm fuzzies. Try this instead.
>>
>> From the "memory is cheap" front comes this modification that replaces
>> the hash table with a simple vector:
[...]
> In the same vein,...
>
> (defconstant +max+ 5000)
>
> (defun make-lookup (size)
> (let* ((lookup (make-array (* size size) :element-type 'bit :initial-element 0)))
> (dotimes (i size lookup)
> (setf (sbit lookup (* i i)) 1))))
[...]
> CL-USER> (time (pythagorean-triplet))
>
> 11362
> Evaluation took:
> 0.07 seconds of real time
> 0.069843 seconds of user run time
> 2.73e-4 seconds of system run time
> 0 calls to %EVAL
> 0 page faults and
> 3,126,264 bytes consed.
> 11362
What kind of machine have you got? I'm using SBCL 1.0.5 on an AMD64
2800+ or something, the disassemblies look identical, but for me
(time (pythagorean-triplet))
says:
--8<---------------cut here---------------start------------->8---
Evaluation took:
0.267 seconds of real time
0.25461 seconds of user run time
7.5e-5 seconds of system run time
0 calls to %EVAL
0 page faults and
3,126,264 bytes consed.
--8<---------------cut here---------------end--------------->8---
... which, by the way, is still *awesome* -- the best result so far,
even faster than the C version [1] I have.
Here's my slightly slower version with `+max+' as a parameter:
--8<---------------cut here---------------start------------->8---
(defun pythagorean-triplet (n)
(declare (optimize speed (safety 0))
(type fixnum n))
(do ((c 2 (1+ c))
(count 0)
(lookup (make-lookup (1+ n))))
((> c n)
count)
(declare (type fixnum c count))
(do ((b 1 (1+ b)))
((>= b c))
(declare (type fixnum b))
(unless (zerop (sbit lookup (the fixnum
(- (the fixnum (* c c))
(the fixnum (* b b))))))
(incf count)))))
--8<---------------cut here---------------end--------------->8---
Timings:
--8<---------------cut here---------------start------------->8---
* (time (pythagorean-triplet 5000))
Evaluation took:
0.282 seconds of real time
0.270825 seconds of user run time
7.4e-5 seconds of system run time
0 calls to %EVAL
0 page faults and
3,126,264 bytes consed.
11362
--8<---------------cut here---------------end--------------->8---
Also faster than the C version (0.356 s) on my machine.
Footnotes:
[1] My C version doesn't use bit fields, though. It's a direct
translation of the original algorithm, using `sqrt' etc.
--
Wolfram Fenske
A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
Wolfram Fenske <·····@gmx.net> writes:
>
> What kind of machine have you got? I'm using SBCL 1.0.5 on an AMD64
> 2800+ or something, the disassemblies look identical, but for me
>
> (time (pythagorean-triplet))
>
> says:
>
> --8<---------------cut here---------------start------------->8---
> Evaluation took:
> 0.267 seconds of real time
> 0.25461 seconds of user run time
> 7.5e-5 seconds of system run time
> 0 calls to %EVAL
> 0 page faults and
> 3,126,264 bytes consed.
> --8<---------------cut here---------------end--------------->8---
>
> ... which, by the way, is still *awesome* -- the best result so far,
> even faster than the C version [1] I have.
>
I have a new Mac Book Pro. Its a Centrino Dual Core 2.16GHz
with a 4MB shared L2 Cache. The cache is probably the thing
that makes the difference. The bit-vector can fit in it.
SBCL 1.04.
Wade
Wade Humeniuk <········@telus.net.no.spam> writes:
> Wolfram Fenske <·····@gmx.net> writes:
>
>>
>> What kind of machine have you got? I'm using SBCL 1.0.5 on an AMD64
>> 2800+ or something, the disassemblies look identical, but for me
>>
>> (time (pythagorean-triplet))
>>
>> says:
>>
>> --8<---------------cut here---------------start------------->8---
>> Evaluation took:
>> 0.267 seconds of real time
>> 0.25461 seconds of user run time
>> 7.5e-5 seconds of system run time
>> 0 calls to %EVAL
>> 0 page faults and
>> 3,126,264 bytes consed.
>> --8<---------------cut here---------------end--------------->8---
>>
>> ... which, by the way, is still *awesome* -- the best result so far,
>> even faster than the C version [1] I have.
>>
>
> I have a new Mac Book Pro. Its a Centrino Dual Core 2.16GHz
> with a 4MB shared L2 Cache. The cache is probably the thing
> that makes the difference. The bit-vector can fit in it.
That has to be it. I knew that caches are a lot faster than main
memory, but still, this is the most impressive demonstration I've seen
so far. Whew! :-)
--
Wolfram Fenske
A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
Wolfram Fenske <·····@gmx.net> writes:
>
> That has to be it. I knew that caches are a lot faster than main
> memory, but still, this is the most impressive demonstration I've seen
> so far. Whew! :-)
Here is a link to the processor blurb.
Intel Core™2 Duo Processor T7400 2.16GHz w/ 4MB Cache
http://www.memoryexpress.com/index.php?PageTag=&page=file&memx_menu=EmbedProductDetail.php&DisplayProductID=9503&SID=
The L2 Cache on the Core2 is the same speed as the processor clock.
The RAM is 667MHz, the processor is 2.16GHz.
This caching thing can be very big. Its explains why the UltraSpracs
in the Sun we have at work is so fast (8MB Caches).
Wade
I tried (a modified version of your C++) version on OS X
The C++ version is slower that the SBCL Version (probably because of
caching). The Lisp version is 40% faster.
wade-humeniuks-computer:~/lisp wade$ gcc -O3 -c pyth.cc
wade-humeniuks-computer:~/lisp wade$ g++ -lstdc++ -o pyth pyth.o
wade-humeniuks-computer:~/lisp wade$ time ./pyth
11362
11362
11362
11362
11362
11362
11362
11362
11362
11362
time: 1.0300
real 0m1.052s
user 0m0.998s
sys 0m0.042s
wade-humeniuks-computer:~/lisp wade$
CL-USER> (time (loop repeat 10 do (pythagorean-triplet)))
11362
11362
11362
11362
11362
11362
11362
11362
11362
11362
Evaluation took:
0.699 seconds of real time
0.691419 seconds of user run time
0.00562 seconds of system run time
[Run times include 0.015 seconds GC run time.]
0 calls to %EVAL
0 page faults and
31,295,360 bytes consed.
NIL
CL-USER>
Wade Humeniuk <········@telus.net.no.spam> writes:
> I tried (a modified version of your C++) version on OS X
>
> The C++ version is slower that the SBCL Version (probably because of
> caching). The Lisp version is 40% faster.
>
Forgot the code.
#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <vector>
using namespace std;
int main(int argc, char *argv[]) {
const int n = argc == 2 ? atoi(argv[1]) : 5000;
clock_t start = clock();
int i = 0;
for (int loop = 0; loop < 10; loop++) {
i = 0;
vector<bool> lookup(n*n);
for (int j = 0; j < n; j++) lookup[j*j] = true;
for (int c = 2; c <= n; c++) {
for (int b = 1; b < c; b++) {
if (lookup[c*c - b*b]) i++;
}
}
printf("%i\n", i);
}
clock_t end = clock();
float time = float(end-start) / float(CLOCKS_PER_SEC);
printf("time: %0.04f", time);
return 0;
}
> wade-humeniuks-computer:~/lisp wade$ gcc -O3 -c pyth.cc
> wade-humeniuks-computer:~/lisp wade$ g++ -lstdc++ -o pyth pyth.o
> wade-humeniuks-computer:~/lisp wade$ time ./pyth
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> time: 1.0300
> real 0m1.052s
> user 0m0.998s
> sys 0m0.042s
> wade-humeniuks-computer:~/lisp wade$
>
> CL-USER> (time (loop repeat 10 do (pythagorean-triplet)))
>
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> 11362
> Evaluation took:
> 0.699 seconds of real time
> 0.691419 seconds of user run time
> 0.00562 seconds of system run time
> [Run times include 0.015 seconds GC run time.]
> 0 calls to %EVAL
> 0 page faults and
> 31,295,360 bytes consed.
> NIL
> CL-USER>
Wade Humeniuk wrote:
> I tried (a modified version of your C++) version on OS X
>
> The C++ version is slower that the SBCL Version (probably because of
> caching). The Lisp version is 40% faster.
>
> wade-humeniuks-computer:~/lisp wade$ gcc -O3 -c pyth.cc
> wade-humeniuks-computer:~/lisp wade$ g++ -lstdc++ -o pyth pyth.o
> wade-humeniuks-computer:~/lisp wade$ time ./pyth
Yes, usually gcc is slower than Microsoft Visual C++ :-)
--
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
On May 20, 9:11 pm, Wade Humeniuk <········@telus.net.no.spam> wrote:
> (defconstant +max+ 5000)
>
> (defun make-lookup (size)
> (let* ((lookup (make-array (* size size) :element-type 'bit :initial-element 0)))
> (dotimes (i size lookup)
> (setf (sbit lookup (* i i)) 1))))
>
> (defun pythagorean-triplet ()
> (declare (optimize (speed 3) (safety 0) (debug 0)))
> (let* ((n 0)
> (lookup (make-lookup (1+ +max+))))
> (declare (fixnum n))
> (loop for c from 2 to +max+ do
> (loop for b from 1 below c
> unless (zerop (sbit lookup (- (* c c) (* b b))))
> do (incf n)))
> (print n)))
I like it. I made a small tweak to take a parameter (I don't like
constants for things like this):
(defun count-triples-slow (hypotenuse-max)
(declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse-max)
(optimize (speed 3) (safety 0) (debug 0)))
(let ((lookup (make-lookup (1+ hypotenuse-max))))
(loop for c from 2 to hypotenuse-max
sum (loop for b from 1 below c
sum (sbit lookup (- (* c c) (* b b)))))))
CL-USER> (time (count-triples-slow 5000))
Evaluation took:
0.474 seconds of real time
0.476029 seconds of user run time
0.0 seconds of system run time
0 page faults and
3,126,264 bytes consed.
11362
Note, however, that this algorithm finds all ordered triples for each
hypotenuse. For example, it counts both (3, 4, 5) and (4, 3, 5). So
we only have to search to the point where b^2 is half of c^2 and then
double the count.
(defun count-triples-slow (hypotenuse-max)
(declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse-max)
(optimize (speed 3) (safety 0) (debug 0)))
(let ((lookup (make-lookup (1+ hypotenuse-max))))
(* 2 (loop for c from 2 to hypotenuse-max
sum (loop for b from 1 to (isqrt (truncate (* c c)
2))
sum (sbit lookup (- (* c c) (* b
b))))))))
CL-USER> (time (count-triples-slow 5000))
Evaluation took:
0.226 seconds of real time
0.220013 seconds of user run time
0.008001 seconds of system run time
0 page faults and
3,126,264 bytes consed.
11362
I then took a look at the macroexpansion of the loop code and noted
that there were some inefficiencies in there. I pulled the two loops
apart into separate functions and made them into tail-recursive
algorithms. I also made some pretty restrictive type definitions.
(The lookup table must fit into an array, so the maximum hypotenuse
must be the square root of array-dimension-limit. Also, the number of
candidate triples below a given hypotenuse length is a triangular
number, so the type for the count of pythagorean triples reflects the
closed formula for triangle number n, (n*(n+1)/2).)
(defun make-lookup (size)
(let ((lookup (make-array (* size size) :element-type 'bit
:initial-element 0)))
(dotimes (i size lookup)
(setf (sbit lookup (* i i)) 1))))
(deftype triple-count ()
`(integer 0 ,(truncate (- array-dimension-limit
(isqrt array-dimension-limit))
2)))
(defun triples-with-hypotenuse (hypotenuse lookup)
(declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse)
(type simple-bit-vector lookup)
(optimize (speed 3)))
(let ((c2 (* hypotenuse hypotenuse)))
(labels ((doit (b sum)
(declare (type (integer 0 #.(isqrt (truncate array-
dimension-limit 2))) b)
(type triple-count sum))
(if (zerop b)
sum
(doit (1- b) (+ sum (sbit lookup (- c2 (* b
b))))))))
(doit (isqrt (truncate c2 2)) 0))))
(defun count-triples-slow (hypotenuse-max)
(declare (type (integer 5 #.(isqrt array-dimension-limit))
hypotenuse-max)
(optimize (speed 3)))
(let ((lookup (make-lookup (1+ hypotenuse-max))))
(labels ((doit (c sum)
(declare (type (integer 0 #.(isqrt array-dimension-
limit)) c)
(type triple-count sum))
(if (zerop c)
sum
(doit (1- c)
(+ sum (the triple-count (triples-with-
hypotenuse c lookup)))))))
(* 2 (doit hypotenuse-max 0)))))
CL-USER> (time (count-triples-slow 5000))
Evaluation took:
0.184 seconds of real time
0.184011 seconds of user run time
0.0 seconds of system run time
0 page faults and
3,126,264 bytes consed.
11362
Wade Humeniuk wrote:
> None of the proposed solutions is guaranteed to give the correct
> solution since they all resort to using imprecise floats.
You mean if the program is run for years it will eventually suffer from
imprecision? I think we have more important things to worry about...
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
j.oke wrote:
> If you can live with compromises (in order to give you some
> superficial(!) advantages), you might consider to switch to some other
> language...
What if the superficial disadvantage is "Lisp lost me a $1M trade because it
hung whilst garbage collecting a big int I never needed" or "nobody is
buying my computer game because Lisp has generalised the player coordinates
to arbitrary-precision numbers and crippled the frame rate"?
The fact is, Lisp is slow and it puts people off.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop <···@ffconsultancy.com> writes:
> j.oke wrote:
> > If you can live with compromises (in order to give you some
> > superficial(!) advantages), you might consider to switch to some other
> > language...
>
> What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> hung whilst garbage collecting a big int I never needed"
Garbage collection is something done by the Sun JVM and the Microsoft
CLR. Why don't you worry about the billions they are losing, instead
of the paltry millions we're losing. Sounds like a real business
opportunity for you.
Or is it possible that this is just a made up problem and that
ordinary quality assurance is responsible for catching it? Do things
get through QA? Sure. But that doesn't make it the responsibility of
the tool.
Incidentally, there's also an unspoken presupposition in this post that
somehow if it had been done in some other language (I can't imagine
which you were thinking of), this wouldn't happen.
What would happen if I lost a million dollars because my statically
type-checked application that nevertheless yielded a bad result.
(Oops. It turns out that some design errors still yield effects at
runtime.) I hear you saying "That would just be bad design." or
"This would be caught by QA, and too bad if someone doesn't QA their
application." As if somehow those reasons aren't equally true of
dynamic languages.
The difference is that dynamic language users tend to know they have
to test their programs and make them robust against bad data. Too
often in static languages, I've seen people assume that after all the
hassle of getting something to compile, they have an entitlement for
it to just work... and then it fails mysteriously at runtime because
they forget that it IS possible for statically checked programs to
have runtime errors.
I've seen plenty of statically typed applications crash mysteriously
on null errors or other bad-data errors (e.g., a list of homogeneously
typed foo elements was expected to be of known length, so was declared
[foo], when really it had a type limitation no one encoded, so it
slipped past the type checker but produced bad data something else
never expected because it thought the upstream facility was strongly
checking everything).
> or "nobody is buying my computer game because Lisp has generalised
> the player coordinates to arbitrary-precision numbers and crippled
> the frame rate"?
Wow, it got all the way through to release without needing to be tested
and ran on the first try, just a little slowly? Cool.
Does that happen a lot in ML? Someone writes an entire game application
and never tests it, then compiles it and runs it and it plays fine
AND fast on the first try?
Or is it possible that this is just a completely contrived example
not typical of anything?
> The fact is, Lisp is slow and it puts people off.
If you're so put off, why are you still posting here?
Kent M Pitman wrote:
> Incidentally, there's also an unspoken presupposition in this post that
> somehow if it had been done in some other language...
You've really given up on doing this in Lisp? I had expected to get a
competitive solution with only a few more days work... ;-)
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
(message (Hello 'Jon)
(you :wrote :on '(Sun, 20 May 2007 21:07:50 +0100))
(
??>> Incidentally, there's also an unspoken presupposition in this post
??>> that somehow if it had been done in some other language...
JH> You've really given up on doing this in Lisp? I had expected to get a
JH> competitive solution with only a few more days work... ;-)
why do you discard competitive solution written in Scheme by Kjetil?
and why do you need these benchmarks? there are already lots of them. e.g.
http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=sbcl&lang2=ocaml
you can find SBCL there beats ocaml in 5 benchmarks, go better optimize
ocaml code..
(OTOH ocaml beats SBCL in 10 benchmarks, but very slightly, maybe it's just
because of 13x larger startup time of SBCL)
btw, you've mentioned GC time. binary-trees is a benchmark for GC made by
Hans Boehm. SBCL is 1.4 times faster than ocaml in it, and on par with C.
)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"I am everything you want and I am everything you need")
Alex Mizrahi wrote:
> why do you discard competitive solution written in Scheme by Kjetil?
I haven't discarded it: I just haven't gotten around to looking at it yet
(Stalin doesn't run in 64-bit AFAIK).
> and why do you need these benchmarks?
We're just playing. :-)
> there are already lots of them. e.g.
>
>
http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=sbcl&lang2=ocaml
>
> you can find SBCL there beats ocaml in 5 benchmarks, go better optimize
> ocaml code..
My machine gives very different results: OCaml is only slower on
binary-trees.
> (OTOH ocaml beats SBCL in 10 benchmarks, but very slightly, maybe it's
> just because of 13x larger startup time of SBCL)
>
> btw, you've mentioned GC time. binary-trees is a benchmark for GC made by
> Hans Boehm. SBCL is 1.4 times faster than ocaml in it, and on par with C.
The benchmark definition is poor: I submitted several progressively faster
OCaml implementation that took time going down to zero seconds about two
years ago and they all got rejected.
The same optimizations can be applied to the Lisp, of course. So you aren't
really measuring anything.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
On May 20, 8:01 pm, Kent M Pitman <······@nhplace.com> wrote:
> Jon Harrop <····@ffconsultancy.com> writes:
>
> > or "nobody is buying my computer game because Lisp has generalised
> > the player coordinates to arbitrary-precision numbers and crippled
> > the frame rate"?
>
> Wow, it got all the way through to release without needing to be tested
> and ran on the first try, just a little slowly? Cool.
>
> Does that happen a lot in ML? Someone writes an entire game application
> and never tests it, then compiles it and runs it and it plays fine
> AND fast on the first try?
>
> Or is it possible that this is just a completely contrived example
> not typical of anything?
>
> > The fact is, Lisp is slow and it puts people off.
>
> If you're so put off, why are you still posting here?
If he plans to make a game he would use a premade engine , the
commercial or an opensource one. Making your own engine is better
left to people who have a enough time , knowledge and a good team who
will help them. People with those talents don't spend time trolling in
forums they build their engine. Where we gonna see engine build in
OCaml or F# ?
And Jon you better took my advcie and write some demos with
Irrlicht .Net and show us how good is F#. It'll give you some
necessary credentials to speak on this subject.
In article <························@ptn-nntp-reader02.plus.net>,
Jon Harrop <···@ffconsultancy.com> wrote:
> j.oke wrote:
> > If you can live with compromises (in order to give you some
> > superficial(!) advantages), you might consider to switch to some other
> > language...
>
> What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> hung whilst garbage collecting a big int I never needed" or "nobody is
> buying my computer game because Lisp has generalised the player coordinates
> to arbitrary-precision numbers and crippled the frame rate"?
>
> The fact is, Lisp is slow and it puts people off.
___________________________
/| /| | |
||__|| | Please don't |
/ O O\__ feed |
/ \ the trolls |
/ \ \ |
/ _ \ \ ----------------------
/ |\____\ \ ||
/ | | | |\____/ ||
/ \|_|_|/ | __||
/ / \ |____| ||
/ | | /| | --|
| | |// |____ --|
* _ | |_|_|_| | \-/
*-- _--\ _ \ // |
/ _ \\ _ // | /
* / \_ /- | - | |
* ___ c_c_c_C/ \C_c_c_c____________
--
http://lispm.dyndns.org
On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> In article <························@ptn-nntp-reader02.plus.net>,
> Jon Harrop <····@ffconsultancy.com> wrote:
>
> > j.oke wrote:
> > > If you can live with compromises (in order to give you some
> > > superficial(!) advantages), you might consider to switch to some other
> > > language...
>
> > What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> > hung whilst garbage collecting a big int I never needed" or "nobody is
> > buying my computer game because Lisp has generalised the player coordinates
> > to arbitrary-precision numbers and crippled the frame rate"?
>
> > The fact is, Lisp is slow and it puts people off.
>
> ___________________________
> /| /| | |
> ||__|| | Please don't |
> / O O\__ feed |
> / \ the trolls |
> / \ \ |
> / _ \ \ ----------------------
> / |\____\ \ ||
> / | | | |\____/ ||
> / \|_|_|/ | __||
> / / \ |____| ||
> / | | /| | --|
> | | |// |____ --|
> * _ | |_|_|_| | \-/
> *-- _--\ _ \ // |
> / _ \\ _ // | /
> * / \_ /- | - | |
> * ___ c_c_c_C/ \C_c_c_c____________
>
> --http://lispm.dyndns.org
Sorry Rainer but I really hate when people speak thrash game
development, beside I find this troll quite interesthing.
fireblade wrote:
> On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
>> Jon Harrop <····@ffconsultancy.com> wrote:
>>> The fact is, Lisp is slow and it puts people off.
>> ___________________________
>> /| /| | |
>> ||__|| | Please don't |
>> / O O\__ feed |
>> / \ the trolls |
>> / \ \ |
>> / _ \ \ ----------------------
>> / |\____\ \ ||
>> / | | | |\____/ ||
>> / \|_|_|/ | __||
>> / / \ |____| ||
>> / | | /| | --|
>> | | |// |____ --|
>> * _ | |_|_|_| | \-/
>> *-- _--\ _ \ // |
>> / _ \\ _ // | /
>> * / \_ /- | - | |
>> * ___ c_c_c_C/ \C_c_c_c____________
>>
>> --http://lispm.dyndns.org
>
> Sorry Rainer but I really hate when people speak thrash game
> development, beside I find this troll quite interesthing.
I don't. I find him very boring and upsetting, and I appreciate
when people call out trolls for what they are.
The fact is that Lisp is a dynamically typed language, and dynamic
type checking takes time. That's not an argument against Lisp
any more than dozens of other popular languages. Even less so,
in fact, because Lisp has optional type declarations.
According to Peter Norivig and Greg Sullivan, the fact is that
dynamic typing is more expressive than static typing. Statically
typed languages don't have that expressiveness, and some people
are "put off" by that.
The fact is that dynamic typing is better for RAD. Statically
typed languages tend to quibble over minor type issues, and
some people are "put off" by that.
The fact is that trolls are disruptive and unhelpful, and *many*
people are put off by that.
--
Dan
www.prairienet.org/~dsb/
On May 21, 2:37 pm, Dan Bensen <··········@cyberspace.net> wrote:
> fireblade wrote:
> > On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> >> Jon Harrop <····@ffconsultancy.com> wrote:
> >>> The fact is, Lisp is slow and it puts people off.
> >> ___________________________
> >> /| /| | |
> >> ||__|| | Please don't |
> >> / O O\__ feed |
> >> / \ the trolls |
> >> / \ \ |
> >> / _ \ \ ----------------------
> >> / |\____\ \ ||
> >> / | | | |\____/ ||
> >> / \|_|_|/ | __||
> >> / / \ |____| ||
> >> / | | /| | --|
> >> | | |// |____ --|
> >> * _ | |_|_|_| | \-/
> >> *-- _--\ _ \ // |
> >> / _ \\ _ // | /
> >> * / \_ /- | - | |
> >> * ___ c_c_c_C/ \C_c_c_c____________
>
> >> --http://lispm.dyndns.org
>
> > Sorry Rainer but I really hate when people speak thrash game
> > development, beside I find this troll quite interesthing.
>
> I don't. I find him very boring and upsetting, and I appreciate
> when people call out trolls for what they are.
>
> The fact is that Lisp is a dynamically typed language, and dynamic
> type checking takes time. That's not an argument against Lisp
> any more than dozens of other popular languages. Even less so,
> in fact, because Lisp has optional type declarations.
>
> According to Peter Norivig and Greg Sullivan, the fact is that
> dynamic typing is more expressive than static typing. Statically
> typed languages don't have that expressiveness, and some people
> are "put off" by that.
>
> The fact is that dynamic typing is better for RAD. Statically
> typed languages tend to quibble over minor type issues, and
> some people are "put off" by that.
>
> The fact is that trolls are disruptive and unhelpful, and *many*
> people are put off by that.
>
Most people still believe in waterfall method, planning everything
from start. The layered architecture, field where embedded languages
perfectly fit in is in little use for someone who prefer big ball of
mud architecture.
Dan Bensen wrote:
> I find him very boring and upsetting
Please don't be upset by what I say. I shall try to be more diplomatic.
> The fact is that dynamic typing is better for RAD.
Dynamic typing can certainly be better for RAD under certain circumstances
(e.g. interop, GUI dev) but not all.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop wrote:
> Dan Bensen wrote:
>> I find him very boring and upsetting
> I shall try to be more diplomatic.
Could you try to be rational?
That's all I really care about.
--
Dan
www.prairienet.org/~dsb/
In article <························@36g2000prm.googlegroups.com>,
fireblade <·················@gmail.com> wrote:
> On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> > In article <························@ptn-nntp-reader02.plus.net>,
> > Jon Harrop <····@ffconsultancy.com> wrote:
> >
> > > j.oke wrote:
> > > > If you can live with compromises (in order to give you some
> > > > superficial(!) advantages), you might consider to switch to some other
> > > > language...
> >
> > > What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> > > hung whilst garbage collecting a big int I never needed" or "nobody is
> > > buying my computer game because Lisp has generalised the player coordinates
> > > to arbitrary-precision numbers and crippled the frame rate"?
> >
> > > The fact is, Lisp is slow and it puts people off.
> >
> > ___________________________
> > /| /| | |
> > ||__|| | Please don't |
> > / O O\__ feed |
> > / \ the trolls |
> > / \ \ |
> > / _ \ \ ----------------------
> > / |\____\ \ ||
> > / | | | |\____/ ||
> > / \|_|_|/ | __||
> > / / \ |____| ||
> > / | | /| | --|
> > | | |// |____ --|
> > * _ | |_|_|_| | \-/
> > *-- _--\ _ \ // |
> > / _ \\ _ // | /
> > * / \_ /- | - | |
> > * ___ c_c_c_C/ \C_c_c_c____________
> >
> > --http://lispm.dyndns.org
>
> Sorry Rainer but I really hate when people speak thrash game
> development, beside I find this troll quite interesthing.
Then discuss game development or other stuff via email with him.
--
http://lispm.dyndns.org
On May 21, 2:39 pm, Rainer Joswig <······@lisp.de> wrote:
> In article <························@36g2000prm.googlegroups.com>,
>
>
>
>
>
> fireblade <·················@gmail.com> wrote:
> > On May 20, 7:46 pm, Rainer Joswig <······@lisp.de> wrote:
> > > In article <························@ptn-nntp-reader02.plus.net>,
> > > Jon Harrop <····@ffconsultancy.com> wrote:
>
> > > > j.oke wrote:
> > > > > If you can live with compromises (in order to give you some
> > > > > superficial(!) advantages), you might consider to switch to some other
> > > > > language...
>
> > > > What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> > > > hung whilst garbage collecting a big int I never needed" or "nobody is
> > > > buying my computer game because Lisp has generalised the player coordinates
> > > > to arbitrary-precision numbers and crippled the frame rate"?
>
> > > > The fact is, Lisp is slow and it puts people off.
>
> > > ___________________________
> > > /| /| | |
> > > ||__|| | Please don't |
> > > / O O\__ feed |
> > > / \ the trolls |
> > > / \ \ |
> > > / _ \ \ ----------------------
> > > / |\____\ \ ||
> > > / | | | |\____/ ||
> > > / \|_|_|/ | __||
> > > / / \ |____| ||
> > > / | | /| | --|
> > > | | |// |____ --|
> > > * _ | |_|_|_| | \-/
> > > *-- _--\ _ \ // |
> > > / _ \\ _ // | /
> > > * / \_ /- | - | |
> > > * ___ c_c_c_C/ \C_c_c_c____________
>
> > > --http://lispm.dyndns.org
>
> > Sorry Rainer but I really hate when people speak thrash game
> > development, beside I find this troll quite interesthing.
>
> Then discuss game development or other stuff via email with him.
>
> --http://lispm.dyndns.org- Hide quoted text -
>
Ok
On May 20, 7:19 pm, Jon Harrop <····@ffconsultancy.com> wrote:
> j.oke wrote:
> > If you can live with compromises (in order to give you some
> > superficial(!) advantages), you might consider to switch to some other
> > language...
>
> What if the superficial disadvantage is "Lisp lost me a $1M trade because it
> hung whilst garbage collecting a big int I never needed" or "nobody is
> buying my computer game because Lisp has generalised the player coordinates
> to arbitrary-precision numbers and crippled the frame rate"?
>
> The fact is, Lisp is slow and it puts people off.
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy
Jon, you never made any game so you don't know is it slow or not. You
never even made a game with Ocaml to speak about it. IMHO , the
largest thing I ever made is a racing demo ~20000 lines with
lightfeather and newton handling physics showed that unoptimised lisp
is about 30% slower , something that people could live with
considering easiness of development with lisp. And BTW making a good
3d game is hardest thing i ever tackled. People develop games with c/c+
+/c# because that's what the GDKs and game engines support. Some use
higher languages for scripting or develop something in house but
that's another topic.
Dr Jon D Harrop of Flying F*** Consultancy finally hauls out this well-
worn saw:
> The fact is, Lisp is slow and it puts people off.
If I had a nickel for every time I heard this...
I think this counts as a lisp-specific instance of Godwin's law.
Anyone remember `Ahmed the Peaceman'?
Apparently you don't read what I post. In a conversation you
participated in on comp.lang.lisp just a bit over a year ago I wrote:
Allow me to point you at the Supercomputer Toolkit:
http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf
``The Supercomputer Toolkit compiler performs partial evaluation
of data-independent programs expressed in the Scheme
dialect of Lisp by using the symbolic execution technique described
in previously published work by Berlin.''
Jacob Katzemelson at Technion has apparently used the Supercomputer
Toolkit to run weather prediction models:
``Conservative estimates performed for a typical mesoscale forecast
over the eastern Mediterranean (for 36 hour) suggest that were the
Toolkit constructed from ALPHA processors, 10 processors would do the
prediction in only about 13 minutes. A 36 hours simulation with full
physics for the whole earth will require 2 hours for 80 ALPHA
processors.''
If Lisp is slow, why then are MIT researchers using it for computation-
intensive programs?
What sort of crappy education are they giving at Robinson College
these days? It only takes a few seconds on Google to find enough
papers on high-performance computing in Lisp to make a decent text
book! If I were you I'd ask Robinson for my money back. It seems
they left you as ignorant as when you entered and certainly didn't
train you to change that sad situation.
Next time you feel like uttering a sentence with the word `fact' in it
(or `most people' or `vast majority', etc.) just type it into that
little rectangular box at the top of your browser and *find out* if
there is any truth to it *before* you go spouting off.
Joe Marshall <··········@gmail.com> writes:
> Allow me to point you at the Supercomputer Toolkit:
> http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf
>
> ``The Supercomputer Toolkit compiler performs partial evaluation
> of data-independent programs expressed in the Scheme
> dialect of Lisp by using the symbolic execution technique described
> in previously published work by Berlin.''
Thanks for posting this, it was extremely interesting. I do a lot of
numerical work (slowly migrating to Lisp), and I would like to learn
more about these techniques. For example, is it Scheme specific or
could this be done in CL? What would be a good place to start?
Thanks,
Tamas
>
>
> Jacob Katzemelson at Technion has apparently used the Supercomputer
> Toolkit to run weather prediction models:
>
>
> ``Conservative estimates performed for a typical mesoscale forecast
> over the eastern Mediterranean (for 36 hour) suggest that were the
> Toolkit constructed from ALPHA processors, 10 processors would do the
> prediction in only about 13 minutes. A 36 hours simulation with full
> physics for the whole earth will require 2 hours for 80 ALPHA
> processors.''
>
> If Lisp is slow, why then are MIT researchers using it for computation-
> intensive programs?
>
> What sort of crappy education are they giving at Robinson College
> these days? It only takes a few seconds on Google to find enough
> papers on high-performance computing in Lisp to make a decent text
> book! If I were you I'd ask Robinson for my money back. It seems
> they left you as ignorant as when you entered and certainly didn't
> train you to change that sad situation.
>
> Next time you feel like uttering a sentence with the word `fact' in it
> (or `most people' or `vast majority', etc.) just type it into that
> little rectangular box at the top of your browser and *find out* if
> there is any truth to it *before* you go spouting off.
On May 21, 1:10 pm, Tamas Papp <······@gmail.com> wrote:
> Joe Marshall <··········@gmail.com> writes:
> > Allow me to point you at the Supercomputer Toolkit:
> >http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf
>
> > ``The Supercomputer Toolkit compiler performs partial evaluation
> > of data-independent programs expressed in the Scheme
> > dialect of Lisp by using the symbolic execution technique described
> > in previously published work by Berlin.''
>
> Thanks for posting this, it was extremely interesting. I do a lot of
> numerical work (slowly migrating to Lisp), and I would like to learn
> more about these techniques. For example, is it Scheme specific or
> could this be done in CL? What would be a good place to start?
It can be done in Common Lisp. Jeffrey Mark Siskind has a partial
evaluator for Common Lisp at
http://cobweb.ecn.purdue.edu/~qobi/software.html
You should also check out this paper:
@article{ fateman95fast,
author = "Richard J. Fateman and Kevin A. Broughan and Diane K.
Willcock and Duane Rettig",
title = "Fast Floating Point Processing in {Common Lisp}",
journal = "ACM Transactions on Mathematical Software",
volume = "21",
number = "1",
pages = "26--62",
year = "1995",
url = "citeseer.ist.psu.edu/fateman95fast.html" }
Two of the authors have been known to post to this group.
A lot of this work is still very experimental, so be prepared to poke
around.
Joe Marshall wrote:
> Dr Jon D Harrop of Flying F*** Consultancy finally hauls out this well-
> worn saw:
>> The fact is, Lisp is slow and it puts people off.
>
> Allow me to point you at the Supercomputer Toolkit:
> http://repository.readscheme.org/ftp/papers/berlinsurati-pepm94.pdf
Allow me to point you at some benchmarks made this century that run on
platforms that aren't dead:
http://www.ffconsultancy.com/languages/ray_tracer/results.html
> It only takes a few seconds on Google to find enough
> papers on high-performance computing in Lisp to make a decent text
> book!
Yeah, your new book could include this paper from 1983 as well:
http://portal.acm.org/citation.cfm?id=801672&coll=portal&dl=ACM
> It seems
> they left you as ignorant as when you entered and certainly didn't
> train you to change that sad situation.
I can't believe you haven't reached for the stereotypical Lisper's
performance comparison with Java. You must be getting slow in your old age.
> Next time you feel like uttering a sentence with the word `fact' in it
> (or `most people' or `vast majority', etc.) just type it into that
> little rectangular box at the top of your browser and *find out* if
> there is any truth to it *before* you go spouting off.
<sarcasm>
I hear Lisp is the new black at high-performance supercomputing conferences.
Not to mention the fact that performance only matters when you're coding
for a supercomputer...
</sarcasm>
ROTFLMAO.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
j.oke wrote:
> Now we get:
>
> OCaml: 0.341s [not tested]
> C: 0.384s [not tested]
> Lisp: 0.0s [tested!]
Can you post your code so that we can benchmark compile times?
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
On Sat, 19 May 2007, Jon Harrop wrote:
> Chris Russell wrote:
>> (defun pythagorean-triplet()
>> (let ((c2-b2 0) (a 0) (i 0))
>> (loop for c from 2 to 5000 do
>> (loop for b from 1 below c do
>> (progn
>> (setf c2-b2 (- (* c c) (* b b))
>> a (isqrt c2-b2))
>> (when (= c2-b2 (* a a))
>> (incf i)))))
>> (print i)))
>
> I think this is the only Lisp posted so far that gives the same answer as
> the original program. As far as timings, I get:
>
> OCaml: 0.341s
> C: 0.384s
> Lisp: 9.233s
C: 0.604s
Scheme: 0.478s
scheme
------
"
(define (pythag max)
(let ((i 0))
(let loop ((c 2))
(if (<= c max)
(begin
(let loop ((b 1))
(if (< b c)
(begin
(let ((a (sqrt (- (* c c) (* b b)))))
(if (= (floor a) a)
(set! i (+ 1 i))))
(loop (+ 1 b)))))
(loop (+ 1 c)))))
(display i)
(newline)))
(pythag 5000)
"
stalin -On -copt -O3 -copt -ffast-math pythag.scm
time ./pythag
C
-
"
#include <stdio.h>
#include <math.h>
#define MAX 5000
int main() {
double c, b;
int i = 0;
double a;
for (c = 2; c <= MAX; c++) {
for (b = 1; b < c; b++) {
a = sqrt(c*c - b*b);
if (trunc(a) == a) {
i++;
}
}
}
printf("%d\n", i);
}
"
gcc -O3 -ffast-math pythag_c.c -lm
time ./a.out
On Sun, 20 May 2007, Kjetil S. Matheussen wrote:
>
>
> On Sat, 19 May 2007, Jon Harrop wrote:
>
>> Chris Russell wrote:
>> > (defun pythagorean-triplet()
>> > (let ((c2-b2 0) (a 0) (i 0))
>> > (loop for c from 2 to 5000 do
>> > (loop for b from 1 below c do
>> > (progn
>> > (setf c2-b2 (- (* c c) (* b b))
>> > a (isqrt c2-b2))
>> > (when (= c2-b2 (* a a))
>> > (incf i)))))
>> > (print i)))
>>
>> I think this is the only Lisp posted so far that gives the same answer as
>> the original program. As far as timings, I get:
>>
>> OCaml: 0.341s
>> C: 0.384s
>> Lisp: 9.233s
>
> C: 0.604s
> Scheme: 0.478s
>
Further experimentation shows:
C: 0.620s (median of 5)
Scheme/Lisp: 0.408s (median of 5)
gcc -O3 -ffast-math -fomit-frame-pointer -freg-struct-return
-malign-double pythag_c.c -lm
../stalin-0.11/stalin -Ob -Om -On -Or -Ot -dP -dC -dh -q -d -db
-no-clone-size-limit -split-even-if-no-widening -copt -O3
-copt -fomit-frame-pointer -copt -freg-struct-return -copt -ffast-math
pythag.scm
On May 19, 4:38 pm, Emre Sevinc <···········@gmail.com> wrote:
> (defconstant +max+ 5000
> "The constant value up to which the main loop will run")
>
> (defun pythagorean-triplet()
> (let ((a 0) (i 0))
> (loop for c from 2 to +max+
> do
> (loop for b from 1 below c
> do (progn
> (setf a (sqrt (- (* c c) (* b b))))
> (when (= (floor a) a) (incf i)))))
> (print i)))
I'm not sure but maybe it would also be possible to optimize the
algorithm to skip triples with bogus parities.
(defun pythagorean-triplet (lim)
(loop for c from 2 to (the fixnum lim)
for cc = (* c c)
for cp = (mod c 2)
sum (loop for b from 1 below c
for bp = (mod b 2)
for aa = (- cc (* b b))
for a = (sqrt aa)
for ap = (mod aa 2)
count (and (or (= 0 ap bp cp) ; E +
E = E
(and (= 1 ap bp) (= 0 cp)) ; O +
O = E
(and (= 1 ap cp) (= 0 bp)) ; O +
E = O
(and (= 1 bp cp) (= 0 ap))) ; E +
O = O
(= (floor a) a)))))
It seems like parity tests cost much more than it ought to help out.
Maybe others know how to optimize them anyway. (Because parity tests
reduce the example space to its half size.)
Regards.
Emre Sevinc <···········@gmail.com> wrote:
> (defun pythagorean-triplet()
> (let ((a 0) (i 0))
> (loop for c from 2 to +max+
> do
> (loop for b from 1 below c
> do (progn
> (setf a (sqrt (- (* c c) (* b b))))
> (when (= (floor a) a) (incf i)))))
> (print i)))
(deftype small-positive-double-float ()
`(double-float 0d0 ,(float most-positive-fixnum 0d0)))
(defun pythagorean-triplet ()
(declare (optimize speed (safety 0)))
(loop for c from 2 to 5000
sum (loop for b from 1 below c
count (let ((a (the small-positive-double-float
(sqrt (float (- (* c c) (* b b)) 0d0)))))
(= (float (truncate a) 0d0) a)))))
There are a few things to note here:
* SBCL can open-code SQRT on values known to be positive floats (for
example on x86-64 into a single sqrtsd instruction).
* SBCL will only open-code TRUNCATE (or FLOOR) on floats when the
result is known to fit into a machine word.
* SBCL won't open-code = on arguments that aren't known to be of the
same types.
CL-USER> (time (print (pythagorean-triplet)))
11362
Evaluation took:
0.16 seconds of real time
0.159975 seconds of user run time
0.0 seconds of system run time
0 calls to %EVAL
0 page faults and
8,192 bytes consed.
CL-USER> (values (lisp-implementation-type)
(lisp-implementation-version)
(machine-type))
"SBCL"
"1.0.5.54"
"X86-64"
--
Juho Snellman
Juho Snellman wrote:
> (deftype small-positive-double-float ()
> `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>
> (defun pythagorean-triplet ()
> (declare (optimize speed (safety 0)))
> (loop for c from 2 to 5000
> sum (loop for b from 1 below c
> count (let ((a (the small-positive-double-float
> (sqrt (float (- (* c c) (* b b))
> 0d0)))))
> (= (float (truncate a) 0d0) a)))))
This is the fastest Lisp so far and its run time is faster than everything
except the macro solution that j.oke hinted at.
So this is clearly doing significant work at compile time.
For a fair comparison, you either need to include compilation time or not
hardcode the 5000.
Including compile times I get:
C: 0.399s
OCaml: 0.478s
SBCL: 0.992s
Tweaking the programs to accept N as an argument and then measuring only run
times, I get:
C: 0.291s
OCaml: 0.350s
SBCL: 0.885s
So Lisp is around 3x slower than C from this benchmark. OCaml is only 20%
slower.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop <···@ffconsultancy.com> writes:
> Juho Snellman wrote:
>> (deftype small-positive-double-float ()
>> `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>>
>> (defun pythagorean-triplet ()
>> (declare (optimize speed (safety 0)))
>> (loop for c from 2 to 5000
>> sum (loop for b from 1 below c
>> count (let ((a (the small-positive-double-float
>> (sqrt (float (- (* c c) (* b b))
>> 0d0)))))
>> (= (float (truncate a) 0d0) a)))))
>
> This is the fastest Lisp so far and its run time is faster than everything
> except the macro solution that j.oke hinted at.
>
> So this is clearly doing significant work at compile time.
No, it isn't. All it does is use some type declarations that help
SBCL avoid boxing and use inline arithmetic where possible.
> For a fair comparison, you either need to include compilation time
No, since nothing is pre-computed during compilation time.
> or not hardcode the 5000.
[...]
> Tweaking the programs to accept N as an argument and then measuring
> only run times, I get:
[...]
> So Lisp is around 3x slower than C from this benchmark. OCaml is
> only 20% slower.
I'm assuming the first few lines of your tweaked version look like
this?
--8<---------------cut here---------------start------------->8---
(defun pythagorean-triplet (n)
(declare (optimize speed (safety 0)))
(loop for c from 2 to n
...))
--8<---------------cut here---------------end--------------->8---
If that is so, SBCL's type inferencer cannot assume that n is a
machine integer, which means it also cannot assume that c is a machine
integer, etc.. This leads to inefficient code.
Here is a version that takes n as a parameter, annotates even more
types than Juho's version, and runs even faster. I'm sorry I had to
switch from LOOP to the DO macro, but using LOOP, SBCL kept giving me
performance warnings.
--8<---------------cut here---------------start------------->8---
(deftype small-positive-double-float ()
`(double-float 0d0 ,(float most-positive-fixnum 0d0)))
(defun pythagorean-triplet (n)
(declare (optimize speed (safety 0))
(type fixnum n))
(do ((c 2 (1+ c))
(count 0))
((> c n)
count)
(declare (type fixnum c count))
(do ((b 1 (1+ b)))
((>= b c))
(declare (type fixnum b))
(let ((a (the small-positive-double-float
(sqrt (float (- (the fixnum (* c c))
(the fixnum (* b b)))
0d0)))))
(when (= (float (truncate a) 0d0) a)
(incf count))))))
--8<---------------cut here---------------end--------------->8---
I get 0.446183s for this Lisp version and 0.355s for the C version.
Measurements do not include compilation times.
--
Wolfram Fenske
A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
Jon Harrop <···@ffconsultancy.com> wrote:
> This is the fastest Lisp so far and its run time is faster than everything
> except the macro solution that j.oke hinted at.
>
> So this is clearly doing significant work at compile time.
It's not doing any of the computation at compile time. "Clearly" your
prejudices are showing.
> For a fair comparison, you either need to include compilation time or not
> hardcode the 5000.
>
> Including compile times I get:
>
> C: 0.399s
> OCaml: 0.478s
> SBCL: 0.992s
Compiling that function takes less than 0.03 seconds on my machine. I
suggest that your measurements are complete bullshit.
> Tweaking the programs to accept N as an argument and then measuring only run
> times, I get:
>
> C: 0.291s
> OCaml: 0.350s
> SBCL: 0.885s
>
> So Lisp is around 3x slower than C from this benchmark. OCaml is only 20%
> slower.
Declaring the type of N (for example as an (unsigned-byte 14)) would
fix that.
--
Juho Snellman
Juho Snellman wrote:
> It's not doing any of the computation at compile time.
Wolfram's results prove otherwise.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop <···@ffconsultancy.com> writes:
> Juho Snellman wrote:
>> It's not doing any of the computation at compile time.
>
> Wolfram's results prove otherwise.
Huh? How so?
--
Wolfram Fenske
A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?
/me feeds the troll.
On 2007-05-20, Jon Harrop <···@ffconsultancy.com> wrote:
> Juho Snellman wrote:
>> (deftype small-positive-double-float ()
>> `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>>
>> (defun pythagorean-triplet ()
>> (declare (optimize speed (safety 0)))
>> (loop for c from 2 to 5000
>> sum (loop for b from 1 below c
>> count (let ((a (the small-positive-double-float
>> (sqrt (float (- (* c c) (* b b))
>> 0d0)))))
>> (= (float (truncate a) 0d0) a)))))
>
> This is the fastest Lisp so far and its run time is faster than
> everything except the macro solution that j.oke hinted at.
>
> So this is clearly doing significant work at compile time.
Others have disagreed with this statement, but ...
> For a fair comparison, you either need to include compilation time
... I'll disagree with this one.
Suddenly the Lisp program wins, through (you believe) compile-time
computation, and suddenly you change what you measure so that the Lisp
program doesn't win any more.
Why?
You might as well include the typing speed of the programmers too,
"for a fair comparison". Or, hmm, the square-footage of their houses,
"for a fair comparison".
You should define what you want up front, and whoever wins, wins.
Just prepare yourself for *us* to want different things (or in
different proportions) than *you*.
Larry Clapp wrote:
>> For a fair comparison, you either need to include compilation time
>
> ... I'll disagree with this one.
>
> Suddenly the Lisp program wins, through (you believe) compile-time
> computation, and suddenly you change what you measure so that the Lisp
> program doesn't win any more.
>
> Why?
Other solutions also take zero run time but significant compile time (to do
the calculation). How do you propose we compare them?
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop wrote:
> Other solutions also take zero run time but significant compile time (to do
> the calculation). How do you propose we compare them?
[off topic, whispering]
(You might consider to change your way of learning Lisp:
In usenet it will take you more than a dozen years... ;-)
On May 21, 2:27 pm, "j.oke" <········@gmail.com> wrote:
> Jon Harrop wrote:
> > Other solutions also take zero run time but significant compile time (to do
> > the calculation). How do you propose we compare them?
>
> [off topic, whispering]
>
> (You might consider to change your way of learning Lisp:
> In usenet it will take you more than a dozen years... ;-)
Yes but it makes a cool looking page. Just imagine : Jon Harrop's
Road to Lisp.
On May 21, 2:47 pm, fireblade <·················@gmail.com> wrote:
> On May 21, 2:27 pm, "j.oke" <········@gmail.com> wrote:
>
> > Jon Harrop wrote:
> > > Other solutions also take zero run time but significant compile time (to do
> > > the calculation). How do you propose we compare them?
>
> > [off topic, whispering]
>
> > (You might consider to change your way of learning Lisp:
> > In usenet it will take you more than a dozen years... ;-)
>
> Yes but it makes a cool looking page. Just imagine : Jon Harrop's
> Road to Lisp.
Something like I learned lisp by trolling in usenet, or my first
exposure to lisp was when I decide to convert lispers to my ugly ocaml
like microshit language and by the way sell my stupid journal.
No way.
On May 21, 3:15 pm, jt <·············@gmail.com> wrote:
> On May 21, 2:47 pm, fireblade <·················@gmail.com> wrote:
>
> > On May 21, 2:27 pm, "j.oke" <········@gmail.com> wrote:
>
> > > Jon Harrop wrote:
> > > > Other solutions also take zero run time but significant compile time (to do
> > > > the calculation). How do you propose we compare them?
>
> > > [off topic, whispering]
>
> > > (You might consider to change your way of learning Lisp:
> > > In usenet it will take you more than a dozen years... ;-)
>
> > Yes but it makes a cool looking page. Just imagine : Jon Harrop's
> > Road to Lisp.
>
> Something like I learned lisp by trolling in usenet, or my first
> exposure to lisp was when I decide to convert lispers to my ugly ocaml
> like microshit language and by the way sell my stupid journal.
>
> No way.
How do you know that his journal is stupid? Wait the minute you are
the lisper that order it ? :)
On 2007-05-21, Jon Harrop <···@ffconsultancy.com> wrote:
> Larry Clapp wrote:
>>> For a fair comparison, you either need to include compilation time
>>
>> ... I'll disagree with this one.
>>
>> Suddenly the Lisp program wins, through (you believe) compile-time
>> computation, and suddenly you change what you measure so that the
>> Lisp program doesn't win any more.
>>
>> Why?
>
> Other solutions also take zero run time but significant compile time
> (to do the calculation). How do you propose we compare them?
Which part of
"You should define what you want up front, and whoever wins, wins."
did you not understand?
Jon Harrop wrote:
[lots of stuff]
Great, you learned that some things can and should be computed at
compile time for maximum efficiency.
You also learned that a good algorithm performs way better. And that
indeed the used language has not much of an impact on that simple fact.
That's two nice stories for your journal, Jon.
Sacha
Jon Harrop wrote:
> Tweaking the programs to accept N as an argument and then measuring only run
> times, I get:
>
> C: 0.291s
> OCaml: 0.350s
> SBCL: 0.885s
>
> So Lisp is around 3x slower than C from this benchmark. OCaml is only 20%
> slower.
3x to 5x slower than C is the same result I had with other micro-benchmarks
like this, but who cares? I don't think this could be generalized for
larger applications, because typically there are only a few hotspots, which
could be implemented in C and called by Lisp and for the rest of the
program it is more important to have the flexibility Lisp provides, to make
it easier for developers to write and maintain programs. And with
incremental GCs, like modern Lisp implementations provide, there is no
significant timeout e.g. when using Lisp for games.
Some more reasons why a good GC is faster than manual memory managment:
http://www.jwz.org/doc/gc.html
--
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Frank Buss wrote:
> Some more reasons why a good GC is faster than manual memory managment:
> http://www.jwz.org/doc/gc.html
So you'd recommend the fastest language with GC?
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
Jon Harrop wrote:
> Frank Buss wrote:
>> Some more reasons why a good GC is faster than manual memory managment:
>> http://www.jwz.org/doc/gc.html
>
> So you'd recommend the fastest language with GC?
This is language independant, because you can use a GC in C++, too. But
Lisp is much easier to use.
--
Frank Buss, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Frank Buss wrote:
> This is language independant...
Language design can affect the potential for optimization in the GC, e.g.
see BIBOP.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
On Sun, 20 May 2007, Jon Harrop wrote:
> Juho Snellman wrote:
>> (deftype small-positive-double-float ()
>> `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>>
>> (defun pythagorean-triplet ()
>> (declare (optimize speed (safety 0)))
>> (loop for c from 2 to 5000
>> sum (loop for b from 1 below c
>> count (let ((a (the small-positive-double-float
>> (sqrt (float (- (* c c) (* b b))
>> 0d0)))))
>> (= (float (truncate a) 0d0) a)))))
>
> This is the fastest Lisp so far and its run time is faster than everything
> except the macro solution that j.oke hinted at.
>
No. Not correct. As I posted, the stalin version was about 1.5 times
faster than the C version, and therefore probably also faster than the
OCaml version. Even better, the stalin version wasn't even optimized, I
just wrote down the most straigh-forward version, compiled it, and bang,
1.5 times faster than C.
On Mon, 21 May 2007, Kjetil S. Matheussen wrote:
>
>
> On Sun, 20 May 2007, Jon Harrop wrote:
>
>> Juho Snellman wrote:
>> > (deftype small-positive-double-float ()
>> > `(double-float 0d0 ,(float most-positive-fixnum 0d0)))
>> >
>> > (defun pythagorean-triplet ()
>> > (declare (optimize speed (safety 0)))
>> > (loop for c from 2 to 5000
>> > sum (loop for b from 1 below c
>> > count (let ((a (the small-positive-double-float
>> > (sqrt (float (- (* c c) (* b b))
>> > 0d0)))))
>> > (= (float (truncate a) 0d0) a)))))
>>
>> This is the fastest Lisp so far and its run time is faster than everything
>> except the macro solution that j.oke hinted at.
>>
>
> No. Not correct.
%!#$%
I thought the lisp benchmark result in your post was created running the
version above.
Kjetil S. Matheussen wrote:
> No. Not correct. As I posted, the stalin version was about 1.5 times
> faster than the C version,
Stalin only runs on my 32-bit system so I haven't gotten around to it yet.
> Even better, the stalin version wasn't even optimized,
Unless you count the page of compiler flags. ;-)
> I
> just wrote down the most straigh-forward version, compiled it, and bang,
> 1.5 times faster than C.
This also indicates that the C could be improved.
Using ints instead of doubles and (int)(..) instead of trunc(..) makes this
C99 more than twice as fast (0.137s):
int main(int argc, char *argv[]) {
const int n = (argc == 2 ? atoi(argv[1]) : 5000);
int i = 0;
for (int c=2; c<=n; c++)
for (int b=1; b<c; b++) {
double a = sqrt(c*c - b*b);
if ((int)a == a) i++;
}
printf("%d\n", i);
}
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
On Mon, 21 May 2007, Jon Harrop wrote:
> Kjetil S. Matheussen wrote:
> > No. Not correct. As I posted, the stalin version was about 1.5 times
> > faster than the C version,
>
> Stalin only runs on my 32-bit system so I haven't gotten around to it yet.
>
> > Even better, the stalin version wasn't even optimized,
>
> Unless you count the page of compiler flags. ;-)
>
> > I
> > just wrote down the most straigh-forward version, compiled it, and bang,
> > 1.5 times faster than C.
>
> This also indicates that the C could be improved.
>
> Using ints instead of doubles and (int)(..) instead of trunc(..) makes this
> C99 more than twice as fast (0.137s):
>
> int main(int argc, char *argv[]) {
> const int n = (argc == 2 ? atoi(argv[1]) : 5000);
> int i = 0;
>
> for (int c=2; c<=n; c++)
> for (int b=1; b<c; b++) {
> double a = sqrt(c*c - b*b);
> if ((int)a == a) i++;
> }
> printf("%d\n", i);
> }
>
>
Well, for fun, here is the benchmark for stalin (the one I posted
earlier), c (your version above), and ocaml (the one found at
http://tempe.st/2007/05/erlang-ruby-and-php-battle-it-out/).
Unfortunately, I don't have sbcl (my linux lacks ntpl).
Compile options
---------------
gcc -O3 --fast-math
ocamlopt -rectypes -inline 100 -ffast-math -ccopt -O3
stalin -Ob -Om -On -Or -Ot -dP -dC -dh -q -d -db -no-clone-size-limit -split-even-if-no-widening -copt -O3 -copt -ffast-math
Compiler versions
-----------------
gcc 3.4.6
ocamlopt 3.09.3
stalin 0.11
Results
-------
gcc: 0.307s (best of 5)
OCaml: 0.538s (best of 5)
Stalin: 0.356s (best of 5)
On 2007-05-21 15:01:11 -0400, Jon Harrop <···@ffconsultancy.com> said:
> I get:
>
> gcc: 0.137s
> SBCL: 0.214s
> Stalin: 0.330s (hardcoded "n")
> OCaml: 0.353s
>
> Probably time to increase n. With n=20000:
>
> gcc: 2.251s
> SBCL: 2.462s
> Stalin: 5.164s
> OCaml: 5.433s
>
> Well, this microbenchmark certainly isn't showing anything meaningful. ;-)
Apparently benchmark "meaning" is inversely proportional to how much
faster a lisp runs it than ocaml.
Raffael Cavallaro wrote:
> On 2007-05-21 15:01:11 -0400, Jon Harrop <···@ffconsultancy.com> said:
>> Well, this microbenchmark certainly isn't showing anything meaningful.
>> ;-)
>
> Apparently benchmark "meaning" is inversely proportional to how much
> faster a lisp runs it than ocaml.
You missed off the factorial.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet
On 2007-05-21 16:39:51 -0400, Jon Harrop <···@ffconsultancy.com> said:
> Raffael Cavallaro wrote:
>> On 2007-05-21 15:01:11 -0400, Jon Harrop <···@ffconsultancy.com> said:
>>> Well, this microbenchmark certainly isn't showing anything meaningful.
>>> ;-)
>>
>> Apparently benchmark "meaning" is inversely proportional to how much
>> faster a lisp runs it than ocaml.
>
> You missed off the factorial.
Credit where credit is due:
"OCaml - the language of choice for toy benchmarks from factorials to
ray-tracers!"
Jon Harrop wrote:
> Kjetil Svalastog Matheussen wrote:
>> gcc: 0.307s (best of 5)
>> OCaml: 0.538s (best of 5)
>> Stalin: 0.356s (best of 5)
>
> I get:
>
> gcc: 0.137s
> SBCL: 0.214s
> Stalin: 0.330s (hardcoded "n")
> OCaml: 0.353s
>
> Probably time to increase n. With n=20000:
>
> gcc: 2.251s
> SBCL: 2.462s
> Stalin: 5.164s
> OCaml: 5.433s
>
> Well, this microbenchmark certainly isn't showing anything meaningful. ;-)
No, of course not. Microbenchmarks never show anything meaningful, no
matter what language they favor.
In this thread we have just seen what every experienced Lisper already
knows: With enough tweaking we can get in the same ballpark as any other
compiled language. This is only useful for the hot spots, though, and
you can never know the hot spots in advance. For everything else,
performance doesn't matter, as long as its reasonable.
Other aspects are at least similarly important.
For example, from
http://groups.google.com/group/comp.lang.lisp/msg/c8aac11d40efa4b0
"It is interesting to notice that one year later, [the Common Lisp
directory] is still the same Lisp process that is running. The project
has evolved from Linkit to the CLD without ever quitting the Lisp
process. It has never crashed, even though the application and the CLOS
classes have changed a lot of times. In the same time, I had to restart
several times the Apache process and reboot a lot of time the 3COM
firewall router in front of the server."
Another example, from
https://lists.csail.mit.edu/pipermail/ll-discuss/2007-February/001186.html
"The second moment came when I had re-factored a class into a
superclass/subclass relationship. I then reloaded the file and tested
the software to see that it worked ok. It did.
Then I thought about that. I just re-organized the class topology on
the fly in a running system. How on earth could that just work?
Obviously existing instances had to be converted on the fly to new
objects and any dependencies had to be recomputed. But a huge chunk of
CLOS that I had thought `overkill' was what had stepped in and made sure
that I could just reload a file and keep going. Once again I was deeply
impressed.
I think if Graham spent a good chunk of time working on a CLOS project
that he, too, would come around to the point of view that CLOS is a
significant achievement."
Or from http://lib.store.yahoo.net/lib/paulgraham/bbnexcerpts.txt
"Lisp's interactive toplevel is a great help in developing software
rapidly. But the biggest advantage for us was probably in finding bugs.
As I mentioned before, with Web-based applications you have the users'
data on your servers and can usually reproduce bugs.
When one of the customer support people came to me with a report of a
bug in the editor, I would load the code into the Lisp interpreter and
log into the user's account. If I was able to reproduce the bug I'd get
an actual break loop, telling me exactly what was going wrong. Often I
could fix the code and release a fix right away. And when I say right
away, I mean while the user was still on the phone."
...and so on, and so forth.
Pascal
--
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/
On Mon, 21 May 2007, Jon Harrop wrote:
> Kjetil Svalastog Matheussen wrote:
>> gcc: 0.307s (best of 5)
>> OCaml: 0.538s (best of 5)
>> Stalin: 0.356s (best of 5)
>
> I get:
>
> gcc: 0.137s
> SBCL: 0.214s
> Stalin: 0.330s (hardcoded "n")
> OCaml: 0.353s
>
> Probably time to increase n. With n=20000:
>
> gcc: 2.251s
> SBCL: 2.462s
> Stalin: 5.164s
> OCaml: 5.433s
>
> Well, this microbenchmark certainly isn't showing anything meaningful. ;-)
>
Right. Its weird that you get so slow code from stalin though. I remember
the same thing happened with your ray-tracer benchmark. both Siskin and
myself got much faster code out of stalin than you did. I like stalin as a
proof that untweaked lisp (including other languages with dynamic typing
and such) can be really fast, and in theory beat everything else when it
comes to performance, so its dissapointing that you can't get as good
performance out of it since you make so much buzz about performance in
your endless number of newsgroup posts and the promotion for your
ray-tracer benchmark.
Kjetil S. Matheussen wrote:
> Right. Its weird that you get so slow code from stalin though. I remember
> the same thing happened with your ray-tracer benchmark. both Siskin and
> myself got much faster code out of stalin than you did. I like stalin as a
> proof that untweaked lisp (including other languages with dynamic typing
> and such) can be really fast, and in theory beat everything else when it
> comes to performance, so its dissapointing that you can't get as good
> performance out of it since you make so much buzz about performance in
> your endless number of newsgroup posts and the promotion for your
> ray-tracer benchmark.
I don't think this result is so surprising. Firstly, the Stalin entry is the
only one running in 32-bit. GCC, OCaml and SBCL all support 64-bit and I'm
running an AMD64 that is much faster in 64-bit.
Secondly, previous benchmarks indicate that SBCL and Stalin both do
relatively better than OCaml on Intel hardware. We only use AMD hardware so
I can't benchmark the same programs on Intels. But my guess is that the
results would be significantly different.
At the end of the day, all this really means is that the results are not
significantly different.
There are still several interesting facts to be drawn from these results:
1. Functional programming languages can be very competitive with C on
imperative code. I was very surprised to see the obvious C implementation
beaten by Lisp.
2. This is another benchmark where OCaml's lack of optimized div and mod
hurts.
3. Compiler switches and platforms have at least as big an effect as
language choice on this benchmark.
4. The underlying representation and choice of GC can have very significant
performance implications for a language. In this case, OCaml has trouble
using a bit-vector because it offers only tagged 31/63-bit ints for unboxed
arrays, which require much more work and slightly more memory.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/products/fsharp_journal/?usenet