From: Rolf Wester
Subject: Common Lisp and Python performance
Date: 
Message-ID: <3A437E56.4A0AA7BC@ilt.fhg.de>
Hi,

I frequently use Python and C++ but I don't like C++ very much and
Python is to slow
for many of my problems (much numerical simulation). I read that Common
Lisp is a much better
language than C++ and that it can be almost as fast as C. I wrote a
little program testing
Common Lisp's performance and have been quite disappointed about the
result. The code
is here:

(defun temp-diff (nt)
     (declare (optimize (speed 3) (safety 0) (debug 0)))
     (let ((mat (make-array '(300 300) :element-type 'single-float
:initial-element 0.0)))
           (dotimes (k nt)
                (loop for i from 1 to 298 do
                     (loop for j from 1 to 298 do
                            (setf (aref mat i j) (the single-float (+
1.0 (* 0.1 (+ (aref mat (+ i -1) j   )

(* -4.0 (aref mat i j) )

(aref mat (+ i 1)  j   )

(aref mat i (+ j -1)   )

(aref mat i (+ j  1)   ) )))))))
            (print (aref mat 50 50)) )))

(compile 'temp-diff)
(temp-diff 10)


The equivalent Python code is:

import array

def run(nt):
    list = [0.0]*(300*300)
    mat = array.array('f',list)
    for k in range(10):
        for i in range(1,299):
            for j in range(1,299):
                ii = i*300
                im = ii - 300
                ip = ii + 300
                mat[ii+j] = 1.0 + 0.1*(mat[im+j]     + mat[ip+j]    +
                                                   mat[ii+j+1]   +
mat[ii+j-1]  -
                                                           4.0*
mat[ii+j])
       print mat[300*50+50]

run(10)


Python is only slightly slower than compiled Common Lisp using ACL6.0
(CLISP is 60% slower).
I also read that it is difficult to write fast Common Lisp code. Is
there any means
to speed up the CL code?

-- Rolf

P.S.: I also tried a C++ version, but this is so much faster, that a
comparison doesn't make any sense.

From: Paul F. Dietz
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A438388.C0DA8713@interaccess.com>
Rolf Wester wrote:

> I also read that it is difficult to write fast Common Lisp code. Is
> there any means
> to speed up the CL code?

It's not *that* hard, you just have to know a few gotchas.

You should put in type declarations for the variables.  Declare
the integers to be fixnums (this is up to 2^28 on ACL and CMUCL,
I think) or (better) a narrower integer subrange type,
and the array to be an array of single floats of
the indicated dimensions.  Also, wrap (the fixnum ...)
around the (+ i -1).  Arithmetic that is guaranteed
to be on fixnums, yielding fixnums, can be much faster
than general arithmetic.  Use disassemble to check if the
operations have been inlined.

In LOOP, use the OF-TYPE clause.

CMUCL should be able to figure out some of these types,
btw.  ACL does not do as much type inference.

	Paul
From: John Watton
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <9202hf$h15$1@nnrp1.deja.com>
Try this version. It is about 50% faster in ACL 5.0. You need to declare
the array. Unfortunately ACL 5.0 and probably ACL 6.0 do not inline
calls to multidimension arrays. If you did a similar test with a single
dimension array then performance would be much better.

(defun temp-diff (nt)
  (declare (optimize (speed 3) (safety 0) (debug 0))
	   (:explain :boxing :calls))
  (let ((mat (make-array '(300 300) :element-type 'single-float
			 :initial-element 0.0)))
    (declare (type (ARRAY SINGLE-FLOAT (* *)) mat))
    (dotimes (k nt)
      (loop for i  from 1 to 298 do
	    (loop for j from 1 to 298 do
		  (setf (aref mat i j)  (+ 1.0 (* 0.1 (+ (aref mat (+ i -1) j  )

							 (* -4.0 (aref mat i j) )

							 (aref mat (+ i 1)  j  )

							 (aref mat i (+ j -1)  )

							 (aref mat i (+ j  1)  ) ))))))
      (print (aref mat 50 50)) )))

--
John Watton
Alcoa Inc.


Sent via Deja.com
http://www.deja.com/
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4u27wz46n.fsf@beta.franz.com>
John Watton <···········@alcoa.com> writes:

> Try this version. It is about 50% faster in ACL 5.0. You need to declare
> the array. Unfortunately ACL 5.0 and probably ACL 6.0 do not inline
> calls to multidimension arrays.

This is not true.  Allegro CL does inline multidimension arrays.
The _real_ problem is that Allegro CL should be extracting the
precise array information out of the make-array form, which has
constant info and such type info thus does exist, but it isn't
doing that extraction.  Your declaration below goes part of
the way in being more precise about what the array is, but it
still doesn't give the compiler as much information as it can use.

> If you did a similar test with a single
> dimension array then performance would be much better.

Perhaps, but what a way to have to program!
Instead,

> (defun temp-diff (nt)
>   (declare (optimize (speed 3) (safety 0) (debug 0))
> 	   (:explain :boxing :calls))
>   (let ((mat (make-array '(300 300) :element-type 'single-float
> 			 :initial-element 0.0)))
>     (declare (type (ARRAY SINGLE-FLOAT (* *)) mat))
=====================^^^^^^^^^^^^^^^^^^^^^^^^^^
>     (dotimes (k nt)
>       (loop for i  from 1 to 298 do
> 	    (loop for j from 1 to 298 do
> 		  (setf (aref mat i j)  (+ 1.0 (* 0.1 (+ (aref mat (+ i -1) j  )
> 
> 							 (* -4.0 (aref mat i j) )
> 
> 							 (aref mat (+ i 1)  j  )
> 
> 							 (aref mat i (+ j -1)  )
> 
> 							 (aref mat i (+ j  1)  ) ))))))
>       (print (aref mat 50 50)) )))

By changing the flagged declaration to (simple-array single-float (300 300)),
I got almost a 10x speedup over your code.  YMMV.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: John Watton
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <920d55$q1e$1@nnrp1.deja.com>
In article <·············@beta.franz.com>,
  Duane Rettig <·····@franz.com> wrote:
> By changing the flagged declaration to (simple-array single-float
(300 300)),
> I got almost a 10x speedup over your code.  YMMV.

This is all new news to me! My usual habit with ACL is to build my
declarations using type-of and (type-of (make-array '(300 300) :element-
type 'single-float)) returns (ARRAY SINGLE-FLOAT (300 300)). There is
no mention of SIMPLE-ARRAY as there is for type-of called on single
dimension arrays. So how is a poor boy to divine the intent of the
implementor and know when to use SIMPLE-ARRAY? I guess I just assumed,
incorrectly, that multidimensional arrays were by definition not
simple. Type-of just reinforced my silly notions.

Further good news to me is that I tried the declaration (simple-array
single-float (* *)) and it works just as well as the one Duane
suggested with the sizes (300 300). To me this is great news because
you can pass around multidimentional arrays without the size
information.

BTW what does YMMV mean?

--
John Watton
Alcoa Inc.


Sent via Deja.com
http://www.deja.com/
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4puikywml.fsf@beta.franz.com>
John Watton <···········@alcoa.com> writes:

> In article <·············@beta.franz.com>,
>   Duane Rettig <·····@franz.com> wrote:
> > By changing the flagged declaration to (simple-array single-float
> (300 300)),
> > I got almost a 10x speedup over your code.  YMMV.
> 
> This is all new news to me! My usual habit with ACL is to build my
> declarations using type-of and (type-of (make-array '(300 300) :element-
> type 'single-float)) returns (ARRAY SINGLE-FLOAT (300 300)). There is
> no mention of SIMPLE-ARRAY as there is for type-of called on single
> dimension arrays. So how is a poor boy to divine the intent of the
> implementor and know when to use SIMPLE-ARRAY? I guess I just assumed,
> incorrectly, that multidimensional arrays were by definition not
> simple. Type-of just reinforced my silly notions.

The type-of form should probably return a simple-array type form; it is
only increasing the confusion between the language distinction between
simple and non-simple arrays and a particular implementation you
happened to be seeing.

A simple array is one which is not displaced to another array, has no
fill pointer, and is not adjustable.  The number of dimensions has no
bearing as to whether an array is simple or not.

Implementationally, you are seeing a less "simple" implementation of an
array with multiple dimensions than one with one dimension, and that
may be confusing.

> Further good news to me is that I tried the declaration (simple-array
> single-float (* *)) and it works just as well as the one Duane
> suggested with the sizes (300 300). To me this is great news because
> you can pass around multidimentional arrays without the size
> information.

The last part of this statement is true, and is in fact explained in
documentation (compiling.htm, Sec 9.4) with almost this precise example.
However, while you will get a huge savings declaring the array as a
simple-array, there will be some slight additional savings by at least
declaring the first dimension.  In this example, you can also see the
difference in the size of the disassembled function.  The algorithm
that the compiler uses depends on whether it knows the major dimensions
of the array.  If so, it can multiply by a constant, instead of grabbing
that dimension from the array itself.  And, of course, if the major
dimension is a power of two, such a multiplication might turn into a
shift...

> BTW what does YMMV mean?

Your Mileage May Vary.  I ran on one type of machine, and you may get
different results on others.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6c1yv0dsua.fsf@octagon.mrl.nyu.edu>
John Watton <···········@alcoa.com> writes:

> In article <·············@beta.franz.com>,
>   Duane Rettig <·····@franz.com> wrote:
> > By changing the flagged declaration to (simple-array single-float
> (300 300)),
> > I got almost a 10x speedup over your code.  YMMV.
> 
> This is all new news to me! My usual habit with ACL is to build my
> declarations using type-of and (type-of (make-array '(300 300) :element-
> type 'single-float)) returns (ARRAY SINGLE-FLOAT (300 300)). There is
> no mention of SIMPLE-ARRAY as there is for type-of called on single
> dimension arrays. So how is a poor boy to divine the intent of the
> implementor and know when to use SIMPLE-ARRAY? I guess I just assumed,
> incorrectly, that multidimensional arrays were by definition not
> simple. Type-of just reinforced my silly notions.


In CMUCL you get

* (type-of (make-array '(300 300) :element-type 'single-float))
(SIMPLE-ARRAY SINGLE-FLOAT (300 300))

Maybe this is something the Franz folks may easily fix.

> BTW what does YMMV mean?

"Your Mileage May Vary" :)

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: Lieven Marchand
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3u27wbbqw.fsf@localhost.localdomain>
Rolf Wester <······@ilt.fhg.de> writes:

> I also read that it is difficult to write fast Common Lisp code. Is
> there any means
> to speed up the CL code?
> 

Use the tools your implementation has to find out what's going
on. Writing fast CL isn't rocket science, but it's no use blundering
in the dark trying micro optimisations. Also, only do this for
routines you've *measured* with profiling to be the bottle necks.

A simple (time (temp-diff 10)) gives the following
; cpu time (non-gc) 11,990 msec user, 50 msec system
; cpu time (gc)     1,560 msec user, 20 msec system
; cpu time (total)  13,550 msec user, 70 msec system
; real time  14,925 msec
; space allocation:
;  82 cons cells, 0 symbols, 185,127,280 other bytes, 0 static bytes

Now adding a (declare (:explain :types :calls)) gives us the following

;Examining a call to AREF with arguments:
;  symeval MAT type T
;  call to +_2OP type NUMBER
;  symeval J type (REAL * *)
; which returns a value of type T

...

;Examining a call to AREF with arguments:
;  symeval MAT type T
;  symeval I type (REAL * *)
;  symeval J type (REAL * *)
; which returns a value of type T
;Generating a non-inline  call to AREF
;Generating a non-inline  call to *_2OP

As you can see types T and NUMBER and non inlined type checking calls
abound. So let's add some type declarations.

This gives us:

; cpu time (non-gc) 730 msec user, 0 msec system
; cpu time (gc)     0 msec user, 0 msec system
; cpu time (total)  730 msec user, 0 msec system
; real time  778 msec
; space allocation:
;  12 cons cells, 0 symbols, 361,120 other bytes, 0 static bytes

Is this so out of line with the C++ timings you got?

The code I got the above result with is this:

(defmacro _f (form) `(the single-float ,form))
(defmacro _i (form) `(the fixnum ,form))
(defun temp-diff (nt)
  (declare (optimize (speed 3) (safety 0) (debug 0)))
  (let ((mat (make-array '(300 300) :element-type 'single-float
			 :initial-element 0.0)))
    (declare (type (simple-array single-float (300 300)) mat))
    (dotimes (k nt)
      (declare (type fixnum k nt))
;      (declare (:explain :calls :types))
;      (declare (:explain :boxing))
      (loop for i fixnum from 1 to 298 do
	    (loop for j fixnum from 1 to 298 do
		  (setf (aref mat i j) (_f (+ (_f 1.0)
					  (_f (* (_f 0.1) (_f  (+ (_f (aref mat (_i (+ i -1)) j   ))
						    (_f (* (_f -4.0) (_f (aref mat i j) )))
						    (_f (aref mat (_i (+ i 1))  j   ))
						    (_f (aref mat i (_i (+ j -1))   ))
						    (_f (aref mat i (_i (+ j  1)))   ) )))))))))
      (print (aref mat 50 50)) )))

So adding 2 lines gives a speedup of 20. Once again, this procedure is so 
simple that I would keep a non optimized version for further development
and if I needed to change the function start from the non optimized version
and add the optimizations back in if necessary.

Note that for production code you'd add typechecks before the declare's.

-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4wvc32mgw.fsf@beta.franz.com>
Lieven Marchand <···@bewoner.dma.be> writes:

> The code I got the above result with is this:
> 
> (defmacro _f (form) `(the single-float ,form))
> (defmacro _i (form) `(the fixnum ,form))
> (defun temp-diff (nt)
>   (declare (optimize (speed 3) (safety 0) (debug 0)))
>   (let ((mat (make-array '(300 300) :element-type 'single-float
> 			 :initial-element 0.0)))
>     (declare (type (simple-array single-float (300 300)) mat))
>     (dotimes (k nt)
>       (declare (type fixnum k nt))
> ;      (declare (:explain :calls :types))
> ;      (declare (:explain :boxing))
>       (loop for i fixnum from 1 to 298 do
> 	    (loop for j fixnum from 1 to 298 do
> 		  (setf (aref mat i j) (_f (+ (_f 1.0)
> 					  (_f (* (_f 0.1) (_f  (+ (_f (aref mat (_i (+ i -1)) j   ))
> 						    (_f (* (_f -4.0) (_f (aref mat i j) )))
> 						    (_f (aref mat (_i (+ i 1))  j   ))
> 						    (_f (aref mat i (_i (+ j -1))   ))
> 						    (_f (aref mat i (_i (+ j  1)))   ) )))))))))
>       (print (aref mat 50 50)) )))
> 
> So adding 2 lines gives a speedup of 20. Once again, this procedure is so 
> simple that I would keep a non optimized version for further development
> and if I needed to change the function start from the non optimized version
> and add the optimizations back in if necessary.

I'd like to revisit this code, since the originator of this thread, while
never again posting to c.l.l (that I saw), he did apparently take your
code, modified it (adding a bug in the process) and posted that to
comp.lang.dylan complaining that CL was 10x slower that C++ and dylan
was several times slower still.  He also said that his modified version
of the above code was almost impossible to read (which I agree with).

Of course, I am not concerned with the dylan aspect of his request.
The dylan people will have to deal with that.

I have fixed his bug in the CL code and posted to comp.lang.dylan.
I also cleaned up his code quite a bit, and that's why I am revisiting this
here:

The two macros _f and _i should not at all be necessary.  Proper type
propagation should handle all of the values in this code.  I rewrote
this function:

(defun temp-diff (nt)
  (declare (optimize (speed 3) (safety 0) (debug 0)))
  (let ((mat (make-array '(300 300) :element-type 'single-float
			 :initial-element 0.0)))
    (declare (type (simple-array single-float (300 300)) mat))
    (dotimes (k nt)
      (declare (type fixnum k nt))
;      (declare (:explain :calls :types))
;      (declare (:explain :boxing))
      (loop for i fixnum from 1 to 298 do
	    (loop for j fixnum from 1 to 298 do
		  (setf (aref mat i j)
		    (+ 1.0
		       (* 0.1 (+ (aref mat (+ i -1) j)
				 (* -4.0 (aref mat i j))
				 (aref mat (+ i 1) j)
				 (aref mat i (+ j -1))
				 (aref mat i (+ j  1))))))))
      (print (aref mat 50 50)) )))

and although I did not test this particular function, I did test the
original poster's modified function (it had two of these pairs of
nested loops instead of one) and found it to make no difference in the
timings.  There may in fact be some slight differences, based on the
type propagations.  The point, however, is that it is much cleaner
code and easier to read, besides being at least close to as fast as
the version with the _f and _i macros.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Lieven Marchand
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3elybqgtz.fsf@localhost.localdomain>
Duane Rettig <·····@franz.com> writes:

> The two macros _f and _i should not at all be necessary.  Proper type
> propagation should handle all of the values in this code.  I rewrote
> this function:
> 

You're right. I removed them later also. It was one of my test versions
to really pin down all types. 

In general, with a fixnum declaration for i, can (the fixnum (1+ i))
not be optimized a bit more because you don't have to check for fixnum
overflow and conversion to bignums, which you have to do with (1+ i)
since i can be most-positive-fixnum?

-- 
Lieven Marchand <···@village.uunet.be>
Gla�r ok reifr skyli gumna hverr, unz sinn b��r bana.
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4puhvgcwu.fsf@beta.franz.com>
Lieven Marchand <···@village.uunet.be> writes:

> Duane Rettig <·····@franz.com> writes:
> 
> > The two macros _f and _i should not at all be necessary.  Proper type
> > propagation should handle all of the values in this code.  I rewrote
> > this function:
> > 
> 
> You're right. I removed them later also. It was one of my test versions
> to really pin down all types. 
> 
> In general, with a fixnum declaration for i, can (the fixnum (1+ i))
> not be optimized a bit more because you don't have to check for fixnum
> overflow and conversion to bignums, which you have to do with (1+ i)
> since i can be most-positive-fixnum?

Normally this is true; when the type propagation results in integer
specifications that don't fit into the fixnum range, an overflow check
is generated, which is obviously slower than no check.

However, this particular benchmark declares (speed 3) and (safety 0), a
combination we at Franz generally discourage (except for benchmarks :-),
unless you are precisely aware of what things might break because of it.
For example, the Allegro CL comp:declared-fixnums-remain-fixnums-switch,
which normally is off, is firing at this speed/safety setting.  Thus,
the type propagator does not track fixnum additions/subractions beyond
fixnum boundaries, and clamps the resultant expressions at fixnums range.
The assumption for correct operation is that we are using "small numbers",
rather than running all the way up the fixnum range.

In fact, fixnums aren't the best declaration to use in these
array accesses, because they are bigger than necessary.   In reality, a
more appropriate declaration might be something like adim, defined as

(deftype adim () '(mod #.array-dimension-limit))

(There is already one of these in the excl package, but it is not
exported.)

Or, better yet, squeeze the dimensions even further, to fit the
problem at hand:

(deftype temp-diff-adim () '(mod 300))

Thus declaring the loop variables as that type, you can avoid the
potential overflow and get a better guarantee that the compiler will
open-code the arefs, even if you used (speed 3) and (saftey 1) settings,
which is what we recomend for fast but correct code generation.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Raymond Toy
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4nae8zyokc.fsf@rtp.ericsson.se>
>>>>> "Lieven" == Lieven Marchand <···@village.uunet.be> writes:

    Lieven> In general, with a fixnum declaration for i, can (the fixnum (1+ i))
    Lieven> not be optimized a bit more because you don't have to check for fixnum
    Lieven> overflow and conversion to bignums, which you have to do with (1+ i)
    Lieven> since i can be most-positive-fixnum?

But if i really can be most-positive-fixnum isn't (the fixnum (1+ i))
an error?

I think CMUCL won't believe you unless you compile with sufficiently
low debug and sufficiently high optimize.

Ray
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4lmsjgcme.fsf@beta.franz.com>
Raymond Toy <···@rtp.ericsson.se> writes:

> >>>>> "Lieven" == Lieven Marchand <···@village.uunet.be> writes:
> 
>     Lieven> In general, with a fixnum declaration for i, can (the fixnum (1+ i))
>     Lieven> not be optimized a bit more because you don't have to check for fixnum
>     Lieven> overflow and conversion to bignums, which you have to do with (1+ i)
>     Lieven> since i can be most-positive-fixnum?
> 
> But if i really can be most-positive-fixnum isn't (the fixnum (1+ i))
> an error?

Correct.  But in fact in this case, i will never get above 300
(actually, 298).

> I think CMUCL won't believe you unless you compile with sufficiently
> low debug and sufficiently high optimize.

This should also be hopefully the case for this particular example,
which is compiled at speed 3, safety 0, and debug 0.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Knut Arild Erstad
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <slrn9477ud.qnp.knute+news@teist.ii.uib.no>
[Rolf Wester]
: 
: Python is only slightly slower than compiled Common Lisp using ACL6.0
: (CLISP is 60% slower).
: I also read that it is difficult to write fast Common Lisp code. Is
: there any means
: to speed up the CL code?

If you use CMUCL, the compiler will tell you what it can't optimize and 
why.  In your example I just added fixnum declarations to NT and K.

I tried your example in Python, CLisp and CMUCL (Sparc/SunOS):

Python: 39.3 seconds.
CLisp:  24.4 seconds.
CMUCL:   0.3 seconds.

ACL will probably perform about the same as CMUCL with the correct
declarations.

BTW, the CMUCL compiler is called Python.  Don't let that confuse you, it 
has nothing to do with the language.  :)

-- 
"And these bumbling old men spent their time kicking away the pillars of
 the world, and they'd nothing to replace them with but uncertainty.  And
 they were /proud/ of this?"        -- Terry Pratchett, Small Gods
From: Hannu Koivisto
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <873dfgnvlm.fsf@lynx.ionific.com>
Rolf Wester <······@ilt.fhg.de> writes:

| I frequently use Python and C++ but I don't like C++ very much and
| Python is to slow
| for many of my problems (much numerical simulation). I read that Common

I also switched away from Python and C++ to Common Lisp about two
years ago partially for those reasons.

| Python is only slightly slower than compiled Common Lisp using ACL6.0
| (CLISP is 60% slower).

While I can't say why you didn't get better results with ACL6.0, I
took your codes, added my own C version and run them on my Debian
GNU/Linux x86 system with Python 1.5.2, CMUCL 2.4.20 and gcc 2.95.2
(-O3).  My approximate results were that the Python version was
about 80x slower than the Common Lisp version and the Common Lisp
version was about 2x slower than the C version.

| I also read that it is difficult to write fast Common Lisp code. Is

If I had to make such generalizations, I'd rather say it's easy to
write slow code.  Writing fast Common Lisp is not really harder
than with any other high level language, but you have to be
familiar with your implementation because different implementations
may penalize different approaches to some code.  For example, John
mentioned that ACL doesn't optimize the use of multidimensional
arrays well so you could write your code using one-dimensional
arrays, but then again, some other implementation might optimize
multidimensional arrays at least almost equally well (which is
probably the case with CMUCL).  These issues are probably
documented in the manual of your implementation; for example,
CMUCL's manual has about 50 pages about writing fast code.

-- 
Hannu
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A444700.DB4762F@worldnet.att.net>
Rolf Wester wrote:
> I frequently use Python and C++ but I don't like C++ very much 
Why?

> and Python is to slow
> for many of my problems (much numerical simulation). I read that Common
> Lisp is a much better
> language than C++ 
In which way?

> and that it can be almost as fast as C. 
Physically impossible. Nothing is as fast as C, except perhaps where you apply a domain
language to this domain specific problem, here a domain tool may prove more expressive and
efficient. But for a general-purpose development to compare Lisp with C is just as
preposterous as comparing Java with C++. There's nothing to compare.

> I wrote a
> little program testing
> Common Lisp's performance and have been quite disappointed about the
> result. 
> P.S.: I also tried a C++ version, but this is so much faster, that a
> comparison doesn't make any sense.
Precisely. Why not use some numerical C++ library, something like
http://www.mpi-sb.mpg.de/LEDA/, or http://www.oonumerics.org/, or
http://www.gel.ulaval.ca/~parizeau/C++/classes/doc/Classes/Classes.html.
From: Larry Elmore
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <PYY06.65273$x6.30483264@news2.rdc2.tx.home.com>
"Frankey" <···········@worldnet.att.net> wrote in message
·····················@worldnet.att.net...
> Rolf Wester wrote:
> > I frequently use Python and C++ but I don't like C++ very much
> Why?
>
> > and Python is to slow
> > for many of my problems (much numerical simulation). I read that Common
> > Lisp is a much better
> > language than C++
>
> In which way?
>
> > and that it can be almost as fast as C.
> Physically impossible. Nothing is as fast as C, except perhaps where you
apply a domain
> language to this domain specific problem, here a domain tool may prove
more expressive and
> efficient. But for a general-purpose development to compare Lisp with C is
just as
> preposterous as comparing Java with C++. There's nothing to compare.

I don't know where this is coming from. Good Fortran will outrun C in
numerical work. Good Ada code is at least the equal of C if the runtime
checks are turned off, as they are in just about all production programs. C
is close to the machine, but it's _too_ close in some ways. Because the
compiler has less information about the program to work with, C code is a
good deal harder to optimize than Fortran or Ada code. Lisp can provide the
compiler even more information and can often be nearly as fast as C, and in
some instances, IIRC, CMUCL can produce faster code. Lord knows the
development time will be less, and the debugging much easier (if the
programmer is at all competent).

Larry
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A44E33A.54880504@worldnet.att.net>
Larry Elmore wrote:
> "Frankey" <···········@worldnet.att.net> wrote in message
> > Physically impossible. Nothing is as fast as C, except perhaps where you
> > apply a domain
> > language to this domain specific problem, here a domain tool may prove
> > more expressive and
> > efficient. But for a general-purpose development to compare Lisp with C is
> > just as
> > preposterous as comparing Java with C++. There's nothing to compare.
> 
> I don't know where this is coming from. Good Fortran will outrun C in
> numerical work. Good Ada code is at least the equal of C 
Well, we were talking of Python and Lisp and general-purpose programming vs. specialist
languages/tasks. Fortran may or may not be OK for numerical tasks, but it's definitely not
good for, say, communications, whereas C or C++ will do just fine in both cases. Moreover,
the guy's using not Fortran, which would be reasonable, but Lisp! Good lord, that's not
the language you wanna use when you care about Python's sluggishness compared to C++ <g>.
Iow, let's not take things out of context. 

(Of course, the real reason why Fortran is still used is the user base and not that it'll
outrun anything C. Your Fortran compiler itself is written in C most likely. The opposite
seems unlikely <g>.)
From: bowman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ho716.469$wE3.3796@newsfeed.slurp.net>
Frankey <···········@worldnet.att.net> wrote in message
>
> (Of course, the real reason why Fortran is still used is the user base and
not that it'll
> outrun anything C. Your Fortran compiler itself is written in C most
likely. The opposite
> seems unlikely <g>.)

I am not positive, but I'd be very surprised if the GNU g77 FORTRAN compiler
is anything but a preprocessor for gcc. Existing FORTRAN code tends to be
very dense; it compiles, runs, and is proven. Very few organizations will
take the risk of updating poorly documented and obscure, but working, code
where
failure could be life threatening or endanger critical missions.

Even for new code, many scientists are fluent in FORTRAN and will write
their
own routines; I've worked in environments like that; I'm a programmer, not a
phD in biochemistry -- translating something I don't comprehend to C just
isn't worth the trouble, as an object file built from FORTRAN links just as
well
as any other. As long as it exposes an entry point that I can call, I'm
happy.
From: Lieven Marchand
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m33dfen6wt.fsf@localhost.localdomain>
"bowman" <······@montana.com> writes:

> I am not positive, but I'd be very surprised if the GNU g77 FORTRAN compiler
> is anything but a preprocessor for gcc. 

Then be very surprised. g77 is a front end that translates FORTRAN
code to the GNU Intermediate Language, just like gcc translates C code
to that IL and g++ translates C++ codes. (Actually these are all
drivers programs. The real front ends are called cc1, f771, cc1plus
etc.) Ironically enough, the intermediate language (RTL) even is
fairly lispy

(jump_insn 24 23 26 (set (pc)
        (if_then_else (gt (cc0)
                (const_int 0))
            (pc)
            (label_ref 38))) 276 {bleu+4} (nil)
    (nil))


-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club
From: bowman
Subject: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <_Dp16.8036$wE3.7638@newsfeed.slurp.net>
Lieven Marchand <···@bewoner.dma.be> wrote in message
>
> Then be very surprised. g77 is a front end that translates FORTRAN
> code to the GNU Intermediate Language, just like gcc translates C code
> to that IL and g++ translates C++ codes.

I haven't delved into gcc internals in a very long time. Did the IL come
about
when gcc became the GNU Compiler Collection or did it precede that?
Intereresting MS should now be trumpteting an IL base for C# etc.
From: Colin Walters
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <87hf3tsnpx.church.of.emacs@meta.verbum.org>
"bowman" <······@montana.com> writes:

> I haven't delved into gcc internals in a very long time. Did the IL
> come about when gcc became the GNU Compiler Collection or did it
> precede that?  Intereresting MS should now be trumpteting an IL base
> for C# etc.

I'm pretty sure that GCC had its IL before it became a multi-language
compiler; intermediate languages are common even in many single
language compilers, because they are easier to translate to assembly
than some data structure more based on the source code.

The CMUCL compiler actually uses two intermediate languages.
From: Lieven Marchand
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <m366k94thu.fsf@localhost.localdomain>
"bowman" <······@montana.com> writes:

> Lieven Marchand <···@bewoner.dma.be> wrote in message
> >
> > Then be very surprised. g77 is a front end that translates FORTRAN
> > code to the GNU Intermediate Language, just like gcc translates C code
> > to that IL and g++ translates C++ codes.
> 
> I haven't delved into gcc internals in a very long time. Did the IL come
> about
> when gcc became the GNU Compiler Collection or did it precede that?

It was there from the beginning. Since the backend was also designed
to allow code generation for a lot of different CPU's configurable by
machine descriptions you need a common point of contact. A machine
description is basically a mapping of the RTL operations to code with
descriptions of cost, used registers etc.

-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club
From: Jochen Schmidt
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <92f0it$6q9sk$2@ID-22205.news.dfncis.de>
bowman wrote:

> 
> Lieven Marchand <···@bewoner.dma.be> wrote in message
>>
>> Then be very surprised. g77 is a front end that translates FORTRAN
>> code to the GNU Intermediate Language, just like gcc translates C code
>> to that IL and g++ translates C++ codes.
> 
> I haven't delved into gcc internals in a very long time. Did the IL come
> about
> when gcc became the GNU Compiler Collection or did it precede that?
> Intereresting MS should now be trumpteting an IL base for C# etc.

The GCC was initially written by Richard Stallman. As he seems to like
lisp a lot (see Emacs or the member-lists of the ANSI CommonLisp standard
group) he has chosen a lispy style of the intermediate language.
There's also a usenet-discussion between RS and the author of TCL somewhere
in web - a real flame-fest similar to that of Torvalds and Tanenbaum.

Regards
Jochen

> 
> 
> 
> 
> 
From: Lieven Marchand
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <m3ito4a2vc.fsf@localhost.localdomain>
Jochen Schmidt <···@dataheaven.de> writes:

> The GCC was initially written by Richard Stallman. As he seems to like
> lisp a lot (see Emacs or the member-lists of the ANSI CommonLisp standard
> group) he has chosen a lispy style of the intermediate language.

RMS didn't invent RTL. He took the idea from a thesis and a previous
compiler.

-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club
From: Joe Marshall
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <snn8m83m.fsf@content-integrity.com>
Erik Naggum <····@naggum.net> writes:

> * Jochen Schmidt
> | The GCC was initially written by Richard Stallman. As he seems to like
> | lisp a lot (see Emacs or the member-lists of the ANSI CommonLisp standard
> | group) he has chosen a lispy style of the intermediate language.
> 
> * Lieven Marchand
> | RMS didn't invent RTL.  He took the idea from a thesis and a previous
> | compiler.
> 
>   It is very important that people get credit for what they do.  RMS didn't
>   invent Lisp for Emacs, either, James Gosling did, ... 

Actually, Bernie Greenberg should be given credit for this idea.
Greenberg's Multics Emacs was the first Emacs written in Lisp (1978),
followed by Eine and Zwei (Weinreb et al) in 1978 and 1979.

Gosling's Emacs didn't appear until 1981.


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Lieven Marchand
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <m366k39p97.fsf@localhost.localdomain>
Joe Marshall <···@content-integrity.com> writes:

> Actually, Bernie Greenberg should be given credit for this idea.
> Greenberg's Multics Emacs was the first Emacs written in Lisp (1978),
> followed by Eine and Zwei (Weinreb et al) in 1978 and 1979.
> 

http://www.multicians.org/mepap.html describes that implementation.

http://www.multicians.org/lcp.html describes the MACLisp compiler for
Multics.

-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club
From: Tim Bradshaw
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <ey3d7ebv87i.fsf@cley.com>
* Erik Naggum wrote:

>   It is very important that people get credit for what they do.  RMS
>   didn't invent Lisp for Emacs, either, James Gosling did, but RMS
>   chose to use Lisp, a very good thing for both Lisp and Emacs,
>   considering that the best that Gosling's Emacs can hope for is
>   that computer history buffs remember it. 

To quibble, I think that <someone> who wrote Multics Emacs was the
first person to do it in Lisp. There's an interesting article on this
somewhere (multics.org?) -- I think they had to do all sorts of
wonderful things to Multics to make it possible to have an interactive
editor at all.

--tim
From: Martin ``rydis'' Rydstr|m
Subject: Re: OT: GCC internals was Common Lisp and Python performance
Date: 
Message-ID: <xf11yur4i4n.fsf@licia.dtek.chalmers.se>
>>>>> "TB" == Tim Bradshaw <···@cley.com> writes:

TB> To quibble, I think that <someone> who wrote Multics Emacs was the
TB> first person to do it in Lisp. There's an interesting article on this
TB> somewhere (multics.org?) -- I think they had to do all sorts of
TB> wonderful things to Multics to make it possible to have an interactive
TB> editor at all.

There's a difference, however, between implementing an Emacs in Lisp
and implementing it in some other language, and make a Lisp-system
part of that implementation.

TTFN,

'mr

-- 
Martin Rydstr|m,  ·····@cd.chalmers.se ; rationalize till I'm blue in the face,
				       ; you cannot lose if you throw the race.
     Looking for a LispM to play with. Any and all offers/ideas appreciated.
Also       - " -    job     - " -                   - " -
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A469D8D.7BF18C99@worldnet.att.net>
bowman wrote:
> 
> Frankey <···········@worldnet.att.net> wrote in message
> >
> > (Of course, the real reason why Fortran is still used is the user base and
> not that it'll
> > outrun anything C. Your Fortran compiler itself is written in C most
> likely. The opposite
> > seems unlikely <g>.)
> 
> I am not positive, but I'd be very surprised if the GNU g77 FORTRAN compiler
> is anything but a preprocessor for gcc. Existing FORTRAN code tends to be
> very dense; it compiles, runs, and is proven. Very few organizations will
> take the risk of updating poorly documented and obscure, but working, code
> where
> failure could be life threatening or endanger critical missions.
> 
> Even for new code, many scientists are fluent in FORTRAN and will write
> their
> own routines; I've worked in environments like that; I'm a programmer, not a
> phD in biochemistry -- translating something I don't comprehend to C just
> isn't worth the trouble, as an object file built from FORTRAN links just as
> well
> as any other. As long as it exposes an entry point that I can call, I'm
> happy.
A very good post. I've seen that happen many times, and not only with Fortran. People get
used to something, and then that's it. And people using Fortran (and similar specialized
tools) are normally scientists or engineers rather than programmers. Though they program
of course, but the programming per se is secondary to what they do.
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6cbsu1kajh.fsf@octagon.mrl.nyu.edu>
Frankey <···········@worldnet.att.net> writes:

> Larry Elmore wrote:
> > "Frankey" <···········@worldnet.att.net> wrote in message
> > > Physically impossible. Nothing is as fast as C, except perhaps where you
> > > apply a domain
> > > language to this domain specific problem, here a domain tool may prove
> > > more expressive and
> > > efficient. But for a general-purpose development to compare Lisp with C is
> > > just as
> > > preposterous as comparing Java with C++. There's nothing to compare.
> > 
> > I don't know where this is coming from. Good Fortran will outrun C in
> > numerical work. Good Ada code is at least the equal of C 

> Well, we were talking of Python and Lisp and general-purpose
> programming vs. specialist languages/tasks. Fortran may or may not
> be OK for numerical tasks, but it's definitely not good for, say,
> communications, whereas C or C++ will do just fine in both
> cases. Moreover, the guy's using not Fortran, which would be
> reasonable, but Lisp! Good lord, that's not the language you wanna
> use when you care about Python's sluggishness compared to C++ <g>.
> Iow, let's not take things out of context.

I believe you are taking things out of context.  AFAIK Python does not have
a machine language level compiler (I might be wrong).  Common Lisp has
many commercial machine language level compilers and one public domain
one.  These compilers produce numerical code which is "good enough"
(i.e as fast, or almost as fast, as C).  Of course you then have to
factor in the time that you spend in chasing pointers around when you
program segfaults :)

> (Of course, the real reason why Fortran is still used is the user
> base and not that it'll outrun anything C. Your Fortran compiler
> itself is written in C most likely. The opposite seems unlikely
> <g>.)

You are mistaking a "network effect" (a lot of people use C/C++ because a
lot of people use C/C++) with a "techonolgy feature" (C is Faster than
Anything).

Let's forget about the "network effect" here and concentrate on the
"technology" feature.  Here you are wrong.  The fact that the
C-written Fortran compiler produces faster code than C-written C
compilers in inherent in the Fortran (77 - I believe that F90 may be
different) design.  From your postings it seems you do not know this
fact.

Of course, IMHO development time is what is more important nowadays.
That is why "scripting" languages are attractive, and why a language
like Common Lisp (which gives you the best of both worlds) has the
best combined features you can get.

Of course this is not enough to overcome network effects.  We wouldn't
be working on MS Software if it were that easy.

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A469EBA.2E06ADCE@worldnet.att.net>
Marco Antoniotti wrote:
> Frankey <···········@worldnet.att.net> writes:
> > Iow, let's not take things out of context.
> I believe you are taking things out of context.
Which wouldn't happen if you looked at the very first message in this thread <g>.

> AFAIK Python does not have
> a machine language level compiler (I might be wrong).  Common Lisp has
> many commercial machine language level compilers and one public domain
> one.  These compilers produce numerical code which is "good enough"
> (i.e as fast, or almost as fast, as C).  Of course you then have to
> factor in the time that you spend in chasing pointers around when you
> program segfaults :)
Red herring, that's what that is called. First, if you know what you're doing, you won't
spend any time "chasing pointers around." Second, again, let's not deviate from the
original problem--which was insufficient performance. Take the insufficient performance as
a given in that case. The guy said it's slow, let's trust him on that.

> Of course, IMHO development time is what is more important nowadays.
Not for the original poster--it was performance he was dissatisfied with <g>.
From: David Thornley
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <Bqr26.470$4c.210085@ruti.visi.com>
In article <·················@worldnet.att.net>,
Frankey  <···········@worldnet.att.net> wrote:
>Marco Antoniotti wrote:
>> Frankey <···········@worldnet.att.net> writes:
>> > Iow, let's not take things out of context.
>> I believe you are taking things out of context.
>Which wouldn't happen if you looked at the very first message in this thread <g>.
>
Really?  The guy was getting slow execution on his Lisp program.  That
was the issue.

>> (i.e as fast, or almost as fast, as C).  Of course you then have to
>> factor in the time that you spend in chasing pointers around when you
>> program segfaults :)

>Red herring, that's what that is called. First, if you know what you're doing, you won't
>spend any time "chasing pointers around."

Do you realize how many programmers don't know what they're doing,
by your definition?  On any reasonably large C or C++ project,
memory management is going to be a pain.

 Second, again, let's not deviate from the
>original problem--which was insufficient performance. Take the insufficient performance as
>a given in that case. The guy said it's slow, let's trust him on that.
>
Well, yes.  He wrote a toy Lisp program which was slow on the Lisp system
he tested it on, and that was the problem.

There are various ways to give him better performance.  One is to write
a Lisp program that executes faster.  Now, the casually written Lisp
program is likely to run much slower than its C equivalent, and this
is not necessarily a bad thing.  Most programs don't really need to
run all that fast, and of those that do, most of the code in them doesn't
need to run all that fast.  Therefore, it's reasonable to have a language
in which it is easy to write slow code that works, provided it is possible
to put in some work and make it run fast enough.

I'd recommend Norvig's "Paradigms of Artificial Intelligence Programming"
for somebody trying to write Lisp code with excellent performance: there
are a couple of good chapters on just that, and some demonstrations
elsewhere in the book.

>> Of course, IMHO development time is what is more important nowadays.
>Not for the original poster--it was performance he was dissatisfied with <g>.

That depends on what the original poster decides to do, doesn't it?  Right
now, he's dissatisfied with performance.  Next year, it may be different.
IMNSHO, it's faster to write a large Lisp program with good performance
than a large C program with good performance, and the Lisp program
may have better performance.  As the project grows, microefficiencies
become less important.


--
David H. Thornley                        | If you want my opinion, ask.
·····@thornley.net                       | If you don't, flee.
http://www.thornley.net/~thornley/david/ | O-
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A50CC.14766D01@worldnet.att.net>
David Thornley wrote:
> There are various ways to give him better performance.
I have one thing to tell you <g>: read his final posting now. Tell me what happened.
From: thi
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y16u27poaw9.fsf@glug.org>
Frankey <···········@worldnet.att.net> writes:

> David Thornley wrote:
> > There are various ways to give him better performance.
> I have one thing to tell you <g>: read his final posting now. Tell me what
> happened.

dude, your posts are degenerate.

thi
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A5777.780C4318@worldnet.att.net>
thi wrote:
> Frankey <···········@worldnet.att.net> writes:
> > David Thornley wrote:
> > > There are various ways to give him better performance.
> > I have one thing to tell you <g>: read his final posting now. Tell me what
> > happened.
> 
> dude, your posts are degenerate.
Why <g>? Or is it an expression of distaste in general on your part? If so, well, f you
too :-)
From: Kenneth P. Turvey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <slrn94pi0b.ovv.kt-alt@pug1.sprocketshop.com>
On Mon, 25 Dec 2000 01:04:37 GMT, Frankey <···········@worldnet.att.net> wrote:

> Red herring, that's what that is called. First, if you know what you're
> doing, you won't spend any time "chasing pointers around." Second, again,
> let's not deviate from the original problem--which was insufficient
> performance. Take the insufficient performance as a given in that
> case. The guy said it's slow, let's trust him on that.

Any C programmer worth his salt would not claim that good C programmers
who "know what they're doing" don't spend time chasing pointers around.
C is a wonderful language for many applications; many of which would not
be possible without pointers, but pointers are a headache and make
programming more difficult even for the best of us. 

>> Of course, IMHO development time is what is more important nowadays.
>Not for the original poster--it was performance he was dissatisfied with <g>.


-- 
Kenneth P. Turvey <······@SprocketShop.com> 
--------------------------------------------------------
  The whole aim of practical politics is to keep the populace alarmed
  [and hence clamorous to be led to safety] by menacing it with an
  endless series of hobgoblins, all of them imaginary.  -- H.L. Mencken
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6celyrkonv.fsf@octagon.mrl.nyu.edu>
Frankey <···········@worldnet.att.net> writes:

> Marco Antoniotti wrote:
> > Frankey <···········@worldnet.att.net> writes:
> > > Iow, let's not take things out of context.
> > I believe you are taking things out of context.
> Which wouldn't happen if you looked at the very first message in this thread <g>.
> 
> > AFAIK Python does not have
> > a machine language level compiler (I might be wrong).  Common Lisp has
> > many commercial machine language level compilers and one public domain
> > one.  These compilers produce numerical code which is "good enough"
> > (i.e as fast, or almost as fast, as C).  Of course you then have to
> > factor in the time that you spend in chasing pointers around when you
> > program segfaults :)

> Red herring, that's what that is called. First, if you know what
> you're doing, you won't spend any time "chasing pointers around."

Congratulations.  You are the best C programmer on the face of the
planet :)  K&R bow in front of you.  The people writing C debuggers
can go home now :)

> Second, again, let's not deviate from the original problem--which
> was insufficient performance. Take the insufficient performance as a
> given in that case. The guy said it's slow, let's trust him on that.

He asked whether the performance he got was the best he could get.
Simply by using a good Common Lisp compiler (an object you did not
know it existed as it showed on many of your postings) he got much
better performance, close to that gotten from a C program.  Exactly as
the people with more knowledge than you have pretty much pointed out.

> > Of course, IMHO development time is what is more important nowadays.
> Not for the original poster--it was performance he was dissatisfied
> with.

And he got the answers he asked for and I believe that he was
satisfied with that answer. Then you came by and basically said
"(Common) Lisp is not a compiled language"....  Big Mistake! :)

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m37l4cjr2w.fsf@cadet.dsl.speakeasy.net>
Frankey <···········@worldnet.att.net> writes:

> > Of course, IMHO development time is what is more important
> > nowadays.

> Not for the original poster--it was performance he was dissatisfied
> with <g>.

Great.  So why don't you just give up on Common Lisp, as it is as slow 
as a turd and go play with C and C++ again?

You know what I'm dissatisfied with?  It's that you just don't have a
clue about what you're talking about.  You've probably never even
developed something that _needed_ to go faster.  Have you?  Did you
try using Common Lisp, and therefore know from first experience that
it is a slug?

I personally have found CL to be slow, but it's seldom that it just
didn't cut it because of speed.  Say you're developing a proprietary
web server for a large-scale application, and you care about how much
you get out of a single box, and don't care so much how much extra
effort goes into development.  Then C might indeed offer an
advantage.  I'm finding in my life that C and C++ are just
disappearing in terms of usefulness (especially in the problems I've
been seeing lately).  I see more applications using things like Java,
Perl, and Common Lisp for the benefits they provide.

dave
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3elykjrm9.fsf@cadet.dsl.speakeasy.net>
Marco Antoniotti <·······@cs.nyu.edu> writes:

> Let's forget about the "network effect" here and concentrate on the
> "technology" feature.  Here you are wrong.  The fact that the
> C-written Fortran compiler produces faster code than C-written C
> compilers in inherent in the Fortran (77 - I believe that F90 may be
> different) design.  From your postings it seems you do not know this
> fact.

I wish I understood why people think it's important what lanaguage a
source-to-machine-language compiler is written in.  It makes no
difference whatsoever.  You can write a Fortran compiler in TCL or in
C.  The bottom line is that a compiler basically translates from one
langauge to another, and optimizes along the way.  There's nothing
inherent about the language a compiler is written in AT ALL!  The
original poster who mentioned this is seriously confused.

> Of course, IMHO development time is what is more important nowadays.

I don't think that execution time ever became less important.  But
it's true that with web-based applications, the latency of the
Internet is much more severe than a couple of extra milliseconds here
and there.

> That is why "scripting" languages are attractive, and why a language
> like Common Lisp (which gives you the best of both worlds) has the
> best combined features you can get.

While one can argue that CL, along with some macros and string
manipulation libraries, can make a damn good scripting language, CL is 
in and of itself NOT such a language.  But admittedly, if running
under a single Lisp world where there were already this listener
running, and I could just borrow a thread in the null environment to
run a small thunk, I'd like to use Common Lisp as the scripting
language.  But since *I* don't know how to do the script-type things
in CL that I do in Perl, I'll stick to Perl.

Another way about it would be to start developing similar types of
tools for CL that Perl has.  But the problem is that there are things
in CL that arn't standardized, and it all just becomes a mess.
Personally, I'd just stick to CL for solving hard problems involving
complex hierarchical data and algorithms, but not messy but important
stuff like what I and many others use Perl for.  If others can manage
to use Common Lisp for the kinds of things that I use Perl for, well,
then all the power to them.

> Of course this is not enough to overcome network effects.  We
> wouldn't be working on MS Software if it were that easy.

You _choose_ to use MS.  Not everyone does, believe it or not.  Many
people have successfully ousted MS from their work.  I'm getting
there.  Things like Dragon NaturallySpeaking which I like only on MS
Windows.

dave
From: Rainer Joswig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <joswig-EB64BD.02244805012001@news.is-europe.net>
In article <··············@cadet.dsl.speakeasy.net>, David Bakhash 
<·····@alum.mit.edu> wrote:

> Another way about it would be to start developing similar types of
> tools for CL that Perl has.  But the problem is that there are things
> in CL that arn't standardized, and it all just becomes a mess.

Some are not even standardized in PERL in the meaning
of being "standardized" in CL. The CL community
has/had several levels of standards:

- formal standards like ANSI CL or OMG's CORBA&CL
- community standards ( can't think of a better word right now)
  like CLtL1, CLtL2, CLIM1, CLIM2, CLOS MOP
- reference implementation standards like PCL
- portability standards like MK:DEFSYSTEM, SERIES, CL-HTTP,
  etc. (e.g. anything that is maintained and aims to be
  portable over a wide variety of platforms)
- "mother of implementation" standards like Genera, other
  copying from
- platform specific library standards like the Quicktime
  interface everybody uses for MCL

Since CL is not PERL and it is not even wanting to, the
comparison is misleading.

Common Lisp is a platform, a philosophie and a bunch of
really different variations on a theme. CLICC
is completely different from Symbolics Common Lisp.
Macintosh Common Lisp has completely different goals
than CMU CL. "L" is not aiming in being compatible
with the goals of Liquid Common Lisp. And so on.

PERL is not CL.

If you think about "a programming language under Unix that
is good for 'scripting'" and can be ported somehow to
Windose and maybe even the Mac and does have a (mostly)
single implementation - this is not Common Lisp. If
you want a Common Lisp, which is good at that, then a
modified/extended/minimized
CLisp is a good start for that. Call it SL
(Scripting Lisp, Script Language, ...), define it,
publish it, maintain it, evangelize it, ... just Do It.
Or just use PERL - since it is there.

> Personally, I'd just stick to CL for solving hard problems involving
> complex hierarchical data and algorithms, but not messy but important
> stuff like what I and many others use Perl for.  If others can manage
> to use Common Lisp for the kinds of things that I use Perl for, well,
> then all the power to them.

One thing that always makes me curious is, for what do people
need PERL for in the first place? I never really had the
need of PERL, regular expressions and the like (I have the
need for Persistent/Distributed CLOS, GUI Libs, and stuff
like that). Isn't the need for PERL a sign of the "wrong"
approach (manipulating text files instead of working with
objects)?

This is really puzzling me for a long time. I'm using computers
for almost twenty years and I don't use PERL/Python/C/C++/...
What am I missing? What am I doing wrong? Is it me?
I even never felt the need for PERL. Can you help me
to get the idea?

> > Of course this is not enough to overcome network effects.  We
> > wouldn't be working on MS Software if it were that easy.
> 
> You _choose_ to use MS.  Not everyone does, believe it or not.  Many
> people have successfully ousted MS from their work.

I gave my PC away a long time ago.

Rainer Joswig

-- 
Rainer Joswig, Hamburg, Germany
Email: ·············@corporate-world.lisp.de
Web: http://corporate-world.lisp.de/
From: Aaron Kushner
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <tdo339.0vl.ln@vermin.dsl.snfc21.pacbell.net>
Rainer Joswig <······@corporate-world.lisp.de> wrote:

> One thing that always makes me curious is, for what do people
> need PERL for in the first place? I never really had the
> need of PERL, regular expressions and the like (I have the
> need for Persistent/Distributed CLOS, GUI Libs, and stuff
> like that). Isn't the need for PERL a sign of the "wrong"
> approach (manipulating text files instead of working with
> objects)?

> This is really puzzling me for a long time. I'm using computers
> for almost twenty years and I don't use PERL/Python/C/C++/...
> What am I missing? What am I doing wrong? Is it me?
> I even never felt the need for PERL. Can you help me
> to get the idea?

Hello Rainer,

I'll assume that you ask the above questions in seriousness and
not as a troll...

I never really had the need to use CL, but then as a full time oracle
DBA/Unix Admin, that should not be any more surprising than you not
needing perl.

What are Perl and regular expressions good for? Here is a snippet of
code from an automated email responder I wrote that gives out license
keys for our source management product. Essentially, it reads in the
incoming message, grabs the headers, and creates a new outgoing message.

	($inHeader, $inMessage) = split(/\n\n/,$incoming,2);
        # Save headers so we can send mail back to user
        @headers = split(/\n/,$inHeader);
        foreach $hdr (@headers) {
                (undef, $mail{'Subject'}) = split(/: /,$hdr) 
                    if $hdr =~ m/^Subject:/;
                (undef, $mail{'From'}) = split(/: /,$hdr) 
                    if $hdr =~ m/^To:/;
                (undef, $mail{'To'}) = split(/: /,$hdr) 
                    if $hdr =~ m/^From:/;
                (undef, $mail{'To'}) = split(/: /,$hdr) 
                    if $hdr =~ m/^Reply-To:/;
        }
        # Check for FQDN
        if ($mail{'From'} =~ ···@/) {
                ($id,undef) = split(··@/,$mail{'From'});
        } else {
                $id = $mail{'From'};
        }
        &archive_msg($incoming,$mail{'To'});
        $Message = "From: $licensefrom\n" .
                   "Date: $time\n" .
                   "To: $mail{'To'}\n" .
                   "Cc: $admins\n" .
                   "Subject: BitKeeper License\n\n" .
                   "$mail{'Message'}";
        open (MAIL , "|$sendmail -t");

(The program was maybe another 40 lines of code and the message itself)

The entire program took less than 30 minutes and probably would have
taken a lot longer in C or lisp. I'd love to be proven wrong, though,
and find that it could be done quicker in CL.

Other useful perl programs I do in perl that are ideal for the language:
  logfile analyzers, database loaders for Oracle, tools for converting
  troff source into HTML, programs for bulk conversion of mp3 files at
  a music distribution website.

Definitely not enterprise apps, but useful programs in any case.
  
> like that). Isn't the need for PERL a sign of the "wrong"
> approach (manipulating text files instead of working with
> objects)?

What happens when reports from a legacy COBOL system need to be inserted
into an RDBMS data warehouse? Ideally, things like this shouldn't happen,
but they end up happening all of the time in my world.

> I gave my PC away a long time ago.

Sorry to hear about that :-)


Regards,
Aaron Kushner
Bitkeeper Sales Engineer
http://www.bitkeeper.com  (distributed source management tools)
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <934c7a$ljcic$1@fido.engr.sgi.com>
Aaron Kushner  <········@pacbell.net> wrote:
+---------------
| ...an automated email responder I wrote...  it reads in the incoming
| message, grabs the headers, and creates a new outgoing message.
| 	($inHeader, $inMessage) = split(/\n\n/,$incoming,2);
|         # Save headers so we can send mail back to user
|         @headers = split(/\n/,$inHeader);
|         foreach $hdr (@headers) {
|                 (undef, $mail{'Subject'}) = split(/: /,$hdr) 
|                     if $hdr =~ m/^Subject:/;
|                 (undef, $mail{'From'}) = split(/: /,$hdr) 
|                     if $hdr =~ m/^To:/;
|                 (undef, $mail{'To'}) = split(/: /,$hdr) 
|                     if $hdr =~ m/^From:/;
|                 (undef, $mail{'To'}) = split(/: /,$hdr) 
|                     if $hdr =~ m/^Reply-To:/;
|         }
+---------------

I don't claim to be a Perl expert, but this code looks seriously
broken w.r.t. the RFC 822 syntax for headers. AFAICT it doesn't
handle continuation lines at all!!


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: Raymond Wiker
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <86y9wq2ot3.fsf@raw.grenland.fast.no>
····@rigden.engr.sgi.com (Rob Warnock) writes:

> Aaron Kushner  <········@pacbell.net> wrote:
> +---------------
> | ...an automated email responder I wrote...  it reads in the incoming
> | message, grabs the headers, and creates a new outgoing message.
> | 	($inHeader, $inMessage) = split(/\n\n/,$incoming,2);
> |         # Save headers so we can send mail back to user
> |         @headers = split(/\n/,$inHeader);
> |         foreach $hdr (@headers) {
> |                 (undef, $mail{'Subject'}) = split(/: /,$hdr) 
> |                     if $hdr =~ m/^Subject:/;
> |                 (undef, $mail{'From'}) = split(/: /,$hdr) 
> |                     if $hdr =~ m/^To:/;
> |                 (undef, $mail{'To'}) = split(/: /,$hdr) 
> |                     if $hdr =~ m/^From:/;
> |                 (undef, $mail{'To'}) = split(/: /,$hdr) 
> |                     if $hdr =~ m/^Reply-To:/;
> |         }
> +---------------
> 
> I don't claim to be a Perl expert, but this code looks seriously
> broken w.r.t. the RFC 822 syntax for headers. AFAICT it doesn't
> handle continuation lines at all!!

        It also doesn't handle header lines with embedded ": "
strings, either... I guess this counts as an argument against Perl's
supposed ease-of-use :-)

        It seems that the final return address is going to be
determined by the last header in the set {"From", "Reply-To"}. This is
wrong; if Reply-To" is specified, it should be used - the ordering of
the headers is (obviously) not important. It may also be
appropriate to use "Sender" in preference to "From", as well (not sure
what the appropriate RFCs say about that).

-- 
Raymond Wiker
·············@fast.no
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <934dkr$ljgr1$1@fido.engr.sgi.com>
Raymond Wiker  <·············@fast.no> wrote:
+---------------
| ····@rigden.engr.sgi.com (Rob Warnock) writes:
| > I don't claim to be a Perl expert, but this code looks seriously
| > broken w.r.t. the RFC 822 syntax for headers. AFAICT it doesn't
| > handle continuation lines at all!!
| 
|         It also doesn't handle header lines with embedded ": "
| strings, either... I guess this counts as an argument against Perl's
| supposed ease-of-use :-)
| 
|         It seems that the final return address is going to be
| determined by the last header in the set {"From", "Reply-To"}. This is
| wrong; if Reply-To" is specified, it should be used - the ordering of
| the headers is (obviously) not important.
+---------------

Exactly my point. People tend to write "quick & dirty" Perl that
*almost*-but-not-quite works for these sorts of tasks, whereas if
you actually handle all the edge conditions the logic of the program
gets quite complex. It isn't just a few dozen lines of Perl any more!

Yes, one can of course process RFC 822 messages "properly" in Perl, but
the code is large enough that personally I'd much rather be using Lisp.


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: Craig Brozefsky
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <87d7e27yg0.fsf@piracy.red-bean.com>
····@rigden.engr.sgi.com (Rob Warnock) writes:

> Exactly my point. People tend to write "quick & dirty" Perl that
> *almost*-but-not-quite works for these sorts of tasks, whereas if
> you actually handle all the edge conditions the logic of the program
> gets quite complex. It isn't just a few dozen lines of Perl any
> more!

Well, he could very well have just used the existing Perl modules for
handling that stuff properly, rather than whipping up a quick,
incomplete solution.  They really are quite nice.

I've also not had poblems with Perl in terms of handling complex
logic.  My biggest issue has always been in building and manipulating
data structures of any significance.  The syntax for references,
hashes, arrays and embedding them in one another is just atroscious
and generates tons of bugs.  The object system is too elementary for
my tastes as well.

> Yes, one can of course process RFC 822 messages "properly" in Perl, but
> the code is large enough that personally I'd much rather be using Lisp.

Using the existing message and mail modules would not have made *his*
code larger. It would have reduced it in size and made it more robust.

In the unix world of text files, manipulating strings becomes very
important, since it's the primary representation for nearly all of
your data.  So for many tasks there is an initial friction that must
be overcome, and that's that mangling of strings in order to get to,
or produce your data.  Perl is good for this because of it's heritage,
it's shared idiom with other unix tools.  Beyond that, it quickly
approaches it's limits.  It's a trade language.

ps: I hope the poster of the code doesn't take any of this personally.
I've whipped up enough grotesque and disfunctional Perl code in my
time that I can't possibly hold anyone at fault for a quicky.
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3elyaau9o.fsf@cadet.dsl.speakeasy.net>
Craig Brozefsky <·····@red-bean.com> writes:

> ····@rigden.engr.sgi.com (Rob Warnock) writes:
> 
> > Exactly my point. People tend to write "quick & dirty" Perl that
> > *almost*-but-not-quite works for these sorts of tasks, whereas if
> > you actually handle all the edge conditions the logic of the
> > program gets quite complex. It isn't just a few dozen lines of
> > Perl any more!
> 
> Well, he could very well have just used the existing Perl modules
> for handling that stuff properly, rather than whipping up a quick,
> incomplete solution.  They really are quite nice.

Yeah.  duh.  in this sense, Perl is advantageous over than Common Lisp
specificially _because_ Perl has these libraries, and one doesn't have
to re-create them.    

> I've also not had poblems with Perl in terms of handling complex
> logic.

For my definition of `complex', Perl is hard to use for a pretty large 
set of interesting and very complex tasks.  CL is much more suitable.

It's all relative.  If you know CL and Perl, and if both have the
basic abilities to solve the problem you're working on (i.e. both have 
the basic facilities to do the job reasonably), and the price issues
don't get in the way, then it's just a matter of picking the one which 
will work better for your development needs.  As the problem becomes
bigger and more complex, I think CL starts to look more and more
likely to be better suited, in the general case.

> In the unix world of text files, manipulating strings becomes very
> important, since it's the primary representation for nearly all of
> your data.  So for many tasks there is an initial friction that must
> be overcome, and that's that mangling of strings in order to get to,
> or produce your data.  Perl is good for this because of it's
> heritage, it's shared idiom with other unix tools.  Beyond that, it
> quickly approaches it's limits.  It's a trade language.

Yeah.  the only built-ins that I've noticed to be useful for dealing
with strings in CL are string=, string<, search, member, and their
cousins (e.g. string-lessp, etc).  Regexps are lots more versatile,
and even ACL's regexps are a bit too cumbersome to look at for my
taste.  So, as always, Perl wins in string manipulation, hands down.
It's just a question of how important such things are.  Most serious
software products don't need regexp-level strength when dealing with
strings.  Scripts that deal with forms and various report formats do,
which is one reason I've argued that CL is not the best scripting
language by far.

dave
From: Dorai Sitaram
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <9420bj$gv$1@news.gte.com>
In article <··············@cadet.dsl.speakeasy.net>,
David Bakhash  <·····@alum.mit.edu> wrote:
>Craig Brozefsky <·····@red-bean.com> writes:
>
>> In the unix world of text files, manipulating strings becomes very
>> important, since it's the primary representation for nearly all of
>> your data.  So for many tasks there is an initial friction that must
>> be overcome, and that's that mangling of strings in order to get to,
>> or produce your data.  Perl is good for this because of it's
>> heritage, it's shared idiom with other unix tools.  Beyond that, it
>> quickly approaches it's limits.  It's a trade language.
>
>Yeah.  the only built-ins that I've noticed to be useful for dealing
>with strings in CL are string=, string<, search, member, and their
>cousins (e.g. string-lessp, etc).  Regexps are lots more versatile,
>and even ACL's regexps are a bit too cumbersome to look at for my
>taste.  So, as always, Perl wins in string manipulation, hands down.
>It's just a question of how important such things are.  Most serious
>software products don't need regexp-level strength when dealing with
>strings.  Scripts that deal with forms and various report formats do,
>which is one reason I've argued that CL is not the best scripting
>language by far.

The Perl regexp engine plus syntax, as helpful and
wonderful and compact as it is, is just locally
expressible text-manipulation and is easy to plug
portably into Scheme/CL, e.g.,
http://www.cs.rice.edu/~dorai/pregexp .  The speed
penalty is not a concern for most scripts.  I think
Perl's only strength -- but it's a big one -- is
its general integration with the OS, the sort of thing
where powerful abstraction facilities are not enough if
you don't have the seed primitives to abstract upon.  

--d
From: Bulent Murtezaoglu
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <87pui2j7xr.fsf@kapi.internal>
I do NOT mean to pick on the original poster of the perl code but it
seems to me that this is "worse is better" at its best!  The code
fragment might be buggy, but "it works" where working is taken to mean 
"it does what we need for a high enough proportion of the cases at the 
present time."  So nothing new there, then.

I guess I'd mark this a "fragile hack" if I had written it independent
of the language I used (I am old enough to not say I would never have
written it this way!).

BM 
From: Aaron Kushner
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <vev439.u8n.ln@vermin.dsl.snfc21.pacbell.net>
Bulent Murtezaoglu <··@acm.org> wrote:

> I do NOT mean to pick on the original poster of the perl code but it
> seems to me that this is "worse is better" at its best!  The code
> fragment might be buggy, but "it works" where working is taken to mean 
> "it does what we need for a high enough proportion of the cases at the 
> present time."  So nothing new there, then.

> I guess I'd mark this a "fragile hack" if I had written it independent
> of the language I used (I am old enough to not say I would never have
> written it this way!).

> BM 

I don't take it as being picked on and enjoy the feedback. You all made
valid points that the program has bugs. However, none of the bugs really
affected this program since it really only requires the from address to
send back the license key to the user. After having just read the "Worse
is Better" paper last week, I am sort of amused that you mentioned it.

One fact that many of you might find interesting is that we don't use
perl in our product for the reasons that Rob Warnock mentioned. Our CTO
actually made a mandate stating that we are not allowed to use perl for
anything since it leads to ugly hackery and hard to find subtle bugs (in
fact, I think I actually grabbed the mail header code from a program our
CTO wrote :-) The core of our SCM tools is written in ANSI C (using the
1988 version of System V Interface Definitions) and the GUI is written
in TCL/TK.

One of the disadvantages I have found while learning lisp/scheme is
that there doesn't seem to be a good body of available code for sysadmin
tasks like the one I wrote that I can learn from. (or, I might have been
looking in the wrong places). In contrast, there are many books and web
sites that show how to do system and web tasks using perl. I find that
I learn best when I read code that others have written.

Regards,
Aaron

--
Aaron Kushner				···@bitmover.com
BitKeeper Sales Engineer		http://www.bitkeeper.com
550 Valley St.				415-401-8808
San Francisco, CA 94131			Distributed SCM Tools
From: Vebjorn Ljosa
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <cy3puhxdw6g.fsf@gnoll.pvv.ntnu.no>
* ····@rigden.engr.sgi.com (Rob Warnock)
| 
| I don't claim to be a Perl expert, but this code looks seriously
| broken w.r.t. the RFC 822 syntax for headers. AFAICT it doesn't
| handle continuation lines at all!!

speaking of RFC 822---does anybody have Common Lisp functions for
parsing these headers _properly_?

-- 
Vebjorn
From: Ray Blaak
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3wvcajw1a.fsf@ns44.infomatch.bc.ca>
Rainer Joswig <······@corporate-world.lisp.de> writes:
> One thing that always makes me curious is, for what do people need PERL for
> in the first place? I never really had the need of PERL, regular expressions
> and the like [...] This is really puzzling me for a long time. I'm using
> computers for almost twenty years and I don't use PERL/Python/C/C++/...  What
> am I missing? What am I doing wrong? Is it me?  I even never felt the need
> for PERL. Can you help me to get the idea?

I use Perl for simple repetitive stuff when I have to find/replace/change stuff
in hundreds of source files at once, for example.

Given that Windows/DOS has brain-dead command line facilities, and Unix
schell/sed scripts are abominations, Perl lets me make the computer do all the
work instead of me manually repeating the same kind of actions over and over
again to a collection of source files (e.g. changing the package name for all
Java sources in a package, adding/changing copyright notices, renaming a method
and all its overridden versions, deleting a rampant but obsolete code pattern,
searching for particular code snippets, etc.).

By using Perl as my scripting language for these kinds of tasks, I don't
usually have to care what kind of box I am running, don't have to remember
findstring vs grep, etc., and can write a new little utility with a minimum of
fuss.

On the other hand, I also have never used Perl for anything "real" either, and
that's almost twenty years of computers for me too.

On yet the other hand, I also write Elisp snippets to do the same sorts of
things from Emacs. In my experience, however, doing things from Emacs often
tends to require a little more manual intervention, so the rule of thumb seems
to be: for an action applied to a few tens for files Elisp is fine, for
hundreds and more, Perl tends to do better.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
·····@infomatch.com                            The Rhythm has my soul.
From: Craig Brozefsky
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <87lmsq8mlc.fsf@piracy.red-bean.com>
Rainer Joswig <······@corporate-world.lisp.de> writes:

<much I agree with>

> like that). Isn't the need for PERL a sign of the "wrong"
> approach (manipulating text files instead of working with
> objects)?

I've worked quite a bit with CLOS, ObjC and various other
object-oriented technologies.  I've also worked alot with the unix
environment and it's penchant for text files.  I'm very hesitant to
say that objects are superior to text files.  

Issues of object persistence, identity, and lifecycle can become
overwhelming in many situations where serializing stuff into a text
file and manipulating it in that form provides a simpler path to the
solution.  Of course there are also places where object lifecycle
concerns are trivialized by the complexity of the manipulations one
must perform on the data, and the cognitive aids objects provide come
in real handy.  Luckily CL can deal with both quite easily.

I don't think manipulating text files instead of working with objects
can be considered a "wrong approach" in the general sense.
From: ·@b.c
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <8ir84tg5fqum3q21dr5e63q96mslggnf14@4ax.com>
On Sat, 23 Dec 2000 06:25:49 GMT, Frankey <···········@worldnet.att.net>
wrote:

>Physically impossible. Nothing is as fast as C

That's true in the physical universe, but software is different.
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6c8zp5k9zz.fsf@octagon.mrl.nyu.edu>
·@b.c writes:

> On Sat, 23 Dec 2000 06:25:49 GMT, Frankey <···········@worldnet.att.net>
> wrote:
> 
> >Physically impossible. Nothing is as fast as C
> 
> That's true in the physical universe, but software is different.

INTERCAL transcends the physical universe.

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: Eugene Zaikonnikov
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <6y8zp71gij.fsf@viking.cit>
* "Frankey" == Frankey  <···········@worldnet.att.net> writes:

Frankey>  Rolf Wester wrote:

>> and that it can be almost as fast as C.

Frankey>  Physically impossible. Nothing is as fast as C

Heck! C is a sluggish memory hog. Consider using assembler instead:
there you get the *real* speed and unmatched control over the machine.

-- 
  Eugene
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A44E40A.633731E2@worldnet.att.net>
Eugene Zaikonnikov wrote:
> 
> * "Frankey" == Frankey  <···········@worldnet.att.net> writes:
> 
> Frankey>  Rolf Wester wrote:
> 
> >> and that it can be almost as fast as C.
> 
> Frankey>  Physically impossible. Nothing is as fast as C
> 
> Heck! C is a sluggish memory hog. Consider using assembler instead:
> there you get the *real* speed and unmatched control over the machine.
Well, jokes aside, yes in the context of this discussion, you're absolutely right. 'course
life is multivariate and assembly may be--practically--a bit too much. C++ strikes the
right balance when performance matters. When it doesn't, for many specialized tasks it is
too low level--unless one has a good library, of course. 

> 
> --
>   Eugene
From: Eugene Zaikonnikov
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <6yk88ryp61.fsf@viking.cit>
* "Frankey" == Frankey  <···········@worldnet.att.net> writes:

Frankey>  C++ strikes the right balance when performance matters. When
Frankey>  it doesn't, for many specialized tasks it is too low
Frankey>  level--unless one has a good library, of course.

What many people dislike about C++ is that it object systems doesn't
feels *native*: kinda like a third hand implanted on the back. I heard
that Objective C is better in this regard (never tried it myself
though). OTOH Lisp is versatile enough to define an object system
within the language.
Some Lisp compilers (e.g. CMUCL) produce very fast numeric code,
generally not much worse than yielded by popular C compilers. Those
who don't can call foreign number-crunching code if it makes
performance bottleneck.
But if you care about 20% speed penalty more than about few spare
weeks spent in debugger, C may be a good option.

-- 
  Eugene
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A469926.D4C5DDB3@worldnet.att.net>
Eugene Zaikonnikov wrote:
> * "Frankey" == Frankey  <···········@worldnet.att.net> writes:
> Frankey>  C++ strikes the right balance when performance matters. When
> Frankey>  it doesn't, for many specialized tasks it is too low
> Frankey>  level--unless one has a good library, of course.
> 
> What many people dislike about C++ is that it object systems doesn't
> feels *native*: kinda like a third hand implanted on the back. 
Assuming I know what you mean <g>: that's for a reason. Like I said, it strikes the right
balance. Smalltalk's object model may feel very native, but then it's unusable for many
tasks because this "nativity" is not without cost. Having said that, I feel that perhaps I
should ask you what you actually meant by C++'s object system not feeling right. C++ has a
very complete object model.

> OTOH Lisp is versatile enough to define an object system
> within the language.
I'm just learning it now (thus my potentially flawed opinions <g>.) Don't know why I do
it... just curious, I guess.  Someone said, learn at least one new programming language
every year, so I'm making an effort!

> Some Lisp compilers (e.g. CMUCL) produce very fast numeric code,
> generally not much worse than yielded by popular C compilers. Those
> who don't can call foreign number-crunching code if it makes
> performance bottleneck.
> But if you care about 20% speed penalty more than about few spare
> weeks spent in debugger, C may be a good option.
Precisely. And btw, if you say lisp is versatile enough to define own object system,
you'll have to admit that so can C <g>. It's sufficiently low-level so you can code up any
"object system" at all.
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3bstqvfpb.fsf@cadet.dsl.speakeasy.net>
Frankey <···········@worldnet.att.net> writes:

> Eugene Zaikonnikov wrote:

> > OTOH Lisp is versatile enough to define an object system within
> > the language.  > I'm just learning it now (thus my potentially
> > flawed opinions <g>.) Don't know why I do > it... just curious, I
> > guess.  Someone said, learn at least one new programming language
> > > every year, so I'm making an effort!
> 
> > Some Lisp compilers (e.g. CMUCL) produce very fast numeric code,
> > generally not much worse than yielded by popular C
> > compilers. Those who don't can call foreign number-crunching code
> > if it makes performance bottleneck.  But if you care about 20%
> > speed penalty more than about few spare weeks spent in debugger, C
> > may be a good option.  > Precisely. And btw, if you say lisp is
> > versatile enough to define own object system, > you'll have to
> > admit that so can C <g>. It's sufficiently low-level so you can
> > code up any > "object system" at all.

Yes.  This is true.  I think I see what Eugene was trying to say, but
it didn't come out right.

Originally, I think CLOS was implemented in Lisp itself, with an extensive
macro layer.  That is to say that in CL, the object model could (as
was, and still is in some implementations) just bootstrapped in a
pretty way: a package (often clos `CLOS') was created, written in
non-CLOS CL, and implemented CLOS in CL, and the CL package then adds
the right symbols (including a very powerful macro layer including
`defclass', `defmethod', `defgeneric', `define-method-combination',
etc.).  The point being that CL, even without CLOS, was powerful
enough to be made OO without writing heinus parsers, modifying the
compiler, etc.  The difference between CL->CLOS and C->C++ is that C
didn't have `defmacro', and so creating syntactic abstraction in C is
pretty much impossible.  I hope that helps.

dave
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6cg0j0wrpr.fsf@octagon.mrl.nyu.edu>
David Bakhash <·····@alum.mit.edu> writes:

> Originally, I think CLOS was implemented in Lisp itself, with an extensive
> macro layer.  That is to say that in CL, the object model could (as
> was, and still is in some implementations) just bootstrapped in a
> pretty way: a package (often clos `CLOS') was created, written in
> non-CLOS CL, and implemented CLOS in CL, and the CL package then adds
> the right symbols (including a very powerful macro layer including
> `defclass', `defmethod', `defgeneric', `define-method-combination',
> etc.).  The point being that CL, even without CLOS, was powerful
> enough to be made OO without writing heinus parsers, modifying the
> compiler, etc.  The difference between CL->CLOS and C->C++ is that C
> didn't have `defmacro', and so creating syntactic abstraction in C is
> pretty much impossible.  I hope that helps.

I'd say that this is true, however, I'd qualify it.  Building a CLOS
on top of a CLtL1 system was definitively possible, although it
required some digging in the implementation.  This is because generic
functions must look and feel like "regular" function and because TYPEP
and friends must know about CLOS itself.

However, building a single dispatch, multiple inheritance OO language,
avec un meta-object protocol, on top of CL is indeed "trivial".

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3ae95c9yl.fsf@cadet.dsl.speakeasy.net>
Marco Antoniotti <·······@cs.nyu.edu> writes:

> I'd say that this is true, however, I'd qualify it.  Building a CLOS
> on top of a CLtL1 system was definitively possible, although it
> required some digging in the implementation.  This is because
> generic functions must look and feel like "regular" function and
> because TYPEP and friends must know about CLOS itself.

Thanks for the clarification.  It's indeed true that some stuff had to
be changed, but it's amazing how much didn't.

Another thing that this reminds me of is that I love about Common Lisp
is how you can redefine functions -- even in the CL package, or better
yet shadow them and change their behavior, etc.  In this sense, GFs
and TYPEP could have been handled still within Lisp source code (not
that this is the best way, but still kinda neat).

dave
From: Colin Walters
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <87y9wo3xyz.church.of.emacs@meta.verbum.org>
David Bakhash <·····@alum.mit.edu> writes:

> Another thing that this reminds me of is that I love about Common
> Lisp is how you can redefine functions -- even in the CL package, or
> better yet shadow them and change their behavior, etc.

Hm.  I seem to remember reading somewhere that it wasn't permitted to
redefine built-in functions, so the compiler was then free to optimize
those calls as it wished (open coding, etc).
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m31yugbxs9.fsf@cadet.dsl.speakeasy.net>
Colin Walters <·········@cis.ohio-state.edu> writes:

> David Bakhash <·····@alum.mit.edu> writes:
> 
> > Another thing that this reminds me of is that I love about Common
> > Lisp is how you can redefine functions -- even in the CL package, or
> > better yet shadow them and change their behavior, etc.
> 
> Hm.  I seem to remember reading somewhere that it wasn't permitted to
> redefine built-in functions, so the compiler was then free to optimize
> those calls as it wished (open coding, etc).

Yeah, but just about every implementation has a restart that allows
you to do it anyway (except CLISP, which catches such things when you
use DEFUN, but not when you use (setf fdefinition).

dave
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ofxi3ux5.fsf@content-integrity.com>
David Bakhash <·····@alum.mit.edu> writes:

> Colin Walters <·········@cis.ohio-state.edu> writes:
> 
> > David Bakhash <·····@alum.mit.edu> writes:
> > 
> > > Another thing that this reminds me of is that I love about Common
> > > Lisp is how you can redefine functions -- even in the CL package, or
> > > better yet shadow them and change their behavior, etc.
> > 
> > Hm.  I seem to remember reading somewhere that it wasn't permitted to
> > redefine built-in functions, so the compiler was then free to optimize
> > those calls as it wished (open coding, etc).
> 
> Yeah, but just about every implementation has a restart that allows
> you to do it anyway (except CLISP, which catches such things when you
> use DEFUN, but not when you use (setf fdefinition).

Or you can simply create your own package with the relevant symbols
shadowed.


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3dfar8lh.fsf@content-integrity.com>
Frankey <···········@worldnet.att.net> writes:

> Physically impossible.  Nothing is as fast as C ....
> There's nothing to compare.

And as you approach C, your apparent mass will increase to such a
point that attaining the velocity of C will require infinite energy.

It is refreshing to know that in this day and age when things change
so rapidly that there are some among us that still rabidly cling to
the orthodox dogma.


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A4F4F.91CD98D3@worldnet.att.net>
Joe Marshall wrote:
> 
> Frankey <···········@worldnet.att.net> writes:
> 
> > Physically impossible.  Nothing is as fast as C ....
> > There's nothing to compare.
> 
> And as you approach C, your apparent mass will increase to such a
> point that attaining the velocity of C will require infinite energy.
Speak English, dude <g>. Btw, I dont' insist on C, any general-purpose compiled language
will do. C++, Pascal, whatever.
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <k88lk2hy.fsf@content-integrity.com>
Frankey <···········@worldnet.att.net> writes:

> Joe Marshall wrote:
> > 
> > Frankey <···········@worldnet.att.net> writes:
> > 
> > > Physically impossible.  Nothing is as fast as C ....
> > > There's nothing to compare.
> > 
> > And as you approach C, your apparent mass will increase to such a
> > point that attaining the velocity of C will require infinite energy.
> Speak English, dude <g>. Btw, I dont' insist on C, any general-purpose compiled language
> will do. C++, Pascal, whatever.

Sorry.  I'll try to use little words for you.

Go to   http://www.m-w.com/   (you do know how to use a web browser,
don't you?)

This web site will allow you to type in any word that you don't
understand, and it will show you what that word means (we call this a
`definition').


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6cae9hfsfe.fsf@octagon.mrl.nyu.edu>
Frankey <···········@worldnet.att.net> writes:

> Joe Marshall wrote:
> > 
> > Frankey <···········@worldnet.att.net> writes:
> > 
> > > Physically impossible.  Nothing is as fast as C ....
> > > There's nothing to compare.
> > 
> > And as you approach C, your apparent mass will increase to such a
> > point that attaining the velocity of C will require infinite energy.
> Speak English, dude <g>. Btw, I dont' insist on C, any
> general-purpose compiled language will do. C++, Pascal, whatever.

You forgot to include Common Lisp: the general purpose compiled
language. :)

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3y9wshyfl.fsf@cadet.dsl.speakeasy.net>
Frankey <···········@worldnet.att.net> writes:

> Speak English, dude <g>. Btw, I dont' insist on C, any
> general-purpose compiled language will do. C++, Pascal, whatever.

Huh?  CL _is_ a general-purpose compiled language, like C++, Pascal,
etc.  I no longer see your point.

dave
From: Paolo Amoroso
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <WH9EOu4XUfNPU0JJBIXjrrsbEV+h@4ax.com>
On Fri, 22 Dec 2000 17:16:22 +0100, Rolf Wester <······@ilt.fhg.de> wrote:

> for many of my problems (much numerical simulation). I read that Common
> Lisp is a much better
> language than C++ and that it can be almost as fast as C. I wrote a

This paper may be interesting:

  Fast Floating Point Processing in Common Lisp
  http://www.cs.berkeley.edu/~fateman/papers/lispfloat.ps


> I also read that it is difficult to write fast Common Lisp code. Is
> there any means
> to speed up the CL code?

The book "Paradigms of Artificial Intelligence Programming - Case Studies
in Common Lisp" by Peter Norvig includes a lot of useful material on
performance optimization of Lisp programs.


Paolo
-- 
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/
From: glauber
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <9229hu$3tl$1@nnrp1.deja.com>
In article <·················@ilt.fhg.de>,
  Rolf Wester <······@ilt.fhg.de> wrote:
[...]
>
> Python is only slightly slower than compiled Common Lisp using ACL6.0
> (CLISP is 60% slower).
> I also read that it is difficult to write fast Common Lisp code. Is
> there any means
> to speed up the CL code?


Just my $.02: did you compile the Lisp code? This makes a huge
difference. Python and Perl compile by default, but Lisp compiles only
if you tell it to. Then, there are compile optimization settings that
also can make a big difference, plus the tricks that other people
already mentioned about declaring the data types.

CLISP is not a speed demon, but when compiled it usually doesn't come
too much behind Python.

If you are using Windows, try also Corman Lisp.

glauber


--
Glauber Ribeiro
··········@my-deja.com    http://www.myvehiclehistoryreport.com
"Opinions stated are my own and not representative of Experian"


Sent via Deja.com
http://www.deja.com/
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A44E53D.1B8D8C1@worldnet.att.net>
glauber wrote:
>   Rolf Wester <······@ilt.fhg.de> wrote:
> [...]
> >
> > Python is only slightly slower than compiled Common Lisp using ACL6.0
> > (CLISP is 60% slower).
> > I also read that it is difficult to write fast Common Lisp code. Is
> > there any means
> > to speed up the CL code?
> 
> Just my $.02: did you compile the Lisp code? This makes a huge
> difference. Python and Perl compile by default, but Lisp compiles only
> if you tell it to. 
Right. Though one has to keep in mind that the term "compile" can only be applied here
figuratively. None of these languages compile in the same sense as what real-compiled
languages like for example C, C++, Pascal, Ada, do. The same question frequently comes up
in Java-related discussions: you can compile your rt environment into the program, but the
structure of the language remains intact. Is such "compilation" an improvement?
Definitely. Still no match to C or C++ or Ada though.

> Then, there are compile optimization settings that
> also can make a big difference, plus the tricks that other people
> already mentioned about declaring the data types.
> 
> CLISP is not a speed demon, but when compiled it usually doesn't come
> too much behind Python.
That is a very reasonable, realistic statement.
From: Kaelin Colclasure
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <wu3dfeuakd.fsf@soyuz.arslogica.com>
Frankey <···········@worldnet.att.net> writes:

> Right. Though one has to keep in mind that the term "compile" can only be applied here
> figuratively. None of these languages compile in the same sense as what real-compiled
> languages like for example C, C++, Pascal, Ada, do. The same question frequently comes up
> in Java-related discussions: you can compile your rt environment into the program, but the
> structure of the language remains intact. Is such "compilation" an improvement?
> Definitely. Still no match to C or C++ or Ada though.

You are misinformed. The majority of Common Lisp implementations
compile to native code, just like C, C++, Pascal, and Ada -- and this
has been the case for decades. Of the CL implementations mentioned in
this thread, only CLISP does not compile to native code.

You can see the native code generated by the compiler by using the
Common Lisp function #'disassemble.

-- Kaelin
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A469B0A.FACC319D@worldnet.att.net>
Kaelin Colclasure wrote:
> 
> Frankey <···········@worldnet.att.net> writes:
> 
> > Right. Though one has to keep in mind that the term "compile" can only be applied here
> > figuratively. None of these languages compile in the same sense as what real-compiled
> > languages like for example C, C++, Pascal, Ada, do. The same question frequently comes up
> > in Java-related discussions: you can compile your rt environment into the program, but the
> > structure of the language remains intact. Is such "compilation" an improvement?
> > Definitely. Still no match to C or C++ or Ada though.
> 
> You are misinformed. The majority of Common Lisp implementations
> compile to native code, just like C, C++, Pascal, and Ada -- and this
> has been the case for decades. 
I didn't mean they don't compile to native code. That is not too much to say that they
compile into native code though--what kind of code is it, that is the question. Take Java:
everything there is dynamically allocated. Go ahead and compile it, it'll still be all
dynamically allocated. The structure of the language does not get changed. Now, dynamic
allocations are expensive. Garbage collector still remains--along with the cost of this
kind of memory management. Stuff like that. 

> Of the CL implementations mentioned in
> this thread, only CLISP does not compile to native code.
All right, but I didn't mean native vs. interpreted. It's rather about different kinds of
native code.

> You can see the native code generated by the compiler by using the
> Common Lisp function #'disassemble.
From: Paul Foley
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m2puig6o55.fsf@mycroft.actrix.gen.nz>
On Mon, 25 Dec 2000 00:48:53 GMT, Frankey  wrote:

> I didn't mean they don't compile to native code. That is not too
> much to say that they compile into native code though--what kind of
> code is it, that is the question. Take Java: everything there is
> dynamically allocated. Go ahead and compile it, it'll still be all
> dynamically allocated. The structure of the language does not get
> changed. Now, dynamic allocations are expensive. Garbage collector
> still remains--along with the cost of this kind of memory
> management. Stuff like that.

Again, you're making claims about things you apparently know nothing
about.  What makes you think dynamic allocations are expensive?
Certainly they are in C, because the memory management machinery has
to do a fair bit of work to avoid fragmentation and try to reallocate
the same memory over and over (after you free it, of course), without
being able to move things that are still in use.  Lisp doesn't have
this problem.  Modern GC implementations are often _faster_ than
manual memory management with malloc/free.  The "cost" you speak of
may well be negative!  Stuff like that?

Your beliefs about Lisp being slow are simply unfounded.  Please stop
posting out of ignorance; you're only serving to prolong the life of
these untruths.

-- 
You don't have to agree with me; you can be wrong if you want.

(setq reply-to
  (concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A492BDE.3338D32C@worldnet.att.net>
Paul Foley wrote:
> On Mon, 25 Dec 2000 00:48:53 GMT, Frankey  wrote:
> > I didn't mean they don't compile to native code. That is not too
> > much to say that they compile into native code though--what kind of
> > code is it, that is the question. Take Java: everything there is
> > dynamically allocated. Go ahead and compile it, it'll still be all
> > dynamically allocated. The structure of the language does not get
> > changed. Now, dynamic allocations are expensive. Garbage collector
> > still remains--along with the cost of this kind of memory
> > management. Stuff like that.
> 
> Again, you're making claims about things you apparently know nothing
> about.  What makes you think dynamic allocations are expensive?
> Certainly they are in C, because the memory management machinery has
> to do a fair bit of work to avoid fragmentation and try to reallocate
> the same memory over and over (after you free it, of course), without
> being able to move things that are still in use.  Lisp doesn't have
> this problem.  Modern GC implementations are often _faster_ than
> manual memory management with malloc/free.  
...and pigs can fly. Well, that's a bunch of idiotic statements. Sorry, have no time for
nonsense. Bye.
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y9x2ptjg.fsf@content-integrity.com>
Frankey <···········@worldnet.att.net> writes:

> Paul Foley wrote:
> > Modern GC implementations are often _faster_ than manual memory
> > management with malloc/free.

> ...and pigs can fly. Well, that's a bunch of idiotic
> statements. Sorry, have no time for nonsense. Bye.

Well, damn!  Where have you been all these years?  We obviously need
new referees for all those computer science journals that have been
posting papers addressing this very issue!

I am very disappointed that you `have no time for this'.  Can you not
find the few seconds it takes to post such enlightening,
well-reasoned, and well-researched facts?  What selfishness.




-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A4E7A.4E457701@worldnet.att.net>
Joe Marshall wrote:
> 
> Frankey <···········@worldnet.att.net> writes:
> 
> > Paul Foley wrote:
> > > Modern GC implementations are often _faster_ than manual memory
> > > management with malloc/free.
> 
> > ...and pigs can fly. Well, that's a bunch of idiotic
> > statements. Sorry, have no time for nonsense. Bye.
> 
> Well, damn!  Where have you been all these years?  We obviously need
> new referees for all those computer science journals that have been
> posting papers addressing this very issue!
> 
> I am very disappointed that you `have no time for this'.  Can you not
> find the few seconds it takes to post such enlightening,
> well-reasoned, and well-researched facts?  What selfishness.
Instead I will simply refer you to the final posting by the originator of this thread. His
findings seem to confirm what I say. Sci-fi is very good, and anything *may* happen. In
the real life though, and among the tools that are available to most people, well, C++
will perform better than Java (or Lisp.) I'm only talking performance, not programming
effort. Some things that can be said in Prolog very shortly and concisely, will require a
lot of low-level programming in C++, that's why people would use things like Prolog--when
it fits the task, because it saves a lot of effort. But everything you can do in Prolog
*can* be done in C++ (while the opposite isn't true.)


> 
> -----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
> http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
> -----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Jochen Schmidt
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92evlm$6q9sk$1@ID-22205.news.dfncis.de>
Frankey wrote:

> Joe Marshall wrote:
>> 
>> Frankey <···········@worldnet.att.net> writes:
>> 
>> > Paul Foley wrote:
>> > > Modern GC implementations are often _faster_ than manual memory
>> > > management with malloc/free.
>> 
>> > ...and pigs can fly. Well, that's a bunch of idiotic
>> > statements. Sorry, have no time for nonsense. Bye.
>> 
>> Well, damn!  Where have you been all these years?  We obviously need
>> new referees for all those computer science journals that have been
>> posting papers addressing this very issue!
>> 
>> I am very disappointed that you `have no time for this'.  Can you not
>> find the few seconds it takes to post such enlightening,
>> well-reasoned, and well-researched facts?  What selfishness.
> Instead I will simply refer you to the final posting by the originator of
> this thread. His findings seem to confirm what I say. Sci-fi is very good,
> and anything *may* happen. In the real life though, and among the tools
> that are available to most people, well, C++ will perform better than Java
> (or Lisp.) I'm only talking performance, not programming effort. Some
> things that can be said in Prolog very shortly and concisely, will require
> a lot of low-level programming in C++, that's why people would use things
> like Prolog--when it fits the task, because it saves a lot of effort. But
> everything you can do in Prolog *can* be done in C++ (while the opposite
> isn't true.)

AAAAAAAAHHRG,

Please... Please.... think of Turing ... think of Church and read them both 
PLEASE!!!

And you should differ languages from implementations.

Regards,
Jochen
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3vgs4vi37.fsf@cley.com>
* Frankey  wrote:
> But everything you can do in Prolog *can* be done in C++ (while the
> opposite isn't true.)

There should be a variant of Godwin's law for threads that result in a
statement like this.

--tim
From: Daniel Barlow
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <87y9x0mxia.fsf@protege.telent.net>
Tim Bradshaw <···@cley.com> writes:

> * Frankey  wrote:
> > But everything you can do in Prolog *can* be done in C++ (while the
> > opposite isn't true.)
> 
> There should be a variant of Godwin's law for threads that result in a
> statement like this.

"As a comp.lang.lisp discussion grows longer, the probability of a
comparison involving C++ approaches one." There is a tradition in
that, once this occurs, that thread is over, and whoever next posts to
it has to refer to Greenspun's Law"


-dan

-- 

  http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources 
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92h7n6$i4692$1@fido.engr.sgi.com>
Daniel Barlow  <···@telent.net> wrote:
+---------------
| "As a comp.lang.lisp discussion grows longer, the probability of a
| comparison involving C++ approaches one." There is a tradition in
| that, once this occurs, that thread is over, and whoever next posts to
| it has to refer to Greenspun's Law"
+---------------

Actually, Greenspun's Tenth Rule of Programming
<URL:http://philip.greenspun.com/research/>

Though I've never ever heard anyone mention One through Nine...


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: David Simmons
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <cOf26.43268$A06.1297511@news1.frmt1.sfba.home.com>
"Frankey" <···········@worldnet.att.net> wrote in message
······················@worldnet.att.net...
> Paul Foley wrote:
> > On Mon, 25 Dec 2000 00:48:53 GMT, Frankey  wrote:
> > > I didn't mean they don't compile to native code. That is not too
> > > much to say that they compile into native code though--what kind of
> > > code is it, that is the question. Take Java: everything there is
> > > dynamically allocated. Go ahead and compile it, it'll still be all
> > > dynamically allocated. The structure of the language does not get
> > > changed. Now, dynamic allocations are expensive. Garbage collector
> > > still remains--along with the cost of this kind of memory
> > > management. Stuff like that.

Frankey ,

I'm sure you're aware that there are compilers that will statically compile
Java directly into native code. Those compilers agressively eliminate
dynamic allocations, track reference scopes as part of their analysis, and
make efforts at global optimization of the resulting executable.

However, even when executing native code jitted from portable opcodes,
dynamic allocations can be performed on the stack as long as the execution
machinery can be assured the allocated-data structure will not be referenced
past the point of the stack frame being unwound. Which means that for all
such data structure usages they perform identically with that of auto-var
allocations in C. This kind of allocation is typical within dynamic
languages for intermediate values and local variables within an execution
scope/context.

> >
> > Again, you're making claims about things you apparently know nothing
> > about.  What makes you think dynamic allocations are expensive?
> > Certainly they are in C, because the memory management machinery has
> > to do a fair bit of work to avoid fragmentation and try to reallocate
> > the same memory over and over (after you free it, of course), without
> > being able to move things that are still in use.  Lisp doesn't have
> > this problem.  Modern GC implementations are often _faster_ than
> > manual memory management with malloc/free.
> ...and pigs can fly. Well, that's a bunch of idiotic statements. Sorry,
have no time for
> nonsense. Bye.

After seeing this back and forth debate, I felt the need to jump in and add
my 2-cents. I've been designing and developing virtual machines and garbage
collection technologies for over 10 years (as my profession). I'm both a
very active user of c++ and have significant experience with implementation
of various languages; especially state of the art dynamic languages. I've
been developing Smalltalk systems for nearly ten years and been a
professional in this industry for 23 some years.

I also happen, as an aside, to have understanding and knowledge of a number
of virtual machine architectures. This, since it was mentioned elsewhere in
this newsgroup thread, includes Microsoft's .NET platform and its
characteristics because I did the work (in collaboration with Microsoft) for
implementing Smalltalk and dynamic dispatching (incl. multi-method) to
support dynamically typed languages (which includes Scripting languages but
does not include Java -- Java is not a dynamic language).

I think I've read the bulk of the posts in this discussion. My immediate
reactions were that people understood some of the aspects and issues
involved but details were fuzzy. Detailed knowledge/understanding is often
required to make good comparisons and accurately identify issues.

-----------------
MEMORY ALLOCATION
-----------------
First, good garbage collectors in dynamic language systems have been able to
outperform the typical malloc/alloc systems for quite some time now. Where
such comparisons where made using typical C++ allocation or other fairly
random malloc/alloc system allocation patterns versus similar patterns
applied against a good garbage collector. This is based both on my knowledge
of the performance of languages and systems that utilize malloc/alloc and on
my direct experience in using (Mac, Win32, Linux) c++ and/or direct
alloc/malloc versus gc technology I have designed into our systems.

-----------
Most alloc/malloc systems usually have to support asynchronous allocation
(pre-emptive multi-threading/signal handler) scenarios. The approach of
using mutex locks on various call layers is, in almost every case, expensive
and inefficient relative the a modern GC designed for pre-emptive threading
situations.

Ignoring the multi-threading case, alloc and malloc can be designed to be
quite efficient. However, the costs for fragmentation and trying to manage
best-fit allocation policies in malloc/alloc systems are typically poor
compared to modern (sophisticated) gc systems. Performance of a malloc/alloc
system depends dramatically on the allocation patterns; whereas gc's tend to
be minimally impacted by such issues and can in fact be tuned for some well
understood/recognized cases. OTOH, it is possible to build very fast
malloc/alloc implementations and there are some companies who offer well
architected commercial solutions.

Static languages such as C++ can be significantly impacted by the cost of
allocating objects via the new operator (typically implemented on top of a
malloc/alloc system). That's why a good C++ application design will try to
allocate as many of its objects as it can on the stack as auto-vars. It's
also why a good alloc system or effective C++ new operator overloads will
utilize custom pools whose creation/destruction attempts to be synchronized
with code flow allocation patterns to enable minimizing the impact of
fragmentation -- use of "alloca", for example, can yield big benefits.

My experience shows that the speed of typical C++ allocators, which are
built on malloc/alloc systems, are really slow compared to a typical good GC
system. In a good pre-emptive multi-threaded gc architecture the full
lifecycle cost to allocate, initialize, track, finalize, release a random
sized [data structure] object type is about 80-100 cpu clock cycles. For
various (types) cases this can be optimized to be significantly faster. This
means that on, say a 500MHz Pentium III laptop, a gc system would be able to
continuously sustain rates of up to 5 million objects per second without
pauses.

Those numbers are based on actual use in dynamic language systems I
design/work-on. That same dynamic language system on a Athlon 650MHz can
perform multi-precision arithmetic calculations such as 1000 factorial in
24ms, or 10000 factorial in 1.25s, or 100000 fibonacci in just over 5
seconds. BigNum/multiprecision arithmetic is, among other things, useful for
cryptography. Its performance depends on the combination of gc, execution
architecture/jit facilities. The complexity and efficiency with which it can
be written/expressed depends on the language systems capabilities.
-----------

--------------------------
METHOD/FUNCTION INVOCATION
--------------------------
Dynamic language dispatching techniques for dynamic (multi-method) binding
and dispatching (of methods/functions) that only require checking of one or
two arguments equals or outperforms C++ (or Java) vtable dispatching. It
cannot outperform statically bound calls, but even there the overhead
difference in performance is only a couple cycles.

If you're wondering how this is possible, you should be aware that the
techniques for this have been around since the late 80's and were pioneered
on Smalltalk systems. There are quite a few papers on the technique amongst
OOPSLA proceedings. It is also the principle behind the Animorphic Smalltalk
hotspot technology that Sun purchased and incorporated into Java; although
it doesn't apply to Java quite as well as it does to dynamic languages.

The basic technique utilizes dynamic code generation (jitting) to regenerate
code adaptively at callsites. Most callsites tend to be mono-morphic, or
exhibit low order polymorphism. That means the types of the arguments at a
given callsite don't actually tend to vary much in comparison to the
frequency with which they are invoked. Performance in all such systems is
impacted significantly by both cpu cache architectures and size, and virtual
machine caching architecture. Newer cpu's are capable of performing better
with jitted code than static code if the jitter is aware of the specific cpu
characteristics (the same holds true for gc designs).

For the case of more than 1 or 2 args needing to be dynamically checked,
superscalar cpu architectures are rapidly diminishing the few cycle
per/dynamic-arg-type-check cost as the operations (and typecases) become
more and more parallelizable -- again this depends on the quality of the jit
architecture and the skill and knowledge of the designers.

C++ can significantly outperform C because C++ has the capability of
inlining functions and thereby restructuring algorithms and managing
registers more efficiently across the (synthesized/inlined) function calls.
Recently I've seen newer compilers that work with their linkers to perform
GLOBAL optimization, but that is still very new for the general C and C++
toolspace.

Bytecode/opcode based intermediate languages have been used and available
for almost as long as computers. Early usage was designed primarily to
enable interpreting because the intermediate language offered very good
memory space compression compared to native code, offered cpu
independence/abstraction, and had additional virtues for demand
load/link/overlay paging.

One of the big changes I've observed in the last five years or so that has
impacted VM design is that there is significantly more memory available in a
typical computer. So, in recent years, the design of opcode sets and
intermediate languages for virtual machines has been able to shift away from
that of needing to design opcodes to support interpreting, and rather
designing (language neutral) opcodes strictly as an intermediate language
for optimization and jitting. This approach is similar in nature to the
design of IL for the GNU compiler family. I should add that, in my opinion,
the Java opcode set is not well designed for either jitting or interpreting.

A modern jitting system can actually optimize code, inline references, and
dynamically update/regenerate code in ways that a static compiler and linker
cannot. Such systems can optimize based on runtime heuristic information and
can play complex games with stack layouts and register usage in calls
between code elements it has generated. The result is that it is possible
for SOME jitted code (based on an intermediate/opcode language) to
outperform equivalent code in static C or even hand coded static assembly
[which I've written enormous amounts of; and somewhat frequently still do].

The ability to allocate objects or other data structures on the synchronous
thread stack versus a asynchronous gc'ed heaps is affected in part by the
requirements of the language that will be supported. In the virtual machines
I design and work on, per/thread stacks are used for effectively
instantaneous allocation/deallocation as well as gc managed memory spaces.

By far the biggest performance factor is the ability to ascertain/know type
information at jit/rejit (or static compile) time. With that information at
hand, inlining, memory allocation, and register scheduling can be tuned to
match the characteristics of the processor against the given data and
execution patterns in the code being compiled. Without it, dynamic
dispatching must be applied and that typically costs 2X more than being able
to inline it. The two largest areas impacted by this kind of issue are
numerics and string/pointer operations.

Lisp and Scheme (and some Smalltalk) systems have the ability to declare
and/or inference type information. Many of them use this information to best
advantage in TRYING to generate code that can perform as well as or even in
some cases outperform the best C code (for the reasons mentioned before).
The performance result has a lot to do with both the design of the given
algorithm being executed and the ability of the given language to analyze
and pre-optimize it for direct or jitted native code generation.

------------------------------------
IMPACT OF CODE AND LANGUAGE FEATURES
ON PERFORMANCE
------------------------------------
The more capabilities the language and toolset have for identifying
hot-performance-sensitive-spots and providing type and optimization control,
combined with developer skill and understanding, the better the performance
can be refined/tuned.

I'm not an expert on the state of the art for Lisp tool and execution
systems. But it is my understanding that they are capable of utilizing type
information as effectively as a C compiler for native (machine) code
generation. If type information is available, it is certainly possible for
them to optimize competitively with "C", whether or not current systems
support it.

Keep in mind that some language features have non-obvious performance
barriers. For example, in terms of numerics specifically, optimizing code
that utilizes multi-precision arithmetic, currently (due to overflow checks)
needs to be more expensive than optimizing code that exclusively uses
floating point of fixed size integer representations (and doesn't do any
float signal handling or overflow/underflow checks).

If the scope of the values generated cannot be accurately determined or is
known that they are captured then (gc tracked) memory allocation may be
performed at various points to hold results. There are similar issues for
range checking and protecting of pointer style operations. Some languages
and/or their compilers (incl lisp) not only offer the option to bypass these
but aggressively support such optimizations.

As I mention in subsequent paragraphs, the performance issues being debated
in this thread may be driven more by the fact that Lisp developers will
naturally make use of Lisp features which have no parallel in C.

The use of such features MAY NOT optimize particularly well compared to a
different, potentially less expressive, pattern that could be modeled in C.
Especially if those features result in the need for transitory objects to be
created. This will have even more impact if those transitory objects may be
capable of being captured such that they could potentially persist after
their creation/instantiation point's execution context unwound -- because
then they cannot be allocated on a stack, but must be allocated from some
heap space.

If the FFI architecture and internal object model within the execution
environment have been designed to efficiently support cross language calling
with minimal marshalling costs then the option for hybridizing code is a
very competitive option.

I.e., the ability to cross call between languages without significant
penalty enables writing some portions in one language, and other portions in
another. This may be done because of different language characteristics, or
because there are already existing code bases to be integrated, etc.

-------
SIDEBAR
-------
Many advanced languages (in a CS sense) have rich (high-level) features that
"appear" to a developer to be very execution efficient (they are very often
efficient and concise in terms of expressiveness). Many dynamic languages
(which Java IS NOT), and Lisp especially, have many such capabilities.
However, without an understanding of what such features demand of an
implementation, and without a basic understanding of the capabilities of
one's specific compiler/code-generation system and execution architecture --
their use can result in code that does not perform competitively with
C/Fortran.

On the other hand, the very nature of such C/Fortran languages also means
that when the algorithm or system is of any real complexity the reverse
starts to become true. Which is that C/Fortran have a relatively low level
of generality for expressiveness and so their utility in a complex
architecture is more bounded by the human authors skill/capacity for
expressing, tracking and understanding the design and system issues. I.e.,
the more complex, the more likelihood of bugs/design-flaws, the more need
for tedious/methodical design, analysis, and high levels of skill. This is
typically why languages with high level features (and corresponding
IDE/tools) that encourage/reinforce or support design generally bear out as
higher productivity languages for use in building systems where the
developers are of average skill.
-------
-------

-- Dave Simmons [www.qks.com / www.smallscript.com]
  "Effectively solving a problem begins with how you express it."
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A4EF4.AC67D0BF@worldnet.att.net>
Looks very interesting. Can't read it now, but I've saved it and will read it sometime
later. 
Thanks for posting.

David Simmons wrote:
>...
From: David Simmons
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <KrF26.49736$A06.1469148@news1.frmt1.sfba.home.com>
[Snip]
> Those numbers are based on actual use in dynamic language systems I
> design/work-on. That same dynamic language system on a Athlon 650MHz can
> perform multi-precision arithmetic calculations such as 1000 factorial in
> 24ms, or 10000 factorial in 1.25s, or 100000 fibonacci in just over 5
> seconds. BigNum/multiprecision arithmetic is, among other things, useful
for
> cryptography. Its performance depends on the combination of gc, execution
> architecture/jit facilities. The complexity and efficiency with which it
can
> be written/expressed depends on the language systems capabilities.
> -----------

Could someone do me a favor and run the same numbers on your favorite lisp
or scheme system. I'm quite interested in gathering information on the
performance characteristics of lisp systems and smalltalk systems. Both
languages utilize a fairly different execution model/architecture than one
sees in, say, Java systems. Both languages have also played key roles in
advancing dynamic language architectures -- an area I have a strong interest
in :-).

Here is an update on those numbers...

This is taken from discussions going on in the Python, Java, Ruby, and
Smalltalk lists...

"Blair McGlashan" <·····@object-arts.com> wrote in message
···················@ID-50941.news.dfncis.de...
> Dave
>
> You wrote in message
> ···························@news1.frmt1.sfba.home.com...
> > ...
> > I don't know what machine or Smalltalk dialect that was run on.
> >
> > But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS
> Platform.
> >
> > "1000 factorial" takes 24ms (.024s).
>
> On my rather slower laptop machine (a 333 Celeron) Dolphin 4.0 takes
> just 13ms to execute 1000 factorial. This is one of the areas were Dolphin
> is faster than most (if not all) other Smalltalk's. I've always thought it
> could be made a fair bit faster too with a bit of optimization.
>
> > To try something that more fully tests multi-precision and GC
performance
> > try "10000 factorial", which on the same system takes 1257 ms (1.25s).
>
> D4 takes 1371mS to do that on the same 333; still faster (bearing in mind
> relative machine speed), but the other factors you mention (i.e. other
than
> LI arithmetic) are starting to come into play, evening out the difference.
>
> I'm glad there is at least one area of VM design and performance where we
at
> least appear to be ahead of AOS Dave :-).


Blair, thank you for pointing this out. I really hadn't thought about how
fast it ought to be.

I've been so busy on other areas I hadn't really gone in and re-verified the
heuristic feedback system in the garbage collection. And, there have been
quite a few changes since the v4 collector was last tuned a year or so
ago...

When I read your post, the first thing I did was run it on our v3 system
(which a slower than v4) to give me a ballpark reference point for
comparison. The only substantive difference between v3 and v4 for the
factorial test is in garbage collector, the multi-precision code is using
effectively the same algorithms.

The v3 AOS Platform yielded a result of 8ms for "1000 factorial" on the same
hardware versus v4 at 24ms. So I knew something was up. After a few hours o
f mucking about exploring test cases and monitoring the GC patterns I
discovered two things.

a) The multi-precision numerics libraries in v4 were still calling templated
c++ object constructors that I created during the bootstrapping process --
principally to allow me to create objects at any time independently of the
gc design. These were not only slow but had a number of checks in them
for sanity issues.

b) The gc heuristic feedback system was significantly perturbed. That system
decides when and how to execute various portions of the collector, and
correspondingly, how aggressively to work at keeping memory defragmented
(which results from pinning etc). In other words, I had added some
additional cases for special objects but I had not got around to retuning
the algorithms to account for the impact of those changes.

As I've mentioned in other posts and during presentations, the factorial and
fibonacci cases are really good tests of the GC system. I just had not
realized how significant they are...

I've been busy preparing the system for alpha testing and I had not planned
on performing tuning at this stage. Now that you've prodded me into it,
the system is significantly faster across the board. There is clearly more
refinement I can perform in the heuristics system, but here is what a few
hours of heuristics tuning achieved:

AOS v4 Platform, 650MHz Athlon
------------------------------
a) The GC system now requires/consumes less memory during
various baseline tests.

b) "1000 factorial" takes 4-6ms depending on whether you use the
cpu's cycle timer, the OS <QueryPerformanceCounter>. The
<GetThreadTimes> call revealed that it had a accuracy/
granularity of some ~10ms so it was not valid here.

c) "10000 factorial" takes
641ms, 633ms, 630ms (cycle timer, perf-timer, thread-times)

d) "100000 fibonacci" takes
337ms, 337ms, 340ms (cycle timer, perf-timer, thread-times)

e) "1000000 fibonacci" takes.
3438ms, 3423ms, 3430ms (cycle timer, perf-timer, thread-times)

>
> Regards
>
> Blair
>
> (Both figures are elapsed times measured using the microsecond
> clock/profiling counter)

The updated execution times for "1000 factorial" are about the same as
you're getting for a 333MHz Celeron; assuming the cycle ratios of 650MHz
Athlon vs 333MHz Celeron and assuming the Athlon caches are helping more
than the Celeron.

The performance of "1000 factorial" has improved by a factor of 4-6X.
The performance of "10000 factorial" has improved by a factor of 2X.
The performance of "100000 fibonacci" has improved by a factor of 15X.

Based on my v3 AOS Platform numbers, these are the kind of performance I
would expect. I'm reasonably certain that they might be able to improved by
up to 20% or so with more extensive gc profiling and tuning analysis.

And I think it is probably a fair guess that it is somewhat representative
of the performance of gc sensitive code on such systems. In other words, I'm
betting the actual numerics calculations are about as fast as they can be.
The actual variants are the GC behavior. It adds confidence that SmallScript
should execute competitively well in its scripting language role.

Again, its hard to compare because we don't know what the original processor
was for those Ruby and Python numbers -- but these are good approximations
assuming a 200MHz cpu for the Ruby and Python tests.

Python (1000 factorial)
------
> It's not very fast though:
>
> real 0m10.433s
> user 0m10.104s
> sys 0m0.033s

Ruby (1000 factorial)
----
> time ruby fact.rb 1000
>
> real 0m0.129s
> user 0m0.086s
> sys 0m0.034s


-- Dave Simmons [www.qks.com / www.smallscript.com]
"Effectively solving a problem begins with how you express it."
From: Reini Urban
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3a49dc91.1137317876@judy>
Frankey wrote:
>Now, dynamic allocations are expensive. Garbage collector still remains--along 
>with the cost of this kind of memory management. Stuff like that. 

please take a look into malloc.c/realloc.c/free.c to get a feeling for
the costs for "non-dynamic" memory management, which is in fact as
dynamic as GC. We call GC "automatic" in contrast to "manual" MM, like
malloc/free and friends.
read the memory docs at xanalys:

myth 8 at
  http://www.xanalys.com/software_tools/products/myths_and_legends.html
the faq:
  http://www.xanalys.com/software_tools/mm/faq.html
-- 
Reini Urban
http://xarch.tu-graz.ac.at/autocad/news/faq/autolisp.html
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A503F.4343EFBE@worldnet.att.net>
Reini Urban wrote:
> 
> Frankey wrote:
> >Now, dynamic allocations are expensive. Garbage collector still remains--along
> >with the cost of this kind of memory management. Stuff like that.
> 
> please take a look into malloc.c/realloc.c/free.c to get a feeling for
Will do, though later. Thanks.

> the costs for "non-dynamic" memory management, which is in fact as
> dynamic as GC. 
Ah, I feel there's a terminological problem here again. Dynamic allocation is exactly what
malloc(s) do. I'm not comparing them with what GC does. In fact, GC doesn't allocate
anything, it *releases* what's allocated before. The bottom half of dynamic memory
management, but the allocation itself is dynamic, just like malloc. However, it's much
cheaper to grab an object out of the stack (or even make it global) than to allocate it
dynamically. In many cases, locals are quite adequate--and they are cheaper to have. Java
doesn't have any local objects, all must come from the heap. 

> We call GC "automatic" in contrast to "manual" MM, like
> malloc/free and friends.
You call it right, I call it the same, but that's not what the discussion is about.

> read the memory docs at xanalys:
> 
> myth 8 at
>   http://www.xanalys.com/software_tools/products/myths_and_legends.html
> the faq:
>   http://www.xanalys.com/software_tools/mm/faq.html
Will do. Thanks.
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3r92svhgg.fsf@cley.com>
* Frankey  wrote:
> However, it's much cheaper to grab an object out of the stack (or
> even make it global) than to allocate it dynamically. In many cases,
> locals are quite adequate--and they are cheaper to have.

Actually this doesn't follow.  The cost of stack allocating an object
is something like this:

        increment a pointer
        initialize the memory somehow.
        
The cost of heap allocating an object with a copying GC is (assuming
it's ephemeral):

        increment a pointer
        initialize the memory somehow
        cause the GC to run slightly more often

So the difference is that the GC runs slightly more often -- note that
the GC does *not* `free' the object or anything like that.

So the heap allocating system is more expensive because the GC runs
more often.  But if almost all data is ephemeral (and when comparing
with stack allocation this is true) the GC doesn't actually do much
when it runs.  So if you make the ephemeral space large then you can
push the GC overhead down into the noise.  There was a thread a few
weeks ago (started by me, I forget the subject) where I asked how low
the overhead can actually be on real implementations, and the answer
is it can be basically nothing.

So ephemeral heap allocation is not `much more expensive' than stack
allocation so long as you use reasonably large ephemeral spaces, where
`reasonably large' is pretty small given how much memory people have
nowadays.

Note there are a couple of things that might make it more expensive
than this:

        more cache pressure
        
        the GC may copy the object once if it gets invoked at a bad 
        time

The second of these vanishes into the noise in the obvious way -- I
just thought I'd mention it in case someone else did.  The first I'm
not sure about but I suspect is not a big deal.

--tim
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4ofxwla2d.fsf@beta.franz.com>
A couple of minor corrections:

Tim Bradshaw <···@cley.com> writes:

> * Frankey  wrote:
> > However, it's much cheaper to grab an object out of the stack (or
> > even make it global) than to allocate it dynamically. In many cases,
> > locals are quite adequate--and they are cheaper to have.
> 
> Actually this doesn't follow.  The cost of stack allocating an object
> is something like this:
> 
>         increment a pointer
==========^
>         initialize the memory somehow.

No pointer need be incremented for typical stack-allocated objects.
Try an experiment on a lisp which does efficient stack allocation
(Allegro CL will work for you):
  1. Create a function which let-binds a couple of variables, each to a
simple make-array form, declares them dynamic extent, and does so at
optimization settings that will ensure stack-allocation.  Example:

(defun foo ()
  (declare (optimize speed))
  (let ((x (make-array 10 :element-type '(unsigned-byte 8)))
	(y (make-array 10 :element-type '(unsigned-byte 8))))
    (declare (dynamic-extent x y))
    (bar x y)))

 2. Compile and disassemble this function
 3. Find the pointer increment.

You'll fail at step 3; the pointer "increment" has already been
pre-calculated by the compiler.  Any other work you see done in the
disassembler output involving loads and stores is the actual building
of the object's innards; i.e. your "initialize the memory somehow"
item (similar to what must be done in a heap-allocation).

> The cost of heap allocating an object with a copying GC is (assuming
> it's ephemeral):
> 
>         increment a pointer
>         initialize the memory somehow
>         cause the GC to run slightly more often
==========^

Implied in this last item is that a test for "fullness" must be made on
every allocation, to see whether whatever sub-memory chunk has been
allocated needs to be renewed, which may result in a gc.  This is
something that is not required for allocating an object on the stack.


-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <vgs4bbnr.fsf@content-integrity.com>
Duane Rettig <·····@franz.com> writes:

> > The cost of heap allocating an object with a copying GC is (assuming
> > it's ephemeral):
> > 
> >         increment a pointer
> >         initialize the memory somehow
> >         cause the GC to run slightly more often
> ==========^
> 
> Implied in this last item is that a test for "fullness" must be made on
> every allocation, to see whether whatever sub-memory chunk has been
> allocated needs to be renewed, which may result in a gc.  

You needn't test for fullness on *every* allocation.  Other techniques
I have seen are putting in guard pages, checking for fullness at
function entry (assuming the function has an idea of how much it
conses), and checking `frequently enough' with the assumption that the
cons rate does not exceed certain limits.




-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4k88kkzzs.fsf@beta.franz.com>
Joe Marshall <···@content-integrity.com> writes:

> Duane Rettig <·····@franz.com> writes:
> 
> > > The cost of heap allocating an object with a copying GC is (assuming
> > > it's ephemeral):
> > > 
> > >         increment a pointer
> > >         initialize the memory somehow
> > >         cause the GC to run slightly more often
> > ==========^
> > 
> > Implied in this last item is that a test for "fullness" must be made on
> > every allocation, to see whether whatever sub-memory chunk has been
> > allocated needs to be renewed, which may result in a gc.  
> 
> You needn't test for fullness on *every* allocation.  Other techniques
> I have seen are
>   putting in guard pages,

I like the idea of using hardware where it can make lisp operations
faster (though I don't know if it would be worth all of the effort
in this particular case in order to remove one test).  However, a caveat
about using guard pages is that in order to trigger a fault, one must
perform a memory reference operation (or a set, of course), within a
scope where the trap handler would be able to tell that an allocation
was being done, and in a place in the code where the handler could recover
or retry the operation.  Also, such an access would have to occur on the
"last" allocation unit of the object, where "last" means the one closest
to the new allocation pointer.  This would ensure that if the last au were
on a different (and thus guarded) page, the trap would be triggered.

This requirement tends to blur the distinction between allocation of
an object and the initializatin of that object.  In practice, it does
not really matter, because most lisp objects have at least some sort
of "header" word, which give type and size information about the object.
Other slots in the object do not necessarily have to be filled in, (e.g.,
a specialized array with no initial-element spec), so the only way to
guarantee a checking memory access in such objects-with-headers is to
make sure that the header is always placed closest to the allocation
pointer.  Thus, for example, Allegro CL would have to be redesigned
either to place the headers for its objects at the highest word of each
object, or else allocate objects in decreasing memory addresses, instead
of the increasing address allocation that is currently done.  Otherwise,
an "extra" memory access would be needed, which would defeat the idea of
eliminating the fullness test.

Do you have any references on this technique, or any example
implementations?  I'd be interested in looking at them.

>   checking for fullness at function entry (assuming the function has
>     an idea of how much it conses),

This would only work if the function were "atomic" with respect to
the garbage-collector; Any consing within the function by functions in
other threads would invalidate the check at the function start.

>   and checking `frequently enough' with the assumption that the
> cons rate does not exceed certain limits.

See the previous answer; I'd hate to have to debug a multithreaded
system that made such assumptions :-)

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92ggng$tpc$1@paradise.nirvananet>
In article <·············@beta.franz.com>,
Duane Rettig  <·····@franz.com> wrote:
>Joe Marshall <···@content-integrity.com> writes:
> ...
>> You needn't test for fullness on *every* allocation.  Other techniques
>> I have seen are
>>   putting in guard pages,
>
>I like the idea of using hardware where it can make lisp operations
>faster (though I don't know if it would be worth all of the effort
>in this particular case in order to remove one test).  However, a caveat
>about using guard pages is that in order to trigger a fault, one must
>perform a memory reference operation (or a set, of course), within a
>scope where the trap handler would be able to tell that an allocation
>was being done, and in a place in the code where the handler could recover

how about the problem of "sizing" the guard pages, e.g. to catch a
stack overflow when a very large array is allocated on the stack?

hs
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4bstvval6.fsf@beta.franz.com>
··@paradise.nirvananet (Hartmann Schaffer) writes:

> In article <·············@beta.franz.com>,
> Duane Rettig  <·····@franz.com> wrote:
> >Joe Marshall <···@content-integrity.com> writes:
> > ...
> >> You needn't test for fullness on *every* allocation.  Other techniques
> >> I have seen are
> >>   putting in guard pages,
> >
> >I like the idea of using hardware where it can make lisp operations
> >faster (though I don't know if it would be worth all of the effort
> >in this particular case in order to remove one test).  However, a caveat
> >about using guard pages is that in order to trigger a fault, one must
> >perform a memory reference operation (or a set, of course), within a
> >scope where the trap handler would be able to tell that an allocation
> >was being done, and in a place in the code where the handler could recover
> 
> how about the problem of "sizing" the guard pages, e.g. to catch a
> stack overflow when a very large array is allocated on the stack?

This seems to be a problem for stacks only on Windows.  Unix page fault
handlers seem to do the right thing when the page fault occurs between
the stack pointer and the last known stack frontier still within the
resource limits set for the user (to distinguish the fault from one
which is just a SEGV as a random wild memory reference).

It is true that on NT and Windows we are forced to "walk" through stack
allocations (decrement %esp by 4092, then do a push to touch memory at
that point, repeating until enough stack is allocated).  This is of course
an inefficient way to stack-allocate, but at a local level I consider it
a bug in MS's stack allocation heuristics.  Not a very bad bug, because
there are pathological cases that occur when the unix style heuristics run
their stacks into memory that already _is_ allocated,  but I would have
preferred at least some kind of option on the heuristic that is used.
Instead, when you disassemble a function which stack-allocates a 100k
array on NT and the same function on linux (on x86) you will find a
large difference in the size of the disassembled function.

I don't know if it is possible to select guard sizes/numbers on Windows,
but I also don't think that it is the right solution even if it is
possible; For any particular number of guard-pages one sets, some
programmer is going to request an array size that requires just a little
bit more than can be handled by that set of guard pages.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92j0et$tv$1@paradise.nirvananet>
In article <·············@beta.franz.com>,
Duane Rettig  <·····@franz.com> wrote:
> ...
>> how about the problem of "sizing" the guard pages, e.g. to catch a
>> stack overflow when a very large array is allocated on the stack?
>
> ...
>a bug in MS's stack allocation heuristics.  Not a very bad bug, because
>there are pathological cases that occur when the unix style heuristics run
>their stacks into memory that already _is_ allocated,  but I would have

that was the situation i was wondering about

> ...
>I don't know if it is possible to select guard sizes/numbers on Windows,
>but I also don't think that it is the right solution even if it is
>possible; For any particular number of guard-pages one sets, some
>programmer is going to request an array size that requires just a little
>bit more than can be handled by that set of guard pages.

that exactly was my concern

hs
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4ofxtrdrw.fsf@beta.franz.com>
··@paradise.nirvananet (Hartmann Schaffer) writes:

> In article <·············@beta.franz.com>,
> Duane Rettig  <·····@franz.com> wrote:
> > ...
> >> how about the problem of "sizing" the guard pages, e.g. to catch a
> >> stack overflow when a very large array is allocated on the stack?
> >
> > ...
> >a bug in MS's stack allocation heuristics.  Not a very bad bug, because
> >there are pathological cases that occur when the unix style heuristics run
> >their stacks into memory that already _is_ allocated,  but I would have
> 
> that was the situation i was wondering about

You shouldn't worry about that case.  If your program is taking up so much
of the address space that your operating system can't or won't keep the
stack at least a little further away from other memory objects than your
stack-size limit parameters allow, then you need to go to 64 bits.

> >I don't know if it is possible to select guard sizes/numbers on Windows,
> >but I also don't think that it is the right solution even if it is
> >possible; For any particular number of guard-pages one sets, some
> >programmer is going to request an array size that requires just a little
> >bit more than can be handled by that set of guard pages.
> 
> that exactly was my concern

and precisely why I believe that it is not the right solution (i.e. a bug).
The Unix method of having trap handlers look at the stack pointer does
assume that every program element is following the prescribed calling
sequence or is otherwise handling the stack pointer with a bit of care,
but if this assumption is acceptable, there is very little that goes
wrong with the stack-pointer heuristic.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ofxwm4ma.fsf@content-integrity.com>
Duane Rettig <·····@franz.com> writes:

> Joe Marshall <···@content-integrity.com> writes:
> 
> > Duane Rettig <·····@franz.com> writes:
> > 
> > > > The cost of heap allocating an object with a copying GC is (assuming
> > > > it's ephemeral):
> > > > 
> > > >         increment a pointer
> > > >         initialize the memory somehow
> > > >         cause the GC to run slightly more often
> > > ==========^
> > > 
> > > Implied in this last item is that a test for "fullness" must be made on
> > > every allocation, to see whether whatever sub-memory chunk has been
> > > allocated needs to be renewed, which may result in a gc.  
> > 
> > You needn't test for fullness on *every* allocation.  Other techniques
> > I have seen are
> >   putting in guard pages,
> 
> I like the idea of using hardware where it can make lisp operations
> faster (though I don't know if it would be worth all of the effort
> in this particular case in order to remove one test).  However, a caveat
> about using guard pages is that in order to trigger a fault, one must
> perform a memory reference operation (or a set, of course), within a
> scope where the trap handler would be able to tell that an allocation
> was being done, and in a place in the code where the handler could recover
> or retry the operation.  Also, such an access would have to occur on the
> "last" allocation unit of the object, where "last" means the one closest
> to the new allocation pointer.  This would ensure that if the last au were
> on a different (and thus guarded) page, the trap would be triggered.
> 
> This requirement tends to blur the distinction between allocation of
> an object and the initialization of that object.  In practice, it does
> not really matter, because most lisp objects have at least some sort
> of "header" word, which give type and size information about the object.

Right, and many objects are initted right after being allocated, so if
allocation and initialization are atomic, you're all set.

> Do you have any references on this technique, or any example
> implementations?  I'd be interested in looking at them.

I can't remember where I saw this one, sorry.


> >   checking for fullness at function entry (assuming the function has
> >     an idea of how much it conses),
> 
> This would only work if the function were "atomic" with respect to
> the garbage-collector; Any consing within the function by functions in
> other threads would invalidate the check at the function start.

MIT Scheme does it a bit like this.  I think they also check on
continuation invokation.  Currently, the multiple threads are handled
via polled interrupts, so if the GC check and the interrupt polling
occurs at the same time, you can't switch back into a thread without
immediately doing a GC check for the code following the context
switch.

(There is another trick: have the interrupts bash the GC frontier and
you can make the GC check do double duty as an interrupt check.)

> >   and checking `frequently enough' with the assumption that the
> > cons rate does not exceed certain limits.
> 
> See the previous answer; I'd hate to have to debug a multithreaded
> system that made such assumptions :-)

If you assume a high-priority `gc thread' that measures the rate of
advancement of the consing frontier, it could trigger GC well before
any of the consing processes could reach the limit.  (of course,
locking out this process would be *bad*).


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3hf3nv8h9.fsf@cley.com>
* Duane Rettig wrote:

> No pointer need be incremented for typical stack-allocated objects.
> Try an experiment on a lisp which does efficient stack allocation
> (Allegro CL will work for you):

[Snipped example]

This is obviously correct, as is your point about the fullness test.
But I think I'm assuming that the whole increment-and-test part of
heap (or `general' stack) allocation is incredibly cheap compared to
the process of actually initializing memory, since it's likely that
the pointer and bound will be either in registers or 1st-level cache
at worst, while the memory will generally not be.  In fact I'd argue
-- based on the assumption that everything except memory access is
basically free on a modern machine -- that stack allocation's win
should be that it has better cache behaviour.

I'd also quibble futilely about the `typical' case of stack allocation
-- I reckon the this is stack allocated rest lists, which (I
think?) must need to go through the general case since the size can't
be precomputed.  

--tim
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4hf3nxhgd.fsf@beta.franz.com>
Tim Bradshaw <···@cley.com> writes:

> * Duane Rettig wrote:
> 
> > No pointer need be incremented for typical stack-allocated objects.
> > Try an experiment on a lisp which does efficient stack allocation
> > (Allegro CL will work for you):
> 
> [Snipped example]
> 
> This is obviously correct, as is your point about the fullness test.
> But I think I'm assuming that the whole increment-and-test part of
> heap (or `general' stack) allocation is incredibly cheap compared to
> the process of actually initializing memory,

Assuming that you actually _do_ initialize the memory.  But what about
a large specialized array?  It need not be initialized, and in Allegro
CL in both the stack-allocated case and the heap-allocated case the
_only_ word that is initialized is the header word (unless there is an
:initial-element specification).  All architectures start this write in
zero time, as far as the instruction flow is concerned.

> since it's likely that
> the pointer and bound will be either in registers or 1st-level cache
> at worst, while the memory will generally not be.  In fact I'd argue
> -- based on the assumption that everything except memory access is
> basically free on a modern machine -- that stack allocation's win
> should be that it has better cache behaviour.

When you are allocating K-bytes of memory at a time it is not caching
but paging that becomes the factor.  And in a generational gc, the
allocation frontier will generally be paged in.

> I'd also quibble futilely about the `typical' case of stack allocation
> -- I reckon the this is stack allocated rest lists, which (I
> think?) must need to go through the general case since the size can't
> be precomputed.  

My use of the term "typical" is perhaps unfortunate; it seems to have
connoted "most used", instead of what I had intended as "similar to
other languages".  Dynamic &rest allocation is definitely not typical;
no other language has it that I know of.  I don't even know of any but
lisp-like langauges that have &rest arguments in the first place.  The
varargs feature comes close in C, but it is not really that similar,
since it neither builds an actual structure out of the available
arguments, nor does it put a limit on how far the data extends.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey34rzlv1uj.fsf@cley.com>
* Duane Rettig wrote:

> Assuming that you actually _do_ initialize the memory.  But what about
> a large specialized array?  It need not be initialized, and in Allegro
> CL in both the stack-allocated case and the heap-allocated case the
> _only_ word that is initialized is the header word (unless there is an
> :initial-element specification).  All architectures start this write in
> zero time, as far as the instruction flow is concerned.

I think I'd probably like to count initializing the memory as part of
allocating it, whether or not that initialization is done as part of
the make-x form.  Of course this isn't always right -- I can think of
cases where you only ever use the object very sparsely so it never
really gets initialized except in very small regions -- but I think
it's generally a bit misleading to consider the time to get some
uninitialised memory as the time to allocate memory.  If you do that
then it's just too easy to cheat (for instance don't even check for
space at all until first use, leading to asymptotically infinite
`allocation' rates).

> When you are allocating K-bytes of memory at a time it is not caching
> but paging that becomes the factor.  And in a generational gc, the
> allocation frontier will generally be paged in.

I'm not sure what you mean by this.  If you mean `writing or reading
pages from backing store', then I think I disagree, for most programs,
if you mean something like `causing lots of TLB misses' then I don't
know.

Of course this depends on a lot of factors -- I'm assuming plenty of
physical memory, which I guess is often true for 32bit systems but
perhaps not so for 64bit systems running large programs.  For the
programs I deal with at the moment (which are on currently at the
large end of 32bit in terms of data) we'd certainly always make sure
that we have enough real memory that we never actually page to disk.

--tim
From: Johan Kullstam
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m2lmsxqmye.fsf@euler.axel.nom>
Tim Bradshaw <···@cley.com> writes:

> * Duane Rettig wrote:
> 
> > Assuming that you actually _do_ initialize the memory.  But what about
> > a large specialized array?  It need not be initialized, and in Allegro
> > CL in both the stack-allocated case and the heap-allocated case the
> > _only_ word that is initialized is the header word (unless there is an
> > :initial-element specification).  All architectures start this write in
> > zero time, as far as the instruction flow is concerned.
> 
> I think I'd probably like to count initializing the memory as part of
> allocating it, whether or not that initialization is done as part of
> the make-x form.  Of course this isn't always right -- I can think of
> cases where you only ever use the object very sparsely so it never
> really gets initialized except in very small regions -- but I think
> it's generally a bit misleading to consider the time to get some
> uninitialised memory as the time to allocate memory.  If you do that
> then it's just too easy to cheat (for instance don't even check for
> space at all until first use, leading to asymptotically infinite
> `allocation' rates).

consider making a table.  in one case, you allocate the array and then
set the values from some function.  consider the other case where you
allocate the array, set everything to zero and then set values.  in
the second case you get a gratuitous initialization.  i think duane is
just pointing out that you can avoid the extra initialization by using
a specialized array without initial-element.

looking at MAKE-ARRAY i don't see where you can pass a function to
allocate things.  something like

  (MAKE-ARRAY '(3 2) :INIT-FUNC #'(LAMBDA (I J) (+ I J)))

where (LAMBDA (I J) ...) gets called to init the (I J)-th element
would maybe be useful.  so it would be something like

  (LET ((X (MAKE-ARRAY '(3 2))))
    (DOTIMES (I 3)
       (DOTIMES (J 2)
          ((LAMBDA (I J) (+ I J)) I J)))
    X)

but where the MAKE-ARRAY here would not of itself initialize anything
despite not being specialized.  perhaps a clever compiler could see
this.

did i miss some interpretation of one of MAKE-ARRAY's arguments or is
there another function that does this?

> > When you are allocating K-bytes of memory at a time it is not caching
> > but paging that becomes the factor.  And in a generational gc, the
> > allocation frontier will generally be paged in.
> 
> I'm not sure what you mean by this.  If you mean `writing or reading
> pages from backing store', then I think I disagree, for most programs,
> if you mean something like `causing lots of TLB misses' then I don't
> know.
> 
> Of course this depends on a lot of factors -- I'm assuming plenty of
> physical memory, which I guess is often true for 32bit systems but
> perhaps not so for 64bit systems running large programs.  For the
> programs I deal with at the moment (which are on currently at the
> large end of 32bit in terms of data) we'd certainly always make sure
> that we have enough real memory that we never actually page to disk.
> 
> --tim
> 

-- 
J o h a n  K u l l s t a m
[········@ne.mediaone.net]
Don't Fear the Penguin!
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4k88hq83b.fsf@beta.franz.com>
Johan Kullstam <········@ne.mediaone.net> writes:

> Tim Bradshaw <···@cley.com> writes:
> 
> > * Duane Rettig wrote:
> > 
> > > Assuming that you actually _do_ initialize the memory.  But what about
> > > a large specialized array?  It need not be initialized, and in Allegro
> > > CL in both the stack-allocated case and the heap-allocated case the
> > > _only_ word that is initialized is the header word (unless there is an
> > > :initial-element specification).  All architectures start this write in
> > > zero time, as far as the instruction flow is concerned.
> > 
> > I think I'd probably like to count initializing the memory as part of
> > allocating it, whether or not that initialization is done as part of
> > the make-x form.  Of course this isn't always right -- I can think of
> > cases where you only ever use the object very sparsely so it never
> > really gets initialized except in very small regions -- but I think
> > it's generally a bit misleading to consider the time to get some
> > uninitialised memory as the time to allocate memory.  If you do that
> > then it's just too easy to cheat (for instance don't even check for
> > space at all until first use, leading to asymptotically infinite
> > `allocation' rates).
> 
> consider making a table.  in one case, you allocate the array and then
> set the values from some function.  consider the other case where you
> allocate the array, set everything to zero and then set values.  in
> the second case you get a gratuitous initialization.  i think duane is
> just pointing out that you can avoid the extra initialization by using
> a specialized array without initial-element.

Correct.

> looking at MAKE-ARRAY i don't see where you can pass a function to
> allocate things.  something like
> 
>   (MAKE-ARRAY '(3 2) :INIT-FUNC #'(LAMBDA (I J) (+ I J)))
> 
> where (LAMBDA (I J) ...) gets called to init the (I J)-th element
> would maybe be useful.  so it would be something like
> 
>   (LET ((X (MAKE-ARRAY '(3 2))))
>     (DOTIMES (I 3)
>        (DOTIMES (J 2)
>           ((LAMBDA (I J) (+ I J)) I J)))
>     X)
> 
> but where the MAKE-ARRAY here would not of itself initialize anything
> despite not being specialized.  perhaps a clever compiler could see
> this.
> 
> did i miss some interpretation of one of MAKE-ARRAY's arguments or is
> there another function that does this?

No.  The spec says that reading a non-displaced array that has not been
given an initial-element or initial-contents is undefined.  So it leaves
it up to the implementation as to what to do.

The problem with arrays of element type T is that the elements must
at all times be valid LispVals, because if they contain uninitialized
bits it might confuse the garbage collector.  So we initialize arrays
of type T to nils even if they are not initialized by the user.  But
it is not necessary to initialize specialized arrays, because they are
not seen by the garbage colletor; they are just bits.  Thus, we take
advantage of this fact and save some time from double-initializing
specialized arrays.  This makes little difference in small arrays,
but when the arrays get large...

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3snn4u8i0.fsf@cley.com>
* Johan Kullstam wrote:

> consider making a table.  in one case, you allocate the array and then
> set the values from some function.  consider the other case where you
> allocate the array, set everything to zero and then set values.  in
> the second case you get a gratuitous initialization.  i think duane is
> just pointing out that you can avoid the extra initialization by using
> a specialized array without initial-element.

I'm not trying to argue that you should always `zero' things and then
set them to what you want.  But I'd like to count the initialization
-- even if it's user-defined rather than system-defined -- as part of
the allocation (perhaps some other term would be better) of memory.

I think I can see reasons *not* to time this way.  For instance if
you're comparing something like a typical lisp allocator against a
typical malloc-type system, then malloc has to do a whole load more
work just to find a free bit of memory to use, and may have all sorts
of horrid pathological cases where either memory use or time to find
memory explodes.  So comparing the time to find a free chunk of memory
makes Lisp look very good -- as it should!

But my reason *to* time this way is that it times the common case,
where you actually want some meaningful contents to thing you
allocated rather than just random bits.  It also prevents people
cheating by not actually doing any work until the thing is first
accessed.  And finally it lets people who can do clever things with
caching[1] win, which is good because doing clever things with caching is
important given the relative performances of processors and memory
nowadays.

--tim



Footnotes: 
[1]  for instance the trick of having a nursery which is the same size
     as a level of cache.
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4k88g7ho3.fsf@beta.franz.com>
Tim Bradshaw <···@cley.com> writes:

> * Johan Kullstam wrote:
> 
> > consider making a table.  in one case, you allocate the array and then
> > set the values from some function.  consider the other case where you
> > allocate the array, set everything to zero and then set values.  in
> > the second case you get a gratuitous initialization.  i think duane is
> > just pointing out that you can avoid the extra initialization by using
> > a specialized array without initial-element.
> 
> I'm not trying to argue that you should always `zero' things and then
> set them to what you want.  But I'd like to count the initialization
> -- even if it's user-defined rather than system-defined -- as part of
> the allocation (perhaps some other term would be better) of memory.
> 
> I think I can see reasons *not* to time this way.  For instance if
> you're comparing something like a typical lisp allocator against a
> typical malloc-type system, then malloc has to do a whole load more
> work just to find a free bit of memory to use, and may have all sorts
> of horrid pathological cases where either memory use or time to find
> memory explodes.  So comparing the time to find a free chunk of memory
> makes Lisp look very good -- as it should!

If you're comparing Lisp against C in this way, you should be comparing
like features.  Remember, in C calloc() initializes, and malloc() does
not.  There are reasons for using calloc().  But which do you think is
used more often?  Why do you think that is?

> But my reason *to* time this way is that it times the common case,
> where you actually want some meaningful contents to thing you
> allocated rather than just random bits.  It also prevents people
> cheating by not actually doing any work until the thing is first
> accessed.  And finally it lets people who can do clever things with
> caching[1] win, which is good because doing clever things with caching is
> important given the relative performances of processors and memory
> nowadays.

> Footnotes: 
> [1]  for instance the trick of having a nursery which is the same size
>      as a level of cache.

Why would you consider it cheating to use all of the features and tricks
of hardware and software to take as few cycles to perform a certain kind
of work?  If the work actually gets done, and saves time, isn't it a
meaningful measure of performance?  What do you do for a sparse array
which will _never_ be fully used, but which because of your rule actually
gets filled in anyway, with corresponding cycles lost and swap allocated?

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Jochen Schmidt
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92nmjr$6umjr$1@ID-22205.news.dfncis.de>
Duane Rettig wrote:

> Tim Bradshaw <···@cley.com> writes:
> 
>> * Johan Kullstam wrote:
>> 
>> > consider making a table.  in one case, you allocate the array and then
>> > set the values from some function.  consider the other case where you
>> > allocate the array, set everything to zero and then set values.  in
>> > the second case you get a gratuitous initialization.  i think duane is
>> > just pointing out that you can avoid the extra initialization by using
>> > a specialized array without initial-element.
>> 
>> I'm not trying to argue that you should always `zero' things and then
>> set them to what you want.  But I'd like to count the initialization
>> -- even if it's user-defined rather than system-defined -- as part of
>> the allocation (perhaps some other term would be better) of memory.
>> 
>> I think I can see reasons *not* to time this way.  For instance if
>> you're comparing something like a typical lisp allocator against a
>> typical malloc-type system, then malloc has to do a whole load more
>> work just to find a free bit of memory to use, and may have all sorts
>> of horrid pathological cases where either memory use or time to find
>> memory explodes.  So comparing the time to find a free chunk of memory
>> makes Lisp look very good -- as it should!
> 
> If you're comparing Lisp against C in this way, you should be comparing
> like features.  Remember, in C calloc() initializes, and malloc() does
> not.  There are reasons for using calloc().  But which do you think is
> used more often?  Why do you think that is?
> 
>> But my reason *to* time this way is that it times the common case,
>> where you actually want some meaningful contents to thing you
>> allocated rather than just random bits.  It also prevents people
>> cheating by not actually doing any work until the thing is first
>> accessed.  And finally it lets people who can do clever things with
>> caching[1] win, which is good because doing clever things with caching is
>> important given the relative performances of processors and memory
>> nowadays.
> 
>> Footnotes:
>> [1]  for instance the trick of having a nursery which is the same size
>>      as a level of cache.
> 
> Why would you consider it cheating to use all of the features and tricks
> of hardware and software to take as few cycles to perform a certain kind
> of work?  If the work actually gets done, and saves time, isn't it a
> meaningful measure of performance?  What do you do for a sparse array
> which will _never_ be fully used, but which because of your rule actually
> gets filled in anyway, with corresponding cycles lost and swap allocated?

This might be a really dumb question but is it possible in any system to 
allocate a huge uninitialized array (bigger than e. g. physical+swap) and
use it as a sparse array? I could imagine that the mmu and the 
memory-management-system of the OS could manage this in similar terms
like sparse-files are handled in UNIX-Filesystems.

I hope my question is not so dumb as i feel at the moment...

Regards,
Jochen

http://www.dataheaven.de
From: Johan Kullstam
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m28zow7c1z.fsf@euler.axel.nom>
Jochen Schmidt <···@dataheaven.de> writes:

> This might be a really dumb question but is it possible in any system to 
> allocate a huge uninitialized array (bigger than e. g. physical+swap) and
> use it as a sparse array? I could imagine that the mmu and the 
> memory-management-system of the OS could manage this in similar terms
> like sparse-files are handled in UNIX-Filesystems.
> 
> I hope my question is not so dumb as i feel at the moment...

i don't know, but wouldn't a hash-table be a good solution?  you can
use fixnums for the key.

-- 
J o h a n  K u l l s t a m
[········@ne.mediaone.net]
Don't Fear the Penguin!
From: Jochen Schmidt
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92o8tu$7ea3k$1@ID-22205.news.dfncis.de>
Johan Kullstam wrote:

> Jochen Schmidt <···@dataheaven.de> writes:
> 
>> This might be a really dumb question but is it possible in any system to
>> allocate a huge uninitialized array (bigger than e. g. physical+swap) and
>> use it as a sparse array? I could imagine that the mmu and the
>> memory-management-system of the OS could manage this in similar terms
>> like sparse-files are handled in UNIX-Filesystems.
>> 
>> I hope my question is not so dumb as i feel at the moment...
> 
> i don't know, but wouldn't a hash-table be a good solution?  you can
> use fixnums for the key.

Yes sure - It was no question of how to do something but more like wondering
if such unusual memory-management systems exist. (I by myself would used 
hash-tables often in this way)
Sparse-files are supported in many good filesystems, but I've never heard of
a efficient "sparse array" implementation. Finally such an implementation 
would be nothing else then a hashtable implemented by the MMU-Hashfunctions.

Regards,
Jochen

> 
> --
> J o h a n  K u l l s t a m
> [········@ne.mediaone.net]
> Don't Fear the Penguin!
> 
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92q0j2$k2tvb$1@fido.engr.sgi.com>
Jochen Schmidt  <···@dataheaven.de> wrote:
+---------------
| This might be a really dumb question but is it possible in any system to 
| allocate a huge uninitialized array (bigger than e. g. physical+swap) and
| use it as a sparse array? I could imagine that the mmu and the 
| memory-management-system of the OS could manage this in similar terms
| like sparse-files are handled in UNIX-Filesystems.
+---------------

In SGI's "Irix" version of Unix, you can dynamically configure
the system with some amount of "virtual swap space", to overcommit
the physically-available swap area. The resulting "logical swap"
space is the sum of vswap & phtsical swap. This is obviously dangerous
(as it can cause deadlock), as warned in the "swap(1)" & "swapctl(2)"
man pages, but it's sometimes is necessary to allow a huge program to
fork when all it's going to do is immediately exec a smaller program,
or allow a program that uses memory sparsely [and does not use the
technique shown later] to run without requiring a lot of swap space
that is not really needed.

Given a enough vswap, one could "allocate" a huge amount of virtual
space in a user program by mmap'ing /dev/zero (which is implicitly
MAP_PRIVATE), and succeed, though "logical swap space" will still
be allocated. The pages in the resulting user virtual address range
will not be initialized or swapped unless they are actually touched.

A better way (if you control the program's source) is to explicitly
defer the allocation of swap space by mmap'ing /dev/zero with the
MAP_AUTORESRV flag. From "mmap(2)":

    The MAP_AUTORESRV flag causes logical swap space to be automatically
    reserved as each page is first referenced with a store operation instead
    of when the mapping is created.  When this flag is used, no logical swap
    space is reserved when the mapping is created.  Therefore, the system
    cannot guarantee that space will be available when needed.  If all the
    logical swap space has been taken by other processes when a page in a
    MAP_AUTORESRV mapping is first stored to, then the process will be sent
    SIGBUS.

Now I don't know if any Lisp implementation for Irix actually uses this 
feature to support sparseness, but it's there (and I know that certain C
programs depend on it)...


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: Jochen Schmidt
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92q47k$7moto$1@ID-22205.news.dfncis.de>
Rob Warnock wrote:

> Jochen Schmidt  <···@dataheaven.de> wrote:
> +---------------
> | This might be a really dumb question but is it possible in any system to
> | allocate a huge uninitialized array (bigger than e. g. physical+swap)
> | and use it as a sparse array? I could imagine that the mmu and the
> | memory-management-system of the OS could manage this in similar terms
> | like sparse-files are handled in UNIX-Filesystems.
> +---------------
> 
> In SGI's "Irix" version of Unix, you can dynamically configure
> the system with some amount of "virtual swap space", to overcommit
> the physically-available swap area. The resulting "logical swap"
> space is the sum of vswap & phtsical swap. This is obviously dangerous
> (as it can cause deadlock), as warned in the "swap(1)" & "swapctl(2)"
> man pages, but it's sometimes is necessary to allow a huge program to
> fork when all it's going to do is immediately exec a smaller program,
> or allow a program that uses memory sparsely [and does not use the
> technique shown later] to run without requiring a lot of swap space
> that is not really needed.
> 
> Given a enough vswap, one could "allocate" a huge amount of virtual
> space in a user program by mmap'ing /dev/zero (which is implicitly
> MAP_PRIVATE), and succeed, though "logical swap space" will still
> be allocated. The pages in the resulting user virtual address range
> will not be initialized or swapped unless they are actually touched.
> 
> A better way (if you control the program's source) is to explicitly
> defer the allocation of swap space by mmap'ing /dev/zero with the
> MAP_AUTORESRV flag. From "mmap(2)":
> 
>     The MAP_AUTORESRV flag causes logical swap space to be automatically
>     reserved as each page is first referenced with a store operation
>     instead
>     of when the mapping is created.  When this flag is used, no logical
>     swap
>     space is reserved when the mapping is created.  Therefore, the system
>     cannot guarantee that space will be available when needed.  If all the
>     logical swap space has been taken by other processes when a page in a
>     MAP_AUTORESRV mapping is first stored to, then the process will be
>     sent SIGBUS.
> 
> Now I don't know if any Lisp implementation for Irix actually uses this
> feature to support sparseness, but it's there (and I know that certain C
> programs depend on it)...

Yes that is exactly what I meant. It sounds like a useful feature of an OS 
and I'am either wondering why the other OSes seem to no such facilities
or I'am completely disinformed.

Thanks,

Jochen

--
http://www.dataheaven.de
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3ae9bt0id.fsf@cley.com>
* Jochen Schmidt wrote:

> Yes that is exactly what I meant. It sounds like a useful feature of
> an OS and I'am either wondering why the other OSes seem to no such
> facilities or I'am completely disinformed.

I just checked, and I *think* that using mmap with MAP_NORESERVE in
Solaris does pretty much what Rob described.  I haven't *checked*
though!

--tim
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4ae98wlus.fsf@beta.franz.com>
Tim Bradshaw <···@cley.com> writes:

> * Jochen Schmidt wrote:
> 
> > Yes that is exactly what I meant. It sounds like a useful feature of
> > an OS and I'am either wondering why the other OSes seem to no such
> > facilities or I'am completely disinformed.
> 
> I just checked, and I *think* that using mmap with MAP_NORESERVE in
> Solaris does pretty much what Rob described.  I haven't *checked*
> though!

Yes, this is the case.  We use it in reserving lisp heap without actually
allocating swap.  I have a comment in the interface code that although
it is called MAP_NORESERVE on Solaris, it is more appropriately named
by SGI to be MAP_AUTORESERV, and I would have personally preferred an
even more appropriate name like MAP_AUTOCOMMIT.  The problem is that
it is not clear what actions are desired when someone actually accesses
an address within the reserved region; It may be, in the case of a sparse
array, that one wants zeros to automatically be mapped in at that first
access, but the way we use it, we would consider it to be a wild pointer
and would want a SEGV generated instead of trying to allocate space.

Our solution to this is to not only use MAP_NORESERVE/..., but to also
unset rwx permissions so as to allow no access.  This causes a fault
if such memory is accessed before it is explicitly allocated by committing.

Systems which allow this reservation concept are Solaris, SGI, NT
(via VirtualAlloc), and HPUX 11.  Systems which do not support it
are Alpha (Tru64), FreeBSD, Linux, HPUX 10 and earlier, and AIX.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3elynt0rm.fsf@cley.com>
* Jochen Schmidt wrote:
> This might be a really dumb question but is it possible in any
> system to allocate a huge uninitialized array (bigger than
> e. g. physical+swap) and use it as a sparse array? I could imagine
> that the mmu and the memory-management-system of the OS could manage
> this in similar terms like sparse-files are handled in
> UNIX-Filesystems.

I think that this kind of thing can be done.  I'm not sure what
systems support it for straight memory allocation but certainly some
do, or did, as I remember being a bit disturbed on some system when
malloc succeeded happily for bogus requests which were much larger
than the available space.

Things like mmap will do this on most modern Unices I think -- you map
a file into memory, and I think that file can be much larger than
swap.  Because of magic with sparse files I think this can have the
result you're after.

--tim
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3k88ft182.fsf@cley.com>
* Duane Rettig wrote:
> If you're comparing Lisp against C in this way, you should be comparing
> like features.  Remember, in C calloc() initializes, and malloc() does
> not.  There are reasons for using calloc().  But which do you think is
> used more often?  Why do you think that is?

I think the common case is malloc followed rather rapidly by some kind
of copy into the storage.  I certainly am not suggesting you compare
malloc alone with something that initialises.

> Why would you consider it cheating to use all of the features and
> tricks of hardware and software to take as few cycles to perform a
> certain kind of work?  If the work actually gets done, and saves
> time, isn't it a meaningful measure of performance?  What do you do
> for a sparse array which will _never_ be fully used, but which
> because of your rule actually gets filled in anyway, with
> corresponding cycles lost and swap allocated?

Yes it's a meaningful measure of performance but it's not something
that's particularly interesting to compare, because it can obviously
be made arbitrarily fast with a good allocator (because all you need
to do is a pointer increment and a bounds check, which (assuming
enough registers that the pointer & bound are in them) is 2 cycles or
something.  It's interesting because it shows how badly non-relocating
allocators do but not really otherwise.

What I meant by cheating was something like this (untested) code.

(defclass my-vector ()
  ((length :initform 0 :initarg :length)
   (storage :initform nil)))

(defun make-my-vector (length)
  (make-instance 'my-vector :length length))

(defmethod my-vector-ref ((v my-vector) i)
   (when (null (slot-value v 'storage))
     (setf (slot-value v 'storage) 
           (make-array (slot-value v 'length))))
   (aref (slot-value v 'length) i))

Now MAKE-MY-VECTOR isn't allocating a vector in any reasonable sense.
Obviously this is an awful implementation, but something like it --
perhaps a system which allocated only small blocks of space -- might
be quite reasonable for sparse arrays or whatever.

I think what I'm trying to distinguish between is what I'd call
allocating address space and allocating initialized storage.  It's
pretty clearly the case that a relocating allocator can allocate
address space incredibly fast: we can argue about whether you need to
increment a pointer and check or not, but, really, who cares about
that, these are operations which take a few billionths of a second.
And I guess it's interesting in the context of discussions with C-type
people to know this, or rather to know that non-relocating allocators
can cause this operation to take quite a long time, and perhaps worse
can have dreadful worst cases.

But in the context of Lisp I'd like to take this as a given:
allocating address space is fast and easy.  But actually using that
space is a much more interesting question, because getting at an
arbitrary bit of address space can take hundreds of cycles.  So this
brings up questions of do allocators behave in cache-friendly ways and
so on.  And, to me, those questions are pretty interesting.

--tim
        
From: Johan Kullstam
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m28zowr2h1.fsf@euler.axel.nom>
Tim Bradshaw <···@cley.com> writes:

> * Johan Kullstam wrote:
> 
> > consider making a table.  in one case, you allocate the array and then
> > set the values from some function.  consider the other case where you
> > allocate the array, set everything to zero and then set values.  in
> > the second case you get a gratuitous initialization.  i think duane is
> > just pointing out that you can avoid the extra initialization by using
> > a specialized array without initial-element.
> 
> I'm not trying to argue that you should always `zero' things and then
> set them to what you want.  But I'd like to count the initialization
> -- even if it's user-defined rather than system-defined -- as part of
> the allocation (perhaps some other term would be better) of memory.

consider the case (which i often have), where you have an numerical
function which is rather expensive to compute (maybe it's an integral
of a complicated function).  a typical example is computing the taps to a
discrete time filter based on the butterworth design.

the reason i make a table is that the function itself is slow but
table look-ups are fast.  thus i first do a make-array and then run
through the array setting the values.  sometimes the array is
precomputed at once, sometimes a lazy evaluation makes sense.  i don't
think an expensive function to compute the array element value should
count against allocation.

maybe you are right, allocation is not the right word for it.

> I think I can see reasons *not* to time this way.  For instance if
> you're comparing something like a typical lisp allocator against a
> typical malloc-type system, then malloc has to do a whole load more
> work just to find a free bit of memory to use, and may have all sorts
> of horrid pathological cases where either memory use or time to find
> memory explodes.  So comparing the time to find a free chunk of memory
> makes Lisp look very good -- as it should!
> 
> But my reason *to* time this way is that it times the common case,
> where you actually want some meaningful contents to thing you
> allocated rather than just random bits.  It also prevents people
> cheating by not actually doing any work until the thing is first
> accessed.  And finally it lets people who can do clever things with
> caching[1] win, which is good because doing clever things with caching is
> important given the relative performances of processors and memory
> nowadays.
> 
> --tim
> 
> 
> 
> Footnotes: 
> [1]  for instance the trick of having a nursery which is the same size
>      as a level of cache.
> 

-- 
J o h a n  K u l l s t a m
[········@ne.mediaone.net]
Don't Fear the Penguin!
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey38zozv6un.fsf@cley.com>
* I wrote:

> I'd also quibble futilely about the `typical' case of stack allocation
> -- I reckon the this is stack allocated rest lists, which (I
> think?) must need to go through the general case since the size can't
> be precomputed.  

I think this is silly -- the rest list case is, I guess, part of the
function calling sequence anyway so it's not quite the same thing.

--tim
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4d7ebxg6a.fsf@beta.franz.com>
Tim Bradshaw <···@cley.com> writes:

> * I wrote:
> 
> > I'd also quibble futilely about the `typical' case of stack allocation
> > -- I reckon the this is stack allocated rest lists, which (I
> > think?) must need to go through the general case since the size can't
> > be precomputed.  
> 
> I think this is silly -- the rest list case is, I guess, part of the
> function calling sequence anyway so it's not quite the same thing.

Not so silly, and not so very different, even, than other dynamic-extent
list allocations (such as the with-stack-list construct or a let-binding
of a variable to a list form and a dynamic-extent declaration).  It is
true that the &rest list size is not known at compile-time, but it is
indirectly known at runtime in the form of the argument-count, from which
the number of required and optional arguments can be subtracted to get the
actual count of cons cells to allocate.  It is also similar in that the
cons cells must be fully initialized, so the argument you made in your
previous post takes effect; memory cache and paging bandwith overshadows
the need to make an allocation test in both cases.  The only other real
difference between dynamic-extent &rest and with-stack-list consing in
Allegro CL is that &rest consing, both dynamic-extent and not, and similar
to heap consing, is done out-of-line, whereas with-stack-list is done
in-line.  This is purely due to various implementation efficiency decisions.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92k3l4$j7mr8$1@fido.engr.sgi.com>
Tim Bradshaw  <···@cley.com> wrote:
+---------------
| But I think I'm assuming that the whole increment-and-test part of
| heap (or `general' stack) allocation is incredibly cheap compared to
| the process of actually initializing memory, since it's likely that
| the pointer and bound will be either in registers or 1st-level cache
| at worst, while the memory will generally not be.  In fact I'd argue
| -- based on the assumption that everything except memory access is
| basically free on a modern machine -- that stack allocation's win
| should be that it has better cache behaviour.
+---------------

Not necessarily. See Ungar's papers on generational collectors with
a fixed "nursery" or "ephemeral" area of size chosen to fit in the
secondary cache. You get the same reuse of memory "hot" in the cache
as with a stack.

[For those not familiar, the idea is that all fresh allocations are done
in the nursery area, which is a *single* space, not a pair of semi-spaces.
When the nursery area fills up, a minor collection is done which copies
all the live data from the nursery area into the active semi-space of the
next generation, and then the whole nursery area is *reused* for more
new allocations.]


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: Paul Foley
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m2g0j975dk.fsf@mycroft.actrix.gen.nz>
On Wed, 27 Dec 2000 20:18:47 GMT, Frankey  wrote:

> management, but the allocation itself is dynamic, just like
> malloc. However, it's much cheaper to grab an object out of the
> stack (or even make it global) than to allocate it dynamically. In

What makes you think it's cheaper to allocate on the stack than on the
heap?  Do you have any basis for this belief, or is it just religion?
[Hint: heap allocation in languages with garbage collectors that can
move things around is just the same as stack allocation -- you just
have to bump the pointer that tells you where the unused memory
begins, and use the intervening space.]

> many cases, locals are quite adequate--and they are cheaper to
> have. Java doesn't have any local objects, all must come from the
> heap.

Why are you so intent on discussing Java?  Who cares about Java?  This
is comp.lang.lisp, not comp.lang.java!  Lisp is quite capable of
stack-allocating objects if it wants to (as is Java, for that matter,
if the compiler can tell that you're not keeping the object around
after the function that allocates it terminates...whether any Java
implementation actually does that or not is irrelevant.  Lisp has a
"dynamic-extent" declaration that lets you inform the compiler that
it's OK to do this)

-- 
I have stopped reading Stephen King novels.  Now I just read C code
instead.                                            -- Richard A. O'Keefe

(setq reply-to
  (concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4ofxxe0zm.fsf@beta.franz.com>
Paul Foley <·······@actrix.gen.nz> writes:

> On Wed, 27 Dec 2000 20:18:47 GMT, Frankey  wrote:
> 
> > management, but the allocation itself is dynamic, just like
> > malloc. However, it's much cheaper to grab an object out of the
> > stack (or even make it global) than to allocate it dynamically. In
> 
> What makes you think it's cheaper to allocate on the stack than on the
> heap?  Do you have any basis for this belief, or is it just religion?
> [Hint: heap allocation in languages with garbage collectors that can
> move things around is just the same as stack allocation -- you just
> have to bump the pointer that tells you where the unused memory
> begins, and use the intervening space.]

Actually, this isn't quite true.  The two nice things about stack
allocation are
 1. it doesn't have to be dynamic (though it might be, via alloca
or something similar) - under the right circumstances those pointers
you mention can be calculated at compile-time.  Thus, there is no
_extra_ pointer increment at runtime; the decrement of the stack is
still done once (just by a different amount) and pointers to the
objects are precalculated.  Pre-initializing such objects might be
necessary, but that is no different than heap allocated objects.
 2. The objects are deallocated automatically when the function returns
by any means.  Of course, this does not take any finalization/destructor
operations necessary, but again, there is little difference in these
extra operations between stack and heap allocations, so they do not
figure in to the difference.

As you mentioned, Lisp does stack allocation very well.

The biggest mistake our friend Frankey makes (besides being fairly
uninformed about Lisp) is that he thinks that C is a low-level
language, close to the machine.  I'm surprised he didn't tout assembler
as the way to go, since using assembler can be several times faster than
C, depending on what is being done.

We've discussed this before, and I wrote a rather long treatise about
some lisp vendors don't compile their Lisp code down to C (i.e. use C
as a portable assembler). The thread was on comp.lang.lisp, and it can
be found on deja.com under the title
 "LISP to C vs machine level (was: LISP for Windows)".
The subject is actually a little broader than the one in this thread,
but deals with the question at hand as well.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Hannah Schroeter
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <93cqg8$e0j$1@c3po.schlund.de>
Hello!

In article <·············@beta.franz.com>,
Duane Rettig  <·····@franz.com> wrote:
>Paul Foley <·······@actrix.gen.nz> writes:

>[...]

>Actually, this isn't quite true.  The two nice things about stack
>allocation are
> 1. it doesn't have to be dynamic (though it might be, via alloca
>or something similar) - under the right circumstances those pointers
>you mention can be calculated at compile-time.  Thus, there is no
>_extra_ pointer increment at runtime; the decrement of the stack is
>still done once (just by a different amount) and pointers to the
>objects are precalculated.  Pre-initializing such objects might be
>necessary, but that is no different than heap allocated objects.

This can be done for heap allocation, too. Just analyse all
heap allocations in a basic block. Test space availability once
(if that isn't done with guard pages), then inc the heap allocation
pointer by the sum of all allocations, then use negative, compile-time
calculated offsets to the allocation pointer (at that time probably
in a register) to initialize / use the heap objects.

>[...]

Kind regards,

Hannah.
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <43detkg3t.fsf@beta.franz.com>
······@schlund.de (Hannah Schroeter) writes:

> Hello!
> 
> In article <·············@beta.franz.com>,
> Duane Rettig  <·····@franz.com> wrote:
> >Paul Foley <·······@actrix.gen.nz> writes:
> 
> >[...]
> 
> >Actually, this isn't quite true.  The two nice things about stack
> >allocation are
> > 1. it doesn't have to be dynamic (though it might be, via alloca
> >or something similar) - under the right circumstances those pointers
> >you mention can be calculated at compile-time.  Thus, there is no
> >_extra_ pointer increment at runtime; the decrement of the stack is
> >still done once (just by a different amount) and pointers to the
> >objects are precalculated.  Pre-initializing such objects might be
> >necessary, but that is no different than heap allocated objects.
> 
> This can be done for heap allocation, too. Just analyse all
> heap allocations in a basic block. Test space availability once
> (if that isn't done with guard pages), then inc the heap allocation
> pointer by the sum of all allocations, then use negative, compile-time
> calculated offsets to the allocation pointer (at that time probably
> in a register) to initialize / use the heap objects.

A cute trick.  However, it forces heap allocation to be in-lined,
which I don't consider to be cost-effective for a few reasons:

 1. Over and above the single test we were theorizing about saving,
in-line heap allocation forces more code into ram/caches/etc.,
causing slowdown of the application.  A simple function call per
allocation tends to generate smaller code.

 2. Some of the profiling and testing instrumentation that is available
(e.g. our space-profiler) is either left out of these allocations or
must be expanded as well, with even more bloat.

 3. The "shape" of heap allocation model must be known by the compiler.
This precludes a new gc/allocator being plugged in for the old one,
without recompiling lisp code.

So why do we bother doing the inlining for stack-allocated objects?
With reasons coreresponding to the above:

 1': Though the code for stack allocation is larger because it is
inlined, it is worth doing because of the larger savings that occur
because the garbage collector is invoked less often.

 2': Stack allocation tends not to need profiling instrumentation.
Most allocation instrumentation has to do with memory management,
including gc efficiency and memory leaks, rather than with specific
objects.

 3': Stack allocated objects have nothing to do with the implementation
of the lisp heap, nor with the garbage collector.  So the inlined code
for stack allocation does not preclude mixing/matching of gc/allocators.
Of course, differences in object representation would force a change
to the inlining of stack-consing code, but I view this as a more
fundamental change that would force complete redesign of the compiler
anyway.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: Hannah Schroeter
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <93l56c$r6u$1@c3po.schlund.de>
Hello!

In article <·············@beta.franz.com>,
Duane Rettig  <·····@franz.com> wrote:
>······@schlund.de (Hannah Schroeter) writes:

>[...]

>> This can be done for heap allocation, too. Just analyse all
>> heap allocations in a basic block. Test space availability once
>> (if that isn't done with guard pages), then inc the heap allocation
>> pointer by the sum of all allocations, then use negative, compile-time
>> calculated offsets to the allocation pointer (at that time probably
>> in a register) to initialize / use the heap objects.

>A cute trick.  However, it forces heap allocation to be in-lined,
>which I don't consider to be cost-effective for a few reasons:

> 1. Over and above the single test we were theorizing about saving,
>in-line heap allocation forces more code into ram/caches/etc.,
>causing slowdown of the application.  A simple function call per
>allocation tends to generate smaller code.

How so?

(let ((x (cons 1 2))
      (y (make-array ...)))
  (declare (dynamic-extent x y))
  body)

could be translated to

  sp -= size_of_cons + size_of_array_descriptor
  sp[size_of_array_descriptor] = 1
  sp[size_of_array_descriptor + size_of_car_part_of_cons] = 2
  initialize_array(sp, ...) ; here's a function call assuming that
    ; initializing the array is too complicated to inline
  body ; aliasing y to sp (with the currently tracked offset)
       ; aliasing x to sp+size_of_array_descriptor (with the currently tracked
       ;   offset)

on the stack, and to

  if (hp + (size_of_cons + size_of_array_descriptor) > heap_limit)
    gc(size_of_cons + size_of_array_descriptor);
  ; inline the common case, closed-code the complicated case
  ; parameter is the amount of allocation needed without further heap
  ; check
  x = hp
  *hp++ = 1
  *hp++ = 2
  y = hp
  hp += size_of_array_descriptor
  initialize_array(y, ...)
  body ; x and y are bound above, usually to registers unless
       ; body forms a closure over at least one of them

There are two differences:
- The check in the beginning
  It's just some instructions:
    t1 <- size_of_cons + size_of_array_descriptor ; constant load
    t2 <- t1 + hp
    compare(t2, heap_limit)
    branch_if_less(Label)
    call(gc, t1) ; there's a stack push if gc() uses the stack
      ; for argument passing. However, in this case,
      ; there can be a special register calling convention for gc()
    Label:
  5 or 6 instructions, most probably

  heap_limit could be set a bit below the real limit. The offset
  is determined so that most basic block allocation sizes are smaller
  than the offset. In this case, the heap check changes to
    compare(hp, heap_limit)
    branch_if_less(Label)
    call(gc_simple)
  for the usual case and to
    t1 <- size_of_cons + size_of_array_descriptor - heap_limit_offset
      ; constant load
    t2 <- t1 + hp
    compare(t2, heap_limit)
    branch_if_less(Label)
    call(gc_complicated, t1) ; there's a stack push if gc() uses the stack
      ; for argument passing. However, in this case,
      ; there can be a special register calling convention for gc()
    Label:
  so 3 instructions for the usual case, 5 or 6 for the complicated
  seldom one.

- register pressure
  x and y must be loaded to registers before initialize_array, because
  that will most probably allocate space for the array contents from the
  same heap. However, in the stack case, sometimes you'll need the
  real address of x or y too, diminishing the difference.

> 2. Some of the profiling and testing instrumentation that is available
>(e.g. our space-profiler) is either left out of these allocations or
>must be expanded as well, with even more bloat.

No. In the heap profiling case, you could just
  call(profile_heap_check, allocation_size)
in the beginning of the basic block.

> 3. The "shape" of heap allocation model must be known by the compiler.
>This precludes a new gc/allocator being plugged in for the old one,
>without recompiling lisp code.

ACK.

>So why do we bother doing the inlining for stack-allocated objects?
>With reasons coreresponding to the above:

> 1': Though the code for stack allocation is larger because it is
>inlined, it is worth doing because of the larger savings that occur
>because the garbage collector is invoked less often.

The GC is invoked only if the heap check fails. The overhead is
low in current GC implementations.

> 2': Stack allocation tends not to need profiling instrumentation.
>Most allocation instrumentation has to do with memory management,
>including gc efficiency and memory leaks, rather than with specific
>objects.

That instrumentation can be outside the short path in my above
examples, anyway (in the gc function[s]).

> 3': Stack allocated objects have nothing to do with the implementation
>of the lisp heap, nor with the garbage collector.  So the inlined code
>for stack allocation does not preclude mixing/matching of gc/allocators.
>Of course, differences in object representation would force a change
>to the inlining of stack-consing code, but I view this as a more
>fundamental change that would force complete redesign of the compiler
>anyway.

Perhaps. But heap allocation will probably occur often enough to
justify *fast* short paths (i.e. when no GC is needed), anyway, even
if you honour dynamic-extent declarations or perhaps even infer them.

And you can at least switch some GC schemes without changing open-coded
allocation code; the only requirement you impose is that there's
a heap pointer and a heap limit register (or pseudo-register) and
the heap allocation grows upward (and perhaps the heap limit offset
constant). How the heap is collected when the allocation area is
exhausted isn't open-coded and can thus be more easily changed.


Kind regards,

Hannah.
From: Duane Rettig
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <4g0ipvcjl.fsf@beta.franz.com>
······@schlund.de (Hannah Schroeter) writes:

> Duane Rettig  <·····@franz.com> wrote:
> >······@schlund.de (Hannah Schroeter) writes:
> 
> >> This can be done for heap allocation, too. Just analyse all
> >> heap allocations in a basic block. Test space availability once
> >> (if that isn't done with guard pages), then inc the heap allocation
> >> pointer by the sum of all allocations, then use negative, compile-time
> >> calculated offsets to the allocation pointer (at that time probably
> >> in a register) to initialize / use the heap objects.
> 
> >A cute trick.  However, it forces heap allocation to be in-lined,
> >which I don't consider to be cost-effective for a few reasons:
> 
> > 1. Over and above the single test we were theorizing about saving,
> >in-line heap allocation forces more code into ram/caches/etc.,
> >causing slowdown of the application.  A simple function call per
> >allocation tends to generate smaller code.
> 
> How so?

You have an incorrect assumption, below.

> (let ((x (cons 1 2))
>       (y (make-array ...)))
>   (declare (dynamic-extent x y))
>   body)
> 
> could be translated to
> 
>   sp -= size_of_cons + size_of_array_descriptor
>   sp[size_of_array_descriptor] = 1
>   sp[size_of_array_descriptor + size_of_car_part_of_cons] = 2
>   initialize_array(sp, ...) ; here's a function call assuming that
>     ; initializing the array is too complicated to inline
======^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   body ; aliasing y to sp (with the currently tracked offset)
>        ; aliasing x to sp+size_of_array_descriptor (with the currently tracked
>        ;   offset)

The assumption is incorrect that initialization is too complicated
to inline.  Try it out for yourself by compiling and disassembling
the following on an Allegro CL:

(defun foo ()
  (declare (optimize speed))
  (let ((x (make-array 10 :initial-element nil)))
    (declare (dynamic-extent x))
    (bas x)))

Also, your example of stack-consing a single cons cell is not
very indicative or common.  Try instead compiling and disassembling
this:

(defun bar (a b c d e f g)
  (declare (optimize speed))
  (let ((x (list a b c d e f g)))
    (declare (dynamic-extent x))
    (bas x)))

Note that the entire list is allocated _and_ initialized with no function
calls save the call to bas.  It is also a lot larger code than a single
call to list with 7 arguments.

> There are two differences:
> - The check in the beginning
  [ ... ]

This difference is not relevant, since it is not part of the discussion -
please recall that I said:

  1. Over and above the single test we were theorizing about saving,

so I was specifically excluding this case.

> - register pressure
>   x and y must be loaded to registers before initialize_array, because
>   that will most probably allocate space for the array contents from the
>   same heap. However, in the stack case, sometimes you'll need the
>   real address of x or y too, diminishing the difference.

I agree that this is not much of a difference.

So this leaves us with the _real_ difference, which is the fact that
we stack-allocate in open-code and I don't want to bloat my code for
heap allocation as well.

> > 2. Some of the profiling and testing instrumentation that is available
> >(e.g. our space-profiler) is either left out of these allocations or
> >must be expanded as well, with even more bloat.
> 
> No. In the heap profiling case, you could just
>   call(profile_heap_check, allocation_size)
> in the beginning of the basic block.

Your statement is a restatement of mine.  So why did you say "No"?
You either leave the profiling call out of the inline code, or you
make the call, which is one more call (i.e. bloat) than you would
have otherwise made.

> > 3. The "shape" of heap allocation model must be known by the compiler.
> >This precludes a new gc/allocator being plugged in for the old one,
> >without recompiling lisp code.
> 
> ACK.

What does this mean?  Does it mean "Acknowledged", or does it mean "Eek"?

> >So why do we bother doing the inlining for stack-allocated objects?
> >With reasons coreresponding to the above:
> 
> > 1': Though the code for stack allocation is larger because it is
> >inlined, it is worth doing because of the larger savings that occur
> >because the garbage collector is invoked less often.
> 
> The GC is invoked only if the heap check fails. The overhead is
> low in current GC implementations.

"Low" is a relative term.  It is not zero, whereas for stack allocations
the overhead _is_ zero.  We have customers for which this makes a big
difference.

> > 2': Stack allocation tends not to need profiling instrumentation.
> >Most allocation instrumentation has to do with memory management,
> >including gc efficiency and memory leaks, rather than with specific
> >objects.
> 
> That instrumentation can be outside the short path in my above
> examples, anyway (in the gc function[s]).

Of course, but why would you want heap-allocation instrumentation on
code segments that have nothing to do with the heap?  Your reporting
tools would have to squirrel the hits away into their own category
anyway, just to keep from corrupting the heap-allocation measurements.

> > 3': Stack allocated objects have nothing to do with the implementation
> >of the lisp heap, nor with the garbage collector.  So the inlined code
> >for stack allocation does not preclude mixing/matching of gc/allocators.
> >Of course, differences in object representation would force a change
> >to the inlining of stack-consing code, but I view this as a more
> >fundamental change that would force complete redesign of the compiler
> >anyway.
> 
> Perhaps. But heap allocation will probably occur often enough to
> justify *fast* short paths (i.e. when no GC is needed), anyway, even
> if you honour dynamic-extent declarations or perhaps even infer them.

It's all a matter of design tradeoffs, and currently we make the
decision that out-of-line heap allocations calls are more efficient all
around.

> And you can at least switch some GC schemes without changing open-coded
==============================^^^^
> allocation code; the only requirement you impose is that there's
> a heap pointer and a heap limit register (or pseudo-register) and
==================================^^^^^^^^   ==^^^^^^^^^^^^^^^
> the heap allocation grows upward (and perhaps the heap limit offset
> constant). How the heap is collected when the allocation area is
> exhausted isn't open-coded and can thus be more easily changed.

Your premise here is weak.  If you consider a group of _some_ GC
allocation schemes, then you would for example have to separate
your two alternate requirements (one for a heap limit register and
one for a heap limit pseudo-register) because they could not mix
and match even between themselves.  This is because if you switch
from a register to a pseudo-register (i.e. somewhere in known memory),
then your lisp code _still_ must be recompiled in order to make the
switch.

The point is that I don't want to limit myself to only looking at
GC/allocator schemes that have a particular set of characteristics;
I want to be able to plug a totally new conceptuallization into the
system without needing to recompile.  With an out-of-line allocator,
it is more possible to do that than with in-line knowledge of what
a particular allocator is doing.

-- 
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m34rzivffg.fsf@cadet.dsl.speakeasy.net>
Kaelin Colclasure <······@everest.com> writes:

> Frankey <···········@worldnet.att.net> writes:
> 
> > Right. Though one has to keep in mind that the term "compile" can
> > only be applied here figuratively. None of these languages compile
> > in the same sense as what real-compiled languages like for example
> > C, C++, Pascal, Ada, do. The same question frequently comes up in
> > Java-related discussions: you can compile your rt environment into
> > the program, but the structure of the language remains intact. Is
> > such "compilation" an improvement?  Definitely. Still no match to
> > C or C++ or Ada though.
> 
> You are misinformed. The majority of Common Lisp implementations
> compile to native code, just like C, C++, Pascal, and Ada -- and
> this has been the case for decades. Of the CL implementations
> mentioned in this thread, only CLISP does not compile to native
> code.

Actually, CMUCL can compile to both native code _or_ to bytecode.
Even when it compiles into non-native code it does very though.

dave
From: Eugene Zaikonnikov
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <6yelyzyodk.fsf@viking.cit>
* "Frankey" == Frankey  <···········@worldnet.att.net> writes:

>> Just my $.02: did you compile the Lisp code? This makes a huge
>> difference. Python and Perl compile by default, but Lisp compiles
>> only if you tell it to.

Frankey>  Right. Though one has to keep in mind that the term
Frankey>  "compile" can only be applied here figuratively. None of
Frankey>  these languages compile in the same sense as what
Frankey>  real-compiled languages like for example C, C++, Pascal,
Frankey>  Ada, do.

Pardon?

* (defun foo (a b)
    (declare (optimize (space 3) (speed 3) (safety 0) (debug 0))
	     (type fixnum a b))
    (the fixnum (+ a b)))

* (compile 'foo)

* (disassemble 'foo)
4802F200:       .ENTRY FOO()                 ; FUNCTION
      18:       POP   DWORD PTR [EBP-8]
      1B:       LEA   ESP, [EBP-32]
      1E:       ADD   EDX, EDI               ; No-arg-parsing entry point
      20:       MOV   ECX, [EBP-8]
      23:       MOV   EAX, [EBP-4]
      26:       ADD   ECX, 2
      29:       MOV   ESP, EBP
      2B:       MOV   EBP, EAX
      2D:       JMP   ECX
      2F:       NOP

Frankey>  Java-related discussions: you can compile your rt
Frankey>  environment into the program, but the structure of the
Frankey>  language remains intact.

Wrong. Lisp compilers do perform various code transformations,
function inlining, constant folding, etc.

Ironically, the first C translator I laid my hands on (HiSoft C for
ZX-Spectrum) was slow, *interpreted* and wasn't able to produce
standalone executables.

-- 
  Eugene
From: Paul Foley
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m28zp6924z.fsf@mycroft.actrix.gen.nz>
On Sat, 23 Dec 2000 17:40:57 GMT, Frankey  wrote:

>> difference. Python and Perl compile by default, but Lisp compiles only
>> if you tell it to.

> Right. Though one has to keep in mind that the term "compile" can
> only be applied here figuratively. None of these languages compile

"Frankey", you don't know what you're talking about.  I don't think
you've posted a single correct statement wrt Lisp yet.  Please stop
posting ignorant crap.  [I guess Erik's on vacation?]


[Believe it or not, this quote was randomly chosen!]
-- 
Quid enim est stultius quam incerta pro certis habere, falsa pro veris?
                                                                 -- Cicero
(setq reply-to
  (concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6c66k9k9pp.fsf@octagon.mrl.nyu.edu>
Frankey <···········@worldnet.att.net> writes:

> glauber wrote:
> >   Rolf Wester <······@ilt.fhg.de> wrote:
> > [...]
> > >
> > > Python is only slightly slower than compiled Common Lisp using ACL6.0
> > > (CLISP is 60% slower).
> > > I also read that it is difficult to write fast Common Lisp code. Is
> > > there any means
> > > to speed up the CL code?
> > 
> > Just my $.02: did you compile the Lisp code? This makes a huge
> > difference. Python and Perl compile by default, but Lisp compiles only
> > if you tell it to. 

> Right. Though one has to keep in mind that the term "compile" can
> only be applied here figuratively. None of these languages compile
> in the same sense as what real-compiled languages like for example
> C, C++, Pascal, Ada, do. The same question frequently comes up in
> Java-related discussions: you can compile your rt environment into
> the program, but the structure of the language remains intact. Is
> such "compilation" an improvement?
>
> Definitely. Still no match to C or C++ or Ada though.

The Naggum will get you soon.  :) It is obvious that you have *no idea*
about what (Common) Lisp is about.

> > Then, there are compile optimization settings that
> > also can make a big difference, plus the tricks that other people
> > already mentioned about declaring the data types.
> > 
> > CLISP is not a speed demon, but when compiled it usually doesn't come
> > too much behind Python.

> That is a very reasonable, realistic statement.

OTOH CMUCL *is* a speed demon, and when compiled it usually doesn't come
too much behind C/C++. :)

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <925nua$qqf$1@paradise.nirvananet>
In article <···············@octagon.mrl.nyu.edu>,
Marco Antoniotti  <·······@cs.nyu.edu> wrote:
>
>Frankey <···········@worldnet.att.net> writes:
>
>> glauber wrote:
>> >   Rolf Wester <······@ilt.fhg.de> wrote:
>> > [...]
>> > >
>> > > Python is only slightly slower than compiled Common Lisp using ACL6.0
>> > > (CLISP is 60% slower).
>> > > I also read that it is difficult to write fast Common Lisp code. Is
>> > > there any means
>> > > to speed up the CL code?
>> > 
>> > Just my $.02: did you compile the Lisp code? This makes a huge
>> > difference. Python and Perl compile by default, but Lisp compiles only
>> > if you tell it to. 
>
>> Right. Though one has to keep in mind that the term "compile" can
>> only be applied here figuratively. None of these languages compile
>> in the same sense as what real-compiled languages like for example
>> C, C++, Pascal, Ada, do. The same question frequently comes up in
>> Java-related discussions: you can compile your rt environment into
>> the program, but the structure of the language remains intact. Is
>> such "compilation" an improvement?
>>
>> Definitely. Still no match to C or C++ or Ada though.
>
>The Naggum will get you soon.  :) It is obvious that you have *no idea*
>about what (Common) Lisp is about.

the fact that he claimed that fortran can't be faster than c because
fortran compilers are written in c demonstrates quite clearly that he
is clueless about compilers and languages in general

> ...

hs
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A46A10A.57F499D7@worldnet.att.net>
Hartmann Schaffer wrote:
> 
> In article <···············@octagon.mrl.nyu.edu>,
> Marco Antoniotti  <·······@cs.nyu.edu> wrote:
> >
> >Frankey <···········@worldnet.att.net> writes:
> >
> >> glauber wrote:
> >> >   Rolf Wester <······@ilt.fhg.de> wrote:
> >> > [...]
> >> > >
> >> > > Python is only slightly slower than compiled Common Lisp using ACL6.0
> >> > > (CLISP is 60% slower).
> >> > > I also read that it is difficult to write fast Common Lisp code. Is
> >> > > there any means
> >> > > to speed up the CL code?
> >> >
> >> > Just my $.02: did you compile the Lisp code? This makes a huge
> >> > difference. Python and Perl compile by default, but Lisp compiles only
> >> > if you tell it to.
> >
> >> Right. Though one has to keep in mind that the term "compile" can
> >> only be applied here figuratively. None of these languages compile
> >> in the same sense as what real-compiled languages like for example
> >> C, C++, Pascal, Ada, do. The same question frequently comes up in
> >> Java-related discussions: you can compile your rt environment into
> >> the program, but the structure of the language remains intact. Is
> >> such "compilation" an improvement?
> >>
> >> Definitely. Still no match to C or C++ or Ada though.
> >
> >The Naggum will get you soon.  :) It is obvious that you have *no idea*
> >about what (Common) Lisp is about.
> 
> the fact that he claimed that fortran can't be faster than c because
> fortran compilers are written in c demonstrates 
Ah, but well... how did you come up with the "because" part? I didn't claim such a thing.
Don't put your own silly words in my mouth and then proceed to demonstrate how stupid
(what you said) is.

> quite clearly that he
> is clueless about compilers and languages in general
Inability to comprehend written text and argue logically demonstrates quite clearly that
you are clueless in general. Or dishonest. (You pick, coz you know better <g>.)


> 
> > ...
> 
> hs
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92655b$rup$1@paradise.nirvananet>
In article <·················@worldnet.att.net>,
Frankey  <···········@worldnet.att.net> wrote:
>Hartmann Schaffer wrote:
> ...
>> the fact that he claimed that fortran can't be faster than c because
>> fortran compilers are written in c demonstrates 
>Ah, but well... how did you come up with the "because" part? I didn't claim such a thing.
>Don't put your own silly words in my mouth and then proceed to demonstrate how stupid
>(what you said) is.

then could you please explain why you pointed out so pointedly that
fortran compilers are (tend to be) written in c when you claimed that
fortran isn't faster than c?

>> quite clearly that he
>> is clueless about compilers and languages in general
>Inability to comprehend written text and argue logically demonstrates quite clearly that
>you are clueless in general. Or dishonest. (You pick, coz you know better <g>.)

you have demonstrated pretty clearly throughout this thread you you
don't know the first thing about what makes a implementation of a
language fast or slow and what the difference in the influence is
between language features and implementation

hs
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A47AA6D.75890D5A@worldnet.att.net>
Hartmann Schaffer wrote:
> 
> In article <·················@worldnet.att.net>,
> Frankey  <···········@worldnet.att.net> wrote:
> >Hartmann Schaffer wrote:
> > ...
> >> the fact that he claimed that fortran can't be faster than c because
> >> fortran compilers are written in c demonstrates
> >Ah, but well... how did you come up with the "because" part? I didn't claim such a thing.
> >Don't put your own silly words in my mouth and then proceed to demonstrate how stupid
> >(what you said) is.
> 
> then could you please explain why you pointed out so pointedly that
> fortran compilers are (tend to be) written in c when you claimed that
> fortran isn't faster than c?
Because that, imho, shows that C is a more general-purpose language than fortran. I also
said that the opposite seems unlikely, which was to demonstrate that fortran is less
general-purpose than C. That's all. Speed doesn't get in directly, although,
general-purpose languages tend to be lower-level, and lower-level languages make it
possible to produce better performing code, as compared with specialist, or simply
high-level, languages. Of course, specialist languages make it easier to develop
domain-specific programs. That saves a lot of work to the user--but it does not improve
performance. And so, specialist languages tend to be used where programming effort matters
while performance doesn't. Seems like the original poster ran into a rare situation where
performance did matter, thus his problem. 

That's why I mentioned that your favorite fortran tools are probably written in C, while
the opposite isn't likely.
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92aob6$gg9$1@paradise.nirvananet>
In article <·················@worldnet.att.net>,
Frankey  <···········@worldnet.att.net> wrote:
>Hartmann Schaffer wrote:
>> 
>> In article <·················@worldnet.att.net>,
>> Frankey  <···········@worldnet.att.net> wrote:
>> >Hartmann Schaffer wrote:
>> > ...
>> >> the fact that he claimed that fortran can't be faster than c because
>> >> fortran compilers are written in c demonstrates
>> >Ah, but well... how did you come up with the "because" part? I didn't claim such a thing.
>> >Don't put your own silly words in my mouth and then proceed to demonstrate how stupid
>> >(what you said) is.
>> 
>> then could you please explain why you pointed out so pointedly that
>> fortran compilers are (tend to be) written in c when you claimed that
>> fortran isn't faster than c?
>Because that, imho, shows that C is a more general-purpose language than fortran. I also

sorry that i misunderstood you, but if you go back and read the post i
was referring to you undoubtedly will notice that you put it in a way
that made it easy to come to my interpretation

>said that the opposite seems unlikely, which was to demonstrate that fortran is less
>general-purpose than C. That's all. Speed doesn't get in directly, although,

you will be surprised what fortran has been misused for over the
years, esp. when it was practically the only language that was
generally available.  so while it isn't really supportive of
nonnumeric applications, it still can be used for it, which makes it
as much general purpose as c

>general-purpose languages tend to be lower-level, and lower-level languages make it
>possible to produce better performing code, as compared with specialist, or simply
>high-level, languages. Of course, specialist languages make it easier to develop
>domain-specific programs. That saves a lot of work to the user--but it does not improve

i would be curious how you define special purpose language.  the
language discussed in this newsgroup is as much general purpose as c,
if not more so

> ...

hs
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A534C.F0A865BC@worldnet.att.net>
Hartmann Schaffer wrote:
> i misunderstood you, but if you go back and read the post i
> was referring to you undoubtedly will notice that you put it in a way
> that made it easy to come to my interpretation
OK that's quite possible. 

> >said that the opposite seems unlikely, which was to demonstrate that fortran is less
> >general-purpose than C. That's all. Speed doesn't get in directly, although,
> 
> you will be surprised what fortran has been misused for over the
> years, esp. when it was practically the only language that was
> generally available.  so while it isn't really supportive of
> nonnumeric applications, it still can be used for it, which makes it
> as much general purpose as c
Sure. Any language is bound to be misused by people who like it and don't know anything
else. I've seen embedded apps written in FoxPro, and they even worked--kind of--but never
well though. There are many tasks that cannot be done in Fortran regardless of the degree
of misuse you're ready to attempt. You can't write a device driver in Fortran.

> >general-purpose languages tend to be lower-level, and lower-level languages make it
> >possible to produce better performing code, as compared with specialist, or simply
> >high-level, languages. Of course, specialist languages make it easier to develop
> >domain-specific programs. That saves a lot of work to the user--but it does not improve
> 
> i would be curious how you define special purpose language.  the
> language discussed in this newsgroup is as much general purpose as c,
> if not more so
No. It's good for symbolic manipulations and it allows other things too, but you cannot
write a device driver with Lisp. And it's quite all right, because that's not what it's
for. Prolog would be a better example than Lisp actually as it is even more specialized.
In fact, there are prolog-like C++ libraries that allow to attack inference -related tasks
on prolog-like level (logically) while using C++ as a main implementation tool for
everything else. That's what a special purpose language is. Iow, if some hypothetical prog
language allow me to have an entity that's called "matrix" and operate on
it--syntactically--as if it were algebra, that language would be a special-purpose one. I
can do the same in C++ or course, but I'll have to write mountains of low-level code (or
buy a library.) Special-purpose languages support some domain-specific abstract concepts
directly, while a general-purpose language while allowing for the same functionality,
would require you to code up these high-level, domain-specific entities themselves before
you can start programming in terms of these entities.
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92dlec$n96$1@paradise.nirvananet>
In article <·················@worldnet.att.net>,
Frankey  <···········@worldnet.att.net> wrote:
> ...
>Sure. Any language is bound to be misused by people who like it and don't know anything
>else. I've seen embedded apps written in FoxPro, and they even worked--kind of--but never
>well though. There are many tasks that cannot be done in Fortran regardless of the degree
>of misuse you're ready to attempt. You can't write a device driver in Fortran.

you might be surprised.  many years ago somebody whom i consider quite
reliable told me that honeywell wrote the device drivers for on of
their oses in cobol.  one guy i knew told me he had written an algol60
compiler in cobol.

> ...
>> i would be curious how you define special purpose language.  the
>> language discussed in this newsgroup is as much general purpose as c,
>> if not more so
>No. It's good for symbolic manipulations and it allows other things too, but you cannot
>write a device driver with Lisp. And it's quite all right, because that's not what it's

afaik the symbolics (lisp machine) people wrote their device drivers
in lisp.

> ...

hs
From: Joe Marshall
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <g0j9k21l.fsf@content-integrity.com>
> Hartmann Schaffer wrote:
> > i would be curious how you define special purpose language.  the
> > language discussed in this newsgroup is as much general purpose as c,
> > if not more so

Frankey <···········@worldnet.att.net> writes:
> No.  It's good for symbolic manipulations and it allows other things
> too, but you cannot write a device driver with Lisp.

Maybe *you* can't, but it is well within the realm of possibility.
I've written several device drivers in Lisp.


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: ······@corporate-world.lisp.de
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92dr2f$238$1@nnrp1.deja.com>
In article <·················@worldnet.att.net>,
  Frankey <···········@worldnet.att.net> wrote:
> No. It's good for symbolic manipulations and it allows other things too, but you cannot
> write a device driver with Lisp.

Hmm, my home computer is a Lisp machine from Symbolics (5 Mips speed,
from early 90s). Everything (the whole operating system + applications)
is written in (object-oriented) Lisp. It has low-level stuff
like SCSI-drivers, Ethernet-interface, full TCP/IP stack,
window system, file-system, etc. - all written in Lisp. If that's not
general purpose, then I don't know. And this was a decade ago,
on what is now considered slow hardware.

Generally I'd say you just should read a bit more about compilers
(and make less assumptions about the nature of the code that
Lisp compilers can generate). I propose you study the compiler
from CMU CL (manuals are available that describe the compiler).
A good introduction to compilation of Lisp languages is
the book from Christian Queinnec: "Lisp in small pieces".
After you have studied those, please come back to this newsgroup
and explain your theory of the "nature of compiled code" a
bit more in depth - if you then still like...


Sent via Deja.com
http://www.deja.com/
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4D19CC.F6BB07A9@worldnet.att.net>
I appreciate the suggestions and will definitely dig into that stuff.

······@corporate-world.lisp.de wrote:
> Generally I'd say you just should read a bit more about compilers
...
From: Jochen Schmidt
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92etcv$6t674$1@ID-22205.news.dfncis.de>
Frankey wrote:

> Hartmann Schaffer wrote:
>> i misunderstood you, but if you go back and read the post i
>> was referring to you undoubtedly will notice that you put it in a way
>> that made it easy to come to my interpretation
> OK that's quite possible.
> 
>> >said that the opposite seems unlikely, which was to demonstrate that
>> >fortran is less general-purpose than C. That's all. Speed doesn't get in
>> >directly, although,
>> 
>> you will be surprised what fortran has been misused for over the
>> years, esp. when it was practically the only language that was
>> generally available.  so while it isn't really supportive of
>> nonnumeric applications, it still can be used for it, which makes it
>> as much general purpose as c
> Sure. Any language is bound to be misused by people who like it and don't
> know anything else. I've seen embedded apps written in FoxPro, and they
> even worked--kind of--but never well though. There are many tasks that
> cannot be done in Fortran regardless of the degree of misuse you're ready
> to attempt. You can't write a device driver in Fortran.
> 
>> >general-purpose languages tend to be lower-level, and lower-level
>> >languages make it possible to produce better performing code, as
>> >compared with specialist, or simply high-level, languages. Of course,
>> >specialist languages make it easier to develop domain-specific programs.
>> >That saves a lot of work to the user--but it does not improve
>> 
>> i would be curious how you define special purpose language.  the
>> language discussed in this newsgroup is as much general purpose as c,
>> if not more so
> No. It's good for symbolic manipulations and it allows other things too,
> but you cannot write a device driver with Lisp. And it's quite all right,
> because that's not what it's for. Prolog would be a better example than
> Lisp actually as it is even more specialized. In fact, there are
> prolog-like C++ libraries that allow to attack inference -related tasks on
> prolog-like level (logically) while using C++ as a main implementation
> tool for everything else. That's what a special purpose language is. Iow,
> if some hypothetical prog language allow me to have an entity that's
> called "matrix" and operate on it--syntactically--as if it were algebra,
> that language would be a special-purpose one. I can do the same in C++ or
> course, but I'll have to write mountains of low-level code (or buy a
> library.) Special-purpose languages support some domain-specific abstract
> concepts directly, while a general-purpose language while allowing for the
> same functionality, would require you to code up these high-level,
> domain-specific entities themselves before you can start programming in
> terms of these entities.

Just as a hint ('cause I can't here that "this is impossible with language 
x" any more)

Read the works from A. Church (Lambda Calculus) and A. Turing (The 
universal Turing machine) so will get a more realistic opinion about
what you can compute and what not.
If you're ready with this you can add G�del (Entscheidungsproblem a. s. o.) 
to your "Todo-list".

Regards,
Jochen
From: Tim Bradshaw
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <ey3n1dgvhcc.fsf@cley.com>
* Frankey  wrote:
> No. It's good for symbolic manipulations and it allows other things
> too, but you cannot write a device driver with Lisp.

I'm sitting three feet away from a machine whose device drivers are
written in Lisp.

--tim
From: Andy Freeman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92deiq$n1v$1@nnrp1.deja.com>
In article <·················@worldnet.att.net>,
  Frankey <···········@worldnet.att.net> wrote:
> general-purpose languages tend to be lower-level, and lower-level
languages make it
> possible to produce better performing code, as compared with
specialist, or simply
> high-level, languages.

According to this definition, C is not a lower-level language
because it does not make it possible to produce faster code than
"specialist" languages.  C fails because it requires programmers
to express things in a way that blocks optimization by tools more
sophisticated than peep-hole optimization and register allocation.

Interestingly enough, specialist languages (not all high-level
languages are specialist languages, but then you'd have to know
something about languages to understand why) tend to be great
candidates for extreme optimization.  Whether or not such measures
are implemented depends on what's important....  (Example - I have
no doubt that the paint on the Edsel could have been improved, I
don't doubt that improving said paint would not have made a
significant difference.)

Note that higher level languages make it easier to make changes that
have significant (10-100x) effects on performance.  In C-land, such
changes are termed "rewrites" and take time and are associated with
new bugs, while they're mere edits for other people.

>Of course, specialist languages make it easier
to develop
> domain-specific programs. That saves a lot of work to the user--but it
does not improve
> performance. And so, specialist languages tend to be used where
programming effort matters
> while performance doesn't.  Seems like the original poster ran into a
rare situation where
> performance did matter, thus his problem.

Rare? Some 5-60% of all computations are, as far as humans can tell,
instantaneous.  More important, slowing them down by 2-4x would be
unnoticable.

For almost all computations, there's a performance threshold.  If
you meet it, there's no benefit to additional speed.  At that point,
other considerations dominate.  For example, would you rather have a
faster Netscape or one with fewer bugs?

-andy


Sent via Deja.com
http://www.deja.com/
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92dln9$na0$1@paradise.nirvananet>
In article <············@nnrp1.deja.com>,
Andy Freeman  <······@earthlink.net> wrote:
> ...
>According to this definition, C is not a lower-level language
>because it does not make it possible to produce faster code than
>"specialist" languages.  C fails because it requires programmers
>to express things in a way that blocks optimization by tools more
>sophisticated than peep-hole optimization and register allocation.

define block.  i have written a few c compilers that did more.
obviously, you can't optimize as intensively as in fortran or some
other languages (including lisp), but if you put enough effort into
the compiler you can go quite far

> ...

hs
From: Andy Freeman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92ga8p$urq$1@nnrp1.deja.com>
In article <············@paradise.nirvananet>,
  ··@paradise.nirvananet (Hartmann Schaffer) wrote:
> In article <············@nnrp1.deja.com>,
> Andy Freeman  <······@earthlink.net> wrote:
> > ...
> >According to this definition, C is not a lower-level language
> >because it does not make it possible to produce faster code than
> >"specialist" languages.  C fails because it requires programmers
> >to express things in a way that blocks optimization by tools more
> >sophisticated than peep-hole optimization and register allocation.
>
> define block.  i have written a few c compilers that did more.
> obviously, you can't optimize as intensively as in fortran or some
> other languages (including lisp), but if you put enough effort into
> the compiler you can go quite far

I'm referring to things like changing representations and taking
advantage of what is known about the inputs to replace general
operations (specified by the programmer) with faster specialized ones
(which are adequate in a given situation).  For example, consider
merging two lists.  Certain characteristics can be exploited at
compile time, but you've got to be able to recognize a list merge
to exploit them.  (I can imagine a value propagation system that
might be able to derive certain similar improvements, but ....)

Also, C's synchronization point rules make it impossible to perform
many "reasonable" optimizations.  (These rules make life easy
for people who want to treat C as a portable assembler.)  They're
often ignored by high-performance compilers, which is yet another
source of "fun" for the C programmer.

-andy


Sent via Deja.com
http://www.deja.com/
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92gh4l$tq9$1@paradise.nirvananet>
In article <············@nnrp1.deja.com>,
Andy Freeman  <······@earthlink.net> wrote:
> ...
>I'm referring to things like changing representations and taking
>advantage of what is known about the inputs to replace general
>operations (specified by the programmer) with faster specialized ones
>(which are adequate in a given situation).  For example, consider

when you expect am optimization deserving paradigm to be used often
enough you can look for specific patterns (i have seen that being done
to get better spec marks) in the il

> ...
>Also, C's synchronization point rules make it impossible to perform
>many "reasonable" optimizations.  (These rules make life easy
>for people who want to treat C as a portable assembler.)  They're
>often ignored by high-performance compilers, which is yet another
>source of "fun" for the C programmer.

i am not convinced that they actually ignore them (unless asked for by
a compiler directive).  it certainly is possible to check whether the
synchronization points have any significance in a particular context

hs
From: Andy Freeman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92igsh$kbd$1@nnrp1.deja.com>
In article <············@paradise.nirvananet>,
  ··@paradise.nirvananet (Hartmann Schaffer) wrote:
> when you expect am optimization deserving paradigm to be used often
> enough you can look for specific patterns (i have seen that being done
> to get better spec marks) in the il

There are lots of patterns that perform a list merge, and more if
you don't factor out representation issues.


> In article <············@nnrp1.deja.com> Andy Freeman
<······@earthlink.net> wrote:
> >Also, C's synchronization point rules make it impossible to perform
> >many "reasonable" optimizations.  (These rules make life easy
> >for people who want to treat C as a portable assembler.)  They're
> >often ignored by high-performance compilers, which is yet another
> >source of "fun" for the C programmer.
>
> i am not convinced that they actually ignore them (unless asked for by
> a compiler directive).  it certainly is possible to check whether the
> synchronization points have any significance in a particular context

As I understand it, synchronization points imply memory accesses.
Since memory accesses are visible in other processes, it's hard to
see how a compiler can know when it's safe to omit or reorder them.

-andy


Sent via Deja.com
http://www.deja.com/
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92j0vk$v9$1@paradise.nirvananet>
In article <············@nnrp1.deja.com>,
Andy Freeman  <······@earthlink.net> wrote:
> ...
>> >Also, C's synchronization point rules make it impossible to perform
>> >many "reasonable" optimizations.  (These rules make life easy
>> >for people who want to treat C as a portable assembler.)  They're
>> >often ignored by high-performance compilers, which is yet another
>> >source of "fun" for the C programmer.
>>
>> i am not convinced that they actually ignore them (unless asked for by
>> a compiler directive).  it certainly is possible to check whether the
>> synchronization points have any significance in a particular context
>
>As I understand it, synchronization points imply memory accesses.
>Since memory accesses are visible in other processes, it's hard to
>see how a compiler can know when it's safe to omit or reorder them.

it's been a while, but my understanding of synchronization points was
that after a sybchrobization point side effects in the preceeding
expression had to have an effect on subsequent calculations.  if
the compiler can determine that a side effect result is redundant, it
can discard them and the problem becomes mute

hs
From: Andy Freeman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92qhvc$9le$1@nnrp1.deja.com>
In article <···········@paradise.nirvananet>,
  ··@paradise.nirvananet (Hartmann Schaffer) wrote:
> In article <············@nnrp1.deja.com> Andy Freeman
<······@earthlink.net> wrote:
> >As I understand it, synchronization points imply memory accesses.
> >Since memory accesses are visible in other processes, it's hard to
> >see how a compiler can know when it's safe to omit or reorder them.
>
> it's been a while, but my understanding of synchronization points was
> that after a sybchrobization point side effects in the preceeding
> expression had to have an effect on subsequent calculations.  if
> the compiler can determine that a side effect result is redundant, it
> can discard them and the problem becomes mute

That's a huge "if".  How does a compiler know about side effects on
other processes via memory accesses?  (Note that the processes can
be physical; one might be programming the clock and voltage
regulator for power savings - reordering those memory accesses
introduces physical heisenbugs.)

-andy


Sent via Deja.com
http://www.deja.com/
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92r0pg$l6e$1@paradise.nirvananet>
In article <············@nnrp1.deja.com>,
Andy Freeman  <······@earthlink.net> wrote:
> ...
>> it's been a while, but my understanding of synchronization points was
>> that after a sybchrobization point side effects in the preceeding
>> expression had to have an effect on subsequent calculations.  if
>> the compiler can determine that a side effect result is redundant, it
>> can discard them and the problem becomes mute
>
>That's a huge "if".  How does a compiler know about side effects on
>other processes via memory accesses?  (Note that the processes can

if that can happen, the corresponding memory locations better be
declared volatile, or there is a bug in the program

hs

>be physical; one might be programming the clock and voltage
>regulator for power savings - reordering those memory accesses
>introduces physical heisenbugs.)
>
>-andy
>
>
>Sent via Deja.com
>http://www.deja.com/
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92rebv$k0vog$1@fido.engr.sgi.com>
Andy Freeman  <······@earthlink.net> wrote:
+---------------
| ··@paradise.nirvananet (Hartmann Schaffer) wrote:
| > it's been a while, but my understanding of synchronization points was
| > that after a sybchrobization point side effects in the preceeding
| > expression had to have an effect on subsequent calculations.  if
| > the compiler can determine that a side effect result is redundant, it
| > can discard them and the problem becomes mute
| 
| That's a huge "if".  How does a compiler know about side effects on
| other processes via memory accesses?
+---------------

You have to tell it. That's what the ANSI C "volatile" attribute is
all about. It tells the compiler that it *can't* depend on a location
being affected only by what it can prove about "local" code. For example,
consider "busy-waitng" on a flag that can be changed by an interrupt
routine:

	extern volatile done_flag;
	...
	while (!done_flag)
	  ; /*wait*/

Without the "volatile" declaration, the compiler would be completely
justified in treating the above exactly the same as this:

	if (!done_flag)	/* only need to test once, since never assigned to */
	  for(;;)
	    ;

which is an infinite loop if "done_flag" was false on entry.


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: Andy Freeman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92t2pr$698$1@nnrp1.deja.com>
In article <··············@fido.engr.sgi.com>,
  ····@rigden.engr.sgi.com (Rob Warnock) wrote:
> Andy Freeman  <······@earthlink.net> wrote:
> | That's a huge "if".  How does a compiler know about side effects on
> | other processes via memory accesses?
> +---------------
>
> You have to tell it. That's what the ANSI C "volatile" attribute is
> all about.

That's what I thought, which prompted my question about a compiler
that supposedly figured it out on its own.

My samples may not be representative, but the C community seems to
engage in far more "magic" programming, that is, rituals, including
"don't use -O2 on foo.c", performed "because", far more often than
the lisp community.  That difference may be due to the applications,
but I thought that I'd ask about tool differences.

-andy


Sent via Deja.com
http://www.deja.com/
From: Rob Warnock
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92u9eb$kbs8h$1@fido.engr.sgi.com>
Andy Freeman  <······@earthlink.net> wrote:
+---------------
| ····@rigden.engr.sgi.com (Rob Warnock) wrote:
| > You have to tell it. That's what the ANSI C "volatile" attribute is
| > all about.
| 
| That's what I thought, which prompted my question about a compiler
| that supposedly figured it out on its own.
+---------------

Well, if you're "just" worried about threads and software interrupts
(Unix signals known to the C compiler & runtime), I suppose a
sufficiently-globally optimizing compiler could perhaps "figure out
on its own" the side-effects *within* a single process, but how could
you *possibly* expect the compiler to understand *DMA* into the process's
address space without telling it explicitly??!?

This is not idle speculation. While doing hardware bringup & debugging,
I have written plenty of user-mode C programs which mpin'd some of their
pages in physical memory, found out where they were, and passed those
addresses (via PIOs to mmap'd /dev/mem which hit the bus) to devices
which then did DMA to/from those regions. And yes, I had to be very
careful to declare those areas "volatile"...


-Rob

p.s. One also has to be *very* careful to catch interrupts (e.g., ^C)
so that the DMA can be shut *off* before the process exits [and the
pinned pages automatically get put back in the system pool!], otherwise
you leave a device doing DMA all over some *other* process's address
space!! [Not good. Not good at all.]

Anyway, to bring this back on-topic, I should imagine that Lisp-based
device drivers must have some way to ensure that the GC won't move DMA
buffers while the device is still active, yes?

-----
Rob Warnock, 31-2-510		····@sgi.com
SGI Network Engineering		http://reality.sgi.com/rpw3/
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA
From: Andy Freeman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92vmuv$br1$1@nnrp1.deja.com>
In article <··············@fido.engr.sgi.com> ····@rigden.engr.sgi.com
(Rob Warnock) wrote:
> Andy Freeman  <······@earthlink.net> wrote:
> | That's what I thought, which prompted my question about a compiler
> | that supposedly figured it out on its own.

> on its own" the side-effects *within* a single process, but how could
> you *possibly* expect the compiler to understand *DMA* into the
process's
> address space without telling it explicitly??!?

I couldn't, so I asked.

> Anyway, to bring this back on-topic, I should imagine that Lisp-based
> device drivers must have some way to ensure that the GC won't move DMA
> buffers while the device is still active, yes?

I have much the same imagination.

The discussion started with a claim that code written in a "low level"
language, such as C, would always be faster than "equivalent" code
written in a high level language, such as lisp.  I asked whether
typical guarantees for low level languages might forbid the use of
optimizations that are available to high level languages.  If so, the
claim has serious problems.  (In high-level languages, I'd expect
special mechanisms for situations where something like those guarantees
are necessary.)

With the req'd use of volatile, C is arguably not a low-level language
under the definition proposed.

-andy


Sent via Deja.com
http://www.deja.com/
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A5679.604FCACE@worldnet.att.net>
Andy Freeman wrote:
>   Frankey <···········@worldnet.att.net> wrote:
> > general-purpose languages tend to be lower-level, and lower-level
> languages make it
> > possible to produce better performing code, as compared with
> specialist, or simply
> > high-level, languages.
> 
> According to this definition, C is not a lower-level language
> because it does not make it possible to produce faster code than
> "specialist" languages.  
yes it does, yes it does <g>.

> C fails because it requires programmers
> to express things in a way that blocks optimization by tools more
> sophisticated than peep-hole optimization and register allocation.
> Interestingly enough, specialist languages (not all high-level
> languages are specialist languages, 
Right, but when you discuss something you have to deliberately limit the set of issues to
consider to the most relevant ones, while setting some lateral yet potentially applicable
cases aside--while being fully aware that you could bring them in if you wanted to and had
limitless time and, perhaps, a memory upgrade from God so you can remember what you're
talking about.

> but then you'd have to know something about languages to understand why) 
Was that a jab <g>?

> ...Note that higher level languages make it easier to make changes that
> have significant (10-100x) effects on performance.  In C-land, such
> changes are termed "rewrites" and take time and are associated with
> new bugs, while they're mere edits for other people.
That's true, I won't argue with that. I do think it irrelevant to the topic at hand.
Obviously the original poster didn't have such an "extreme" optimization at his disposal,
even though perhaps it's *theoretically* possible.


> >Of course, specialist languages make it easier
> to develop
> > domain-specific programs. That saves a lot of work to the user--but it
> does not improve
> > performance. And so, specialist languages tend to be used where
> programming effort matters
> > while performance doesn't.  Seems like the original poster ran into a
> rare situation where
> > performance did matter, thus his problem.
> 
> Rare? Some 5-60% of all computations are, as far as humans can tell,
> instantaneous.  More important, slowing them down by 2-4x would be
> unnoticable.
You seem to lack logic here. By saying "Rare?" you seem to express a disagreement with the
statement to which you responded, and yet then you proceed to confirm it, by saying that
in 5-60% of cases performance doesn't matter. That would mean, imo, that the cases where
it does are rare. But rare or not, in the original poster's case performance mattered. Now
what? Another ad hominem <g>?

> For almost all computations, there's a performance threshold.  If
> you meet it, there's no benefit to additional speed.  At that point,
> other considerations dominate.  For example, would you rather have a
> faster Netscape or one with fewer bugs?
Blah blah blah blah blah... Actually, if I needed something fast, I'd give NS a try, and
if it didn't blow I'd get exactly what I wanted. Btw, I'm using NS and it doesn't blow all
that terribly much, so, yeah, you can bet your you know what I'd give it a try. Anyway,
neither of the above is relevant in the context of the current discussion. Performance
mattered in that particular case. You want threshold, fine, the guy clearly bashed through
this threshold and needed it faster. End of story.
From: Paul Foley
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m2hf3p75ez.fsf@mycroft.actrix.gen.nz>
On Wed, 27 Dec 2000 20:45:22 GMT, Frankey  wrote:

> it does are rare. But rare or not, in the original poster's case
> performance mattered. Now what? Another ad hominem <g>?

Here are some actual timings in the three Lisp implementations I have
on my machine, with three versions of the original poster's code:

Code Version     CMUCL        ACL(Trial)      LispWorks(PE)
============================================================
Original          1.31            39.60            43.3
Declarations      1.27             3.06            30.4
Row-Major         0.54             3.17            28.5

"Declarations" is just the original code with type declarations added
for the array, argument, and loop indices.  "Row-Major" is rewritten
to use ROW-MAJOR-AREF.

For comparison, the same function written in C takes 1.45 seconds
unoptimized; best speed is 0.44 seconds with -O2.

> neither of the above is relevant in the context of the current
> discussion. Performance mattered in that particular case. You want
> threshold, fine, the guy clearly bashed through this threshold and
> needed it faster. End of story.

No: beginning of story.  That's why the original poster asked for help
making it faster.  Which he got.  You may have noticed that CMUCL was
already quite competitive with C on the original code!  [And there's
no reason to think the timings for the other implementations are at
all representative of what they're capable of]

-- 
You don't have to agree with me; you can be wrong if you want.

(setq reply-to
  (concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))
From: David Bakhash
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3lmspczf1.fsf@cadet.dsl.speakeasy.net>
Paul Foley <·······@actrix.gen.nz> writes:

> Here are some actual timings in the three Lisp implementations I have
> on my machine, with three versions of the original poster's code:
> 
> Code Version     CMUCL        ACL(Trial)      LispWorks(PE)
> ============================================================
> Original          1.31            39.60            43.3
> Declarations      1.27             3.06            30.4
> Row-Major         0.54             3.17            28.5

Any chance you can try that in CLISP as well?  Also, how about
including both the compiled and non-compiled versions?

I know it's a bit more to ask; just figured I would in case it's not
too much trouble.

thanks
dave
From: Fernando Rodr�guez
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <sahc4tsl94dp4j61r3qqhbmavicg0v176t@4ax.com>
On 24 Dec 2000 12:30:58 -0500, Marco Antoniotti <·······@cs.nyu.edu> wrote:



>> Definitely. Still no match to C or C++ or Ada though.
>
>The Naggum will get you soon.  :)

};-DDDDDDDDDD I just can't wait. 
>It is obvious that you have *no idea*
>about what (Common) Lisp is about.

And about what Eric is either, but I'm confident he will soon be enlightened.
};-D

Merry Christmas and a Happy New Flame Fest!






//-----------------------------------------------
//	Fernando Rodriguez Romero
//
//	frr at mindless dot com
//------------------------------------------------
From: Tom Breton
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m31yuxnzqu.fsf@world.std.com>
> Frankey <···········@worldnet.att.net> writes:
> 
> 
> > Right. Though one has to keep in mind that the term "compile" can
> > only be applied here figuratively. None of these languages compile
> > in the same sense as what real-compiled languages like for example
> > C, C++, Pascal, Ada, do. 

As others have pointed out, this is untrue.  You may be mistaking
particular implementations of a language for the language itself.
Even C need not be compiled; C interpreters actually exist.  

If you don't mind me using this to make a point, this is an example of
how an optimizationist mindset leads people astray (not just you, and
no offense intended).  You immediately want speed, and therefore you
reach immediately for a "fast" language.  But that's an unneccessarily
painful and restrictive way to do things.  A smarter language (eg
Lisp) makes it easier to make your program correct first.  Only
afterwards do you really know what you do and don't need to speed up.

-- 
Tom Breton, http://world.std.com/~tob
Not using "gh" 1997-2000. http://world.std.com/~tob/ugh-free.html
Some vocal people in cll make frequent, hasty personal attacks, but if
you killfile them cll becomes usable.
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A469CC2.A3FEEC45@worldnet.att.net>
Tom Breton wrote:
> > Frankey <···········@worldnet.att.net> writes:
> > > Right. Though one has to keep in mind that the term "compile" can
> > > only be applied here figuratively. None of these languages compile
> > > in the same sense as what real-compiled languages like for example
> > > C, C++, Pascal, Ada, do.
> 
> As others have pointed out, this is untrue.  You may be mistaking
> particular implementations of a language for the language itself.
> Even C need not be compiled; C interpreters actually exist.
> 
> If you don't mind me using this to make a point, this is an example of
> how an optimizationist mindset leads people astray (not just you, and
> no offense intended).  
No offense taken, but I'd like to use this to make another point, about how many people
like to talk about what they think they see rather than what there is there. Of course
there are C interpreters, and yet I'll feel rather comfortable saying that when anyone
mentions C they mean a compiled language. More importantly, like I've said somewhere
above, the "compilation" question is not about native vs interpreted code, but rather
about what kind of native code results from such compilation, which is heavily influenced
by the structure of the language.

> You immediately want speed, and therefore you
> reach immediately for a "fast" language.  But that's an unneccessarily
> painful and restrictive way to do things.  A smarter language (eg
> Lisp) makes it easier to make your program correct first.  Only
> afterwards do you really know what you do and don't need to speed up.
Let's not drift away from the original poster's problem. Obviously it had to do with
performance, and therefore why don't we stick to that context, rather than starting a
vague and general discussion about who knows what. The original poster didn't say he can't
use C or C++ because then he won't be able to write a correct program <g>.
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <9264og$rtk$1@paradise.nirvananet>
In article <·················@worldnet.att.net>,
Frankey  <···········@worldnet.att.net> wrote:
> ...
>above, the "compilation" question is not about native vs interpreted code, but rather
>about what kind of native code results from such compilation, which is heavily influenced
>by the structure of the language.

you have posted this a couple of times, but that doesn't make it any
more true.  apparently you imply that if a language needs elaborate
run time support that means any code that gets generated by its
compilers must be slower than the code generated by compilers for
languages where the elaborate run time support isn't needed.  this is
true only to a minor extent, and offset by the fact that when you need
the features that the runtime support is needed for, they tend to be
implemented more efficiently than what most programmers can come up
with when they try to implement it themselves.  you also err when you
think that it doesn't matter whether the compilation produces native
code or code for some virtual machine.  the performance difference is
quite noticeable

> ...

hs
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A47A5CB.AE259A0E@worldnet.att.net>
Hartmann Schaffer wrote:
> Frankey  <···········@worldnet.att.net> wrote:
> > ...
> >above, the "compilation" question is not about native vs interpreted code, but rather
> >about what kind of native code results from such compilation, which is heavily influenced
> >by the structure of the language.
> 
> you have posted this a couple of times, but that doesn't make it any
> more true.  apparently you imply that if a language needs elaborate
> run time support that means any code that gets generated by its
> compilers must be slower than the code generated by compilers for
> languages where the elaborate run time support isn't needed.  this is
> true only to a minor extent, 
Well, so let's now argue about this extent (not <g>.) The structure of the language
influences the result, no matter what you do. Take Java (only because it's a simple
example.) All object there are dynamically allocated. Does this change if you compile the
code? No. That's kinda what I'm talking about.
From: Hallvard B Furuseth
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <HBF.20001226t2qk@bombur.uio.no>
Frankey <···········@worldnet.att.net> writes:
>Hartmann Schaffer wrote:
>>Frankey  <···········@worldnet.att.net> wrote:
>
> The structure of the language influences the result, no matter what
> you do. Take Java (only because it's a simple example.) All object
> there are dynamically allocated. Does this change if you compile the
> code? No.

Yes.  If the compiler can tell that an object won't be needed after the
function which allocated it returns, it may be able to put that object
on the stack instead.  (Assuming the implementation uses a stack, of
course.)  That's why the answer to the original poster's problem was to
give the compiler enough information that the structure of the language
didn't have to influence the result.

This is assuming Java isn't so silly that the above is impossible, but
then the rest of this discussion is about Lisp, so who cares.

> That's kinda what I'm talking about.

And that's kinda what you ought to stop talking about until you know
anything about it.

-- 
Hallvard
From: Hartmann Schaffer
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92anp7$gf2$1@paradise.nirvananet>
In article <·················@worldnet.att.net>,
Frankey  <···········@worldnet.att.net> wrote:
> ...
>> more true.  apparently you imply that if a language needs elaborate
>> run time support that means any code that gets generated by its
>> compilers must be slower than the code generated by compilers for
>> languages where the elaborate run time support isn't needed.  this is
>> true only to a minor extent, 
>Well, so let's now argue about this extent (not <g>.) The structure of the language
>influences the result, no matter what you do. Take Java (only because it's a simple
>example.) All object there are dynamically allocated. Does this change if you compile the
>code? No. That's kinda what I'm talking about.

but java gets compiled to a virtual machine that gets interpreted,
which slows down the execution considerably.  the jit compilers don't
really change that, because the compilation is done during the
execution of the program.  i haven't tried gcj(?, the java front ent
for gcc) yet, but expect substantial performance improvements.  most
listps, otoh, compile directly to native code

hs
From: David Simmons
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <Jag26.43522$A06.1301002@news1.frmt1.sfba.home.com>
"Hartmann Schaffer" <··@paradise.nirvananet> wrote in message
·················@paradise.nirvananet...
> In article <·················@worldnet.att.net>,
> Frankey  <···········@worldnet.att.net> wrote:
> > ...
> >> more true.  apparently you imply that if a language needs elaborate
> >> run time support that means any code that gets generated by its
> >> compilers must be slower than the code generated by compilers for
> >> languages where the elaborate run time support isn't needed.  this is
> >> true only to a minor extent,
> >Well, so let's now argue about this extent (not <g>.) The structure of
the language
> >influences the result, no matter what you do. Take Java (only because
it's a simple
> >example.) All object there are dynamically allocated. Does this change if
you compile the
> >code? No. That's kinda what I'm talking about.
>
> but java gets compiled to a virtual machine that gets interpreted,

You're being sarcastic aren't you?

Most java code is run on virtual machines that jit-(just in time)-compile
the java opcode set into native code when they first need to run it. They
may also be advanced enough to recompile it later based on runtime
heuristics to further optimize the code based on actual data flow analysis
(this is what hotspot technology is all about).

> which slows down the execution considerably.  the jit compilers don't
> really change that, because the compilation is done during the
> execution of the program.

In other words, most Java code is executing as native code. It just has a
multi-stage compilation process, like many lisp systems. I.e., compile from
Java to an intermediate form (java opcodes); then when loading the class
and/or invoking a method for the first time, compile the intermediate (java
opcodes) form into native code for actual execution. I'm glossing over the
fact that this dynamic compilation technique allows code to be unloaded by a
gc systems and therefore it may get rejitted on demand to support memory
utilization policies.

The cost of jitting is typically inconsequential to the cost of executing
the code. Furthermore, there are systems that cache jitted code on disk so
that it can be reloaded without needing to jit it at all. They also try and
share all such jitted code between any simultaneously running virtual
machines or sandboxed object spaces.

I am no big fan of Java, and it is my opinion that Java has a number of what
one might call "flaws". It has, as far as it goes, a suboptimal (opcode)
instruction set for either fast optimized jitting or efficient
interpretting. The security/safety requirements and its suboptimal
instruction set have an impact on both its speed and its ability to be
optimized into native code.

That characteristic of Java should not be misconstrued as being a problem
with ALL languages that support intermediate opcode instruction sets and/or
utilize virtual machine jitting technology.

-- Dave Simmons [www.qks.com / www.smallscript.com]
  "Effectively solving a problem begins with how you express it."

> i haven't tried gcj(?, the java front ent
> for gcc) yet, but expect substantial performance improvements.  most
> listps, otoh, compile directly to native code
>
> hs
From: ··········@my-deja.com
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <92c8di$qjq$1@nnrp1.deja.com>
borate
> > >> run time support that means any code that gets generated by its
> > >> compilers must be slower than the code generated by compilers for
> > >> languages where the elaborate run time support isn't needed.


The secret language of the Egyptian masters is steadily growing more
powerful, exhibiting performance and functionality that will soon equal
that of the proprietary giants.


In the tradition of year-end predictions, I offer my own: 2001 will be
the year that the stochastic is finally fractalised by the secret
masters of Dynamuc PRogramung !


Sent via Deja.com
http://www.deja.com/
From: Marco Antoniotti
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <y6cg0j9g8wg.fsf@octagon.mrl.nyu.edu>
··········@my-deja.com writes:

> borate
> > > >> run time support that means any code that gets generated by its
> > > >> compilers must be slower than the code generated by compilers for
> > > >> languages where the elaborate run time support isn't needed.
> 
> 
> The secret language of the Egyptian masters is steadily growing more
> powerful, exhibiting performance and functionality that will soon equal
> that of the proprietary giants.
> 
> 
> In the tradition of year-end predictions, I offer my own: 2001 will be
> the year that the stochastic is finally fractalised by the secret
> masters of Dynamuc PRogramung !

What program do you use to generate these postings? :)

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4A4D35.E3D8483E@worldnet.att.net>
Hartmann Schaffer wrote:
> 
> In article <·················@worldnet.att.net>,
> Frankey  <···········@worldnet.att.net> wrote:
> > ...
> >> more true.  apparently you imply that if a language needs elaborate
> >> run time support that means any code that gets generated by its
> >> compilers must be slower than the code generated by compilers for
> >> languages where the elaborate run time support isn't needed.  this is
> >> true only to a minor extent,
> >Well, so let's now argue about this extent (not <g>.) The structure of the language
> >influences the result, no matter what you do. Take Java (only because it's a simple
> >example.) All object there are dynamically allocated. Does this change if you compile the
> >code? No. That's kinda what I'm talking about.
> 
> but java gets compiled to a virtual machine that gets interpreted,
Well, that's a terminological problem--what to call "compilation" <g>? Yes you're right,
Java gets "compiled" to the bytecodes, but that's not what I meant. YOu know that you can
also compile it (no quotes) to the native code. No VM anymore. But the structure-imposed
characteristics of the language remain, for example dynamic allocation of all objects.

> which slows down the execution considerably.  the jit compilers don't
> really change that, because the compilation is done during the
> execution of the program.  
That's a different kind of compilation. Java is always "compiled" in the sense that you
never execute the source itself, but a JIT would make native machine code out of it VM
bytecodes--and, you bet, it will run faster than bytecodes on the VM. My point was that
even that, formally native, and definitely better performing code will be outperformed by
a functionally-equivalent C++ or C code. Because of the different structure of the
languages under consideration.
From: Frankey
Subject: Ppl, who's that fruitcake? (Re: Common Lisp and Python performance)
Date: 
Message-ID: <3A46A00B.878B2C83@worldnet.att.net>
Erik Naggum wrote:
> * Frankey <···········@worldnet.att.net>
> | Though one has to keep in mind that the term "compile" can only be
> | applied here figuratively.
> 
>   First C++ as a _language_ has struck the right balance in performance,
>   which is a quality restricted to people and machines, and now _this_
>   rampant and arrogant ignorance, as well?
> 
>   Why do some people think that entertaining such trolls really helps
>   the quality of a forum?
> 
>   Ignorance may be cured, but not in people who hold it as a virtue.
Ran out of vallium, Erik? I see. 
Bummer.

Merry Christmas.
From: John Wiseman
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <m3u27uvrjf.fsf@bjork.local.neodesic>
glauber <··········@my-deja.com> writes:

> Python and Perl compile by default, but Lisp compiles only if you
> tell it to.

That's not a property of lisp in general.  The lisp I usually use
compiles everything by default.  I think even the stuff I type at the
listener is compiled before being evaluated.


John
From: Rolf Wester
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A44F263.D40C2D6@t-online.de>
Hi,

I want to thank you all for your replies. I have to test the code you
suggested and
become a little bit fitter in writing efficient Common Lisp. It also took
some time to
get familiar with C++ or Python (BTW I know that the Python language has
nothing
to do with CMUCL's compiler). I began programing in Fortran which is still
the fastest
language for numerical computations. In spite of this I switched to C++
because it is object oriented and this makes it possible to write programs
that couldn't be realized in Fortran.
So if Common Lisp makes it easier and faster to write complex programs I
would except
a moderate loss of speed (a loss of speed of say 100% is compensated by
just waiting 18 month,
until the next faster processor is available, whereas writing and
maintaining a complex program
can take longer).

Thanks again.

Merry Christmas and a happy new year.

--Rolf
From: Frankey
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A469F5F.6AAC64D1@worldnet.att.net>
Rolf Wester wrote:
> 
> Hi,
> 
> I want to thank you all for your replies. I have to test the code you
> suggested and
> become a little bit fitter in writing efficient Common Lisp. It also took
> some time to
> get familiar with C++ or Python (BTW I know that the Python language has
> nothing
> to do with CMUCL's compiler). I began programing in Fortran which is still
> the fastest
> language for numerical computations. In spite of this I switched to C++
> because it is object oriented and this makes it possible to write programs
> that couldn't be realized in Fortran.
> So if Common Lisp makes it easier and faster to write complex programs I
> would except
> a moderate loss of speed (a loss of speed of say 100% is compensated by
> just waiting 18 month,
> until the next faster processor is available, whereas writing and
> maintaining a complex program
> can take longer).

That sounds about right.

Merry Christmas.
From: ··········@my-deja.com
Subject: Avoid Python Worship Lisp
Date: 
Message-ID: <926jqj$l6$1@nnrp1.deja.com>
In article <·················@worldnet.att.net>,
  Frankey <···········@worldnet.att.net> wrote:
> Rolf Wester wrote:
> >
> > Hi,
> > I want to thank you all for your replies. I have to test the code

There are encoded messages.
I saw a tear in my copy of Knuth and realised that it ran through thge
word "trheat". The DOORS are now open.

You will understand when the true spitritual light comes to the EATH.
Visitors from the planes will come.

Python is a synonym of the EVIL ONES.
Have you bnot seebn how they use whitespace to encode meaning in THE
PYTHON ?

LISP uses only the HOLY CURED PARENTHESIS.

I has seen the truth.
The LIGHt IS IN LISP.

There is still time before they come.


Sent via Deja.com
http://www.deja.com/
From: Rolf Wester
Subject: Re: Common Lisp and Python performance
Date: 
Message-ID: <3A4F63DA.F52D2E5E@t-online.de>
Rolf Wester wrote:

> Hi,
>
> I want to thank you all for your replies. I have to test the code you
> suggested and
> become a little bit fitter in writing efficient Common Lisp. It also took
> some time to
> get familiar with C++ or Python (BTW I know that the Python language has
> nothing
> to do with CMUCL's compiler). I began programing in Fortran which is still
> the fastest
> language for numerical computations. In spite of this I switched to C++
> because it is object oriented and this makes it possible to write programs
> that couldn't be realized in Fortran.
> So if Common Lisp makes it easier and faster to write complex programs I
> would except
> a moderate loss of speed (a loss of speed of say 100% is compensated by
> just waiting 18 month,
> until the next faster processor is available, whereas writing and
> maintaining a complex program
> can take longer).
>

Hi,

I ran my program with all possible declarations using CMUCL. The g++ version
is about 2 times
faster when compiled with -O3. Python (the language, not CMUCL's compiler) is
much slower. For many tasks I consider CMUCL to be fast enough. Thanks again
to those of you who replied to  my original post.

Happy new year

--Rolf
From: ··········@my-deja.com
Subject: Common Lisp why are you blaming bush
Date: 
Message-ID: <923m0t$2pf$1@nnrp1.deja.com>
In article <················@naggum.net>,
  Erik Naggum <····@naggum.net> wrote:
>   The United States of America, soon a Bush league world power.

That is suspicius.
Those inside the system use everday terms and symbols linked with
situations in order to communicate with the person they are torturing
(crucifying). Insiders use phonetics in communication. Examples:
the word "spa" is interpretted as "spay", as in castration. "Lot" is
interpretted as "loot", meaning money.  Use this reference then review
the usenet

Secret Word Translation

Common

An outsider who has been purchased
by an insider as a slave__for
use in their "game"
If the player refuses to play__he is tortured.

Bottom
Lower class,  used in phrase: "bottom line( lying)"
Used when a person they are crucifying allegedly has lied or
violated one of their rules.

Soon
An outsider who is put under
surveillance and tortured/played
with because of alleged rules violation

State
Protection by the system
an outsider is caught putting the person under
high tech surveillance and control

Covered
Same as respect, not under serveillance
Crucify When an outsider is captured

they are put under high tech surveillance

and tortured
Cut Using high tech weaponry to

damage blood vessels, causes extreme

Check Money, assets obtained as
Term used for high tech surveillance
method. Person is under the magnifying

glass or in the mirror.
One who has limited audio/visual capabilities.
No Respect
Under high tech surveillance by the

insiders, i.e., being watched, candid

camera, exploitation
Paranoid
An outsider who is attacked/
caught by the system
Rat
Outsider who has figured out that the insiders

have special characteristics
Respect
Not under high tech surveillance
Rules
Bogus set of rules insiders use

to catch and crucify/enslave  outsiders

(those inside the system have no rules)
Sacrifice
Insiders sacrifice outsiders they are captured
or considered rats(security risks)


Touching the person causing

pulminary problems and death,

Suspicious suicides:
Class, Protection, High Tech Protection
See You
Despite an Israeli offer to give the Palestinians control of the Temple
Mount, parts of Jerusalem's Old City, and practically all Arab villages
in the eastern part of Jerusalem, talks in Washington ended
inconclusively yesterday and negotiators returned to the Middle East to
consult with their leaders.

An outsider who won't play

the insider's game.
System
Connected people along with their bogus

rules  for playing with and destroying outsiders
Threat
Insiders who raise a challenge against

the system to free the enslaved outsider.

This causes a newLesson 12 is now the final examination lesson.
If you are working online, you now have access to this updated course
material
 betting match in order

Real-estate, empires.


Sent via Deja.com
http://www.deja.com/