From: Peter Ludemann
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <LUDEMANN.95Oct6140930@expernet26.expernet.com>
In article <··········@Yost.com> ····@Yost.com (Dave Yost) writes:

   I dare someone to write an article that whets the
   appetite of C programmers for parenthesized syntax
   by teaching its benefits and showing the goodies
   that it makes possible (particularly Common Lisp's
   macros). I bet it can be done.
   Or perhaps there are 7 such articles already...

   From my own experience having come late to lisp, I can tell
   you that there are many, many benefits of it that you can't
   have any idea of until you're exposed to them.

No need for Lots of Irritating Silly Parentheses.  Prolog provides many
of the benefits of Lisp without the need for all those parentheses.
Ditto ML, Haskell, etc.

Because Java has garbage collection, no pointers, and somewhat more
sensible handling of strings and arrays than C++, it might start
whetting C programmers' appetitites for the things that programmers
who use high-level languages (Prolog, Lisp, Smalltalk, etc.) have long
taken for granted, but which the C programmers have either not
understood or have thought to be too inefficient.



-- 
Peter Ludemann			········@expernet.com
ExperNet			phone: +1.415.949.8691
Systems Architect		fax:   +1.415.949.1395
KNS Project Leader		

From: Frank Brickle
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <4568tc$ra@newton.texel.com>
Peter Ludemann (········@expernet26.expernet.com) wrote:

: Because Java has garbage collection, no pointers, and somewhat more
: sensible handling of strings and arrays than C++, it might start
: whetting C programmers' appetitites for the things that programmers
: who use high-level languages (Prolog, Lisp, Smalltalk, etc.) have long
: taken for granted, but which the C programmers have either not
: understood or have thought to be too inefficient.

This is not entirely far-fetched. I toyed with Lisp gingerly for
years.  But it was Perl -- the iteration constructs, really -- that
made me think, hell, why not do this right, and start using Lisp in
earnest. Good decision it was, too.
From: Paul Wilson
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <45haq6$636@jive.cs.utexas.edu>
In article <··········@undergrad.math.uwaterloo.ca>,
Carl Laurence Gonsalves <········@undergrad.math.uwaterloo.ca> wrote:
  [...]
>
>The incredible slowness of Java is only evidence of the inefficiency of
>garbage collection. I've programmed in a few languages that have garbage
>collection (among them are Scheme, Modula-3, and now Java), and all have
>been *much* slower than C or C++. I'm not saying Java is bad. Just don't
>call it "efficient" when it most certainly is not. It's fine for a lot of
>things, but it is unusable for anything computationally intensive. I have
>yet to hear of a person who has seen the "bouncing heads" demo run
>smoothly...

Get real.  You have no idea what garbage collection does or doesn't cost.
Have you profiled these systems to find out where the time goes?  I haven't,
but I'd bet BIG MONEY that GC is not the culprit, or if it is, it's
because an obsolete GC is being used, or because of a crummy compiler/GC
interface.

>I'm also not saying that garbage collection is the only reason Java is
>slow. The fact that it's interpreted and is continually doing range
>checking must slow it down too.

I'd say that the costs of interpretation are vastly greater than the
costs of garbage collection.

> But Java is not going to convince anyone
>(except maybe people used to really slow languages like Rexx or BASIC) that
>garbage collection is efficient.

And your posting is certainly not going to convince anybody who knows
anything about the subject that garbage collection is slow.  Try reading
my GC survey (bigsurv.ps on our ftp site, below, in revision for Computing
Surveys) and then come back and tell us why we should think GC is slow.
We'd really like to know.

-- 
| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (······@cs.utexas.edu)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and  Scheme interpreters and compilers available via ftp from 
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)      
From: Mike McNally
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <m5.813503618@tivoli.com>
········@undergrad.math.uwaterloo.ca (Carl Laurence Gonsalves) writes:
>The incredible slowness of Java is only evidence of the inefficiency of
>garbage collection. 

I'd like to see some evidence that it's the garbage collection that
makes Java "incredibly slow".

>I've programmed in a few languages that have garbage
>collection (among them are Scheme, Modula-3, and now Java), and all have
>been *much* slower than C or C++. 

Still waiting for evidence.  Seems to me that an efficient garbage collector,
even a rather simple-minded one, is overall a big win.  I know I've written
software in C with a dynamic garbage collector underneath, and the resulting
code wasn't "incredibly slow".

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Nobody's going to listen to you if you just | Mike McNally (··@tivoli.com) |
| stand there and flap your arms like a fish. | Tivoli Systems, Austin TX    |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: muzo
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <45q3gi$jqg@voyager.internex.net>
Cyber Surfer <············@wildcard.demon.co.uk> wrote:
>If a C++ compiler for Windows that used a GC were available, I'd be
>_very happy_. Even if the GC performed as "badly" as mine does. My GC
>is a simple mark/compact job, so there's a hell of a lot of room for
>improvement. I just pulled it from an 18 year old CS book! Today's
>GCs can surely do better.

Boehm's (sp?) GC system can be used with windows programs and is available at
parcftp.xerox.com

muzo

standard disclaimer
From: Cyber Surfer
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <813741999snz@wildcard.demon.co.uk>
In article <··········@voyager.internex.net> ·····@msn.com "muzo" writes:

> Boehm's (sp?) GC system can be used with windows programs and is available at
> parcftp.xerox.com

I've got it already, but I just didn't mention it. The code I was
refering to created code for DOSX (32bit DOS extended C, in fact),
and I had my own reasons for using a compacting GC. I may someday
add the Boehm GC to my runtime system, esp if I ever port my code
to Win32.

Thanks.
-- 
<URL:http://www.demon.co.uk/community/index.html>
<URL:http://www.enrapture.com/cybes/namaste.html>
Current favourite word: hypersonic. As used by Fluffy.
"You can never browse enough."
From: Carl Laurence Gonsalves
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <DGDGEB.6no@undergrad.math.uwaterloo.ca>
In article <··········@undergrad.math.uwaterloo.ca>,
Carl Laurence Gonsalves <········@undergrad.math.uwaterloo.ca> wrote:
>In article <·····················@expernet26.expernet.com>,
>Peter Ludemann <········@expernet.com> wrote:
>>
>>Because Java has garbage collection, no pointers, and somewhat more
>>sensible handling of strings and arrays than C++, it might start
>>whetting C programmers' appetitites for the things that programmers
>>who use high-level languages (Prolog, Lisp, Smalltalk, etc.) have long
>>taken for granted, but which the C programmers have either not
>>understood or have thought to be too inefficient.
>
>The incredible slowness of Java is only evidence of the inefficiency of
>garbage collection. I've programmed in a few languages that have garbage
>collection (among them are Scheme, Modula-3, and now Java), and all have
>been *much* slower than C or C++. I'm not saying Java is bad. Just don't
>call it "efficient" when it most certainly is not. It's fine for a lot of
>things, but it is unusable for anything computationally intensive. I have
>yet to hear of a person who has seen the "bouncing heads" demo run
>smoothly...
>
>I'm also not saying that garbage collection is the only reason Java is
>slow. The fact that it's interpreted and is continually doing range
>checking must slow it down too. But Java is not going to convince anyone
>(except maybe people used to really slow languages like Rexx or BASIC) that
>garbage collection is efficient.

Several people seem to have misunderstood my point, so I thought I'd better
clarify. I am *not* saying that garbage collection is slow. What I am
saying is that Java is slow. I think most people agree on this. Java also
has garbage collection. (any disagreements?)

Now, when I said "evidence" perhaps I was using the wrong word. By evidence
I mean that (statistically speaking) the fact that Java is both slow and
has GC puts a point in the "GC is slow" bucket. I'm not saying that it
proves that GC is slow, nor am I saying Java is slow *because* it has GC.
In fact, I recall reading that Java is about 8-times faster if you compile
it to native-code (rather than byte-code), and the majority of the
remianing slowness when compared to C has to do with security checks and
range-checking.

The post I was originally replying to was saying that one could convince C
and C++ programmers that GC could be efficient by pointing at Java. That's
a silly statement since Java programs are very slow, and if anything the
slowness of Java will make many C and C++ programmers feel that GC must be
the cause of Java's slowness.

As for Scheme and Modula-3, the Scheme I used was interpreted (I probably
should've pointed that out. By default I assume Scheme is interpreted, just
as by default I assume Rexx and BASIC are interpreted, while I assume C and
Pascal are compiled). The Modula-3 compiler I use is SRC Modula-3, which
from what I've seen must be generating some pretty bad code. (I haven't
used it in a couple of months, so things may have changed by now)

Anyways, before you go and send me another flame telling me GC is the best
thing since sliced bread, I suggest you go back and read my original post.
I never actually said there was anything wrong with GC, though I suppose
some might think that is what I was trying to imply.

-- 
        Carl Laurence Gonsalves - ········@undergrad.math.uwaterloo.ca
                   Computer Science, University of Waterloo
               http://www.undergrad.math.uwaterloo.ca/~clgonsal/
                   http://www.csclub.uwaterloo.ca/~clgonsal/
From: Robert Futrelle
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <45ltru$osk@camelot.ccs.neu.edu>
By default, you should assume that Lisp is compiled.
-- 
Prof. Robert P. Futrelle | Biological Knowledge Lab, College of CS
Office: (617)-373-2076   | Northeastern University, 161CN
Fax:    (617)-373-5121   | 360 Huntington Ave.
········@ccs.neu.edu     | Boston, MA 02115
From: Noah L. Gibbs
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <0kV1Z1600iVCI902Vu@andrew.cmu.edu>
Excerpts from netnews.comp.lang.dylan: 17-Oct-95 Re: GC, Java, etc. by
Cyber ······@wildcard.de 
> Which Allegro CL are you refering to? The Unix or the Windows version?

    Unix version.

> >     So, although it could quite easily be internally a compiler of some
> > bizarre stripe, I will always call it an interpreter because I can't see
> > the internals, and the externals resemble Applesoft BASIC in speed
> > (relative to processor) and interactivity (relative to language).
> >     I'm not trying to indicate that all LISPs are like this.  I've only
> > really used Allegro CommonLISP to any real degree, and the fact that
> > people still use CommonLISP is proof enough that Allegro isn't typical.
>  
> Have you read PJ Brown's book, Interactive Compilers and Interpreters?
> In it, he makes the very good point that all language systems tend to
> be a mixture of compiler and interpreter. I guess it's just a matter
> of compromises. Some implementations of a Lisp may be closer to an
> "interpreter" than a "compiler", but really, they'll always be both,
> if you use Brown's definitions.

    I have not, though I can certainly see the point, at least in its
most basic form.  For example, "compiled" BASIC is sometimes just
packaging the interpreter with the bytecodes in an executable (I can't
remember which PC BASIC does that).

> You should also state the machine you're using. For all I know, the
> system could be paging heavily, which would effect performance...

    It might well have been.  In the course I took, we were using Sparc-5's.
We had to write simple text-based projects early on, and later had to use
a windowing system written by the professors to add a UI to a text-based
program (which we had written as the previous project).  While in text-based
form, the program ran only 25X or so slower than a similar program turned out
be GCC, when a user-interface was added, it hit the far end of unusable,
probably because the windowing system was written on top of GARNET, which I
have also heard bad things about.


				Noah Gibbs

"The exact value is a magic cookie." -- Turbo C++ 3.0 Programmer's Guide
From: Jeff Dalton
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <DGnAw5.Co1@cogsci.ed.ac.uk>
········@undergrad.math.uwaterloo.ca (Carl Laurence Gonsalves) writes:

>In article <··········@camelot.ccs.neu.edu>,
>Robert Futrelle <········@ccs.neu.edu> wrote:
>>By default, you should assume that Lisp is compiled.

>Unless you mean Scheme, I never mentioned Lisp. Every Scheme (and Lisp)
>that I've ever used were interpreted though. I know Lisp and Scheme
>compilers are available, but I've never actually seen one.

Humm.  I think I've used only one Common Lisp that didn't have
a compiler to native code (sometimes via C), and the one exception
was compiled to byte codes.  (I've used something like 14-15
Common Lisps.)

There are lots of Scheme interpreters around, because they're fairly
easy to write.  But there are also a fair number of Scheme's that have
compilers.

-- jeff
From: Jeff Dalton
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <DGnB3F.Csp@cogsci.ed.ac.uk>
"Noah L. Gibbs" <·········@andrew.cmu.edu> writes:

>Excerpts from netnews.comp.lang.dylan: 16-Oct-95 Re: GC, Java, etc. by
>··········@netcom.com 
>> I find this almost impossible to credit, as it's been a good decade or more
>> since I've seen a Scheme or Common Lisp that _didn't_ include a compiler!

>    This also depends on what you mean by compiler.  I consider, e.g.
>Allegro CommonLISP to be interpreted, although the professor of our course
>assured us it was "incrementally compiled".  However, the speed, interact-
>ivity, and every other empirical test indicated it was an interpreter.

What do you mean by "Allegro CommonLISP"?  The one from Franz Inc that
runs on Suns and a number of other machines definitely includes a
compiler to native code.  If you call everything that's incrementally
compiled or interactive (or maybe only both) an interpreter, you're
just wrong.  As for speed, compiled code isn't much slower than
compiled C.

Did you actually compile anything, by the way?

-- jd
From: Cyber Surfer
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <814090803snz@wildcard.demon.co.uk>
In article <··········@cogsci.ed.ac.uk>
           ····@cogsci.ed.ac.uk "Jeff Dalton" writes:

> What do you mean by "Allegro CommonLISP"?  The one from Franz Inc that
> runs on Suns and a number of other machines definitely includes a
> compiler to native code.  If you call everything that's incrementally
> compiled or interactive (or maybe only both) an interpreter, you're
> just wrong.  As for speed, compiled code isn't much slower than
> compiled C.

In fact, (sorry mention him again), in Writing Interactive Compilers
and Interpreters, PJ Brown calls both compilers and interpreters by the
same word: compilers. That's so much simpler, esp when the differences
may be rather complicated.
 
> Did you actually compile anything, by the way?

I'd argue that even tokenised Basic interpreters compile. Microsoft's
QuickBasic used a very interesting compiler, and yet it was also
interactive and incremental. Perhaps it should still be called an
interpreter because it used threaded code (an "address interpreter")?
Forth traditionally used threaded code, but some implementations use
native code. I've used both kinds, but the distinction I'd make is
between the interactive & incremental compilers, and the "batch" variety.

You could make any argument you like, if you conveniently define
enough of the terms you use to support it. That's why I like PJ Brown's
terminology so much. Just call them _all_ compilers...
-- 
<URL:http://www.demon.co.uk/community/index.html>
<URL:http://www.enrapture.com/cybes/namaste.html>
Current favourite word: hypersonic. As used by Fluffy.
"You can never browse enough."
From: nq8140700-Davison
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <JWD.95Oct20083905@ihnns792.ATT.COM>
Cyber Surfer > writes:

   In fact, (sorry mention him again), in Writing Interactive Compilers
   and Interpreters, PJ Brown calls both compilers and interpreters by the
   same word: compilers. That's so much simpler, esp when the differences
   may be rather complicated.

   > Did you actually compile anything, by the way?

   I'd argue that even tokenised Basic interpreters compile. Microsoft's
   QuickBasic used a very interesting compiler, and yet it was also
   interactive and incremental. Perhaps it should still be called an
   interpreter because it used threaded code (an "address interpreter")?
   Forth traditionally used threaded code, but some implementations use
   native code. I've used both kinds, but the distinction I'd make is
   between the interactive & incremental compilers, and the "batch" variety.

   You could make any argument you like, if you conveniently define
   enough of the terms you use to support it. That's why I like PJ Brown's
   terminology so much. Just call them _all_ compilers...


  Rather than call every thing compilers, I prefer to make the distinction
that an interpreter executes a set of instructions; a compiler translates
them into another language.  The hardware is an interpreter.  Some programs
are interpreters (that are themselves interpreted, usually by the
hardware).  Many other programs are compilers -- sometimes they translate
the program into the language that is directly interpreted by the hardware
(native code compilers), sometimes they translate the program into a
language that is interpreted by another program (or another part of the
same program).

  Using those definitions, things like ParcPlace's Smalltalk system is an
interesting beast: when a method is compiled it is translated into
byte-codes; when the method is invoked, it is translated into machine code
that is interpreted by the hardware (this translation is, presumably, only
done if it's not already in memory).  So, it is interactive, looking like
what people tend to call interpreters, but it's really a double edged
compiler  (I think they prefer "incremental compiler")!



--
Joe Davison 	················@att.com
From: Cyber Surfer
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <814377648snz@wildcard.demon.co.uk>
In article <·················@ihnns792.ATT.COM>
           ················@.ATT.COM "nq8140700-Davison" writes:

>   Rather than call every thing compilers, I prefer to make the distinction
> that an interpreter executes a set of instructions; a compiler translates
> them into another language.  The hardware is an interpreter.  Some programs
> are interpreters (that are themselves interpreted, usually by the
> hardware).  Many other programs are compilers -- sometimes they translate
> the program into the language that is directly interpreted by the hardware
> (native code compilers), sometimes they translate the program into a
> language that is interpreted by another program (or another part of the
> same program).

Sometimes a language system compiles _and_ interprets. What do you call
such a system? In some systems, the amount of compilation may depend on
what you do with the code, like how often you run it.
 
>   Using those definitions, things like ParcPlace's Smalltalk system is an
> interesting beast: when a method is compiled it is translated into
> byte-codes; when the method is invoked, it is translated into machine code
> that is interpreted by the hardware (this translation is, presumably, only
> done if it's not already in memory).  So, it is interactive, looking like
> what people tend to call interpreters, but it's really a double edged
> compiler  (I think they prefer "incremental compiler")!

Exactly. Apparently, Microsoft's QuickBasic will "recompile" the code
as you move from one state (e.g. edit) to another (e.g. run), and then
back again.

The Taos operating system uses the "virtual code" idea in its kernal,
which makes the VP code portable (as long as you use Taos). You write
your code using the VP assembler. Does that make the "compiler" part
of the OS? If you want to think of it like that, then it could be what
some people call a "throw away compiler".

ParcPlace Smalltalk uses a similar idea, as you said, but it works at
the method level. Perhaps we could call it an "incremental throw away
compiler"?

All these ideas are described by PJ Brown in Writing Interactive Compilers
and Interpreters. The difference is that he was using Basic as his example
language, and since he was writing in the mid 70s, it's not suprising
that he used lines as the unit of compilation.
-- 
<URL:http://www.demon.co.uk/community/index.html>
<URL:http://www.enrapture.com/cybes/namaste.html>
Current favourite word: hypersonic. As used by Fluffy.
"You can never browse enough."
From: Al Slater
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <DGozD8.Cqx@bri.hp.com>
[followups trimmed]

Carl Laurence Gonsalves (········@undergrad.math.uwaterloo.ca) wrote:
: In article <·················@netcom.com>,
: ActiVision <·······@netcom.com> wrote:
: >Carl Laurence Gonsalves (········@undergrad.math.uwaterloo.ca) wrote:

<snippings>

: Okay, tell me where I can get a Scheme compiler for the Amiga. :-)

I suspect that Gambit will go on it.

: But actually, I never said that "there aren't any Scheme or Lisp
: compilers". In fact, I acknowledged their existence. I even read a paper on
: Scheme->C. So I know they exist. I just haven't seen one.

Haven't looked terribly hard ;-)

: The Scheme interpreter I used the most was called "scm". I can't find any
: man-pages for it, but it looks like it's interpreted to me. It might be
: doing "incremental compiles" but the outward appearance is that of an
: interpreter. It's interactive and slow.

Compared to say what?
I've used it for all manner of things, including lashing in a PVM
interface to enable me to write apps in Scheme that use PVM routines
to communicate with other scheme / pvm processes written in C. no
bother and it went like a dream.

Kindly provide some stats if you are claiming its slow, either that or 
try some of the other free lisp substrates and see how your mileage varies ;-)

: The Lisp that I use isn't CommonLisp. It's actually AutoLISP (the Lisp
: interpreter built into AutoCAD). I was developing some small CAD
: applications in AutoLisp. As far as I know there is no compiled AutoLISP
: compiler, but I haven't really bothered to check. Interpreted Lisp was fast
: enough for my purposes.

Which is essentially Xlisp based I think, which then might mean you have
the save workspace function to generate .wks files...

: The fact that I haven't actually used or seen a Scheme or Lisp compiler is
: probably dirrectly related to the fact that I don't do a lot in Scheme or
: Lisp, and I don't really worry myself by looking for Scheme and/or Lisp
: compilers. I personally don't like the parenthesized syntax of Lisp that
: much, and I find it takes me a lot longer to develop in Lisp or Scheme
: because I have to do a mental conversion. I'm certainly not saying that
: parenthesized syntax is inferior. I just don't particularly enjoy
: programming in a language like that. It's about as easy as hand coding
: PostScript I find.

Oooooh, thats trolling for flamage <grin>
I use whatever happens to float my boat for a particular project, Ive
found SCM to be easily extensible in C for what I want...and I found
it very very rapid to prototype stuff up in once you added the primitives
to put in the app specific bits (this is tantamount to heresy, I realise
this :-) that had to be glued in via C.

: Perhaps there is a similar language (like maybe Dylan, I haven't looked at
: it yet) that I would find prefferable, but Scheme and Lisp aren't my
: favorite languages. So I hope that helps you understand why I haven't
: seen Scheme or Lisp compilers: I haven't really gone loooking for them.

Hmm, having a more in depth look might persuade you I guess, but it's your
loss not to investigate what they have to offer..

: I don't think there is an AutoLISP compiler.

See above comment about save.

cheers,
al
From: Cyber Surfer
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <814091238snz@wildcard.demon.co.uk>
In article <··········@undergrad.math.uwaterloo.ca>
           ········@undergrad.math.uwaterloo.ca
           "Carl Laurence Gonsalves" writes:

> Okay, tell me where I can get a Scheme compiler for the Amiga. :-)

XScheme? If it isn't available for the Amiga, then it should be easy
to fix, assuming that you have a suitable C compiler.
 
> The Scheme interpreter I used the most was called "scm". I can't find any
> man-pages for it, but it looks like it's interpreted to me. It might be
> doing "incremental compiles" but the outward appearance is that of an
> interpreter. It's interactive and slow.

I think you mean, "It's intepreted and slow." Interactive languages
don't have to be slow. They can just as easily compile to native code.
This is just a matter of compiler technology, and any good book on
the subject should cover that. I have several...
 
> The Lisp that I use isn't CommonLisp. It's actually AutoLISP (the Lisp
> interpreter built into AutoCAD). I was developing some small CAD
> applications in AutoLisp. As far as I know there is no compiled AutoLISP
> compiler, but I haven't really bothered to check. Interpreted Lisp was fast
> enough for my purposes.

If AutoLISP is not fast enough for you, then you should contact AutoDesk
and let them know. Perhaps they'll do something about, but if nobody
demands better performance, then don't be suprised if you don't get it.

Please note that AutoLISP is not typical of Lisp systems in general.
It's an embedded language, and it may be assumed that performance will
depend more on AutoCAD than the language that extends it.
 
> The fact that I haven't actually used or seen a Scheme or Lisp compiler is
> probably dirrectly related to the fact that I don't do a lot in Scheme or
> Lisp, and I don't really worry myself by looking for Scheme and/or Lisp
> compilers. I personally don't like the parenthesized syntax of Lisp that
> much, and I find it takes me a lot longer to develop in Lisp or Scheme
> because I have to do a mental conversion. I'm certainly not saying that
> parenthesized syntax is inferior. I just don't particularly enjoy
> programming in a language like that. It's about as easy as hand coding
> PostScript I find.

Why then are you so concerned about the performance of these
languages? Is the problem the fact that you can't avoid them?
Are there any other languages that you'd rather avoid, or with
featues that you have difficulty with? If you'd like some help
in this area, please say so, so we can offer you some.

> Perhaps there is a similar language (like maybe Dylan, I haven't looked at
> it yet) that I would find prefferable, but Scheme and Lisp aren't my
> favorite languages. So I hope that helps you understand why I haven't
> seen Scheme or Lisp compilers: I haven't really gone loooking for them.

You should certainly take a look at Dylan, but I think reading some
good books on compiler theory may also help. It's a fascinating
subject, even if you never intend to write a compiler. For example,
they may help you with string handling in general (source code is
a string, and so is object code). They may also help you understand
the languages you use a little better.

> I don't think there is an AutoLISP compiler.
 
Ask AutoDesk about this. That's better than just assuming something,
As people say, that only makes an ASS out of U and ME.

Good luck.
-- 
<URL:http://www.demon.co.uk/community/index.html>
<URL:http://www.enrapture.com/cybes/namaste.html>
Current favourite word: hypersonic. As used by Fluffy.
"You can never browse enough."
From: Jeff Dalton
Subject: Re: Parenthesized syntax challenge
Date: 
Message-ID: <DGp8zz.6Jq@cogsci.ed.ac.uk>
········@undergrad.math.uwaterloo.ca (Carl Laurence Gonsalves) writes:

>In article <··········@undergrad.math.uwaterloo.ca>,
>Carl Laurence Gonsalves <········@undergrad.math.uwaterloo.ca> wrote:

>>The incredible slowness of Java is only evidence of the inefficiency of
>>garbage collection. [...]

>>I'm also not saying that garbage collection is the only reason Java is
>>slow. [...]

Which implies that it is _a_ reason, just not the only one.

>Several people seem to have misunderstood my point, so I thought I'd better
>clarify. I am *not* saying that garbage collection is slow. What I am
>saying is that Java is slow. I think most people agree on this. Java also
>has garbage collection. (any disagreements?)

>Now, when I said "evidence" perhaps I was using the wrong word. By evidence
>I mean that (statistically speaking) the fact that Java is both slow and
>has GC puts a point in the "GC is slow" bucket. 

But it doesn't.  If anything, it puts a point in the "system that uses
GC is slow" bucket.  It also puts one in the "system whose name starts
with `J'" bucket.

Now, why do you think the first bucket is worth mentioning while the
second is obviously silly?  Presumably you think GC might well be a
reason Java is slow.  Moreover, that you describe the bucket as the
"GC is slow" bucket suggests that your view is actually a stronger
one.

I'm not trying to show that you think that "garbage collection is
slow", just to say something about why people might have thought
that was your view.

>The post I was originally replying to was saying that one could convince C
>and C++ programmers that GC could be efficient by pointing at Java. That's
>a silly statement since Java programs are very slow, and if anything the
>slowness of Java will make many C and C++ programmers feel that GC must be
>the cause of Java's slowness.

That's a good point.

-- jd
From: MAEDA Atusi
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <MAD.95Oct18040436@tanzanite.math.keio.ac.jp>
In article <··········@Cadence.COM> Simon Kinahan <simonk> writes:

    simonk> ······@cs.utexas.edu (Paul Wilson) wrote:
    >> For more info, you might see followups that made it to some of the three
    >> newsgroups the original post went to (but not all), and a recent thread
    >> in comp.lang.scheme.  Hans Boehm has clearly argued (again) that the
    >> asymptotic arguments about GC cost (and about relative costs of copying
    >> vs. non-copying GC) just don't wash.  He (and I) have repeatedly tried
    >> to kill all of these myths, but they're very hardy memes.  There are
    >> some similar misunderstandings of generational GC issues.

    simonk> I agree that 'in practice' GC is not faster. However mathematically
    simonk> a copying GC program is faster asymptoptically than its malloc/free
    simonk> equivalent. It is not a myth jsut people mistaking maths for reality :-)

He pointed out that my reasoning about why GC is faster is a myth and
also mentioned about "asymptotic complexity myth".

My statement (gc is faster than free because cost of free is
proportional to the amount of objects died during computation while gc
cost isn't) came from Kenneth Anderson's article "Courage in Profiles"
in Lisp Pointers Vol.8, No.1.  (He is also the author of the article I
quoted in my previous post.)

When I read that excellent article I accepted the reasoning without
thinking much since it sounds logical at first glance.  (A question
came to mind was "What about reference counting?".  But I, er,
deferred thinking about that.)

All the references shown by Paul Wilson say that things are not that
simple.  Hans Boem's web page ftp://parcftp.xerox.com/pub/gc/complexity.html
was especially clear and convincing.  Now I understand why "copying
collector is faster than mark and sweep" is a myth.

Also, in his survey on memory allocator Wilson warned that it is
dangerous to relying too much on simplified mathematical model.
(Although I haven't thoroughly read it yet; it's a *long* survey.)

;;;  Keio University
;;;    Faculty of Science and Technology
;;;      Department of Math
;;;		MAEDA Atusi (In Japan we write our family names first.)
;;;		···@math.keio.ac.jp
From: Henry Baker
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <hbaker-1710952127200001@10.0.2.15>
In article <·················@tanzanite.math.keio.ac.jp>,
···@math.keio.ac.jp wrote:

> All the references shown by Paul Wilson say that things are not that
> simple.  Hans Boem's web page ftp://parcftp.xerox.com/pub/gc/complexity.html
> was especially clear and convincing.  Now I understand why "copying
> collector is faster than mark and sweep" is a myth.

A number of us asked Hans Boehm to write down his argument, and his web
page admits that the argument still needs some work.  I basically agree
with most of his conclusions, but would like to point out some additional
things:

* although a copying gc uses ~2X the _virtual_ memory, it isn't clear that
it needs to use 2X the _real_ memory.  However, in order to save paging,
it is essential that the VM allows the user to give it information about
what is live and what is not.  For example, when starting to allocate
in the 'tospace', there is no need to actually _read_ the page, only to
write it.  So the VM needs the ability to _write-allocate_, like some
caches.  Similarly, when the 'fromspace' is abandoned, the VM should
literally discard the pages and make no attempt to write them out, since
nothing on them will ever be read again.  (This may be impossible on
high-security VM's which would require that the pages on disk be literally
zeroed out; this is something that should be performed by today's
intelligent disk controllers (anyone out there listening??)).
Finally, if you copy stuff from one place to another more than one
generation -- and if you copy in the same order (breadth OR depth first) --
then you will be touching sequentially exactly the same relative pages in both
the fromspace and the tospace at the same time, plus some additional pages
whose forwarding pointers need to be updated.  A well-designed VM should
be able to take advantage of this kind of correlation to improve paging
(and possibly cache) behavior.  You also want to make sure that your fromspace
and your tospace are offset in a direct-mapped cache, or you will pay
dearly!!

* I disagree with Hans that people will want to keep their mark(copy)/alloc
ratios high enough to keep the total memory to some some integer times the
live data.  Memory is becoming cheaper faster than processors are becoming
faster, so it will continue to be more appealing to allocate enough memory
to keep the fraction of the time spent in GC at a nearly negligible level
-- kind of like the current cost of 'refreshing DRAMs' (you didn't know
about this?)  As a result, the 'sweep cost' _could_ (without clever
programming) become a significant fraction of GC time, although it would
still be a negligible cost overall.

* I disagree with both Paul and Hans to a certain extent regarding the
necessity to eliminate fragmentation.  Although I don't like copying
large objects around, it is precisely the largest objects which can cause
the most fragmentation, so moving them becomes inevitable in a long-lived
system such as a persistent object store.  Perhaps a GC which copies only
rarely, rather than one which copies all the time, is a better compromise.

-- 
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
From: Robert O'Callahan
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <4635ap$8no@cantaloupe.srv.cs.cmu.edu>
······@netcom.com (Henry Baker) wrote:

...
>* I disagree with Hans that people will want to keep their mark(copy)/alloc
>ratios high enough to keep the total memory to some some integer times the
>live data.  Memory is becoming cheaper faster than processors are becoming
>faster,
...

I don't believe this.  The doubling time for processors speeds is
definitely less than two years.  Memory costs hardly seem to be coming
down at all.

Rob
======================================================================
Robert O'Callahan (····@cs.cmu.edu) 2nd year CMU SCS PhD
Home page: http://www.cs.cmu.edu/~roc
Three million people, sixty million sheep, one America's Cup
From: Henry Baker
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <hbaker-1810952107490001@10.0.2.15>
In article <··········@cantaloupe.srv.cs.cmu.edu>, ····@cs.cmu.edu (Robert
O'Callahan) wrote:

> ······@netcom.com (Henry Baker) wrote:
> 
> >* I disagree with Hans that people will want to keep their mark(copy)/alloc
> >ratios high enough to keep the total memory to some some integer times the
> >live data.  Memory is becoming cheaper faster than processors are becoming
> >faster,
> 
> I don't believe this.  The doubling time for processors speeds is
> definitely less than two years.  Memory costs hardly seem to be coming
> down at all.

Thanks to everyone who pointed this out to me.  Perhaps I'm looking at this
with a much longer time scale -- 35 years or so.  I'm also using the intuition
that cycle time is linearly related to feature size, while memory space is
quadratically related to feature size.  Unfortunately, this intuition doesn't
consider things like the strength of the dollar, the budgeting cycles of
major DRAM vendors, etc.  Also, the ability of Microsoft to waste memory
so prodigiously has really raised the demand curve.  (I understand that
'Bob' was going to require a minimum of 32 Mbytes.  Whatever happened to
'Bob', anyway?  Perhaps he succumbed to hype blood pressure...)

-- 
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
From: Hans Boehm
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <46jb8v$sm8@news.parc.xerox.com>
······@netcom.com (Henry Baker) writes:

>* I disagree with Hans that people will want to keep their mark(copy)/alloc
>ratios high enough to keep the total memory to some some integer times the
>live data.  Memory is becoming cheaper faster than processors are becoming
>faster, so it will continue to be more appealing to allocate enough memory
>to keep the fraction of the time spent in GC at a nearly negligible level
>-- kind of like the current cost of 'refreshing DRAMs' (you didn't know
>about this?)  As a result, the 'sweep cost' _could_ (without clever
>programming) become a significant fraction of GC time, although it would
>still be a negligible cost overall.

Paul and others have already responded to much of Henry's message.
But I think there's an important issue here.  My experience has been that
memory requirements of current garbage collected systems often prevent
their acceptance.  A significant contributing factor are GC algorithms
that are much more generous than necessary in their memory use.  The
reasons I don't think this is acceptable:

1.  I vaguely recall that memory was around $20/MB around 1991
(corrections appreciated).  The current price is around $30/MB.
Processors have gotten much faster since then.

2. As a result, I claim the average PC is grossly underconfigured with
memory.  Most PCs would gain much more from a memory upgrade than a
processor upgrade, if they're running anything beyond DOS.  RISC
workstations tend to be configured a bit more reasonably, but users
expect something in return for the extra money they paid for
memory.

3. Not all programs are designed to take over a machine.  Many programs
are designed to run in the background while the machine is used for
something else.  Few users will be happy with a large memory footprint
background program (or 20 of them) while they are running SML of NJ,
or some CAD program, or Microsoft Word, or whatever, in the foreground.

4. Users expect that if your machine has N megabytes, you should be
able to run an application that has nearly N megabytes live data without
paging.  If you tell them the real limit is .7N they'll grumble, but
they might accept it.  If you tell them it's N/5, you've lost.

I will agree that you should be able to tune the heap size.  But my
experience is that few users ever do such tuning.

Hans-J. Boehm
(·····@parc.xerox.com)
Standard disclaimer ...
From: William Paul Vrotney
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <vrotneyDH17B3.29q@netcom.com>
In article <·······················@10.0.2.15> ······@netcom.com (Henry Baker) writes:
>
>   In article <··········@news.parc.xerox.com>, ·····@parc.xerox.com (Hans
>   Boehm) wrote:
>
>   > 2. As a result, I claim the average PC is grossly underconfigured with
>   > memory.  Most PCs would gain much more from a memory upgrade than a
>   > processor upgrade, if they're running anything beyond DOS.  RISC
>                                                                 ^^^^
>   > workstations tend to be configured a bit more reasonably, but users
>   > expect something in return for the extra money they paid for
>   > memory.
>
>   The RISC 'revolution' has increased memory requirements for code by about
>   a factor of two.

Good point.  I did some tests (years ago) by compiling the same program on a
68040 and then a Sparc and you are correct about the factor of two.  This
was for small programs.  What I noticed was that as the programs got larger
the factor got larger (than two).  This was just something that I noticed
and did not prove any results.  Also, the CISC compiler may have been better
than the RISC compiler for larger programs.  But it was enough to make me
start wondering about all the hype over RISC architectures at the time.

-- 

William P. Vrotney - ·······@netcom.com
From: Henry Baker
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <hbaker-2710950838100001@10.0.2.15>
In article <··················@qobi.ai>, ····@CS.Toronto.EDU wrote:

> I'm told that on DEC ALPHAs, which use 64 bit points, that memory requirements
> are doubled yet again.
> 
> Data structure size affects not only paging performance but also caching
> performance. Has anybody done a study to see what the tradeoffs are between
> the increased instruction cost of CDR coding vs the potential decreased
> cache-miss ratio? How would that change if minimal CDR-coding hardware were
> added to RISC processors?

I would believe that some sort of 'pointer-swizzling' idea might work better
than checking the pointers on every access.

-- 
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
From: Erik Corry
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <DH25zB.7D@kroete2.freinet.de>
Henry Baker (······@netcom.com) wrote:

: I would advise using .-relative addressing for all pointers (perhaps some
: C compiler vendor/GNU is listening), and start using VM compression.  Once you
: have encoded pointers as _relative_ addresses instead of _absolute_
: addresses, then VM compression will work better.

Could you explain how using .-relative addressing improves compression?
The way I understood it VM compression (a la Mac RamDoubler) only works
really well if the pages to be compressed consist mostly of zeros.

--
[SCSI] ist ja bekanntlich so kompliziert, dass es nur die fuer ihre Bastelwut
beruechtigten Apple-User benutzen. -- Detlef Grell, c't 8/95
--
Erik Corry ·······@inet.uni-c.dk
From: Henry Baker
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <hbaker-2710952309000001@10.0.2.15>
In article <·········@kroete2.freinet.de>, ·······@inet.uni-c.dk (Erik
Corry) wrote:

> Henry Baker (······@netcom.com) wrote:
> 
> : I would advise using .-relative addressing for all pointers (perhaps some
> : C compiler vendor/GNU is listening), and start using VM compression. 
Once you
> : have encoded pointers as _relative_ addresses instead of _absolute_
> : addresses, then VM compression will work better.
> 
> Could you explain how using .-relative addressing improves compression?
> The way I understood it VM compression (a la Mac RamDoubler) only works
> really well if the pages to be compressed consist mostly of zeros.

If you allocate a list in sequence, and they are allocated contiguously,
then one of the fields will point to a location .+n bytes subsequent.
If these are relative pointers, then they will all be 'n', so a reasonable
compression scheme -- e.g., gzip -- should be able to compress such a
sequence.

I don't know how sophisticated RamDoubler's compression scheme is.

-- 
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
From: Hans Boehm
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <46rkuv$jh8@news.parc.xerox.com>
······@netcom.com (Henry Baker) writes:

>The RISC 'revolution' has increased memory requirements for code by about
>a factor of two.

My impression is that code density varies a lot between CISC processors,
and that code density for "32 bit" code on an X86 is nothing to write
home about either.  But RISC code is likely to be larger.

>Furthermore, due to RISC requirements for data alignment, RISC _data_ has
>also gotten much less dense -- e.g., I'm not aware of any RISC Lisps which
>do 'cdr-coding' anymore.  I would imagine that the RISC revolution has
>expanded data requirements by perhaps 1.5X.

I don't know about Lisp implementations.  For C code, I don't believe
this makes an appreciable difference.  Borland and Microsoft C/C++
compilers have different defaults for alignment.  The Microsoft compiler
uses RISC-like alignments by default (which makes sense
for time performance reasons) and the Borland compiler uses one byte
alignments by default (probably for historical reasons).  I suspect
most important structures are manually aligned so they come out the
same way in either case.  But then the fields are also likely to be manually
ordered to minimize loss, so even the loss over, say, a VAX representation
is likely to be minimal.

>Therefore, unless you start using some form of VM
>compression on code pages, you are probably in worse shape than before
>RISC, because you are going to page fault more often than a CISC architecture
>with the same amount of memory, and this is a very non-linear phenomenon.
>(RISC processors should be very good at doing compression, so the overall
>cost of this should be quite small.)

>I would advise using .-relative addressing for all pointers (perhaps some
>C compiler vendor/GNU is listening), and start using VM compression.
Please don't do that for C; you'll break conservative GC completely.
(You'll also break at least all C code that calls third party libraries,
and I doubt you'll have an ANSI conforming compiler.  How do you handle
union assignments?  But I have my priorities straight :-))

>Once you
>have encoded pointers as _relative_ addresses instead of _absolute_
>addresses, then VM compression will work better.
Presumably the right place to convert to relative pointers is in the
compressor?

Hans-J. Boehm
(·····@parc.xerox.com)
Standard disclaimer ...
From: Paul Wilson
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <46rvtt$h3e@jive.cs.utexas.edu>
In article <··········@news.parc.xerox.com>,
Hans Boehm <·····@parc.xerox.com> wrote:
>······@netcom.com (Henry Baker) writes:
>
>>The RISC 'revolution' has increased memory requirements for code by about
>>a factor of two.
>
>My impression is that code density varies a lot between CISC processors,
>and that code density for "32 bit" code on an X86 is nothing to write
>home about either.  But RISC code is likely to be larger.

The number I've heard bandied about is about 1.3 times CISC size, on
average.  I think early RISC compilers generated more bloated code,
partly because they weren't mature (eventually optimizers got better
at reducing code size without much speed penalty), and partly because
some early RISC architectures (e.g., MIPS R2000?) required a lot of NOPs
to ensure that pipeline constraints were preserved.  

My impression is that everybody's using interlocked architectures that
just stall as necessary, rather than requiring NOPS, because that makes
it easier to design multiple generations of binary-compatible RISCs.

>>Furthermore, due to RISC requirements for data alignment, RISC _data_ has
>>also gotten much less dense -- e.g., I'm not aware of any RISC Lisps which
>>do 'cdr-coding' anymore.  I would imagine that the RISC revolution has
>>expanded data requirements by perhaps 1.5X.

I don't think CDR-coding would be worthwhile on modern CISCs either.
(Though Henry may have been talking about counterfactual "Lisp-oriented
modern CISCs", which haven't actually been built, partly due to marketing
issues.)

At any rate, I don't think microcode would help any longer---CISC machines
are going to have a RISC core, and you're going to want to do loads and
stores in a single cycle.  I'm not sure what impact CDR-coding support
would have on a machine with single-cycle loads and stores.  I'm not
sure what should go into which parts of the hardware, and whether you
can do it without an impact on the cycle time, at least in the uncommon
case.  Then again, I'm not sure it's a problem---I haven't worked it out.

I'm also not sure how big a win CDR-coding is in a modern dialect of
Lisp, programmed in a modern, object-oriented style, or whether programming
styles should change.  I wouldn't be surprised if conventional Lisp
list structures get less common over time, as people come up with
alternative data structures using object-oriented features or whatever.

There's some data (from Bob Shaw's and Ben Zorn's theses) that bear
on this, but there are lots of issues.  (It appears that most object-oriented
Lisp programs still use in the ballpark of half their memory for list cells.
But that may be an artifact of poor compilers for object systems, and there
might be a trend toward more and more object-orientedness as it gets
relatively cheaper.)

At any rate, on stock hardware CDR coding is a big lose speedwise.  Compressed
VM is more general, because it can compress pointers in other kinds of
data structures, as well as compressing nonpointer data.

> [ ... For C ] I suspect
>most important structures are manually aligned so they come out the
>same way in either case.  But then the fields are also likely to be manually
>ordered to minimize loss, so even the loss over, say, a VAX representation
>is likely to be minimal.
>
>>Therefore, unless you start using some form of VM
>>compression on code pages, you are probably in worse shape than before
>>RISC, because you are going to page fault more often than a CISC architecture
>>with the same amount of memory, and this is a very non-linear phenomenon.
>>(RISC processors should be very good at doing compression, so the overall
>>cost of this should be quite small.)

Yes, if you have very fast compression algorithms.

>>I would advise using .-relative addressing for all pointers (perhaps some
>>C compiler vendor/GNU is listening), and start using VM compression.

VM compression is a good idea, but relative addressing isn't necessary for it
to work quite well;  pointers are highly compressible if you have any
kind of decent locality in your data structures.

>Please don't do that for C; you'll break conservative GC completely.
>(You'll also break at least all C code that calls third party libraries,
>and I doubt you'll have an ANSI conforming compiler.  How do you handle
>union assignments?  But I have my priorities straight :-))

>>Once you
>>have encoded pointers as _relative_ addresses instead of _absolute_
>>addresses, then VM compression will work better.
>Presumably the right place to convert to relative pointers is in the
>compressor?

Yes.  That's what we do in a compressed VM system we're developing.
Actually, we don't bother to convert to relative explicitly, we just notice
that the upper and middle bits of a lot of pointers are similar to upper and
middle bits of lots of other nearby pointers, and exploit that.

>Hans-J. Boehm
>(·····@parc.xerox.com)
>Standard disclaimer ...

-- 
| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (······@cs.utexas.edu)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and  Scheme interpreters and compilers available via ftp from 
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)      
From: Erik Naggum
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <19951029T012233Z@naggum.no>
[Henry Baker]

|   I would advise using .-relative addressing for all pointers (perhaps
|   some C compiler vendor/GNU is listening), and start using VM
|   compression.

[Hans Boehm]

|   Please don't do that for C; you'll break conservative GC completely.
|   (You'll also break at least all C code that calls third party
|   libraries, and I doubt you'll have an ANSI conforming compiler.  How do
|   you handle union assignments?  But I have my priorities straight :-))

this is odd.  to build shared objects (libraries) under, e.g., SunOS 4,
so-called "position-independent code" must be emitted by the assembler, and
special considerations must be followed by the compiler.  (not all C code
can be compiled into shared objects, but I don't know what the conditions
are.)  given this, are you saying that shared objects would cause the
compiler (or library) not to be conforming to ANSI C?  if so, this is,
IMNSHO, a problem in the standard (ISO C, actually) that should be fixed.

#<Erik 3023918552>
-- 
a good picture may well be worth a thousand words, but on the WWW,
even bad imagemaps cost tens of thousands of words.
From: Hans Boehm
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <474fku$mej@news.parc.xerox.com>
······@netcom.com (Henry Baker) writes:
>On some RISC machines, asking for byte alignment blows code size size up
>substantially.

To tell you the truth, I've never been tempted to ask for byte alignment.
But I usually arrange common C structures so that word alignment
doesn't make much of a difference.  On some RISC machines unaligned memory
references are emulated after a trap.

>I don't understand this.  There's nothing in C that guarantees that a pointer
>points to an _absolute_ address; only that relative addresses work correctly.
>In fact, many compilers optimize accesses in such a way that an absolute
>pointer to an object may never be stored at all, since the object can be
>easily reference relative to some nearby object.

Yes, but the offset is independent of where the pointer is stored.

>Relative pointers should be less visible than the segmentation nonsense
>(so NEAR and yet so FAR) of some brain-damaged processors.

Maybe I'm missing something.  Consider

union U {
    int x;
    union U * y;
}

Assume a and b have type (union U *).  How do you compile 

*a = *b;


A straight copy won't work if *b was last assigned with the pointer variant.
Adjusting for the difference between a and b will break if they both have
the integer variant.

Representing unions with a tag is nearly impossible in C because you
can take the address of a union field.  Hence nearly any assignment through
a pointer might be assigning to a union.

Hans-J. Boehm
(·····@parc.xerox.com)
Standard disclaimer ...
From: Hans Boehm
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <47dlmk$mu0@news.parc.xerox.com>
······@netcom.com (Henry Baker) writes:

[My point was that if relative pointers are used, union assignments in
C are very difficult to compile correctly.]

>I don't think that C guarantees _anything_ about unions of pointers and
>ints.  I don't think that C guarantees much about unions at all.  Most C
>compilers seem to handle union assignments as if they were the first
>member of the union, or that they are bit strings as
>long as the longest member.  But there are lots of things that break with
>unions.

I disagree.  The C standard does guarantee that if you assign to a union
using one variant, copy the union, and retrieve the same variant, you get
the same answer back.  Without that guarantee you couldn't implement
something like tagged unions.

>C should never allow assignments of union types; they should only
>allow assignments of the members.  Not only does this keep the compiler
>from getting confused, it also keeps the _programmer_ from getting confused.

I disagree again.  You might have a structure containing a union, where
some of the fixed structure fields identify the union variant.  That
seems perfectly consistent with good C programming style.  Furthermore,
prohibiting unions assignments then prohibits assignments of structures
containing unions.  To avoid this problem, you also have to prohibit
passing unions as arguments, or returning them from functions.

Clearly there are ways to design a language with these restrictions.
You might even just replace untagged unions with tagged unions.  (And
you might also replace pointer arithmetic with real arrays, so that the
language has a reasonable safe subset.)  The result might even, for
most purposes, be a better programming language than C.  But it
wouldn't be C anymore.

Hans-J. Boehm
(·····@parc.xerox.com)
Standard disclaimer ...
From: Ken Anderson
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <KANDERSO.95Nov2145204@lager.bbn.com>
In article <··········@deneb.cygnus.com> ·······@cygnus.com (Per Bothner) writes:
  
  Nonsense.  You just re-allocate the vector by a factor of two
  each time it gets full.  The amortized allocation+copying time
  is linear in the final size (i.e. the same as for a linked list).
  However, the constant factor is probably smaller for the vector
  (since you are allocating bigger chunks at a time).
  Plus you get all the other advantages of vectors.
  
  For examples, see most GNU code (such as gdb).
  
  Lists are more efficient mainly if you move things around alot,
  or delete/insert items at other places than the end.
  
You have to be careful about the intial and final size, and the growth rate
because you can spend a lot of time copying.  We had an application which
spend quite some time growing a million element hash table.  Getting the
table size roughly right helped a lot.  Here's the result of a simple
benchmark:

k

GROW-LIST 
; cpu time (non-gc) 640 msec user, 10 msec system
; cpu time (gc)     200 msec user, 0 msec system
; cpu time (total)  840 msec user, 10 msec system
; real time  877 msec
; space allocation:
 1,000,000 cons cells,;  0 symbols, 0 other bytes
GROW-ARRAY 
; cpu time (non-gc) 2,570 msec user, 30 msec system
; cpu time (gc)     80 msec user, 0 msec system
; cpu time (total)  2,650 msec user, 30 msec system
; real time  2,875 msec
; space allocation:
 28 cons cells,;  0 symbols, 8,388,944 other bytes
 
(in-package "USER")

(defun grow-list (N x i L)
  (declare (cons x)
	   (fixnum N i L))
  (if (= n 0) x
      (grow-list (- N 1) (cons nil x) (+ i 1) L)))

(defun grow-array (N x i L)
  (declare (simple-vector x)
	   (fixnum N i L))
  (if (= n 0) x
    (if (< i L) 
	(progn 
	  (setf (aref x i) nil)
	  (grow-array (- N 1) x (+ i 1) L))
      (let* ((new-L (+ L L))
	     (new-x (make-array new-L)))
	(dotimes (i L)
	  (setf (aref new-x i) (aref x i)))
	(grow-array (- N 1) new-x i new-L)))))

(defun grow-benchmark (N)
  (print 'grow-list)
  (time (grow-list N () 0 0))
  (print 'grow-array)
  (time (grow-array N #(1) 0 1)))

(grow-benchmark 1000000)
From: Mike Christiansen
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <470uk5$sig@feenix.metronet.com>
·····@parc.xerox.com (Hans Boehm) wrote:

>1.  I vaguely recall that memory was around $20/MB around 1991
>(corrections appreciated).  The current price is around $30/MB.
>Processors have gotten much faster since then.

Yea..This is one is a real problem. Why haven't dram prices tracked
processors? I beleive that it was a bit earlier than 1991, however, I
beleive that it was the far-east producers dumping on our market to
drive out all of the competition and now have a monopoly. They are
keeping prices high, but not so high that the uproar causes the US
Goverment  to do something about it.  And why haven't the US companies
gotten back into the market? Well they are staring to (IBM & TI) but
always with Japanise partners. I read some time ago that the US has
given up on the DRAM market. But if there is money to be made, why
isn't the US Goverment protecting US-based companies from the unfair
traiding practices of other countries?

Just my two cents worth.
Mike
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Mike Christiansen
·····@metronet.com
From: Dirk Wright
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <47l442$ap0@pioneer.uspto.gov>
Mike Christiansen (·····@metronet.com) wrote:
: ·····@parc.xerox.com (Hans Boehm) wrote:

: >1.  I vaguely recall that memory was around $20/MB around 1991
: >(corrections appreciated).  The current price is around $30/MB.
: >Processors have gotten much faster since then.

: Yea..This is one is a real problem. Why haven't dram prices tracked
: processors? 

I suspect that an international cartel has been formed to keep dram prices
high, like in the diamond and oil industries. I have no evidence of this,
however.

Dirk
From: Jeffrey Mark Siskind
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <QOBI.95Oct30171156@qobi.ai>
In article <··········@jive.cs.utexas.edu> ······@cs.utexas.edu (Paul Wilson) writes:

   I'm also not sure how big a win CDR-coding is in a modern dialect of
   Lisp, programmed in a modern, object-oriented style, or whether programming
   styles should change.  I wouldn't be surprised if conventional Lisp
   list structures get less common over time, as people come up with
   alternative data structures using object-oriented features or whatever.

In any object-oriented language, Lisp or otherwise, a significant fraction of
objects will be members of collections. Some collections need to be
hash-tables, trees, or other complex data structures due to the kinds of
operations performed on those collections. Others don't need that complexity
and you don't want the overhead for them. Vectors are only suitable if you
know how big the collection will be in advance. That doesn't fit all uses.
I'll bet that even in fully object-oriented situations you will always have a
significant fraction of collections represented as linked lists. It doesn't
matter whether they are CONS cells, doubly linked lists, or link slots merged
into other classes by inheritance, they will be there and some variant of
CDR-coding can be applied to them. Whether it is worthwhile or not is another
story.

    Jeff (home page http://www.cs.toronto.edu/~qobi)
--

    Jeff (home page http://www.cs.toronto.edu/~qobi)
From: Per Bothner
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <478s22$eim@deneb.cygnus.com>
In article <··················@qobi.ai>,
Jeffrey Mark Siskind <····@CS.Toronto.EDU> wrote:
>Vectors are only suitable if you know how big the collection
>will be in advance.

Nonsense.  You just re-allocate the vector by a factor of two
each time it gets full.  The amortized allocation+copying time
is linear in the final size (i.e. the same as for a linked list).
However, the constant factor is probably smaller for the vector
(since you are allocating bigger chunks at a time).
Plus you get all the other advantages of vectors.

For examples, see most GNU code (such as gdb).

Lists are more efficient mainly if you move things around alot,
or delete/insert items at other places than the end.

-- 
	--Per Bothner
Cygnus Support     ·······@cygnus.com
From: Tim Bradshaw
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <TFB.95Oct18100248@scarp.ed.ac.uk>
* Noah L Gibbs wrote:

>     It might well have been.  In the course I took, we were using Sparc-5's.
> We had to write simple text-based projects early on, and later had to use
> a windowing system written by the professors to add a UI to a text-based
> program (which we had written as the previous project).  While in text-based
> form, the program ran only 25X or so slower than a similar program turned out
> be GCC, when a user-interface was added, it hit the far end of unusable,
> probably because the windowing system was written on top of GARNET, which I
> have also heard bad things about.

If you were getting a factor of 25 between Allegro and gcc you were
either not compiling your code, or writing very poor code indeed (or
perhaps memory starved).  When I tried, several years ago, to check
equivalent lisp & C programs (both written by me) using allegro I
found that gcc was rather faster if I was sloppy about memory
management (never called free()) and slower if I wasn't (but I didn't
bother to write or obtain a clever allocator for what I was doing).

--tim
From: Mr. Eduardo W. Pellegrino
Subject: Re: GC, Java, etc.
Date: 
Message-ID: <PLLGRNEW.95Oct23193730@sunv22.ps.ic.ac.uk>
In article <······················@lager.bbn.com> ········@lager.bbn.com (Ken Anderson) writes:

> I've waited a few days hoping that there would be some clarification on
> this "25x or so" figure before responding to this.  Generally when the
> runtimes of two programs that "do the same thing" differ by a factor that
> large, they are doing something wildly different.  I suspect that most of
> the difference has little to do with one program being in Lisp and the
> other in GCC.
> 

I have tried changing C programs into perl before now, since perl's
syntax is very close to C (or can be), this enabled me to test the
relative speeds of compiled and interpretted code.  Doing really basic
stuff (floating point math) I found C 100 times faster (almost exactly
actually).  This isn't a criticism of perl, I am sure that perl would
perform much better if I was doing string manipulation, but I just
wanted to point out that in math intensive programs C or FORTRAN will
often do miles better than interpretted languages.
From: Tim Rentsch
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <TXR.95Oct24010445@alumni.cco.caltech.edu>
The article mentioned above (in-reply-to:) says:

> Memory is becoming cheaper faster than processors are 
> becoming faster,

Seems factually incorrect, at least locally.  Processors have
gotten faster in the last five years, whereas RAM as pretty
much stayed constant at $40/megabyte (yes, not exact, but
pretty close) for the last five years.  Am I missing something?
Five years is a pretty long trend.
From: Tim Rentsch
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <TXR.95Oct24011649@alumni.cco.caltech.edu>
>  The original poster was asking whether GC could ever be faster than a
>  malloc/free implementation, not whether GC is expensive or cheap for
>  real programs.  There are clearly cases where GC is intrinsically
>  faster than malloc/free, even if in most real cases, there isn't much
>  difference between the two.

Anecdotally the answer seems to be yes.  An unpublished report
on a large (>500,000 lines) program that was converted from
explicit deallocation to an ambiguous-roots style collector
resulted in a significant speedup.  For non-disclosure
reasons I cannot give more details, except to say that this
program is a successful commercial application (both before
and after conversion to GC), and the same group of programmers
worked on both versions (ie, programmer variation would seem
to be at best a second-order factor).
From: John D. Mitchell
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <46ofok$72b@emf.emf.net>
In article <··················@qobi.ai>, Jeffrey Mark Siskind wrote:
>I'm told that on DEC ALPHAs, which use 64 bit points, that memory
>requirements are doubled yet again.
>
>Data structure size affects not only paging performance but also caching 
>performance. Has anybody done a study to see what the tradeoffs are
>between the increased instruction cost of CDR coding vs the potential
>decreased cache-miss ratio? How would that change if minimal CDR-coding
>hardware were added to RISC processors?

Yep.  Work on the Titanium project at UC Berkeley has done some work so far
on 'packed linked lists', packed binary trees (B-trees), etc. to look at
dealing with the multi-level memory hierarchies that exist today and will
exist in the future.  Some very prelimary numbers on a couple of existing
machines with just packed LLs showed pretty clearly that performance
increased either a little below or a little above 2X where the data object
size was around the cache line size.

Their web page is http://HTTP.CS.Berkeley.EDU/projects/titanium/ though I
haven't looked at it yet.  The professors involved are A. Aiken, S. Graham,
and K. Yelick.

Hope this helps,
		John
From: MORE Systems
Subject: Re: Garbage collection cost (was Re: Parenthesized syntax challenge)
Date: 
Message-ID: <MORESYS.95Oct31213722@world.std.com>
In article <·······················@10.0.2.15> ······@netcom.com (Henry Baker) writes:
   C should never allow assignments of union types; they should only
   allow assignments of the members.  Not only does this keep the compiler
   from getting confused, it also keeps the _programmer_ from getting confused.

This is not quite true. A variable of a union type can be assigned a
value of the same union type, which copies the memory verbatim between
the two unions. If you meant to say that you cannot safely assign a
value to a member of a union by assigning the value directly to the
union, that is correct.

I find that it helps to not focus on the fact that the members of a
union overlap in memory. They do, but there is no safe, portable way
to take advantage of the fact. Just think of them as structs with seperate
slots for each member, but only room for one member at a time, and
all the safety rules suddenly make sense.