Lex, Yacc, and Lisp...

From: Ray Dillinger
Subject: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 01:14:14 +0000
Message-ID: <GL1Ze.37$Aw.604@typhoon.sonic.net>

I had a random thought the other day.  Lisp could easily
be the "back end" or "executable intermediate representation"
for most language compilers.

Consider this:  some student in a CS course designs a language.
He uses flex or lex to create a lexer, yacc or bison to parse
the tokens into a syntax tree according to a grammar, and
then does a traversal of the syntax tree where each node is
preceded by an open-paren and the name of the node type (from
the grammar), and followed by a close-paren.

The output of this traversal can be considered as a lisp
expression.  The final step in implementing the student's
language is to implement lisp definitions (probably macros)
for each grammar node type that perform the semantic action
of that node type according to the language design.

This seems like an easier method of implementing most
languages, and Lisp, with its syntactic flexibility, is
uniquely even more suited to being a "universal" target
metalanguage than, say, the JVM or .NET to name a couple
of the usual suspects.

Have I missed important things?  What would be hard to
do with this approach?

				Bear

Re: Lex, Yacc, and Lisp... Mark Carter
Re: Lex, Yacc, and Lisp... Sven-Olof Nystr|m
- Re: Lex, Yacc, and Lisp... Ray Dillinger
  - Re: Lex, Yacc, and Lisp... David Steuber
    - Re: Lex, Yacc, and Lisp... Lars Brinkhoff
      - Re: Lex, Yacc, and Lisp... David Steuber
    - Re: Lex, Yacc, and Lisp... Sven-Olof Nystr|m
      - Re: Lex, Yacc, and Lisp... Ulrich Hobelmann
        Re: Lex, Yacc, and Lisp... Pascal Bourguignon
        Re: Lex, Yacc, and Lisp... Sven-Olof Nystr|m
        Re: Lex, Yacc, and Lisp... Pascal Bourguignon
        Re: Lex, Yacc, and Lisp... Christopher C. Stacy
        Re: Lex, Yacc, and Lisp... Paolo Amoroso
        Re: Lex, Yacc, and Lisp... Lars Brinkhoff
        Re: Lex, Yacc, and Lisp... =?utf-8?b?R2lzbGUgU8ODwqZsZW5zbWk=?= =?utf-8?b?bmRl?=
        Re: Lex, Yacc, and Lisp... Gareth McCaughan
        Re: Lex, Yacc, and Lisp... Marcin 'Qrczak' Kowalczyk
        Re: Lex, Yacc, and Lisp... Pascal Bourguignon
        Re: Lex, Yacc, and Lisp... Frank Buss
- Re: Lex, Yacc, and Lisp... Peter Seibel
  - Re: Lex, Yacc, and Lisp... Sven-Olof Nystr|m
Re: Lex, Yacc, and Lisp... Matthias Buelow
- Re: Lex, Yacc, and Lisp... Pascal Costanza
  - Re: Lex, Yacc, and Lisp... Matthias Buelow
    - Re: Lex, Yacc, and Lisp... Pascal Costanza
  - Re: Lex, Yacc, and Lisp... David Steuber
Re: Lex, Yacc, and Lisp... Alex Mizrahi
Re: Lex, Yacc, and Lisp... Oleg A. Paraschenko

From: Mark Carter
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 09:23:51 +0000
Message-ID: <43351b21$0$49010$14726298@news.sunsite.dk>

Ray Dillinger wrote:
> 
> Lisp, with its syntactic flexibility, is
> uniquely even more suited to being a "universal" target
> metalanguage than, say, the JVM or .NET to name a couple
> of the usual suspects.

How about having Forth as a kind of universal target metalanguage. I'm 
not much into Forth, but it seems that many Forthers have a go at 
rolling their own implementation. So it's a very simple language to 
implement.

From: Sven-Olof Nystr|m
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 08:47:34 +0000
Message-ID: <vpzmq26ee1.fsf@harpo.it.uu.se>

Ray Dillinger <····@sonic.net> writes:

> I had a random thought the other day.  Lisp could easily
> be the "back end" or "executable intermediate representation"
> for most language compilers.
>
> Consider this:  some student in a CS course designs a language.
> He uses flex or lex to create a lexer, yacc or bison to parse
> the tokens into a syntax tree according to a grammar, and
> then does a traversal of the syntax tree where each node is
> preceded by an open-paren and the name of the node type (from
> the grammar), and followed by a close-paren.

Why not write the compiler in Lisp? There are parser generator
available. Besides, writing a recursive descent parser in Lisp is
straight-forward.

> The output of this traversal can be considered as a lisp
> expression.  The final step in implementing the student's
> language is to implement lisp definitions (probably macros)
> for each grammar node type that perform the semantic action
> of that node type according to the language design.

Sounds like an elegant solution. However, doing a simple translation
from a tree-structured intermediate language to a lower level tends to
be pretty straight-forward. Using macros would allow you to avoid the
explicit case analysis, but the case analysis is the easy part, so you
wouldn't gain much.

> This seems like an easier method of implementing most
> languages, and Lisp, with its syntactic flexibility, is
> uniquely even more suited to being a "universal" target
> metalanguage than, say, the JVM or .NET to name a couple
> of the usual suspects.

From a practical point of view, using Lisp as a backend has many
advantages. However, if we are talking about a compilers course and
one of the goals of the course is to teach how machine code is
generated, you would miss out on that.

> Have I missed important things?  

You haven't explained why you want to do this :-)

> What would be hard to do with this approach?

If the goal is to implement an experimental language as easily as
possible, but with reasonable performance, then I think that a
translation to Lisp would be the easiest route. This is of course
under the assumption that the libraries you need may be accessed from
Lisp. (And if interfacing with the libraries is hard, then that might
be one of the main obstacles.)

Sven-Olof

-- 
Sven-Olof Nystrom 
Comp Sci Dept, Uppsala University, Box 337, S-751 05 Uppsala SWEDEN
········@csd.uu.se, http://www.csd.uu.se/~svenolof
phone: +46 18 471 76 91, fax: +46 18 51 19 25

From: Ray Dillinger
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 16:56:22 +0000
Message-ID: <WyfZe.115$Aw.1504@typhoon.sonic.net>

Sven-Olof Nystr|m wrote:
> Ray Dillinger <····@sonic.net> writes:
> 

>>Consider this:  some student in a CS course designs a language.
>>He uses flex or lex to create a lexer, yacc or bison to parse
>>the tokens into a syntax tree according to a grammar, and
>>then does a traversal of the syntax tree where each node is
>>preceded by an open-paren and the name of the node type (from
>>the grammar), and followed by a close-paren.

>>The output of this traversal can be considered as a lisp
>>expression.  The final step in implementing the student's
>>language is to implement lisp definitions (probably macros)
>>for each grammar node type that perform the semantic action
>>of that node type according to the language design.

> You haven't explained why you want to do this :-)

Honestly?  I was thinking of autotranslating code libraries
written in other languages to Lisp.  We've had so many people
saying "where is your CPAN equivalent?" that I'm just about
ready to tell them, "look, here's a perl-to-lisp translator;
now you can suck in CPAN itself if you want it and call the
lisp versions of all these functions from your lisp code. "

If we claim that lisp makes it easy to implement language
extensions, it seems to me that with a translator approach,
we ought to be able to benefit from code written in just
about any language.

It falls down, of course, where those libraries rely on
binary interfaces or OS calls not available to our translator;
but pure code libraries, implementing algorithms and data
structures, etc, should be fine.

				Bear

From: David Steuber
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 01:34:59 +0000
Message-ID: <87psqykjzw.fsf@david-steuber.com>

Ray Dillinger <····@sonic.net> writes:

> Sven-Olof Nystr|m wrote:
> 
> > You haven't explained why you want to do this :-)
> 
> Honestly?  I was thinking of autotranslating code libraries
> written in other languages to Lisp.  We've had so many people
> saying "where is your CPAN equivalent?" that I'm just about
> ready to tell them, "look, here's a perl-to-lisp translator;
> now you can suck in CPAN itself if you want it and call the
> lisp versions of all these functions from your lisp code. "

I have had idle thoughts that a C => Lisp translator would be cool to
get some of those libraries into native Lisp.  The thoughts have
remained idle because I'm not sure how to deal with inevitable
situations like C's use of pointer arithmetic and low level system
calls when they don't go through libc.  I'm sure I haven't even
touched the surface.

But something like that would be very cool, esp if it created READABLE
lisp code that could then be maintained easily.

One thing I have noticed about all programming languages is that they
lose information.  The information they lose is the intent of the
programmer, hence the invention of inline code comments (and perhaps
also literate programming, although I don't know much about that).
I'm not sure how a C to Lisp translator could reverse engineer the
intent of the programmer to produce canonical Lisp code as might be
typed by a proficient human programmer.

-- 
You should maybe check the chemical content of your breakfast
cereal. --- Bill Watterson

From: Lars Brinkhoff
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 07:34:53 +0000
Message-ID: <85slvtbnxe.fsf@junk.nocrew.org>

David Steuber <·····@david-steuber.com> writes:
> I have had idle thoughts that a C => Lisp translator would be cool

Have you looked at Zeta-C?

From: David Steuber
Subject: Re: Lex, Yacc, and Lisp...
Date: Mon, 26 Sep 2005 00:51:14 +0000
Message-ID: <87irwomz25.fsf@david-steuber.com>

Lars Brinkhoff <·········@nocrew.org> writes:

> David Steuber <·····@david-steuber.com> writes:
> > I have had idle thoughts that a C => Lisp translator would be cool
> 
> Have you looked at Zeta-C?

I have not.  But I'll look it up when I'm less busy.

And not wasting time on usenet ;-)

-- 
http://www.david-steuber.com/
The UnBlog | Lisp on OS X topics for the most part
Click all the links you want.  I'll make more!

From: Sven-Olof Nystr|m
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 08:16:19 +0000
Message-ID: <vphdc9imuk.fsf@harpo.it.uu.se>

David Steuber <·····@david-steuber.com> writes:

> Ray Dillinger <····@sonic.net> writes:
>
>> Sven-Olof Nystr|m wrote:
>> 
>> > You haven't explained why you want to do this :-)
>> 
>> Honestly?  I was thinking of autotranslating code libraries
>> written in other languages to Lisp.  We've had so many people
>> saying "where is your CPAN equivalent?" that I'm just about
>> ready to tell them, "look, here's a perl-to-lisp translator;
>> now you can suck in CPAN itself if you want it and call the
>> lisp versions of all these functions from your lisp code. "
>
> I have had idle thoughts that a C => Lisp translator would be cool to
> get some of those libraries into native Lisp.  The thoughts have
> remained idle because I'm not sure how to deal with inevitable
> situations like C's use of pointer arithmetic and low level system
> calls when they don't go through libc.  I'm sure I haven't even
> touched the surface.

One solution might be to use some form of fat pointer
representation. For example, a pointer into an array could be
represented by a structure with information about the type of pointer,
which array it is pointing into and at which index in the array it is
pointing. No, it wouldn't be very efficient. :-)

> But something like that would be very cool, esp if it created
>READABLE lisp code that could then be maintained easily.

The only way the you could produce maintainable code would be if the
compiler translated the C code to something that had a structure
similar to the source. You could define Lisp macros for each C
construct, then the translation would be a simple one-to-one
mapping. The object code would be about as easy to maintain as the
source.

> One thing I have noticed about all programming languages is that they
> lose information.  The information they lose is the intent of the
> programmer, hence the invention of inline code comments (and perhaps

Very true.

> also literate programming, although I don't know much about that).
> I'm not sure how a C to Lisp translator could reverse engineer the
> intent of the programmer to produce canonical Lisp code as might be
> typed by a proficient human programmer.

Normally, translation tend to cause information to be lost. Could one
reverse the process? As you say, what is really interesting is
understanding the programmer's intentions. Reading the programmer's
mind is in general an unsolvable problem :-), but perhaps a translator
could recognize common idioms and replace them with higher-level code.

Sven-Olof

-- 
Sven-Olof Nystrom 
Comp Sci Dept, Uppsala University, Box 337, S-751 05 Uppsala SWEDEN
········@csd.uu.se phone: +46 18 471 76 91, fax: +46 18 51 19 25

From: Ulrich Hobelmann
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 09:59:45 +0000
Message-ID: <3pnaohFb5kv0U1@individual.net>

Sven-Olof Nystr|m wrote:
> Normally, translation tend to cause information to be lost. Could one

That's just because translators as we know them suck.  When I know the 
words, I can translate, say, English to German and preserve the meaning. 
  Often I don't know the words.  And this translation-ability is not 
intelligence, it's just experience and pattern-matching, IMHO.

> reverse the process? As you say, what is really interesting is
> understanding the programmer's intentions. Reading the programmer's
> mind is in general an unsolvable problem :-), but perhaps a translator
> could recognize common idioms and replace them with higher-level code.

Yes, I wouldn't bother to read minds.

Translating code as it is should be enough if the target language allows 
low-level code to be written (I think CL does to a reasonable degree). 
Maybe the biggest problem would be performance of array accesses due to 
bounds-checks, but that's no worse than Java.

-- 
My mouth says the words, my brain is thinking monstertrucks.
Joey (Friends)

From: Pascal Bourguignon
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 10:36:35 +0000
Message-ID: <87ll1lmo24.fsf@thalassa.informatimago.com>

Ulrich Hobelmann <···········@web.de> writes:
> Translating code as it is should be enough if the target language
> allows low-level code to be written (I think CL does to a reasonable
> degree). Maybe the biggest problem would be performance of array
> accesses due to bounds-checks, but that's no worse than Java.

This would be the easiest.  The biggest problem would be the
greenspuning when "translating" CL to C, and the usual bugs when
"translating" C to CL.  Some kind of inverse greenspuning.  For
example, when you see unsigned int x; {x++;}, you cannot translate to
(incf x), you have to do it bug for bug, and translate to: 
(setf x (mod (1+ x) (expt 2 32)))

How do you "translate":
(defun fact (x) (if (< 1 x) (* x (fact (1- x))) 1))
to C?
int fact(int x){if(1<x){return(x*fact(x-1));}else{return(1);}}
is plain wrong.

[70]> (fact 12.34)
1.2730496E9
[71]> (fact 100)
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000



On the other hand, to implement:

[···@thalassa pjb]$ cat fact.c
#include <stdio.h>
    int fact(int x){if(1<x){return(x*fact(x-1));}else{return(1);}}
    int main(void){int i;for(i=0;i<30;i++){printf("%d!=%d\n",i,fact(i));}return(0);}
[···@thalassa pjb]$ gcc -o fact fact.c
[···@thalassa pjb]$ ./fact
0!=1
1!=1
2!=2
3!=6
4!=24
5!=120
6!=720
7!=5040
8!=40320
9!=362880
10!=3628800
11!=39916800
12!=479001600
13!=1932053504
14!=1278945280
15!=2004310016
16!=2004189184
17!=-288522240
18!=-898433024
19!=109641728
20!=-2102132736
21!=-1195114496
22!=-522715136
23!=862453760
24!=-775946240
25!=2076180480
26!=-1853882368
27!=1484783616
28!=-1375731712
29!=-1241513984
[···@thalassa pjb]$ 


You have to write:

(defun c2*32 (x y) ;; inverse greenspuning
   (let ((p (logand (1- (expt 2 32)) (* x y))))
      (if (<= (expt 2 31) p)
          (- p (expt 2 32))
          p)))

(defun c-fact (x)
  (declare (type x (integer (- (expt 2 31)) (1- (expt 2 31)))))
  (if (< 1 x)
      (c2*32 x (fact (1- x) #|we've infered that x>-2^31|#))
      1))


(defun main ()
   (loop for i from 0 below 30 do (format t "~D!=~D~%" i (c-fact i)))
   0)


[91]> (main)
0!=1
1!=1
2!=2
3!=6
4!=24
5!=120
6!=720
7!=5040
8!=40320
9!=362880
10!=3628800
11!=39916800
12!=479001600
13!=1932053504
14!=1278945280
15!=2004310016
16!=2004189184
17!=-288522240
18!=-898433024
19!=109641728
20!=-2102132736
21!=-1195114496
22!=-522715136
23!=862453760
24!=-775946240
25!=2076180480
26!=-1853882368
27!=1484783616
28!=-1375731712
29!=-1241513984
0


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The rule for today:
Touch my tail, I shred your hand.
New rule tomorrow.

From: Sven-Olof Nystr|m
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 11:10:16 +0000
Message-ID: <vpaci1iesn.fsf@harpo.it.uu.se>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Ulrich Hobelmann <···········@web.de> writes:
>> Translating code as it is should be enough if the target language
>> allows low-level code to be written (I think CL does to a reasonable
>> degree). Maybe the biggest problem would be performance of array
>> accesses due to bounds-checks, but that's no worse than Java.
>
> This would be the easiest.  The biggest problem would be the
> greenspuning when "translating" CL to C, and the usual bugs when
> "translating" C to CL.  Some kind of inverse greenspuning.  For
> example, when you see unsigned int x; {x++;}, you cannot translate to
> (incf x), you have to do it bug for bug, and translate to: 
> (setf x (mod (1+ x) (expt 2 32)))

If maintainability of the generated code is important, define a
package with Lisp implementations of all C primitives. x++ could be
translated to, say, (c:unsigned-int-incf x)

>
> How do you "translate":
> (defun fact (x) (if (< 1 x) (* x (fact (1- x))) 1))
> to C?
> int fact(int x){if(1<x){return(x*fact(x-1));}else{return(1);}}
> is plain wrong.

The same problem has to be solved when translating Lisp to
assembly. By the way, isn't there a Lisp implementation which
translates to C?  Again, the obvious solution is to define a library
for arbitrary-precision arithmetic (several such libraries exist).

Sven-Olof

-- 
Sven-Olof Nystrom 
Comp Sci Dept, Uppsala University, Box 337, S-751 05 Uppsala SWEDEN
········@csd.uu.se phone: +46 18 471 76 91, fax: +46 18 51 19 25

From: Pascal Bourguignon
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 12:02:39 +0000
Message-ID: <87hdc9mk2o.fsf@thalassa.informatimago.com>

Sven-Olof Nystr|m <········@harpo.it.uu.se> writes:
>> How do you "translate":
>> (defun fact (x) (if (< 1 x) (* x (fact (1- x))) 1))
>> to C?
>> int fact(int x){if(1<x){return(x*fact(x-1));}else{return(1);}}
>> is plain wrong.
>
> The same problem has to be solved when translating Lisp to
> assembly. By the way, isn't there a Lisp implementation which
> translates to C?  Again, the obvious solution is to define a library
> for arbitrary-precision arithmetic (several such libraries exist).

Of course, what I call greenspuning.  What's the point to translate if
you have to forward the whole run-time environment along?

Starting from specificiations and generating automatically programs in
one language or another would be more interesting.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The rule for today:
Touch my tail, I shred your hand.
New rule tomorrow.

From: Christopher C. Stacy
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 21:28:48 +0000
Message-ID: <uhdc8n8fi.fsf@news.dtpq.com>

Sven-Olof Nystr|m <········@harpo.it.uu.se> writes:
> By the way, isn't there a Lisp implementation which
> translates to C?

In the past there were a bunch of them.
(I haven't bothered to track that for more than 10 years.)

From: Paolo Amoroso
Subject: Re: Lex, Yacc, and Lisp...
Date: Mon, 26 Sep 2005 07:48:07 +0000
Message-ID: <8764sofex4.fsf@plato.moon.paoloamoroso.it>

Sven-Olof Nystr|m <········@harpo.it.uu.se> writes:

> assembly. By the way, isn't there a Lisp implementation which
> translates to C?  Again, the obvious solution is to define a library

Until a few years ago there was such a commercial product, Eclipse.
It was developed by, I think, Howard Stearns, and was distributed by
Elwood Corporation.

Paolo
-- 
Why Lisp? http://wiki.alu.org/RtL%20Highlight%20Film
Recommended Common Lisp libraries/tools:
- ASDF/ASDF-INSTALL: system building/installation
- CL-PPCRE: regular expressions
- CFFI: Foreign Function Interface

From: Lars Brinkhoff
Subject: Re: Lex, Yacc, and Lisp...
Date: Mon, 26 Sep 2005 08:17:10 +0000
Message-ID: <85br2gb5vd.fsf@junk.nocrew.org>

Paolo Amoroso <·······@mclink.it> writes:
> Sven-Olof Nystr|m <········@harpo.it.uu.se> writes:
> > By the way, isn't there a Lisp implementation which translates to
> > C?
> Until a few years ago there was such a commercial product, Eclipse.
> It was developed by, I think, Howard Stearns, and was distributed by
> Elwood Corporation.

Are GCL and ECL too obvious to mention?

From: =?utf-8?b?R2lzbGUgU8ODwqZsZW5zbWk=?= =?utf-8?b?bmRl?=
Subject: Re: Lex, Yacc, and Lisp...
Date: Mon, 26 Sep 2005 17:28:53 +0000
Message-ID: <0nu0g7agbu.fsf@kaktus.ii.uib.no>

Pascal Bourguignon <····@mouse-potato.com> writes:

> Ulrich Hobelmann <···········@web.de> writes:
> > Translating code as it is should be enough if the target language
> > allows low-level code to be written (I think CL does to a reasonable
> > degree). Maybe the biggest problem would be performance of array
> > accesses due to bounds-checks, but that's no worse than Java.
> 
> This would be the easiest.  The biggest problem would be the
> greenspuning when "translating" CL to C, and the usual bugs when
> "translating" C to CL.  Some kind of inverse greenspuning.  For
> example, when you see unsigned int x; {x++;}, you cannot translate to
> (incf x), you have to do it bug for bug, and translate to: 
> (setf x (mod (1+ x) (expt 2 32)))

It is even arguable whether it can even be called a bug. All right, the 
C standard says that the behavior of overflows are undefined, but practically
all implementations of cryptographic, signal processing and audio processing 
algorithms in C relies on wraparound semantics, and would fail horribly if
overflows were not handled that way. No sane C compiler writer for a 
general-purpose (not embedded) target would make a compiler without this behavior
these days, and the C-to-lisp translator would have to do the same.

> How do you "translate":
> (defun fact (x) (if (< 1 x) (* x (fact (1- x))) 1))
> to C?
> int fact(int x){if(1<x){return(x*fact(x-1));}else{return(1);}}
> is plain wrong.
> 
> [70]> (fact 12.34)
> 1.2730496E9
> [71]> (fact 100)
> 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000

In the mathbooks I have available here, factorials are defined for integers only, but 
let us for now assume that your generalized definition is required. Then you need 
a separate version for float/double due to the static typing of C. The same goes for
the upper limit. A correct integer version may look like this:

int fact(int x){
  if(x<1){
    return -UNDERFLOW;
  } else if(x > 12){
    return -OVERFLOW;
  } else {
    return(x*fact(x-1));
  }
}

This version assumes 32-bit integers, but is more what well-behaved C-code would
do. That is, to work around C's limitations as a normal mode of operation, and
illustrates exactly why automaticly translated code never can be good lisp code, and
also illustrates why Lisp gives higher programmer productivity than C. 

Any useful implementation would use a bignum library of cause, making the code even
more complicated.

-- 
Gisle Sælensminde, Phd student, Scientific programmer
Computational biology unit, University of Bergen, Norway
Email: ·····@cbu.uib.no

From: Gareth McCaughan
Subject: Re: Lex, Yacc, and Lisp...
Date: Mon, 26 Sep 2005 20:00:24 +0000
Message-ID: <87zmpzwqve.fsf@g.mccaughan.ntlworld.com>

Gisle S�Lensminde wrote:

[Pascal Bourgignon:]
>> [70]> (fact 12.34)
>> 1.2730496E9
>> [71]> (fact 100)
>> 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
> 
> In the mathbooks I have available here, factorials are defined for integers only, but 
> let us for now assume that your generalized definition is required.

There is a useful definition that works for non-integer arguments,
but it is not equivalent to the one Pascal gave. (But his underlying
point is independent of whether the particular example he gave is
useful, I guess.)

-- 
Gareth McCaughan
.sig under construc

From: Marcin 'Qrczak' Kowalczyk
Subject: Re: Lex, Yacc, and Lisp...
Date: Fri, 30 Sep 2005 09:53:14 +0000
Message-ID: <87wtkyhofp.fsf@qrnik.zagroda>

·····@kaktus.ii.uib.no (Gisle Sælensminde) writes:

> It is even arguable whether it can even be called a bug. All right,
> the C standard says that the behavior of overflows are undefined,
> but practically all implementations of cryptographic, signal
> processing and audio processing algorithms in C relies on wraparound
> semantics, and would fail horribly if overflows were not handled
> that way.

In ISO/ANSI C it's undefined only for signed types. Unsigned types
have a well-defined wraparound semantics of overflow.

In fact modern gcc takes advantage of undefinedness of signed integer
overflow:

#include <stdio.h>
#include <limits.h>
int test1(int x, int y) {return x + y  > x;}
int test2(int x)        {return x + 10 > x;}
int main(void) {
   printf("%d %d\n", test1(INT_MAX, 10), test2(INT_MAX));
   return 0;
}

Result with optimization turned on: 0 1

gcc has option -fwrapv for making it defined.

It once broke my C implementation of a runtime of a dynamically typed
language. I knew that I relied on undefined behavior, but I assumed
that every sane implementation makes it wrap around, which was not true.
I still haven't changed the code but I'm adding -fwrapv when needed.

-- 
   __("<         Marcin Kowalczyk
   \__/       ······@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/

From: Pascal Bourguignon
Subject: Re: Lex, Yacc, and Lisp...
Date: Tue, 27 Sep 2005 06:29:29 +0000
Message-ID: <87k6h3ja5y.fsf@thalassa.informatimago.com>

···@zedat.fu-berlin.de (Stefan Ram) writes:

> Pascal Bourguignon <····@mouse-potato.com> writes:
>>(let ((p (logand (1- (expt 2 32)) (* x y))))
>
>   This might assume that the arithmetics of C were defined by
>   the implementation used. The usual specification, however, is
>   ISO/IEC 9899:1999 (E). This does not prescribe (expt 2 32) for
>   int arithmetics, but instead only that INT_MAX be at least
>   +32767 and INT_MIN be at most -32767. An implementation even
>   is free to support "integer_overflow" values for signed types,
>   I quote from ISO/IEC 9899:1999 (E):
>
>       H.2.2  Integer types
>
>       [#1] The signed C integer types int, long int, long long
>       int, and the corresponding unsigned types are compatible
>       with LIA-1. If an implementation adds support for the
>       LIA-1 exceptional values ``integer_overflow'' and
>       ``undefined'', then those types are LIA-1 conformant
>       types. C's unsigned integer types are ``modulo'' in the
>       LIA-1 sense in that overflows or out-of-bounds results
>       silently wrap. An implementation that defines signed
>       integer types as also being modulo need not detect integer
>       overflow, in which case, only integer divide-by-zero need
>       be detected.
>
>   So the output of the C program
>
> #include <stdio.h>
> int fact(int x){if(1<x){return(x*fact(x-1));}else{return(1);}}
> int main(void){int i;for(i=0;i<30;i++){printf("%d!=%d\n",i,fact(i));}return(0);}
>
>   is not specified by ISO/IEC 9899:1999 (E). Therefore, it is
>   impossible to translate into any language with more specific
>   semantics, unless one first adds additional specifications
>   regarding those details that are not specified by ISO/IEC
>   9899:1999 (E) or one constrains oneself to those programs
>   whose output is specified (determined) by ISO/IEC 9899:1999 (E).

Totally correct (and I was concious of this while writting my
examples).  But most C programmers write their programs against a
given implementation, which is an added complexity to determine the
meaning of a C source.

Actually, I think it would be easier to try to recover the meaning of
a program from the binary than from C sources.  At least the
processors are mostly formally specified. (some have bugs and some
have undocumented opcodes, but these later are rarely used by
compilers).

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d? s++:++ a+ C+++ UL++++ P--- L+++ E+++ W++ N+++ o-- K- w--- 
O- M++ V PS PE++ Y++ PGP t+ 5+ X++ R !tv b+++ DI++++ D++ 
G e+++ h+ r-- z? 
------END GEEK CODE BLOCK------

From: Frank Buss
Subject: Re: Lex, Yacc, and Lisp...
Date: Tue, 27 Sep 2005 07:39:50 +0000
Message-ID: <192774d7qamn.m46xirwvig2f.dlg@40tude.net>

Stefan Ram wrote:

>   is not specified by ISO/IEC 9899:1999 (E). Therefore, it is
>   impossible to translate into any language with more specific
>   semantics, unless one first adds additional specifications
>   regarding those details that are not specified by ISO/IEC
>   9899:1999 (E) or one constrains oneself to those programs
>   whose output is specified (determined) by ISO/IEC 9899:1999 (E).

but at least you could write a automatic self test in the Makefile, which
tests the limits on which your code relies.

-- 
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

From: Peter Seibel
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 20:46:41 +0000
Message-ID: <m21x3ep51q.fsf@gigamonkeys.com>

Sven-Olof Nystr|m <········@harpo.it.uu.se> writes:

> Ray Dillinger <····@sonic.net> writes:

>> This seems like an easier method of implementing most languages,
>> and Lisp, with its syntactic flexibility, is uniquely even more
>> suited to being a "universal" target metalanguage than, say, the
>> JVM or .NET to name a couple of the usual suspects.
>
> From a practical point of view, using Lisp as a backend has many
> advantages. However, if we are talking about a compilers course and
> one of the goals of the course is to teach how machine code is
> generated, you would miss out on that.

Well, there's no reason that you have to stop with the version that
uses Lisp as a backend. After they've written a parser for some
Algolish syntax into s-expressions and the various intermedate form
manipulations and a Lisp backend then they could write a simple
s-expression to machine code assembler and then write a new backend
that targets that assembly language. Then, if you have time left in
the semester, let them retarget the assembler to a different
underlying machine.

-Peter

-- 
Peter Seibel           * ·····@gigamonkeys.com
Gigamonkeys Consulting * http://www.gigamonkeys.com/
Practical Common Lisp  * http://www.gigamonkeys.com/book/

From: Sven-Olof Nystr|m
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 07:49:00 +0000
Message-ID: <vpoe6hio43.fsf@harpo.it.uu.se>

Peter Seibel <·····@gigamonkeys.com> writes:

> Sven-Olof Nystr|m <········@harpo.it.uu.se> writes:
>
>> Ray Dillinger <····@sonic.net> writes:
>
>>> This seems like an easier method of implementing most languages,
>>> and Lisp, with its syntactic flexibility, is uniquely even more
>>> suited to being a "universal" target metalanguage than, say, the
>>> JVM or .NET to name a couple of the usual suspects.
>>
>> From a practical point of view, using Lisp as a backend has many
>> advantages. However, if we are talking about a compilers course and
>> one of the goals of the course is to teach how machine code is
>> generated, you would miss out on that.
>
> Well, there's no reason that you have to stop with the version that
> uses Lisp as a backend. After they've written a parser for some
> Algolish syntax into s-expressions and the various intermedate form
> manipulations and a Lisp backend then they could write a simple
> s-expression to machine code assembler and then write a new backend
> that targets that assembly language. Then, if you have time left in
> the semester, let them retarget the assembler to a different
> underlying machine.

OK, now we are discussing the best way of using Lisp in a compiler
course. That's what I thought Ray Dillinger was asking about, but from
his reply to my posting it is clear that he had other things in
mind. (Actually, the three people who responded managed to
misunderstand him in three different ways. :-))

Sure, you could include a Lisp backend, but would that be the best use
of the time? There are many other things one would like to bring in to
a project (data flow analysis, optimizations, graph-coloring register
allocation). 

Still, I can see some advantages with using a Lisp-oriented compiler.
Today, students leave compiler courses thinking that the best way to
implement a new language is to use lex and yacc, building
intermediate code and then translating it to machine code. The fact
is, this is a pretty awful way to implement an experimental
language. It might make sense to spend some time teaching faster ways
to obtain an implementation of a programming language, for example
translation to a high-level language or (simply) writing an
interpreter.

Sven-Olof

-- 
Sven-Olof Nystrom 
Comp Sci Dept, Uppsala University, Box 337, S-751 05 Uppsala SWEDEN
········@csd.uu.se, phone: +46 18 471 76 91, fax: +46 18 51 19 25

From: Matthias Buelow
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 19:13:13 +0000
Message-ID: <3plmq9Fb4ipoU3@news.dfncis.de>

Ray Dillinger <····@sonic.net> wrote:

>This seems like an easier method of implementing most
>languages, and Lisp, with its syntactic flexibility, is
>uniquely even more suited to being a "universal" target
>metalanguage than, say, the JVM or .NET to name a couple
>of the usual suspects.
>
>Have I missed important things?  What would be hard to
>do with this approach?

The JVM (or .NET) does a lot more than just execute statements.  It
provides a complete operating environment, which Common Lisp doesn't
(how would you do network programming, for example)?
IMHO (Common) Lisp might be one of the worst backend target VMs
conceivable.  It's far too dynamic, it doesn't provide any
communications with the outside world except per-character or
per-byte files, and it's monstrous, most of the monstrosity you'd
never need in a backend target and hence would be dead weight.

mkb.

From: Pascal Costanza
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 19:43:23 +0000
Message-ID: <3ploisFasagkU1@individual.net>

Matthias Buelow wrote:

> The JVM (or .NET) does a lot more than just execute statements.  It
> provides a complete operating environment, which Common Lisp doesn't
> (how would you do network programming, for example)?
> IMHO (Common) Lisp might be one of the worst backend target VMs
> conceivable.  It's far too dynamic, it doesn't provide any
> communications with the outside world except per-character or
> per-byte files, and it's monstrous, most of the monstrosity you'd
> never need in a backend target and hence would be dead weight.

That's nonsense.

The typical mistake you make is to just assume the existence of the 
(ANSI) Common Lisp specification. Common Lisp _implementations_ 
typically come with foreign function interfaces, and then it becomes 
obvious how to communicate with the "outside world".

If you really want to compare Common Lisp with the JVM or .NET, you 
should compare it with the JVM _specification_, or the .NET 
_specification_. For example, do you see anything related to network 
programming or foreign function interfaces mentioned in 
http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html ?

"Of course, we mean the whole Java platform," I hear them say, but then 
again the comparison should be against the "whole" Common Lisp 
"platform": If you would target Common Lisp as a virtual machine, you 
would of course choose a suitable implementation that provides the 
necessary infrastructure, ideally by using libraries that are available 
for other Common Lisp implementations as well.

Next, I don't understand how a virtual machine environment can be "too 
dynamic". For example, it's relatively straightforward to express the 
Java object model in terms of CLOS due to its flexibility, however, it's 
really tricky, probably impossible to express the CLOS object model in 
terms of the Java or .NET object models.

Finally, do you really think that Common Lisp is more "monstrous" than 
the Java Language + the JVM + the standard Java API? Most certainly not 
in terms of specification page count!

Pascal

-- 
OOPSLA'05 tutorial on generic functions & the CLOS Metaobject Protocol
++++ see http://p-cos.net/oopsla05-tutorial.html for more details ++++

From: Matthias Buelow
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 21:42:55 +0000
Message-ID: <3plvivFb1dfcU1@news.dfncis.de>

Pascal Costanza <··@p-cos.net> wrote:

>The typical mistake you make is to just assume the existence of the 
>(ANSI) Common Lisp specification. Common Lisp _implementations_ 

Well, what else is there to assume? The different implementations
aren't even compatible in much of the non-standardized stuff, which
includes everything more than trivial by-character or by-byte i/o
to files.
Last I looked, when we stay with our example, network programming
was exactly the same on every Java implementation that implemented
a certain specification (J2RE, J2EE, etc.), any implementation bugs
notwithstanding, of course.

>"Of course, we mean the whole Java platform," I hear them say, but then 
>again the comparison should be against the "whole" Common Lisp 
>"platform": If you would target Common Lisp as a virtual machine, you 

Ah. What's the "whole Common Lisp" platform then? Which particular
implementation do you have in mind? Or all of them?

>would of course choose a suitable implementation that provides the 

Ah. "Suitable". Of course.

>Next, I don't understand how a virtual machine environment can be "too 
>dynamic". For example, it's relatively straightforward to express the 
>Java object model in terms of CLOS due to its flexibility, however, it's 
>really tricky, probably impossible to express the CLOS object model in 
>terms of the Java or .NET object models.

What's your point?  At least I haven't proposed to use Java (or .NET) as
a backend for Lisp (although people are doing this, with certain degrees
of difficulties).  Likewise, I don't recommend doing it the other way
round.  Plus, let's face it, a well-written VM in C or C++ is a lot more
portable than any written for a particular Lisp implementation (the
"unportedness" of the Sun JDK isn't so much a technical but more of a
policy and licensing issue.)

>Finally, do you really think that Common Lisp is more "monstrous" than 
>the Java Language + the JVM + the standard Java API? Most certainly not 
>in terms of specification page count!

I surely won't go into comparisons between Lisp and Java, because
firstly they are too different, and second I don't quite think it would
be worth the effort, although I do think that Common Lisp is more
monstrous indeed.  It's not just sheer size (the Java language itself is
small and simple, the richness of the Java libraries is a pro-argument
for Java, while the "richness" of Common Lisp is mostly ill-matched
hacks upon hacks, see the many redundant functions that do mostly the
same, are weirdly named and each expect arguments in a different order
than their cousins, or take LOOP, or read macros, etc.) but how the
functionality is organized.

mkb.

From: Pascal Costanza
Subject: Re: Lex, Yacc, and Lisp...
Date: Sat, 24 Sep 2005 22:43:42 +0000
Message-ID: <3pm34uFb2poaU1@individual.net>

Matthias Buelow wrote:
> Pascal Costanza <··@p-cos.net> wrote:
> 
>>The typical mistake you make is to just assume the existence of the 
>>(ANSI) Common Lisp specification. Common Lisp _implementations_ 
> 
> Well, what else is there to assume? The different implementations
> aren't even compatible in much of the non-standardized stuff, which
> includes everything more than trivial by-character or by-byte i/o
> to files.
> Last I looked, when we stay with our example, network programming
> was exactly the same on every Java implementation that implemented
> a certain specification (J2RE, J2EE, etc.), any implementation bugs
> notwithstanding, of course.
> 
>>"Of course, we mean the whole Java platform," I hear them say, but then 
>>again the comparison should be against the "whole" Common Lisp 
>>"platform": If you would target Common Lisp as a virtual machine, you 
> 
> Ah. What's the "whole Common Lisp" platform then? Which particular
> implementation do you have in mind? Or all of them?

The context of this discussion (as proposed by the OP) is using Lisp for 
implementing one's own languages.

I have been involved both in implementing languages for Java and for 
Common Lisp. Our recent effort, ContextL, runs in seven different Common 
Lisp implementations on a multitude of operating systems. With "runs" I 
mean "runs". I have done the bulk of the implementation myself. Some 
parts of the implementation are not trivial and required good knowledge 
about the internals of Common Lisp, especially of the CLOS MOP. But it 
was _possible_ to get a decent working implementation in a comparatively 
short amount of time.

Of the various language extensions for Java I have been involved in, 
none work anymore. They only worked in a limited time frame as 
prototypes that I wouldn't recommend for "real world" use. If you take a 
look around in the Java world what language extensions people are 
working on, and risk to take a look behind the scenes, you can tell that 
this is mostly the same everywhere.

Even in the case for widely well-known and successful projects like 
AspectJ, you can see that they continually have to struggle against the 
ever-changing specifications around the Java platform. Even language 
extensions blessed by Sun itself don't work as reliably as expected. 
(Google for some blog entries about usage problems wrt the recently 
introduced generic types.) And these projects throw loads of money and 
manpower at these things, without really getting them right!

>>would of course choose a suitable implementation that provides the 
> 
> Ah. "Suitable". Of course.

Of course!

>>Next, I don't understand how a virtual machine environment can be "too 
>>dynamic". For example, it's relatively straightforward to express the 
>>Java object model in terms of CLOS due to its flexibility, however, it's 
>>really tricky, probably impossible to express the CLOS object model in 
>>terms of the Java or .NET object models.
> 
> What's your point?  At least I haven't proposed to use Java (or .NET) as
> a backend for Lisp (although people are doing this, with certain degrees
> of difficulties).

Recall the topic of this thread: Choosing a platform as a target for 
one's own language. It's likely (or at least should be the case!) that 
your language extension is somewhat original. Java doesn't allow you to 
be very original. Implementing CLOS in terms of the Java object model is 
just one example. In the case of .NET, which purportedly is language 
indepedent, you can see that almost all languages that have attempted to 
target .NET needed to compromise in some respect. Sorry, that's not a 
good target environment if you have to change the semantics of your 
language just to make it work at all.

> Likewise, I don't recommend doing it the other way
> round.  Plus, let's face it, a well-written VM in C or C++ is a lot more
> portable than any written for a particular Lisp implementation (the
> "unportedness" of the Sun JDK isn't so much a technical but more of a
> policy and licensing issue.)

Nonsense. In our projects we have made some considerable experience with 
changing the original Sun JVM implementation to make our language 
extensions work. Since Sun's JVM implementation is a moving target, we 
have switched to the Kaffe VM for a different language extension, which 
is implemented in C. I don't know on what experiences you base your 
claims on, but what you say is just nonsense.

In just another project, we have opted for transforming Java bytecode to 
implement our language extensions, instead of changing the JVM directly. 
(That was what later on led to JMangler.) Again, we have been running 
against walls in several respects.

Targeting the JVM to implement a language that deviates from the Java 
model in non-trivial ways is just not a very good idea, to put it mildly.

>>Finally, do you really think that Common Lisp is more "monstrous" than 
>>the Java Language + the JVM + the standard Java API? Most certainly not 
>>in terms of specification page count!
> 
> I surely won't go into comparisons between Lisp and Java, because
> firstly they are too different,

But you _have_ gone into comparisons between Lisp and Java!

> and second I don't quite think it would
> be worth the effort, although I do think that Common Lisp is more
> monstrous indeed.  It's not just sheer size (the Java language itself is
> small and simple, the richness of the Java libraries is a pro-argument
> for Java, while the "richness" of Common Lisp is mostly ill-matched
> hacks upon hacks, see the many redundant functions that do mostly the
> same, are weirdly named and each expect arguments in a different order
> than their cousins, or take LOOP, or read macros, etc.) but how the
> functionality is organized.

Sorry, I cannot take people serious who claim that Java is a simple 
language. Java as of JDK 1.0 was simple, but that's long ago. You're 
really in need of a reality check.

Pascal

-- 
OOPSLA'05 tutorial on generic functions & the CLOS Metaobject Protocol
++++ see http://p-cos.net/oopsla05-tutorial.html for more details ++++

From: David Steuber
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 01:46:16 +0000
Message-ID: <87k6h5ly1j.fsf@david-steuber.com>

Pascal Costanza <··@p-cos.net> writes:

> Finally, do you really think that Common Lisp is more "monstrous" than
> the Java Language + the JVM + the standard Java API? Most certainly
> not in terms of specification page count!

Not to mention memory footprint or LOCs.

I've complained about the size of an SBCL executable deliverable in
the past.  I recently tried to deliver one to Apress.  The tgz file is
a little over 7MB.  That is a heck of a lot smaller than simple J2SE
applications I see delivered when the runtime is included.

I've already eaten crow over SBCL.  I know it is still a FAQ about
delivering apps using it, but it turns out to be trivial.

-- 
You should maybe check the chemical content of your breakfast
cereal. --- Bill Watterson

From: Alex Mizrahi
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 10:46:08 +0000
Message-ID: <43367fee$0$49009$14726298@news.sunsite.dk>

(message (Hello 'Ray)
(you :wrote  :on '(Sat, 24 Sep 2005 01:14:14 GMT))
(

 RD> Have I missed important things?  What would be hard to
 RD> do with this approach?

nobody except morons used to visual basic need custom syntax, so lisp reader
can be used, and new languages implemented as Common Lisp macros.

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity")

From: Oleg A. Paraschenko
Subject: Re: Lex, Yacc, and Lisp...
Date: Sun, 25 Sep 2005 23:01:55 +0000
Message-ID: <20050926030155.1d790e55.usenet@datahansa.com>

  Hello Ray,

On Sat, 24 Sep 2005 01:14:14 GMT
Ray Dillinger <····@sonic.net> wrote:

> 
> I had a random thought the other day.  Lisp could easily
> be the "back end" or "executable intermediate representation"
> for most language compilers.
> 
> ...
> 
> Have I missed important things?

  I think no. The last year I'm advocating that Scheme can be considered
as a virtual machine. I also work on reusable and embeddable XPath and
XSLT implementations. No problems so far.

>  What would be hard to
> do with this approach?
> 
> 				Bear
> 
> 

-- 
Oleg Paraschenko  ····@ http://uucode.com/
http://uucode.com/blog/  Generative Programming, XML, TeX, Scheme