macro: compiled & interpreted

From: Vladimir Zolotykh
Subject: macro: compiled & interpreted
Date: Tue, 12 Feb 2002 17:40:00 +0000
Message-ID: <3C695370.AEC2CAAD@eurocom.od.ua>

y Apologize for disturbing you on the same (almost) subject.

Would you mind to consider

(defmacro nthb (n list)
  `(if (= ,n 0)
     (car ,list)
     (nthb (- ,n 1) (cdr ,list))))

[taken from "On Lisp" by P.Graham pp. 140]

He says 'The dangerous thing about recursive macros like nthb is that
is that they usually work fine under the interpreter. Then when you
finally have your program working and you try to compile it, it won't
even compile. Not only that, but there will usually be no indication
that the problem is due to a recursive macro; the compile will simply
to into an infinite loop and leave you to figure out what went wrong'.

This subtle difference between compiling macro and interpreting one
interest�� me so I've tried to learn it better. I've read CLHS

  3.1.2.1.2.2 Macro Forms
  3.2.2.2 Minimal Compilation

Then made some experiments. That I saw I can't explain nor according
to Graham's sayings neither according to CLHS (though it is very
likely I just didn't understand it properly). 

ACL:

  o It seems ACL has no troubles with any case. It successfully
    compiles file, executes compiled code, also interpreted one
    without any difficulties.

CMUCL

  o Different figure. Also compiler doesn't go into infinite loop
    but in both cases (compiled and interpreted) attempt to call nthb
    leads to infinite loop.

Though I think nthb doesn't suit well to be a macro the difference
between compiling macros and interpreting one seems to me important
enough to spend some time to learn it better.

-- 
Vladimir Zolotykh

Re: macro: compiled & interpreted Tim Bradshaw
Re: macro: compiled & interpreted Kent M Pitman
- Re: macro: compiled & interpreted Vladimir Zolotykh
  - Re: macro: compiled & interpreted Pierre R. Mai
- Re: macro: compiled & interpreted Vladimir Zolotykh
  - Re: macro: compiled & interpreted Kent M Pitman
    - Constant folding (was Re: macro: compiled & interpreted) Daniel Barlow
    - Re: macro: compiled & interpreted Vladimir Zolotykh
    - Re: macro: compiled & interpreted Aleksandr Skobelev
      - Re: macro: compiled & interpreted Christophe Rhodes
        Re: macro: compiled & interpreted Marco Antoniotti
        Re: macro: compiled & interpreted Christophe Rhodes
        Re: macro: compiled & interpreted Raymond Toy
        Re: macro: compiled & interpreted Aleksandr Skobelev
        Re: macro: compiled & interpreted Kaz Kylheku
      - Re: macro: compiled & interpreted Julian Stecklina

From: Tim Bradshaw
Subject: Re: macro: compiled & interpreted
Date: Tue, 12 Feb 2002 18:37:26 +0000
Message-ID: <ey3g046eec9.fsf@tfeb.org>

* Vladimir Zolotykh wrote:

> (defmacro nthb (n list)
>   `(if (= ,n 0)
>      (car ,list)
>      (nthb (- ,n 1) (cdr ,list))))

> Then made some experiments. That I saw I can't explain nor according
> to Graham's sayings neither according to CLHS (though it is very
> likely I just didn't understand it properly). 

> ACL:

>   o It seems ACL has no troubles with any case. It successfully
>     compiles file, executes compiled code, also interpreted one
>     without any difficulties.

> CMUCL

>   o Different figure. Also compiler doesn't go into infinite loop
>     but in both cases (compiled and interpreted) attempt to call nthb
>     leads to infinite loop.

I think you are not doing the right thing, if you get code to compile.
You need not only to compile the macro definition (which will probably
be fine), but compile a function which uses the macro (which will
not).

For instance try to compile (using the above definition of NTHB)

        (defun foo (n l) (nthb n l))

and the compiler will expire at some point.

The issue is that a macro can't make use of the *runtime* values of
the arguments it's given, it only has access to whatever the system
knows at macro expansion time.  So if it attempts to recurse on
information which can not be known until runtime then bad things will
happen.  In particular macros just get the literal `source text' (this
is a very bad term in the context of Lisp...) of their arguments.

So in this case, in the function FOO, the first stage of expansion is
something like this:

        (nthb n l) ==> (if (= n 0) (car l) (nthb (- n 1) (cdr l)))

and then this expands to:

        (if (= n 0) (car l) 
            (if (= (- n 1) 0) 
                (car (cdr l))
                (nthb (- (- n 1) 1) (cdr (cdr l)))))

and you can see what is going to happen here.

--tim

From: Kent M Pitman
Subject: Re: macro: compiled & interpreted
Date: Tue, 12 Feb 2002 19:50:43 +0000
Message-ID: <sfwr8nqh430.fsf@shell01.TheWorld.com>

Vladimir Zolotykh <······@eurocom.od.ua> writes:

> y Apologize for disturbing you on the same (almost) subject.
> 
> Would you mind to consider
> 
> (defmacro nthb (n list)
>   `(if (= ,n 0)
>      (car ,list)
>      (nthb (- ,n 1) (cdr ,list))))
> 
> [taken from "On Lisp" by P.Graham pp. 140]
> 
> He says 'The dangerous thing about recursive macros like nthb is that
> is that they usually work fine under the interpreter. Then when you
> finally have your program working and you try to compile it, it won't
> even compile. Not only that, but there will usually be no indication
> that the problem is due to a recursive macro; the compile will simply
> to into an infinite loop and leave you to figure out what went wrong'.
> 
> This subtle difference between compiling macro and interpreting one
> interest�� me so I've tried to learn it better. I've read CLHS
> 
>   3.1.2.1.2.2 Macro Forms
>   3.2.2.2 Minimal Compilation
> 
> Then made some experiments.

Well, you can't just do experiments without knowing some things about your
Lisp.  No Lisp is required to have an interpreter. EVAL (and, by implication,
the interactive read-eval-print loop) might do
 (defun eval (x) (funcall (compile nil `(lambda () ,x))))

ALSO, even in an interpreted Lisp, the interpreter _might_ preprocess the
entire expression before begioning to interpret it.  Personally, I call this
compilation also, since it satisfies "minimal compilation", but this is not
a big deal.

The situation Paul is talking about is that it is permissible to create an
interpreter which delays expansion until execution time in the interpreter.
In that case, EVAL will do something like this:

(defun interpret (exp env)
  (typecase exp
    (cons (destructuring-bind (op . args) exp
            (case op
              ((quote) (first args))
              ...handle other special operators...
              (otherwise
                (cond ((macro-function op env)
                       (interpret (macroexpand-1 exp env) env))
                      (t
                       (error "Invalid form: ~S" exp)))))))
    (symbol ...handle variables and symbol macros...)
    (otherwise exp)))

Because of this, and because

 (macroexpand-1 '(nthb 5 x))

 => (if (= 5 0)
        (car x)
        (nthb (- 5 1) (cdr x)))

INTERPRET will be re-called on this expression, failing on (= 5 0) and
trying (nthb (- 5 1) (cdr x)).

 (macroexpand-1 '(nthb (- 5 1) (cdr x)))

 => (if (= (- 5 1) 0)
        (car (cdr x))
        (nthb (- (- 5 1) 1) (cdr (cdr x))))

[What an EXTREMELY BADLY WRITTEN example.]  This will again fail because
4 is not 0.  But eventually you'll get to

 (macroexpand-1 '(nthb (- (- (- (- (- 5 1) 1) 1) 1) 1)
                       (cdr (cdr (cdr (cdr (cdr x)))))))

 => (if (= (- (- (- (- (- 5 1) 1) 1) 1) 1) 0)
        (car (cdr (cdr (cdr (cdr (cdr x))))))
        (nthb (- (- (- (- (- (- 5 1) 1) 1) 1) 1) 1) 
              (car (cdr (cdr (cdr (cdr (cdr x))))))))

It would be really awful if the else clause here got expanded since there
is nothing to ever terminate it, but in a true interpreter, the branch would
go to the 'then' not the 'else' and only that part wouuld be evaluated would
be the 'then', and that requires no expansion.  So you'd win.

> That I saw I can't explain nor according
> to Graham's sayings neither according to CLHS (though it is very
> likely I just didn't understand it properly). 
> 
> ACL:
> 
>   o It seems ACL has no troubles with any case. It successfully
>     compiles file, executes compiled code, also interpreted one
>     without any difficulties.
>
> CMUCL
> 
>   o Different figure. Also compiler doesn't go into infinite loop
>     but in both cases (compiled and interpreted) attempt to call nthb
>     leads to infinite loop.
> 
> Though I think nthb doesn't suit well to be a macro the difference
> between compiling macros and interpreting one seems to me important
> enough to spend some time to learn it better.

If these are "succeeding" at all, it's probably because they are doing
constant folding in some cases, and only compiling the branch that they
can tell from the constant folding will get taken.  You don't say what
your test case is.

If you were to have used a variable first argument to nthb instead of
a constant, I can't believe it wouldn't blow up in the compiler.  But
then, I didn't try it.

From: Vladimir Zolotykh
Subject: Re: macro: compiled & interpreted
Date: Wed, 13 Feb 2002 09:30:02 +0000
Message-ID: <3C6A321A.DAD0023D@eurocom.od.ua>

Thank you Kent, Tim.

I feel myself more comfortable on this matter. Of course Tim were
right, I were doing not right thing. Even I haven't tried to compile

  (defun foo () (nthb 1 '(a b c)))

When I done, the both compilers (ACL and CMUCL) went into infinite
loop. No problem with interpreted code in ACL and it is possible (if I
understand Kent's explanation properly) because interpreter
intersperse evaluation and expansion and that way it get THEN clause
at some time. Compiler on the other side do only expansion and do that
forever. As Kent said 'No Lisp is required to have an interpreter' so
CMUCL might compile even interpreted code so it loops indefinitely on
it.


-- 
Vladimir Zolotykh

From: Pierre R. Mai
Subject: Re: macro: compiled & interpreted
Date: Thu, 14 Feb 2002 00:21:36 +0000
Message-ID: <87zo2coqun.fsf@orion.bln.pmsf.de>

Vladimir Zolotykh <······@eurocom.od.ua> writes:

> forever. As Kent said 'No Lisp is required to have an interpreter' so
> CMUCL might compile even interpreted code so it loops indefinitely on
> it.

Actually CMU CL does one of two things in its interpreter:

a) There is a very simple _partial_ interpreter, which picks of easy
   cases (e.g. function application), and executes those directly, as
   one would expect of a "traditional" interpreter.

   This interpreter is always present, even in lisps which don't
   include the compiler.

   For all non-easy cases, it calls out to the "real" interpreter:

b) The "real" interpreter is only present if the compiler is present,
   since it uses the first stages of the compiler, to convert the form
   to the IR1 internal representation (and run certain cheap
   optimizations on that), which is then interpreted by the "real"
   interpreter.

   Because of the conversion process, this interpreter will behave
   like a compiler, when it comes to things like macro-expansion
   (i.e. all macros are expanded before interpretation starts).

Regs, Pierre.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein

From: Vladimir Zolotykh
Subject: Re: macro: compiled & interpreted
Date: Thu, 14 Feb 2002 09:39:15 +0000
Message-ID: <3C6B85C2.F1CE30C6@eurocom.od.ua>

Kent,

In replies to my recent questions you twice mentioned "constant folding".
Is this used now in ACL, CMUCL for example ? Would you mind to give me the
idea what it is or suggest some reference to read about ?

-- 
Vladimir Zolotykh

From: Kent M Pitman
Subject: Re: macro: compiled & interpreted
Date: Thu, 14 Feb 2002 10:17:10 +0000
Message-ID: <sfw4rkk8j15.fsf@shell01.TheWorld.com>

Vladimir Zolotykh <······@eurocom.od.ua> writes:

> In replies to my recent questions you twice mentioned "constant folding".
> Is this used now in ACL, CMUCL for example ? Would you mind to give me the
> idea what it is or suggest some reference to read about ?

It's just an optimization technique.  You can't force it unless there's a
system-specific way for you to do it.  Of course, adjusting the OPTIMIZE
qualities can help make it more likely.  It's basically what causes
 (+ x 1 1)
to get compiled as 
 (+ x 2)
or sometimes what causes
 (let ((x 1))
   (+ x 1))
to get compiled as a constant 2.

I'm sure CMUCL does it aggressively under a great many circumstances.
I know LispWorks does a lot of it.  Probably ACL does, too.  Most quality
commercial implementations do, though it can vary where it does and doesn't
happen.  You can often check that it's happened with DISASSEMBLE,

From: Daniel Barlow
Subject: Constant folding (was Re: macro: compiled & interpreted)
Date: Thu, 14 Feb 2002 10:39:57 +0000
Message-ID: <87sn842vpe.fsf_-_@noetbook.telent.net>

Kent M Pitman <······@world.std.com> writes:

> I'm sure CMUCL does it aggressively under a great many circumstances.
> I know LispWorks does a lot of it.  Probably ACL does, too.  Most quality
> commercial implementations do, though it can vary where it does and doesn't
> happen.  You can often check that it's happened with DISASSEMBLE, 

There is a circumstance I came across the other week where you can
actually see it happen in CMUCL: 

* (compile nil (lambda () (/ 1 0)))
Compiling LAMBDA NIL: 

In: LAMBDA NIL
  (/ 1 0)
Warning: Lisp error during constant folding:
Arithmetic error DIVISION-BY-ZERO signalled.
Operation was KERNEL::DIVISION, operands (1 0).

Compiling Top-Level Form: 

Compilation unit finished.
  1 warning


#<Function "LAMBDA NIL" {480AF731}>
T
T


-dan

-- 

  http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources

From: Vladimir Zolotykh
Subject: Re: macro: compiled & interpreted
Date: Thu, 14 Feb 2002 12:07:49 +0000
Message-ID: <3C6BA895.1529A06C@eurocom.od.ua>

Kent M Pitman wrote:
> 
> or sometimes what causes
>  (let ((x 1))
>    (+ x 1))
> to get compiled as a constant 2.

Of course to be able to do such things Lisp has to look ahead and
expand macros just to see what it can do with it.  This adds something
to my understanding.


-- 
Vladimir Zolotykh

From: Aleksandr Skobelev
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 08:23:39 +0000
Message-ID: <bigi4a.bt1.ln@hermit.athome>

Kent M Pitman <······@world.std.com> wrote:
> Vladimir Zolotykh <······@eurocom.od.ua> writes:
> 
>> In replies to my recent questions you twice mentioned "constant folding".
>> Is this used now in ACL, CMUCL for example ? Would you mind to give me the
>> idea what it is or suggest some reference to read about ?
> 
> It's just an optimization technique.  You can't force it unless there's a
> system-specific way for you to do it.  Of course, adjusting the OPTIMIZE
> qualities can help make it more likely.  It's basically what causes
> (+ x 1 1)
> to get compiled as 
> (+ x 2)
> or sometimes what causes
> (let ((x 1))
>   (+ x 1))
> to get compiled as a constant 2.
> 
> I'm sure CMUCL does it aggressively under a great many circumstances.

Hmm. It doesn't look like it is so for CMUCL. At least in follow simple case.

--------------

CMU Common Lisp 18d-pre, level-1 built 2002-02-11 on melbourne, running on localhost
Send questions to ··········@cons.org. and bug reports to ·········@cons.org.
Loaded subsystems:
    Python native code compiler, target Intel x86
    CLOS based on PCL version:  September 16 92 PCL (f)
    Gray Streams Protocol Support
    CLX X Library MIT R5.02
    Motif toolkit and graphical debugger 1.0
    Hemlock 3.5

* (defun f1 (x)
  (declare (type (signed-byte 16) x)
           (optimize (speed 3) (safety 0) (debug 0) (space 0)))
  (+ x 1 1))
F1
* (disassemble 'f1)
Compiling LAMBDA (X): 
Compiling Top-Level Form: 

* (disassemble 'f1)
48147970:       .ENTRY F1()                  ; FUNCTION
      88:       POP   DWORD PTR [EBP-8]
      8B:       LEA   ESP, [EBP-32]
      8E:       ADD   EDX, 4                 ; No-arg-parsing entry point
      91:       ADD   EDX, 4
      94:       MOV   ECX, [EBP-8]
      97:       MOV   EAX, [EBP-4]
      9A:       ADD   ECX, 2
      9D:       MOV   ESP, EBP
      9F:       MOV   EBP, EAX
      A1:       JMP   ECX
      A3:       NOP
      A4:       NOP
      A5:       NOP
      A6:       NOP
      A7:       NOP
 
--------------

So CMUCL calls ADD twice in 8E in 91 lines. 
While it is true for Lispwork 4.2:

--------------
CL-USER 1 > (defun f1 (x)
              (declare (type (signed-byte 16) x)
                       (optimize (speed 3) (safety 0) (debug 0) (space
0)))
              (+ x 1 1))
F1

CL-USER 2 > (disassemble 'f1)
2065C932:
       0:      55               push  ebp
       1:      89E5             move  ebp, esp
       3:      FF7500           push  [ebp]
       6:      83ED04           sub   ebp, 4
       9:      8B7508           move  esi, [ebp+8]
      12:      897504           move  [ebp+4], esi
      15:      894508           move  [ebp+8], eax
      18:      B800020000       move  eax, 200
      23:      C9               leave 
      24:      E9A3805000       jmp   20B649F2         ; #<function
SYSTEM:+$FIXNUM 20B649F2>
      29:      90               nop   
NIL

--------------

And if to add a type declaration for the result of the function then it will
produce ever more short code:


CL-USER 11 > (defun f2 (x)
              (declare (type (signed-byte 16) x)
                       (optimize (speed 3) (safety 0) (debug 0) (space
0)))
              (the (signed-byte 16) (+ x 1 1)))
F2

CL-USER 12 > (disassemble 'f2)
20661FA2:
       0:      55               push  ebp
       1:      89E5             move  ebp, esp
       3:      0500020000       add   eax, 200
       8:      FD               std   
       9:      C9               leave 
      10:      C3               ret   
      11:      90               nop   
      12:      90               nop   
      13:      90               nop   
NIL

From: Christophe Rhodes
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 09:22:56 +0000
Message-ID: <sq7kpft7yn.fsf@cam.ac.uk>

Aleksandr Skobelev <·········@mail.ru> writes:

> Kent M Pitman <······@world.std.com> wrote:
> > Vladimir Zolotykh <······@eurocom.od.ua> writes:
> > 
> >> In replies to my recent questions you twice mentioned "constant folding".
> >> Is this used now in ACL, CMUCL for example ? Would you mind to give me the
> >> idea what it is or suggest some reference to read about ?
> > 
> > It's just an optimization technique.  You can't force it unless there's a
> > system-specific way for you to do it.  Of course, adjusting the OPTIMIZE
> > qualities can help make it more likely.  It's basically what causes
> > (+ x 1 1)
> > to get compiled as 
> > (+ x 2)
> > or sometimes what causes
> > (let ((x 1))
> >   (+ x 1))
> > to get compiled as a constant 2.
> > 
> > I'm sure CMUCL does it aggressively under a great many circumstances.
> 
> Hmm. It doesn't look like it is so for CMUCL. At least in follow simple case.
> [ ... ]
> * (defun f1 (x)
>   (declare (type (signed-byte 16) x)
>            (optimize (speed 3) (safety 0) (debug 0) (space 0)))
>   (+ x 1 1))
> F1

You're quite right; CMUCL fails to constant-fold this. If you replace
the body with (+ 1 1 x), though, you get the following disassembly:

480F8478:       .ENTRY F1()                  ; FUNCTION
      90:       POP   DWORD PTR [EBP-8]
      93:       LEA   ESP, [EBP-32]
      96:       ADD   EDX, 8                 ; No-arg-parsing entry point
      99:       MOV   ECX, [EBP-8]
      9C:       MOV   EAX, [EBP-4]
      9F:       ADD   ECX, 2
      A2:       MOV   ESP, EBP
      A4:       MOV   EBP, EAX
      A6:       JMP   ECX

Where clearly[*] constant-folding has occurred. Why the difference?
Well, the problem with optimizing calls to + is that a decision needs
to be made at some point about where to parenthesize "x+1+1", in this
case; cmucl isn't terribly bright, and always chooses to do
(+ (+ x 1) 1)
where no constant-folding is possible.

It's actually not completely trivial to fix this failure of
optimization in CMUCL, I think, because of the different levels of
transformation, but I'm sure that if this is the sticking point in
your application we could come to terms :-)

Christophe

[*] To the newbies: CMUCL (and most other CL implementations) "tag"
small numbers with two low '0' bits, with the rest of the object being
the number left-shifted twice. So the 8 in the line labelled 96 there
is the lisp's version of 2, or (+ 1 1).
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Marco Antoniotti
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 16:23:28 +0000
Message-ID: <y6cpu36pvcv.fsf@octagon.mrl.nyu.edu>

Christophe Rhodes <·····@cam.ac.uk> writes:
	...
> Where clearly[*] constant-folding has occurred. Why the difference?
> Well, the problem with optimizing calls to + is that a decision needs
> to be made at some point about where to parenthesize "x+1+1", in this
> case; cmucl isn't terribly bright, and always chooses to do
> (+ (+ x 1) 1)
> where no constant-folding is possible.
> 
> It's actually not completely trivial to fix this failure of
> optimization in CMUCL, I think, because of the different levels of
> transformation, but I'm sure that if this is the sticking point in
> your application we could come to terms :-)

Good analysis, however.  How would a C compiler constant fold

	x + 1 + 1

?

Cheers

-- 
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group        tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                 fax  +1 - 212 - 995 4122
New York, NY 10003, USA                 http://bioinformatics.cat.nyu.edu
                    "Hello New York! We'll do what we can!"
                           Bill Murray in `Ghostbusters'.

From: Christophe Rhodes
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 16:48:35 +0000
Message-ID: <sqvgcyzo64.fsf@cam.ac.uk>

Marco Antoniotti <·······@cs.nyu.edu> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
> 	...
> > Where clearly[*] constant-folding has occurred. Why the difference?
> > Well, the problem with optimizing calls to + is that a decision needs
> > to be made at some point about where to parenthesize "x+1+1", in this
> > case; cmucl isn't terribly bright, and always chooses to do
> > (+ (+ x 1) 1)
> > where no constant-folding is possible.
> > 
> > It's actually not completely trivial to fix this failure of
> > optimization in CMUCL, I think, because of the different levels of
> > transformation, but I'm sure that if this is the sticking point in
> > your application we could come to terms :-)
> 
> Good analysis, however.  How would a C compiler constant fold
> 
> 	x + 1 + 1
> 
> ?

Well, 

·····@lambda:~$ cat foo.c
int main (void) {
  int x;
  return x + 1 + 1;
}
(gdb) disassemble main
[ ... ]
0x8048409 <main+9>:	add    $0x2,%edx
[ ... ]

So gcc seems to manage quite well. 

Let me just amplify slightly on why CMUCL folds (+ 1 1 x) and not (+ x
1 1) at the time of writing. There are several translation layers, as
Pierre Mai described in <··············@orion.bln.pmsf.de>. In the IR1
phase of translation, there are several passes; one is
source-to-source transforms, for instance[*]:

(define-source-transform 1+ (x) `(+ ,x 1))

and another is so-called ir1-transform:

(deftransform + ((x y) (t (constant-arg t)) * :when :both)
  "fold zero arg"
  (let ((val (continuation-value y)))
    (unless (and (zerop val)
                 (not (and (floatp val) (plusp (float-sign val))))
                 (not-more-contagious y x))
      (give-up-ir1-transform)))
  'x)

where this second transform converts (+ foo <constant-arg>) to foo
when the <constant-arg> is a suitable zero.

Note that this transform applies to + with precisely two arguments.
How do we get there? Well, there's a source-transform for
more-than-2-arguments +, which transforms (via a source transform)
into a series of calls to two argument +. Only then does the
constant-folding kick in, by which time it is too late, as (+ x 1 1)
has been transformed via associate-args to (+ (+ x 1) 1). The
moral of this tale is twofold; firstly, if you're dealing with
micro-optimization it pays to know your tools; secondly, CMUCL and
SBCL are by no means perfect, and any help in their improvement is
appreciated :-)

Christophe

[*] Caveat: these are actually from SBCL, as that's what I have most
readily to hand. SBCL forked from CMUCL a couple of years ago, but the
architecture of the translating phases is very similar.
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Raymond Toy
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 17:28:08 +0000
Message-ID: <4n8z9ulknr.fsf@rtp.ericsson.se>

>>>>> "Christophe" == Christophe Rhodes <·····@cam.ac.uk> writes:

    Christophe> Note that this transform applies to + with precisely two arguments.
    Christophe> How do we get there? Well, there's a source-transform for
    Christophe> more-than-2-arguments +, which transforms (via a source transform)
    Christophe> into a series of calls to two argument +. Only then does the
    Christophe> constant-folding kick in, by which time it is too late, as (+ x 1 1)
    Christophe> has been transformed via associate-args to (+ (+ x 1) 1). The
    Christophe> moral of this tale is twofold; firstly, if you're dealing with
    Christophe> micro-optimization it pays to know your tools; secondly, CMUCL and
    Christophe> SBCL are by no means perfect, and any help in their improvement is
    Christophe> appreciated :-)

I think it would be possible to modify associate-args to gather up
integer constants and place them at the front so that things get
associated in the right order.  We should only do integer constants
because floating-point arithmetic is not associative.

And of course, this only works for constants that appear in the
sources as such.  It won't help if the compiler knows that they're
constants such as defconstants.

I think this optimization is only useful for the "my compiler
micro-optimizes useless things better than yours" discussions. :-) 

Ray

From: Aleksandr Skobelev
Subject: Re: macro: compiled & interpreted
Date: Sat, 16 Feb 2002 08:44:50 +0000
Message-ID: <266l4a.do.ln@hermit.athome>

Raymond Toy <···@rtp.ericsson.se> wrote:
> ...
> 
> And of course, this only works for constants that appear in the
> sources as such.  It won't help if the compiler knows that they're
> constants such as defconstants.


Well, if to follow the recommendations of Christofhe Rhodes then CMUCL
will be able to fold `defconstanted' constants also:

* (defconstant +one+ 1)
+ONE+
* (defconstant +two+ 2)
+TWO+
* (defun f3 (x)
  (declare (type (signed-byte 16) x)
           (optimize (speed 3) (space 0) (debug 0) (safety 0)))
  (+ +one+ +two+ x))
F3
* (disassemble 'f3)
Compiling LAMBDA (X): 
Compiling Top-Level Form: 

481CC110:       .ENTRY "LAMBDA (X)"()        ; FUNCTION
      28:       POP   DWORD PTR [EBP-8]
      2B:       LEA   ESP, [EBP-32]
      2E:       ADD   EDX, 12                ; No-arg-parsing entry point
      31:       MOV   ECX, [EBP-8]
      34:       MOV   EAX, [EBP-4]
      37:       ADD   ECX, 2
      3A:       MOV   ESP, EBP
      3C:       MOV   EBP, EAX
      3E:       JMP   ECX
* 

Lispwork does the same but only in a case of (+ x +one+ +two+). It looks
like LW  has the same `difficulties' with (+ +one+ +two+ x) as CMUCL has 
with (+ x +one+ +two+).

> ...

From: Kaz Kylheku
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 18:53:00 +0000
Message-ID: <gOcb8.44371$Cg5.2399832@news1.calgary.shaw.ca>

In article <···············@octagon.mrl.nyu.edu>, Marco Antoniotti wrote:
>
>Christophe Rhodes <·····@cam.ac.uk> writes:
>	...
>> Where clearly[*] constant-folding has occurred. Why the difference?
>> Well, the problem with optimizing calls to + is that a decision needs
>> to be made at some point about where to parenthesize "x+1+1", in this
>> case; cmucl isn't terribly bright, and always chooses to do
>> (+ (+ x 1) 1)
>> where no constant-folding is possible.
>> 
>> It's actually not completely trivial to fix this failure of
>> optimization in CMUCL, I think, because of the different levels of
>> transformation, but I'm sure that if this is the sticking point in
>> your application we could come to terms :-)
>
>Good analysis, however.  How would a C compiler constant fold
>
>	x + 1 + 1

It could fold the 1 + 1, if it could prove that the resulting expression
will still produce the same result as the abstract semantics for all
values of x which do not lead to overflow. The abstract semantics
call for left to right addition.  If x is an integral type, this
equivalence is obvious by inspection.  If x is a floating point type,
then the folding is forbidden, because floating point arithmetic doesn't
obey the associative property. Adding 1.0 twice is not necessarily the
same thing as adding 2.0.

If subtraction is involved, compilers can still ``cheat'' by taking
advantage of the reversible overflow of integer arithmetic on two's
complement architectures. Consider x + a - b  where a and b are constants.
Under two's complement, you can precompute the a - b subtraction at
translation time, then add x at run time without caring about overflow.
You will get the same result as ((x + a) - b) in all cases that are
free of overflow. The overflow cases have undefined behavior according to
the C standard, so you are absolved from caring, unless you intend
to go beyond the language specification and provide overflow detection.

-- 
Meta-CVS: version control with directory structure versioning over top of CVS.
http://users.footprints.net/~kaz/mcvs.html

From: Julian Stecklina
Subject: Re: macro: compiled & interpreted
Date: Fri, 15 Feb 2002 13:44:56 +0000
Message-ID: <87aduaeu5j.fsf@blitz.comp.com>

Aleksandr Skobelev <·········@mail.ru> writes:

> Kent M Pitman <······@world.std.com> wrote:
> > Vladimir Zolotykh <······@eurocom.od.ua> writes:
> > 
> >> In replies to my recent questions you twice mentioned "constant folding".
> >> Is this used now in ACL, CMUCL for example ? Would you mind to give me the
> >> idea what it is or suggest some reference to read about ?
> > 
> > It's just an optimization technique.  You can't force it unless there's a
> > system-specific way for you to do it.  Of course, adjusting the OPTIMIZE
> > qualities can help make it more likely.  It's basically what causes
> > (+ x 1 1)
> > to get compiled as 
> > (+ x 2)
> > or sometimes what causes
> > (let ((x 1))
> >   (+ x 1))
> > to get compiled as a constant 2.
> > 
> > I'm sure CMUCL does it aggressively under a great many circumstances.
> 
> Hmm. It doesn't look like it is so for CMUCL. At least in follow simple case.
> 
> --------------
> 
> 
> CMU Common Lisp 18d-pre, level-1 built 2002-02-11 on melbourne,
> running on localhost Send questions to ··········@cons.org. and bug
> reports to ·········@cons.org.  Loaded subsystems: Python native
> code compiler, target Intel x86 CLOS based on PCL version: September
> 16 92 PCL (f) Gray Streams Protocol Support CLX X Library MIT R5.02
> Motif toolkit and graphical debugger 1.0 Hemlock 3.5
> 
> * (defun f1 (x)
>   (declare (type (signed-byte 16) x)
>            (optimize (speed 3) (safety 0) (debug 0) (space 0)))
>   (+ x 1 1))
> F1

[...]

I was just curious wether CLISP would optimize its bytecode:

Disassembly of function F2
(CONST 0) = 2
1 required arguments
0 optional arguments
No rest parameter
No keyword parameters
0     (CONST&PUSH 0)                      ; 2
1     (LOAD&PUSH 2)
2     (CALLSR 2 54)                       ; +
5     (SKIP&RET 2)
#<COMPILED-CLOSURE F2>

It does.

Regards,
Julian
-- 
Um meinen oeffentlichen Schluessel zu erhalten:
To get my public key:
http://math-www.uni-paderborn.de/pgp/