From: Joerg Hoehle
Subject: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <uzn1j5v7s.fsf@users.sourceforge.net>
Hi,

I came across J.Amsterdam's Iterate package and would like to address
its main weakness: correct, lexical aware code expansion.

I believe in its over 10 years of existence, it's never been able to
correctly handle this form:
    (macrolet ((over(x) `(collect ,x)))
      (iterate
	(for n in '(1 2 3))
	(flet ((over(x)(declare (ignore x)) (collect 2)))
	  (over n)))) ; would yield (2 2 2) if correct

One reason is that it does not contain a code walker aware of lexical
bindings.

I actually wonder whether Iterate actually needs a full code walker,
thereby duplicating functionality found in every Lisp implementation,
or whether Iterate could be written without one, possibly by clever
use of MACROLET and/or *MACROEXPAND-HOOK*?

Here's one of the problems: Iterate allows a (COLLECT x INTO var)
clause to appear anywhere deep inside the body of (Iterate ...),
contrary to LOOP. As a result, it's only after COLLECT has been
encountered that Iterate knows which bindings it must create in its
top level expansion.  Information flows from deep inside to the
outermost level. How to do that?

E.g.
(iterate ... (collect x into a) ... (collect ... into b) ...)
-> `(let (a b) ... ...)
(iterate ... no-collect-at-all ...)
-> `(let () ...) or (progn ...)

Do you have an idea how such a COLLECT INTO macro could be written
without duplicating a code walker?

This seems hairy to me if one wants it to work both in interpreted
and compiled modes, since with an interpreter, the COLLECT form may
only be encountered upon evaluation of (COLLECT #) --really late.

If it's not possible without a code walker, are there portable means
to use the implementation's code walking engine? It seems a pity to me
that something like AUGMENT-ENVIRONMENT (in CLtL2) was not
standardized in ANSI-CL.
Section 8.5, Environments, of CLtL2 see
http://www.supelec.fr/docs/cltl/clm/node102.html

Thanks for your help,
	Jorg Hohle
Telekom/T-Systems Technology Center

From: Arthur Lemmens
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <opshh2vck6k6vmsw@news.xs4all.nl>
Joerg Hoehle <······@users.sourceforge.net> wrote:

> I came across J.Amsterdam's Iterate package and would like to address
> its main weakness: correct, lexical aware code expansion.

I'm not sure if Andreas Fuchs (the maintainer of Iterate) reads comp.lang.lisp,
so you could consider sending your question to the iterate mailing list on
common-lisp.net.

Arthur
From: Andreas Fuchs
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <pan.2004.11.15.14.39.31.990828@boinkor.net>
(This is the first time I've opened c.l.l in a long time, and I didn't
intend to post anything, but then... (-:)

I'm cross-posting this to the iterate-devel mailing list.

On Mon, 15 Nov 2004 10:15:35 +0100, Joerg Hoehle wrote:
> I came across J.Amsterdam's Iterate package and would like to address
> its main weakness: correct, lexical aware code expansion.
> 
> I believe in its over 10 years of existence, it's never been able to
> correctly handle this form:
>     (macrolet ((over(x) `(collect ,x)))
>       (iterate
> 	(for n in '(1 2 3))
> 	(flet ((over(x)(declare (ignore x)) (collect 2)))
> 	  (over n)))) ; would yield (2 2 2) if correct

Actually, the current version of iterate can handle that clause pretty
well (since iterate--main--1.0--patch-6, since when we pass the *env*
argument to macroexpand calls). 

What it can't handle (and what I think you meant to post, anyway) are
nested macrolet clauses, like:

     (iterate
       (macrolet ((over(x) `(collect ,x)))
       (for n in '(1 2 3))
       (flet ((over(x)(declare (ignore x)) (collect 2)))
         (over n)))) ; would yield (2 2 2) if correct

because the code walker does not know how to extend the environment with
this nested macrolet. 

> One reason is that it does not contain a code walker aware of lexical
> bindings.

Right. I think that's partly due to the fact that it's impossible to have
a macrolet (and symbol-macrolet)-aware code walker in portable CL without
reimplementing half of common lisp. (-:

> I actually wonder whether Iterate actually needs a full code walker,
> thereby duplicating functionality found in every Lisp implementation,
> or whether Iterate could be written without one, possibly by clever
> use of MACROLET and/or *MACROEXPAND-HOOK*?

An iterate that doesn't code walk could end up being more friendly to its
maintainer. Iterate in its current form relies very heavily on the code
walker, so you'd end up re-implementing most of it from scratch (probably
not such a bad thing, either (-:)

*macroexpand-hook*, now there's an idea. From the code walker, it might be
possible to put a closure in there which can expand the macrolet by
itself. That would be interesting to try out (you'd still have to create a
macro function from the macrolet's definition. hmm)

> Here's one of the problems: Iterate allows a (COLLECT x INTO var)
> clause to appear anywhere deep inside the body of (Iterate ...),
> contrary to LOOP. As a result, it's only after COLLECT has been
> encountered that Iterate knows which bindings it must create in its
> top level expansion.  Information flows from deep inside to the
> outermost level. How to do that?
> 
> E.g.
> (iterate ... (collect x into a) ... (collect ... into b) ...)
> -> `(let (a b) ... ...)
> (iterate ... no-collect-at-all ...)
> -> `(let () ...) or (progn ...)
> 
> Do you have an idea how such a COLLECT INTO macro could be written
> without duplicating a code walker?

Unfortunately, ANSI CL doesn't specify the order in which macros are
expanded, but if it did, you /might/ be able to signal conditions which
the outer *macroexpand-hook* functions could know how to treat. All this
is highly speculative, but it would be a pretty neat hack. (-:

> This seems hairy to me if one wants it to work both in interpreted
> and compiled modes, since with an interpreter, the COLLECT form may
> only be encountered upon evaluation of (COLLECT #) --really late.

Right. That's probably part of the reason why ANSI doesn't specify
macroexpand order.

> If it's not possible without a code walker, are there portable means
> to use the implementation's code walking engine? It seems a pity to me
> that something like AUGMENT-ENVIRONMENT (in CLtL2) was not
> standardized in ANSI-CL.
> Section 8.5, Environments, of CLtL2 see
> http://www.supelec.fr/docs/cltl/clm/node102.html

It seems like most implementations offer code walker support similar
to ClTl2's. While not ANSI CL, these might be your best bet.

Cheers,
-- 
Andreas Fuchs, <···@boinkor.net>, ···@jabber.at, antifuchs
From: Kalle Olavi Niemitalo
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <87sm70mfm2.fsf@Astalo.kon.iki.fi>
Andreas Fuchs <···@boinkor.net> writes:

> It seems like most implementations offer code walker support similar
> to ClTl2's. While not ANSI CL, these might be your best bet.

I think not even a complete code walker can solve Iterate.
Consider:

  (symbol-macrolet ((paradox t))
    (macrolet ((when-symbol-macro (form &environment env)
                 (when (nth-value 1 (macroexpand-1 'paradox env))
                   form)))
      (iterate (for x in '(1 2 3))
               (when-symbol-macro
                 (collect x into paradox)))))

If Iterate binds PARADOX as a variable, then that binding hides
the symbol macro, and WHEN-SYMBOL-MACRO expands to NIL, so
Iterate shouldn't see the COLLECT clause and has no reason to
bind PARADOX.  But in that case, the symbol macro remains visible
and WHEN-SYMBOL-MACRO passes the COLLECT clause through, so
Iterate should bind PARADOX after all.

I don't think any standard Common Lisp constructs have this problem.
From: Toomas Altosaar
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <3ef90b62.0411232104.35aaa1ce@posting.google.com>
I think not even a complete code walker can solve Iterate.
>... 
> I don't think any standard Common Lisp constructs have this problem.

If this is so, does that imply that Iterate's specification has too
many "degrees of freedom"?
From: Kalle Olavi Niemitalo
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <87pt20mq3x.fsf@Astalo.kon.iki.fi>
···············@hut.fi (Toomas Altosaar) writes:

> If this is so, does that imply that Iterate's specification has
> too many "degrees of freedom"?

I don't know if that term applies to discrete systems.  Even if
it does, it seems the very number of degrees of freedom in
Iterate is ill-specified.

At first, I thought the binding problem could be solved by first
walking the body once with an environment that doesn't include
any accumulation variables, selecting the variables, and then
walking the body again with the resulting environment and
verifying that the same variables get selected again.  However,
this scheme does not allow a macro in the body to expand to a
form that accumulates to a gensym variable.  One could just
forbid that but I think it would feel unobvious to the
programmer.

I see the following ways to resolve the situation:

(a) Specify that any macros expanding to accumulation forms must
    not depend on whether the accumulation variables are already
    in the environment.  This does not require any change in the
    Iterate implementation.  However, I am not sure it is not
    possible to violate this requirement inadvertently.

(b) Require the programmer to list all the accumulation variables
    with an (:into a b c) form at the top of the body.  Signal an
    error if the code attempts to accumulate into any other
    variables.  This too would be incompatible with gensyms
    chosen in the body, but at least that would be obvious to the
    programmer.  On the plus side, this would allow a nested
    Iterate form to accumulate to a variable bound by the outer
    Iterate form.

(c) Deduce the accumulation variables from the body, but not by
    expanding macros.  Instead, define Iterate-specific walkers
    for all standard CL macros, and give the programmer a way to
    define walkers for her own macros.  The walkers must then
    satisfy the requirement of (a) but that is not a big problem
    because they are already specific to Iterate.  This would
    also ensure that the bindings made by

      (iterate (case t
                 (() (:collect t :into y))))

    will not depend on how smart the CASE macro is trying to be.
From: Thomas F. Burdick
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <xcvk6s5lqrz.fsf@conquest.OCF.Berkeley.EDU>
Kalle Olavi Niemitalo <···@iki.fi> writes:

> At first, I thought the binding problem could be solved by first
> walking the body once with an environment that doesn't include
> any accumulation variables, selecting the variables, and then
> walking the body again with the resulting environment and
> verifying that the same variables get selected again.  However,
> this scheme does not allow a macro in the body to expand to a
> form that accumulates to a gensym variable.  One could just
> forbid that but I think it would feel unobvious to the
> programmer.
> 
> I see the following ways to resolve the situation:

I haven't been following this discussion closely enough, so sorry if
it's been covered, but couldn't you have the ITERATE macro bind an
accumulation "environment"?  Make it basically an alist of
(var-name . accumulator), and an accumulation macro could add its
variable to the alist if it isn't there already.  Or are these
accumulation variables supposed to also be available as variables to
the user?  In that case, I'm pretty sure it can't be done without a
code walker.
From: Joerg Hoehle
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <uoeh74d3v.fsf@users.sourceforge.net>
Hi,

> Kalle Olavi Niemitalo <···@iki.fi> writes:
> I don't think any standard Common Lisp constructs have this problem.
I mean you want to say that no CL macro needs macroexpand to produce its
expansion (except maybe for compiler-let/macrolet tricks)?
That's also why LOOP clauses appear at top-level.

> > I see the following ways to resolve the situation:
> (b) Require the programmer to list all the accumulation variables
>     with an (:into a b c) form at the top of the body.

Me too sees that requiring collection variables be declared at
top-level should not cause much harm to users, and also not prevent
gensym'ed variables (they already have to be named twice since
(collect foo into (gensym)) is quite nonsense, you want the value).

It would be an Iterate rather different from the one we know. But it
might be worthwhile if it can then work without macroexpanding its body.

I haven't investigated whether that would be the only change from the
current Iterate though. Currently, quite a few of Iterate forms lead
to the creation of internal variables (but internal means they may not
affect macroexpansion).

···@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
> I haven't been following this discussion closely enough, so sorry if
> it's been covered, but couldn't you have the ITERATE macro bind an
> accumulation "environment"?  Make it basically an alist of
> (var-name . accumulator), and an accumulation macro could add its
> variable to the alist if it isn't there already.  Or are these
> accumulation variables supposed to also be available as variables to
> the user?  In that case, I'm pretty sure it can't be done without a
> code walker.

They are available to the user as lexical variables (or at least
symbol-macrolet) like loop variables so you can write
 (finally (return x)). That's why an internal plist like you propose
doesn't seem to be enough.
See Message-ID: <·············@users.sourceforge.net>
Subject: macro flow from inside to outside
where I raised this particular question :-)
of how to possibly get lexical bindings for something named only deep
within the body.

I believe Iterate uses lexical variables because that's what's felt
efficient. a p-list of name/values would have lookup cost at
run-time. Ten years ago, that would have been reason enough not to
advertise Iterate as a good replacement for LOOP/DO etc.

However, I've never seen a documented restriction on loop variables.
E.g. in both Iterate-1.0.9 and CLISP's LOOP, this does not
work as expected:
(loop initially (setq x (list 'a 'b)) ; try and prefill x
      for i in '(1 2 3)
      collect i into x
      finally (return x))
I don't know whether this is correct (portable) code.

BTW, drop me a note if you want to test my version of Iterate. I did
nearly two dozen small changes and bugfixes to Iterate-1.0.9.

Regards,
	Jorg Hohle
Telekom/T-Systems Technology Center
From: Joerg Hoehle
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <uu0r347j1.fsf@users.sourceforge.net>
Kalle Olavi Niemitalo <···@iki.fi> writes:
> Consider:
>   (symbol-macrolet ((paradox t))
>     (macrolet ((when-symbol-macro (form &environment env)
>                  (when (nth-value 1 (macroexpand-1 'paradox env))
>                    form)))
>       (iterate (for x in '(1 2 3))
>                (when-symbol-macro
>                  (collect x into paradox)))))

Are you sure this is correct code? IIRC the MACROLET expander is
defined in the lexical environment, but CLHS clearly says "the
consequences are undefined if the local macro definitions reference
any local variable or function bindings that are visible in that
lexical environment."

Doesn't this mean that (nth-value ...) is undefined as well?

Oops, this arguments sounds like use of macroexpand need be forbidden
inside macrolet!?!


BTW, my interpretation is opposite of yours: Iterate does not detect
'(collect into paradox) from an eagle's perspective. Instead, it
proceeds top-down to walk the code. Therefore, when when-symbol-macro
is encountered, Iterate has not yet seen the collect clause, therefore
it has not yet bound paradox as a variable. As a result, NTH-VALUE
returns T because the symbol-macro got expanded. Then only, Iterate
gets to see the collect clause, which causes it to generate a variable
binding for paradox.

As a result, the symbol-macro is shadowed. Iterate collects into a
distinct variable and returns nil (you may wish to add a
(finally (return paradox)) clause to see that).

>I don't think any standard Common Lisp constructs have this problem.
What's the problem, and why should e.g. cl:loop behave differently
(except that loop does not accept collect inside some form)?

There's some point, though: if Iterate could yield an expansion
without itself resorting to macroexpand, then when-symbol-macro would
be expanded in a environment context where the Iterate expansion would
have bound variables (via LET), causing different behaviour.

But that's little surprise, no? It all depends on what time and
environments macroexpansion occurs. I can't remember a point where CL
makes some guarantees or recommendatiosn on this.

Did I miss something?

Regards,
	Jorg Hohle
Telekom/T-Systems Technology Center
From: Kalle Olavi Niemitalo
Subject: Re: Macroexpansion, bindings and information flow (Iterate package)
Date: 
Message-ID: <87d5xrkkzi.fsf@Astalo.kon.iki.fi>
Joerg Hoehle <······@users.sourceforge.net> writes:

> Kalle Olavi Niemitalo <···@iki.fi> writes:
>> Consider:
>>   (symbol-macrolet ((paradox t))
>>     (macrolet ((when-symbol-macro (form &environment env)
>>                  (when (nth-value 1 (macroexpand-1 'paradox env))
>>                    form)))
>>       (iterate (for x in '(1 2 3))
>>                (when-symbol-macro
>>                  (collect x into paradox)))))
>
> Are you sure this is correct code?

Yes, apart from the use of Iterate.

> IIRC the MACROLET expander is
> defined in the lexical environment, but CLHS clearly says "the
> consequences are undefined if the local macro definitions reference
> any local variable or function bindings that are visible in that
> lexical environment."

In my example, there are no local variable or function bindings
in the lexical environment of the MACROLET form.  Thus, the macro
definition obviously does not reference any such bindings.

> Doesn't this mean that (nth-value ...) is undefined as well?

NTH-VALUE is defined in the global environment.

> Therefore, when when-symbol-macro is encountered, Iterate has
> not yet seen the collect clause, therefore it has not yet bound
> paradox as a variable.

The current implementation may do that, but in my opinion, it is
the wrong thing to do.  Iterate is effectively lying to the macro
function about the environment.  There is no general guarantee
that the expansion will then operate correctly, and I don't think
one should be added, as that would disable some macroexpand-time
optimizations that are too difficult to be deferred to the
compiler.

>>I don't think any standard Common Lisp constructs have this problem.
> What's the problem, and why should e.g. cl:loop behave differently
> (except that loop does not accept collect inside some form)?

I hope I have shown the problem.  LOOP analyzes its clauses
statically without macroexpansion and thus need not guess what to
place in environments.

> Did I miss something?

I think you missed the (case t (() (:collect t :into x))) issue.
Macros are not in general guaranteed to preserve the existence of
forms, but Iterate seems to be relying on such a guarantee, in
order to decide which variables to bind and how to initialize them.