the "upward funarg" problem

From: Silver
Subject: the "upward funarg" problem
Date: Mon, 02 Nov 1992 13:30:55 +0000
Message-ID: <Nov.2.08.30.54.1992.10743@brushfire.rutgers.edu>

I'd appreciate it if someone would describe the "upward funarg" problem,
seemingly in reference to the resolution of free variable references.
What is the problem, and what are its solutions?

Regards, [Ag]

Re: the "upward funarg" problem Barry Margolin
- Re: the "upward funarg" problem Silver
Re: the "upward funarg" problem John P. Flanagan
- Re: the "upward funarg" problem Jeff Dalton

From: Barry Margolin
Subject: Re: the "upward funarg" problem
Date: Mon, 02 Nov 1992 16:35:49 +0000
Message-ID: <1d3ld5INN92l@early-bird.think.com>

In article <·························@brushfire.rutgers.edu> ······@paul.rutgers.edu writes:
>I'd appreciate it if someone would describe the "upward funarg" problem,
>seemingly in reference to the resolution of free variable references.
>What is the problem, and what are its solutions?

An upward funarg, AKA a lexical closure, is a closure that is made
accessible outside the dynamic environment in which it was created, and
which contains references to local variables from that environment.  The
"problem" is that the dynamic environment is normally implemented as a
stack, and local variables are usually stored in the stack frame.  If a
procedure were implemented in the traditional way, as a combination of a
code pointer and a frame pointer, the frame pointer would be invalid after
the stack has been popped.

Some languages (PL/I, Pascal, Maclisp) solve this problem simply by telling
users not to return local functions as function values.  C and C++ solve it
by not having local functions (I think GCC provides them as an extension,
and uses the previous solution to the upward funarg problem).

Common Lisp, Scheme, and Dylan require lexical closures to be supported,
though.  A simplistic way to solve the problem is to allocate all
activation records on the heap rather than the stack.  This is pretty
expensive, though, since most activation records are not part of lexical
closure environments; it's generally only done this way in interpreters
(they're slow enough that this additional overhead may be unnoticeable).
More typically, compilers analyse the code and determine which variables
may be referenced outside their dynamic extent, and only allocate those
variables on the heap.

-- 
Barry Margolin
System Manager, Thinking Machines Corp.

······@think.com          {uunet,harvard}!think!barmar

From: Silver
Subject: Re: the "upward funarg" problem
Date: Mon, 02 Nov 1992 22:52:34 +0000
Message-ID: <Nov.2.17.52.33.1992.12408@brushfire.rutgers.edu>

······@think.com writes:
> An upward funarg, AKA a lexical closure

Oh, THAT upward funarg problem.  Awrighty, nuff sed, thanks!

Regards, [Ag]

From: John P. Flanagan
Subject: Re: the "upward funarg" problem
Date: Mon, 02 Nov 1992 20:57:47 +0000
Message-ID: <1992Nov2.205747.1162@wstar.mixcom.com>

······@brushfire.rutgers.edu (Silver) writes:

>I'd appreciate it if someone would describe the "upward funarg" problem,
>seemingly in reference to the resolution of free variable references.
>What is the problem, and what are its solutions?

The upward funarg problem is the situation in which a function closure
loses the bindings for it's free variables.  The solution was to use
lexical scoping (as opposed to dynamic scoping).

I think upward funarg's are mentioned briefly in "Structure and Interpretation
of Computer Programs" by Abelson, Sussman, and Sussman.  If you really want to
get a handle on funarg's, you want to read John Allen's, "Anatomy of Lisp".
-- 
   __________________________________________________________________________
   WaveStar Technology           WaveStar Common Lisp        John P. Flanagan
   W344 N6855 Bayberry Court     ANSI-12.24, X3J13/92          (414) 367-5014
   Oconomowoc, WI  53066         HPUX-800/700/400/300    ···@wstar.mixcom.com

From: Jeff Dalton
Subject: Re: the "upward funarg" problem
Date: Fri, 06 Nov 1992 18:05:32 +0000
Message-ID: <7867@skye.ed.ac.uk>

In article <····················@wstar.mixcom.com> ···@wstar.mixcom.com (John P. Flanagan) writes:
>······@brushfire.rutgers.edu (Silver) writes:
>
>>I'd appreciate it if someone would describe the "upward funarg" problem,
>>seemingly in reference to the resolution of free variable references.
>>What is the problem, and what are its solutions?
>
>The upward funarg problem is the situation in which a function closure
>loses the bindings for it's free variables.  The solution was to use
>lexical scoping (as opposed to dynamic scoping).

That more or less right, but it's not the whole story.

It's possible to use lexical (aka static) scoping and still fail
to solve the upward funarg problem (or even some aspects of the
downward one).  C is an example.  Compiled Franz Lisp is another.
VAX NIL used lexical scoping even in the interpreter (which was
unusual at the time), but still (initially) had only "downward
funargs".

It's also possible to solve the funarg problems while still using
dynamic scoping.  Lisp 1.5 is an example.

We can go way back in Lisp history on this, and, ok, I will.  
I think the following account is essentially correct, though
some might dispute the details.  (And I'll omit InterLisp.)

Think of a Lisp with dynamic scoping.

Consider a function: (lambda (x) (cons x lis))

Consider a definition of MAPCAR:

   (defun mapcar (fun lis)
     (if (null lis)
         '()
       (cons (funcall fun (car lis))
             (mapcar fun (cdr lis)))))

Now:

   (let ((lis '(green dogs)))
     (mapcar '(lambda (x) (cons x lis)) '(two three)))

We expect: ((two green dogs) (three green dogs)).

But we get: ((two two three) (three three)).

Why?  The functional argument refers to a "free variable" (namely LIS)
that is bound by MAPCAR as well as by the LET.  We want this variable
to refer the the LET binding, but because we're in a dynamically
scoped Lisp it ends up referring to the MAPCAR binding instead.

So: we have a functional argument (hence "funarg") and a problem
(hence "funarg problem"), and we're passing the functional argument
"down" to a function we call rather than returning it "up" to a
something that called us (hence "downward funarg problem").

In the upward case, we really talking about a functional value rather
than a functional argument.  But the term "upward funarg problem"
still makes sense (historically), as I will now explain.

The name FUNARG goes back to Lisp 1.5.  In Lisp 1.5, the value of
(QUOTE (LAMBDA ...) was just (LAMBDA ...).  You'll note that I quoted
the lambda-expression when calling MAPCAR above.  However (still in
Lisp 1.5), the value of

  (function (lambda ...))

was

  (funarg (lambda ...) <env>)

If you're familiar with closures, you'll recognize this FUNARG list
as a closure.  In Lisp 1.5, however, the environment was the a-list
used for _dynamic_ scoping.

This solves the downward problem.  When I write

   (let ((lis '(green dogs)))
     (mapcar (function (lambda (x) (cons x lis)))
             '(two three)))

MAPCAR is called with two arguments:

 arg-1: (funarg (lambda (x) (cons x lis))  ; the lambda-expr
                ((lis . (green dogs))))    ; the a-list

 arg-2: (two three)

Moreover, the rule for applying FUNARGs was to use the a-list in
the funarg as the environment.  So when the FUNARG is called by
MAPCAR, the variable LIS gets its value from the a-list.  (If you
write a simple Lisp interpreter, or copy the one in the Lisp 1.5
book, use dynamic scoping, and use this rule for evaluating calls
to FUNARGS, you'll see that it works.)

Now, since the a-list is an ordinary Lisp list, and since,
consequently, it can be returned as a value, the whole FUNARG
list can be returned as a value w/o any danger of the bindings
captured in the a-list being lost.  So this FUNARG technique
also solves the problem of functional results.  That is, it
also works in the upward direction.  For instance,

  (defun make-counter (n)
    (function (lambda () (setq n (1+ n)))))

  (make-counter 10)
    ==> (funarg (lambda () (setq n (1+ n))) ((n . 10)))

This funarg carries its interpretation of N with it wherever it
goes.

So this is great!  All the problems are solved!

Unfortunately, Lisp implementors wanted Lisp to be more efficient.
They wrote compilers.  They put variable values on a stack and
compiled references to the variables using stack offsets computed
as compile-time, just like compilers for languages like C.  In
effect, this implements lexical scoping for local, bound, variables.

Free variables were still a problem, though.  A "free variable"
is something that we'd call "a reference to a global variable"
in a language that didn't allow one function definition to appear
inside another.  That sort of language was one that people knew
how to handle.  (The ones that allowed nested function definitions
were trickier.)  So Lisp compilers more or less behaved as if they
were dealing with that kind of language.

A lambda-expr inside a function definition is a lot like a nested
function definition.  That was hard to handle.  So the compilers
compiled the lambda-expr as if it had appeared on its own at the
top level and not inside another function at all!  And if the
lambda-expr function wanted to refer to any of the local variables
of the function in which the lambda-expr appeared (vars like N in
MAKE-COUNTER above, or LIS in the LET even further back), it couldn't.
In compiled code, those variables were just stack entries.  The
old a-list and FUNARG trick wouldn't work.

Some clever noticed that, despite all this, they could still solve the
downward case.  Instead of having (FUNCTION (LAMBDA ...)) return

   (funarg (lambda ...) <a-list>)

it could return

   (*funarg (lambda ...) <stack-pointer>)

So Lisps like this had downward funargs that worked (with the aid
of some clever tricks), but they didn't have upward funargs.  That is,
if they returned one of these *FUNARG lists, the stack would still be
popped (as on all function returns), the local vars would be gone,
and the stack-pointer would be useless.

PDP-10 MacLisp was a Lisp of roughly this sort, as was Franz Lisp
(though Franz lacked the downward funarg trick).  The Lisp 1.5
compiler put Lisp 1.5 in roughly the same situation too, but Lisp 1.5
still had upward funargs for certain cases.

These Lisps typically used SPECIAL declarations to tell the compiler
to have the emitted code treat certain variables in the way the
interpreter would treat them rather than have the variables turned
into stack offsets.  SPECIAL variables were, consequently, dynamically
scoped (because that's what interpreters did in those days).

So here's one solution to the upward funarg problem for special
variables:

  * Use an a-list for special variables and put the current
    a-list in any FUNARGS that are created.

However, Lisps that wanted to me more efficient (eg, MacLisp) 
stopped using a-lists for special variables.  A-lists are a 
form of "deep binding".  You have to look back along the a-list 
for the variable you want.  "Shallow binding" works by attaching
the values directly to the symbols that represent the variables
and storing the old value on a stack whenever a nested binding
is made.  For instance, 

  (let ((a 1))
    (declare (special a))
    (g))

is more or less equivalent to:

  (push (symbol-value 'a) <stack>)
  (unwind-protect
      (progn (setf (symbol-value 'a) 1)
             (g))
    (setf (symbol-value 'a) (pop <stack>)))

This makes for fast, constant-time lookup, though it's somewhat slow
at setting up bindings and taking them down again.  But it prevents
the above solution to the upward funarg problem.

In any case, a feature of the a-list approach is that the a-list
contains _all_ the non-top-level bindings.  So there's no need to
figure out what variables need to be included in the FUNARG.  In some
shallow-binding Lisps (which means no a-list), it was possible to make
a form of dynamic-binding closure by explicitly listing the variables
to "close over".  Various tricks could make this more, or less, close
to the semantics of the a-list solution.  I won't go into the details,
but examples are Franz Lisp and ZetaLisp (Lisp Machine Lisp).  A
version that works more or less correctly _except when there are
assignments to the variables that are closed over_ is as follows:

(defun make-closure (vars fun)
  (let ((vals (mapcar #'symbol-value vars)))
    `(lambda (&rest args)
       (let ,(mapcar #'(lambda (var val) `(,var ',val)) vars vals)
         (declare (special ,@vars))
         (apply ',fun args)))))

> (let ((a 1) (b 2))
    (declare (special a b))
    (make-closure '(a b) '(lambda (c) (list a b c))))

(lambda (&rest args)
  (let ((a '1) (b '2))
    (declare (special a b))
    (apply '(lambda (c) (list a b c)) args)))

> (funcall * 3)

(1 2 3)

This is pretty much how things stood in the early 80s, until Common
Lisp came along and took the idea of full lexical scoping from Scheme.
Compilers had to be more sophisticated if they wanted to make this
work efficiently.  They had to identify which variables were going
to be closed over and arrange for them to have their values in some
heap-allocated structure rather than on the stack.  The emitted code
had to know how to refer to entries in this structure.  And so on.

However, in the the interpreter, a lexical solution can be very
like the Lisp 1.5 a-list solution described above.

             +--------------------------------+
             | Lexical scoping as if by magic |
             +--------------------------------+

It works like this:

  1. Write an ordinary, dynamically scoped interpreter with an a-list.
  2. Implement the FUNCTION / FUNARG technique from Lisp 1.5.
  3. Have (DEFUN <name> <args> <form>+) be equivalent to

       (setf (symbol-function '<name>)
             (function (lambda <args> <form>+)))

That is, functions include their definition environment.

Jeff Dalton,
AI Applications Institute,                               ········@ed.ac.uk
Edinburgh University.