what is the funarg problem

From: Frode Vatvedt Fjeld
Subject: what is the funarg problem
Date: Wed, 01 Dec 1999 00:00:00 +0000
Message-ID: <2hyabfasdc.fsf@dslab7.cs.uit.no>

What is "the classic funarg problem", exactly?

Thanks,
-- 
Frode Vatvedt Fjeld

Re: what is the funarg problem Jeff Dalton
- Re: what is the funarg problem Frode Vatvedt Fjeld
- Re: what is the funarg problem Aaron Gross
  - Re: what is the funarg problem Jeff Dalton
    - Re: what is the funarg problem Christopher R. Barry
      - Re: what is the funarg problem Jeff Dalton

From: Jeff Dalton
Subject: Re: what is the funarg problem
Date: Wed, 01 Dec 1999 00:00:00 +0000
Message-ID: <x2emd64kp5.fsf@todday.aiai.ed.ac.uk>

Frode Vatvedt Fjeld <······@acm.org> writes:

> What is "the classic funarg problem", exactly?

First, the problem; then FUNARGS, upward and downward, etc.

Suppose that in a dynamically scoped Lisp someone defines a 2-arg
mapcar like this:

  (defun mapcar2 (fun lis)
    (cond ((null lis) '())
          (t (cons (funcall fun (car lis))
                   (mapcar2 fun (cdr lis))))))

And now suppose I do this:

  (defun prefix-first (lis)
    (mapcar2 #'(lambda (item) (list (car lis) item))
             lis))

I want to take the first element of the list and "prefix" it to all
elements of the list, and I'm expecting that, for example

  (prefix-first '(a b c))  ==>  ((A A) (A B) (A C))

I think the "lis" that's in my
  
  #'(lambda (item) (list (car lis) item))

will get the value of the "lis" that's the parameter of prefix-first.

[That #'(lambda (item) (list (car lis) item) is a function being
passed as an argument, hence a functional argument, hence a funarg.]

But when I try it I actually get

  (prefix-first '(a b c))  ==>  ((A A) (B B) (C C))

because the "lis" in #'(lambda (item) (list (car lis) item)) ends
up getting the value of the "lis" that's the parameter to mapcar2.

[To see this in Common Lisp, do "(defvar lis)" first, to proclaim
"lis" special, hence making it dynamically scoped wherever it
occurs.]

That's how dynamic binding works: all uses of a name such as "lis"
as a variable are references to, in effect, the same variable, and
you get the most recently established value.

But even with dynamic binding I would have gotten the result I
expected if only the definition of mapcar2 hadn't happened to use
the same variable name that I did.

Presumably the author of mapcar2 did not *intend* to produce such
effects.  He or she just happened to pick a variable name that was
too common.

This problem of unintentionally getting the "wrong" value -- or of
unintentionally causing those who use your code to get the "wrong"
value -- turns out to be a fairly general problem: it's not only with
functional arguments that things go "wrong".  All "free" variables
will refer to the "wrong" value.

[A free variable in an expression is one that's not bound.  So in
(lambda (item) (list (car lis) item), item is bound by the lambda-
expression, and lis is free.]

But it's arguable that in other cases (other than functional
arguments), the problem is not very worrying, perhaps not even
a problem at all.  If, for instance, I write:

  (defun f (x) (* x scale-factor))

it sure looks like I must have intended to refer to a global variable
or to something similar.  Since one way to think of dynamic binding is
as temporary assignments to global variables, we can think of it as
in effect a useful generalisation of that idea.

[The modern approach is to make the intention clearer by using a
star-convention for the names of variables such as the scale-factor
above.  We'd write it as *scale-factor*.  And we might exploit
the fact that it's dynamically scoped by writing something like
this:  (let ((*scale-factor* 10)) (f (get-raw-value))).]


So that's the problem.  What about the solution?

One "solution" would be just to say "use obscure variable names" when
defining functions that take functional arguments.  That's generally
regarded as a rather bad approach, not really a solution at all.

Another non-solution is to say "live with it, but make things clearer
by always using the star-convention for any variable names that might
be a problem".  [Note that that's different from the modern approach
mentioned above.  In the modern approach, the problem *has* been
solved, but dynamic binding remains an option for when it's actually
wanted.]

In modern Lisps such as Scheme, Common Lisp, and EuLisp, the problem
is solved by lexical scoping.  That is, we might say, the correct,
clean solution.  It makes the prefix-first function work as I
intended.  The "lis" in "(lambda (item) (list (car lis) item))"
will refer to the "lis" that's the formal parameter of prefix-first.

But historically the problem was first solved in a different way,
though the two ways turn out to be closely related.  Here's a brief
description of the original solution, roughly as it appeared in
Lisp 1.5.

Imagine a Lisp interpreter written in Lisp.  It will have some way
to keep track of the current values of variables.  The traditional
way was to use an a-list.  This mapping from variable names to values
is an example of what we'd now call an "environment".

The Lisp 1.5 solution was to have the interpreter treat

  (FUNCTION (LAMBDA ...))

as an expression whose value was (FUNARG (LAMBDA ...) current-a-list)
and then to use that "captured" a-list when evaluating a call to 
(FUNARG ...).  That (FUNARG ...) was therefore an object that could
be used as a function.  Note that the a-list is the one that's
"current" at the time the (FUNCTION (LAMBDA ...)) expression is
evaluated.  This is sometimes described as "closing the function
in the definition environment", or something like that.

[Here I'm using upper case names for literals and lower case names
as syntactic variables.]

[Lisp 1.5 also had a compiler, but I won't describe how things worked
in that case.]

This is the so-called "funarg device", from which we get the term
"funarg".  There are traces of this solution even in some modern
Lisps, because #'whatever is an abbreviation for (FUNCTION whatever).

Back to the example.  when (PREFIX-FIRST '(A B C)) is evaluated,
mapcar2 will be called with two values:

  (FUNARG (LAMBDA (ITEM) (LIST (CAR LIS) ITEM)) ((LIS . (A B C))))

and

  (A B C)

Then when (FUNCALL FUN (CAR LIS))  [see the definition of mapcar2
above] is evaluated, the value of "FUN" will be

  (FUNARG (LAMBDA (ITEM) (LIST (CAR LIS) ITEM)) ((LIS . (A B C))))

and that "captured" a-list will be used when evaluating

  (LIST (CAR LIS) ITEM)

And the same will happen when MAPCAR2 called itself recursively,
because the same value of FUN is passed along each time.  So each time
(LIST (CAR LIS) ITEM) is evaluated, LIS will have the value (A B C);
but the rest of the time while MAPCAR2 is recursively calling itself
-- ie apart from when the FUNARG is being applied -- LIS will have the
value controlled what MAPCAR2 does.  That is, it will be successive
CDRs of the list originally passed to MAPCAR2.

This is, of course, very like lexical scoping.  Indeed, lexical
scoping is what you get when all functions are "closed in the
definition envoronment".  In Lisp 1.5, however, that happened
only when you wrote (FUNCTION (LAMBDA ...)), which you did,
if at all, only when passing functions as arguments or
(it turns out) returning them as results.

That gives us the two famous cases "downward funargs" passed
as arguments, and "upward funargs" returned as results.


Since Lisp 1.5 was so close to the "right" solution -- ie to lexical
scoping -- we ought to ask why they failed to go all the way.  I think
the answer is that these issues weren't well enough understood at the
time, and that it wasn't clear how to implement FUNARGs efficiently.

The downward case can be handled in more efficient, specialised ways.
For example, the captured environment can just be a pointer into the
stack.  [It turns out that that can be done with both shallow and deep
binding ways of using the stack.]  So some Lisps have handled only the
downward case and people sometimes say, of Lisps that do more, that
they "can handle upward funargs" or that they "solve the upward funarg
problem".

However, the generality of the Lisp 1.5 solution remained a challenge
for later Lisps.  Note in particular that (1) you don't have to list
the variables that are to be "closed over", and (2) assignment works
(in the sense that two closures over the same variables will "see"
the effects of each other's assignments).

I think that is one reason why the Lisp world was never quite
satisfied by awkward or partial solutions to these problems.
Examples of languages that have settled for less are Pop-2,
Pop-11, Python, and Java.

[Some possible other reasons: a desire to do things in a correct,
elegant way; the influence of lambda calculus, denotational semantics,
and (later) functional programming; the example of Scheme.]

-- jeff

From: Frode Vatvedt Fjeld
Subject: Re: what is the funarg problem
Date: Wed, 01 Dec 1999 00:00:00 +0000
Message-ID: <2hg0xmbi38.fsf@dslab7.cs.uit.no>

Jeff Dalton <····@todday.aiai.ed.ac.uk> writes:

> Frode Vatvedt Fjeld <······@acm.org> writes:
> 
> > What is "the classic funarg problem", exactly?
> 
> First, the problem; then FUNARGS, upward and downward, etc. [..]

Thank you for a very clear and extensive answer! I would say this
article is very FAQ-worthy (whether the question is frequent or not),
so perhaps if the powers that be are with us..?

-- 
Frode Vatvedt Fjeld

From: Aaron Gross
Subject: Re: what is the funarg problem
Date: Thu, 02 Dec 1999 00:00:00 +0000
Message-ID: <ly3dtloknn.fsf@cruella.bfr.co.il>

Jeff Dalton <····@todday.aiai.ed.ac.uk> writes:

> Since Lisp 1.5 was so close to the "right" solution -- ie to lexical
> scoping -- we ought to ask why they failed to go all the way.  I
> think the answer is that these issues weren't well enough understood
> at the time, and that it wasn't clear how to implement FUNARGs
> efficiently.

From McCarthy's "History of Lisp", at <URL:http://www-formal.stanford.
edu/jmc/history/lisp.html>:

    In all innocence, James R. Slagle programmed the following LISP
    function definition and complained when it didn't work right:

    The object of the function is to find a subexpression of x
    satisfying p[x] and return f[x]. If the search is unsuccessful,
    then the continuation function u[] of no arguments is to be
    computed and its value returned. The difficulty was that when an
    inner recursion occurred, the value of car[x] wanted was the outer
    value, but the inner value was actually used. In modern
    terminology, lexical scoping was wanted, and dynamic scoping was
    obtained.

    I must confess that I regarded this difficulty as just a bug and
    expressed confidence that Steve Russell would soon fix it. He did
    fix it but by inventing the so-called FUNARG device that took the
    lexical environment along with the functional argument. Similar
    difficulties later showed up in Algol 60, and Russell's turned out
    to be one of the more comprehensive solutions to the
    problem. While it worked well in the interpreter,
    comprehensiveness and speed seem to be opposed in compiled code,
    and this led to a succession of compromises. Unfortunately, time
    did not permit writing an appendix giving the history of the
    problem, and the interested reader is referred to (Moses 1970) as
    a place to start. (David Park tells me that Patrick Fischer also
    had a hand in developing the FUNARG device).

From: Jeff Dalton
Subject: Re: what is the funarg problem
Date: Fri, 03 Dec 1999 00:00:00 +0000
Message-ID: <x2k8mw3ujf.fsf@todday.aiai.ed.ac.uk>

Aaron Gross <············@bfr.co.il> writes:

> Jeff Dalton <····@todday.aiai.ed.ac.uk> writes:
> 
> > Since Lisp 1.5 was so close to the "right" solution -- ie to lexical
> > scoping -- we ought to ask why they failed to go all the way.  I
> > think the answer is that these issues weren't well enough understood
> > at the time, and that it wasn't clear how to implement FUNARGs
> > efficiently.
> 
> From McCarthy's "History of Lisp", at <URL:http://www-formal.stanford.
> edu/jmc/history/lisp.html>:

Thanks.  I think that account is slightly misleading in one way,
however.  Don't get me wrong.  I think it's a perfectly good account
from what we might call a modern perspective; but it glosses over
the way in which Lisp 1.5 departed from lexical scoping.  (I'm
assuming that the Lisp-in-Lisps of the Lisp 1.5 book are essentially
correct, though.)

Anyway, this raises a potentially interesting issue which I'll discuss
at the end.  But first, the example:

>     In all innocence, James R. Slagle programmed the following LISP
>     function definition and complained when it didn't work right:

Humm.  McCarthy doesn't actually give the Lisp definition of the
function in the HTML version of the paper, but it's there in the
PostScript version.  In case anyone's wondering, the problem *is* 
due to a free variable in a functional argument.  Since we don't
see M-exprs very much these days, I'll type in the definition,
below:

  testr[x,p,f,u] <- if p[x] then f[x]
                    else if atom[x] then u[]
                    else testr[cdr[x],p,f,\:testr[car[x],p,f,u]]

"\" is a lambda.  In the PostScript, there's an actual lambda character.

\:testr[car[x],p,f,u]  corresponds to the S-expr

   (lambda () (testr (car x) p f u))

The above is a modern version of M-expr notation as used in
McCarthy's paper.  In the Lisp 1.5 version, the if-expression
would be written something like this:

   [p[x] -> f[x];
    atom[x] -> u[];
    t -> testr[cdr[x],p,f,\:testr[car[x],p,f,u]]]

This notation for conditionals maps directly onto COND.

>     The object of the function is to find a subexpression of x
>     satisfying p[x] and return f[x]. If the search is unsuccessful,
>     then the continuation function u[] of no arguments is to be
>     computed and its value returned. The difficulty was that when an
>     inner recursion occurred, the value of car[x] wanted was the outer
>     value, but the inner value was actually used. In modern
>     terminology, lexical scoping was wanted, and dynamic scoping was
>     obtained.

Just so.

>     I must confess that I regarded this difficulty as just a bug and
>     expressed confidence that Steve Russell would soon fix it. He did
>     fix it but by inventing the so-called FUNARG device that took the
>     lexical environment along with the functional argument.

It's not actually the lexical environment, though.  I think it's
better to say Lisp 1.5 FUNARGs were closures over the *dynamic*
environment.  This gets tricky, because if you close all functions in
this "dynamic" env, lexical scoping is what you get.  But, from the
Lisp 1.5 book (and indeed from other things I've read), it looks like
this happened only when FUNCTION was used to explicitly request a
FUNARG, *and in other cases the current env was simply passed along*.
Simply passing it along is what makes it the dyamic env.  If you
passed along the empty env instead, you'd be closer to lexical
scoping.  (Indeed, the empty env would the the right lexical env for
functions defined at the top level.)

>     Similar
>     difficulties later showed up in Algol 60, and Russell's turned out
>     to be one of the more comprehensive solutions to the
>     problem. While it worked well in the interpreter,
>     comprehensiveness and speed seem to be opposed in compiled code,
>     and this led to a succession of compromises. Unfortunately, time
>     did not permit writing an appendix giving the history of the
>     problem, and the interested reader is referred to (Moses 1970) as
>     a place to start. (David Park tells me that Patrick Fischer also
>     had a hand in developing the FUNARG device).

In my earlier long article, I said something about why the Lisp world
was not satisfied by those compromises.  But another interesting
question is why people were willing to put up with them for so long.

Indeed, in may Lisps, the scoping rules were bizarre: a mixture of
lexical and dynamic, with various special cases when it came to
functional arguments, and with different rules in the interpreter and
compiler.

I think part of the answer is that in the Lisp world there's a
tendency to regard the program -- the source code -- as an interesting
entity in its own right, especially when it's represented as Lisp
data in the form of lists, symbols, etc.  Various tools can then
be applied to this object, and the interpreter and compiler are
among those tools.  When the interpreter and compiler don't quite
agree on some things, that can be a problem, but with a little
care it ceased to be a very serious one in practice.  [Some may
disagree about that, but, based primarily on my experience with
Franz Lisp, I think it's nonetheless true.]  Besides compilers always
change *something*, for otherwise there'd be no point.

[N.B. Franz Lisp is the MacLisp-like Lisp that was part of Berkeley
Unix, and then for a while a product of Franz Inc; it is a completely
separate system from Franz Inc's Common Lisp.]

-- jeff

From: Christopher R. Barry
Subject: Re: what is the funarg problem
Date: Fri, 03 Dec 1999 00:00:00 +0000
Message-ID: <87903bew78.fsf@2xtreme.net>

Jeff Dalton <····@todday.aiai.ed.ac.uk> writes:

> It's not actually the lexical environment, though.  I think it's
> better to say Lisp 1.5 FUNARGs were closures over the *dynamic*
> environment.

Lisp Machines have dynamic closures. There's about 15 or so functions
specific to them. They are created with a ZL:CLOSURE form. Here's an
example of a list-iterating closure (which would actually be better
done with lexical closures...):

  > (progv '(list) '((foo bar baz mum quux))
      (zl:closure '(list)
		  #'(lambda ()
		      (prog1 (car list)
			     (setf list (cdr list))))))
  #<DTP-DYNAMIC-CLOSURE 400006523>

  > (funcall *)
  FOO

  > (funcall **)
  BAR

  > (funcall ***)
  BAZ

Christopher

From: Jeff Dalton
Subject: Re: what is the funarg problem
Date: Mon, 06 Dec 1999 00:00:00 +0000
Message-ID: <x2so1gqeua.fsf@todday.aiai.ed.ac.uk>

······@2xtreme.net (Christopher R. Barry) writes:

> Jeff Dalton <····@todday.aiai.ed.ac.uk> writes:
> 
> > It's not actually the lexical environment, though.  I think it's
> > better to say Lisp 1.5 FUNARGs were closures over the *dynamic*
> > environment.
> 
> Lisp Machines have dynamic closures. There's about 15 or so functions
> specific to them. They are created with a ZL:CLOSURE form. Here's an
> example of a list-iterating closure (which would actually be better
> done with lexical closures...):
> 
>   > (progv '(list) '((foo bar baz mum quux))
>       (zl:closure '(list)
> 		  #'(lambda ()
> 		      (prog1 (car list)
> 			     (setf list (cdr list))))))
>   #<DTP-DYNAMIC-CLOSURE 400006523>

Thanks!

Something similar was added to Franz Lisp in Opus 38 or thereabouts.
Like the LM version, you had to list the variables to "close over",
but it didn't quite have the full assignment semantics.  (Actually,
it might have had the full semantics by the end.  I don't remember
off hand, but I can check later today if I remember to).

(If I recall correctly, the LM closures took advantage of invisible
pointers which weren't available to Franz Lisp because it ran on
VAXen and the like.)

In Lisp 1.5, you didn't have to list the variables, because it
just grabbed the whole env.

-- jeff