Dynamic and lexical environment access

From: Howard R. Stearns
Subject: Dynamic and lexical environment access
Date: Thu, 11 Mar 1999 00:00:00 +0000
Message-ID: <36E82309.26704DBA@elwood.com>

In response to another issue (Idiom For Context Sensitive Dynamic
Context), Duane Rettig wrote about plans for making the
variable-information/augment-environment stuff work.  This had all been
proposed for ANSI and described in CLtL2, but removed because it wasn't
quite all there yet.

I think what you describe is a great idea, and please don't take what
I'm about to say as a suggestion to disuade you from it.  I have been
working on a related idea and you might consider whether some of these
ideas make sense, and whether they effect your design.  (Also, I haven't
been working on this LATELY, so this is from memory...)

I have a sort of MOP for code processing.  It handles (user extensible)
code walking, interpratation (eval), and compilation (compile and
compile-file).  The whole thing started out as a sort of CLOS (multiple
dispatch) version of Quiennec's "chapter 3" intepreter from "Lisp in
Small Pieces".

The basic idea is that there some generic functions for doing common
things like applying a combination (i.e. calling a function), reading a
variable, assigning a variable, extending an environment, executing
various kinds of code bodies (sequencing), etc.  Each of these take
various different arguments, but the required arguments all end in
PROCESSOR, ENVIRONMENT, CONTINUATION.

Each of these dispatchable objects have system-defined classes and can
be subclassed by the user, with multiple inheritance:

   PROCESSOR represents the dynamic, global environment that is doing
the processing.  NIL is the top level interpreter and other processors
are defined for different kinds of compilers, etc.  The compiler
PROCESSOR objects keeps things like global definitions for the current
compilation unit.

   ENVIRONMENT represents the lexical environment, as described in the
CLtL2 environment stuff.

   CONTINUATION defines what is to be done with the value.

In fact, there is one other generic function which can also be
specialized by the user, RESUME which is defined to take a PROCESSOR,
CONTINUATION and a "value", and "does the right thing" with that value,
given the type of processing and continuation involved.  The "value"
might be a set of multiple values (interpreter) a set of generated
psuedo-machine instructions (compiler), or transformed source code (code
walker).

As you might imagine, there is a lot of code reuse between interpeters,
compilers and code walkers that is handled through ordinary CLOS
dispatching and inheritance in this scheme.  To implement an interpreter
or compiler for a new language, or to implement a new compiler for an
existing language (i.e. CL), requires only a few classes and methods to
be defined.

WITH-COMPILATION-UNIT essentially creates a PROCESSOR instance which is
used by all interior top-level entries to the code walker.  At the end
of the compilation unit, a generic function is called on the processor
to dump out whatever it feels the need to.

Now, here's where all this relates to your environment stuff.  I decided
that it was better to separate the (dynamic) processor object from the
(lexical) environment object.

 1. The function RESUME does not need to know anything about the lexical
environment, so why provide it?  (Actually, there ARE situations, such
as debuggers, where I have been tempted to include it in resume, so
maybe I'm wrong.)

 2. There are lots of reasons that the system or a user might want to
specialize processor in all sorts of different ways, but as long as one
sticks to CL, there is very little reason to ever create new
specializations on lexical environment objects. 

It would certainly be fair to use just a single environment object.  The
various classes of lexical environments could each be specialized for
each of the processors.  If it was too much to expect programmers to be
able to do this right, a utility could be created to cons up new classes
on the fly as necessary, in the same way that CLIM creates new
combinations of existing classes when it needs to.

I felt that separating the two made things easier to describe, easier
for methods to dispatch, and and easier for smaller sets of specialized
classes to be created. In fact, my current code doesn't even use the
same metaclass for processors and environments (environments are
defstructs).

----
Now, in this model, the CLtL2 environment functions exist, but their
arguments don't have any indication of WHY the environment is being
examined (i.e., what kind of processing is being done.)  This is
probably a good thing for lexical accessors/constructors.  On the other
hand, by not including the dynamic environment they can really only
acces the global top-level interpreter dynamic environment when no
lexical binding is defined.  

Therefore, I also have expanded generic function versions of the CLtL2
environment utilities which take PROCESSOR, NAME,
(lexical-)ENVIRONMENT.  These always return something for the second
value as long as the binding is known in the combined (lexical and
global) environment. (The CLtL2 functions return nil for the second
value if there is no lexical binding.)

Also, even for the pure lexical (CLtL2 compatible) environment
accessors/constructors, I found it necessarry to make some extensions. 
These are describes in some comments that I've wripped out from the
code:

-A "control-information" accessor was needed, to get a hold of bindings
in the block and tagbody namespaces.

-xxx-information returns the binding value (e.g. a closure for
function-bindings) as the second value (not just T).

- FUNCTION-INFORMATION...
  ;; NOTE COMPILER DEPENDENCY: Steele says that this "returns
  ;; information about the interpretation of the function-name [name]
  ;; when it appears in the functional position" of a form.  We define
  ;; it for names "when it appears as the second argument to the
  ;; FUNCTION special operator." 

  ;; We take the position that FTYPE, INLINE/NOTINLINE and
  ;; IGNORE/IGNORABLE declarations for NAME are not automatically
  ;; applicable to (SETF NAME) - i.e. they must be defined separately
  ;; by the programmer.

-AUGMENT-ENVIRONOMENT takes....
;;; An additional keyword argument, :bound-declarations-p indicates
;;; that bound declarations will apply to initialization forms
;;; appearing at the head of a body being represented by the new
;;; environment.  This is used for LABELS, LET*, and LAMBDA.  The
;;; default is nil.

;;; In all cases, two environments are returned.  The first is as
;;; described in CLtL2, and is an appropriate environment for the BODY
;;; of a Common Lisp special form.  The second value is an environment
;;; that is appropriate for the INITITIALIZATION forms of a Common
;;; Lisp special form.  When :bound-declarations-p is nil, the second
;;; environment is equivalent to the environment passed as an
;;; argument to augment-environment.

;;; When :bound-declarations-p is non-nil, the second value
;;; includes the :function and :variable bindings, and only those
;;; declarations which refer to these bindings (i.e. bound
;;; declarations).  In addition, the :variable bindings and their
;;; bound declarations are added in sequence and successively earlier
;;; bindings can be accessed using ENCLOSING-ENVIRONMENT.  For
;;; example, one might represent the environment formed by:
;;;	(let* ((a 1) (b a) (c b))
;;;       (declare (special b z) (fixnum c))...)
;;; as (augment-environment x   :variable '(a b c)
;;;				:declare '((special b z) (fixnum c))
;;;				:bound-declarations-p t)
;;; The first environment returned is suitable for the body of the
;;; LET* and includes variable bindings for A, B, and C, special
;;; declarations, for B and Z, and a fixnum declarations for C.

;;; The second environment returned does not include free declarations.
;;; The ENCLOSING-ENVIRONMENT of this is suitable for the
;;; initialization form for the last variable, C.  It includes the
;;; original environment argument to augment-environment, plus
;;; bindings for A and B, and a special declarations for B.  

;;; ENCLOSING-ENVIRONMENT of this environment is suitable for the
;;; initialization form of the preceding variable, B, and includes the
;;; original environment and a binding for A. 

;;; ENCLOSING-ENVIRONMENT of this environment is suitable for the
;;; initialization form of the first variable, A, and is equivalent to
;;; the environment which was originally passed to
;;; augment-environment.

;;; Note that no Common Lisp form introduces both :function and
;;; :variable bindings, and that :function bindings are never
;;; introduced in series.  The scope of :function bindings in LABELS
;;; includes ALL the function definitions forms, so the second
;;; environment value returned by augment-environment may be used for
;;; all the LABELS function definitions and one need not descend
;;; through the environments using ENCLOSING-ENVIRONMENT.  The
;;; second returned value is undefined if one combines non-nil values
;;; for all of :variable, :function and :bound-declarations-p.

;;; In order to support indexing (i.e. renaming of similarly named
;;; variables with overlapping scope) the initialization environment
;;; may include a dummy "declaration-only" binding for the new
;;; variables defined in the body environment.  This is because the C
;;; scope of the initialization forms includes the new bindings, and
;;; these forms may introduce newer bindings with the same name.
;;; Indexing looks for bindings (declaration-only or otherwsise) with
;;; the same name.  

-DEFINE-DECLARATION
;;; This is different than "standard" only for :DECLARE entries:
;;; Steele says that when the handler returns (key . value),
;;; (declaration-information key) returns value.  Instead, when the
;;; handler returns (key value), declaration-information returns all
;;; the values appended together.  This is how optimize is handled. 

#| ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[Howard's note: I haven't looked at this in a while, I'm not even sure
if my code still works this way...]

GOLDEN RULE OF DECLARATIONS:
Whenever a new binding is created, any existing outer declarations
should be ignored within the scope of the new binding.  This is
true even for: 
  - type definitions (nesting only applies for declarations
    refering to the same binding).
  - a special binding shadowing a local binding
  - a NEW special binding shadowing an existing special binding.

The only exception is that a local binding of a variable proclaimed
special must still be special.  

One consequence of this is that the compiler does not intersect
information about different bindings of the same special variable.

We implement this by adding declarations to each frame in which the
declaration appears -- to the binding itself if there is one in
the same frame, otherwise to a "dummy" binding.  When collecting
variable information, we append all the applicable declarations
from "dummy" bindings we encounter as well as the from the "real" 
bindings.  The innermost declarations are listed first, so these
appear earlier in the plist.

LOCALLY SPECIAL RULE:
We further assume that a special declaration which causes
references within the declaration to refer to a variable which
would OTHERWISE BE LEXICAL introduces a new scope in which
declarations for the special variable are accumulated as though a
new binding were being used.  
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; |#

> 
> "Howard R. Stearns" <······@elwood.com> writes:
> 
> > What is the Common Lisp idiom for executing a body of code in a
> > dynamic context in which some parameter will be set, but only if it
> > hasn't been already set?
> 
> Is this a homework problem?
> 
> :-)
> 
> >  This comes up often enough in systems
> > programming, and is hard enough to get right, that I'd like to use
> > and/or develop a standard idiom for it.  How this is done also effects
> > other things which we'd like to standardize, such as DEFSYSTEM and
> > perhaps other interfaces to system code (streams, CORBA, ...)
> 
> My first reaction to your first paragraph was "use DEFVAR".
> However, it becomes clear later that what you really want is
> "environments".  One version of this was proposed in CLtL2 (<174>
> SYNTACTIC-ENVIRONMEN-ACCESS:SMALL, as described on pp 206-214 of
> CLtL2), but was then withdrawn due to the facts that nobody had
> implemented it yet, and there were semantic difficulties with the
> system.
> 
> I am doing some research in the environments area, and it is my
> intention (if I am successful) to release an environments implementation
> (with source, of course) to the public, and further to propose such
> additions to the ANSI spec.  Anyone with ideas or problems is welcome
> to respond on this ng or by mail; the more I know about what does or
> doesn't work, the more likely I would be to succeed.
> 
> > I'm going to give some example situations and some sucessively more
> > complicated implementations.  I'm also going to show how this effects
> > WITH-COMPILATION-UNIT/DEFSYSTEM, and show an analogy to the condition
> > system.
> 
> Rather than quote and answer your whole article, I will put forth
> what I am doing and maybe you can then analyze how what you want to
> do relates to what I am proposing to propose.  I must first say that
> what I am  doing is driven by two needs we have in our product:
> source-level debugging and user-declared inline function support.
> 
> First, I am using the CLtL2 documentation as a guide for a first
> implementation for lexical environments.  I am not sure this will
> stick, because of the lack of support for dynamic environments and
> lack of extensibility.  But it is a start, for now.
> 
> For definitions, I use the ANSI spec for some, and add a term of my own:
> 
> The spec distinguishes between "lexical" and "dynamic" environments,
> but only talks about environment "objects" for the lexical environments.
> The only dynamic environment that is discussed is the "global"
> environment.  However, anyone who deals with file compilation (or any
> other manifestation of with-compilation-unit) knows that during such
> time, things may be added to the "global" environment that will then
> go away after compilation.  The extent to which implementations clean
> up this after-compilation global environment varies between
> implementations.  I have given this intermediate state of the global
> environment a name: a "shadow" environment.  If we could represent a
> shadow environment as an object which is created for each outside or
> overridden with-compilation-unit form, then cleanup would become more
> specifiable, and there would be the potential for more precise
> specification of the scope of various operations by identifying the
> environments to with they pertain.
> 
> Another difference between lexical and dynamic environments is the
> complexity of lexical-environment access and creation: Every time
> an augment-environment returns a new environment, this info must be
> kept/recorded somewhere in heirarchical fashion so that various
> stages of compilation can analyze the entire lexical system within
> the compilation unit.  Such information could even be stored away to
> become the information needed for such things like block-compilation,
> code-rewriting, and source-level-debugging.  Every CL implementation
> has some method of storing just enough info to complete the compilation
> process.  But I am convinced that lexical environments, properly
> designed, will regularize this info and make amazing tools available
> portably that are not available now.
> 
> [Steve Haflich has worked with me to get a straw implementation that
> allows for lexical environments to be implemented with the above
> behaviors; I am not close to being ready to release it, but am
> always implementing the code in such a way that the source can be
> released when the concept is solid.]
> 
> Continuing with the difference discussion, any additions to a dynamic
> environment tend not to want to be logged; instead, the environment as
> a whole (whether it is a "shadow" object or simply the current state
> of the lisp as the global environment).
> 
> I have not yet considered the best implementation for shadow
> environments.  How they are implemented will probably depend
> on what CL functionality should access them.  This seems to me to be
> an easier problem than that of lexical environments, but any help
> I can get would be welcome.
> 
> I have only worked for a couple of months on the nitty-gritty of
> this environments problem, so those of you who know much more than
> I about the details, please be gentle with me.  I intend to learn
> as much as possible, and at the very least report on what I have
> found, if an implementation does not itself come of it.
> 
> Howard, let me know if this concept strikes a chord (or not) with what
> you are trying to do.  To me, environments seem like the natural way
> to do what you are trying to do.
> 
> --
> Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
> 1995 University Ave Suite 275  Berkeley, CA 94704
> Phone: (510) 548-3600; FAX: (510) 548-8253   ·····@Franz.COM (internet)