Stack-allocated objects vs. GC

From: Lawrence G. Mayka
Subject: Stack-allocated objects vs. GC
Date: Tue, 02 Aug 1994 16:39:22 +0000
Message-ID: <LGM.94Aug2113922@polaris.ih.att.com>

The DYNAMIC-EXTENT declaration in ANSI Common Lisp is basically a
promise from the programmer that any object(s) bound to the declared
variable or function name will not be accessible (i.e., they will be
"thrown away" rather than saved anywhere, and are hence safely
reclaimable) outside the declaration's dynamic scope.

But now let's say that I have written a DYNAMIC-EXTENT declaration of
FOO.  Within the dynamic scope of that declaration I store the value
of FOO into a cons cell (or other heap-allocated object).  I remain
within this dynamic scope so long that the cons cell gets promoted to
an older generation.  (Assume a generational garbage collector.)
Still within this dynamic scope, I discard the cons cell (but without
nulling out its reference to FOO's object).  I now leave the
DYNAMIC-EXTENT's dynamic scope.  At a later time, the generational
garbage collector, while collecting a young generation and therefore
examining older generations for old-to-new references, encounters the
cons cell I discarded earlier, and examines the cell's reference to
the location that earlier held FOO's object.

If FOO's object was on the stack, that location could now contain
arbitrary gibberish.  If stacks happen to be arbitrarily relocatable
in this particular implementation, that location could now be the
address of the beginning or middle of a heap-allocated object.  The
garbage collector could theoretically check all such references to
ensure that they don't point into a region of memory that could ever
have housed a process's stack, but is it required to make this
(perhaps expensive) check?  And couldn't that loophole potentially
leave around a lot of garbage objects, simply because those objects
refer to locations that at some earlier time were on a stack?  If so,
DYNAMIC-EXTENT may often--depending on the GC implementation--be more
expensive than it's worth.  On the other hand, if the garbage
collector can assume that the programmer will never ever store a
reference to a DYNAMIC-EXTENT object into a heap-allocated object,
then DYNAMIC-EXTENT is far more dangerous, and consequently less
generally useful, than some of us have previously thought.

Any comments from implementors or other experts?  I remember that
Symbolics Genera makes significant use of stack-allocated objects.
Does it safely handle this situation, and if so, how?
--
        Lawrence G. Mayka
        AT&T Bell Laboratories
        ···@ieain.att.com

Standard disclaimer.

Re: Stack-allocated objects vs. GC Sean Foderaro
Re: Stack-allocated objects vs. GC Wade Hennessey
Re: Stack-allocated objects vs. GC Kris Karas
- Re: Stack-allocated objects vs. GC Martin Rodgers
Re: Stack-allocated objects vs. GC Henry G. Baker
Re: Stack-allocated objects vs. GC Paul Fuqua

From: Sean Foderaro
Subject: Re: Stack-allocated objects vs. GC
Date: Tue, 02 Aug 1994 23:51:25 +0000
Message-ID: <JKF.94Aug2155125@frisky.Franz.com>

  
  A generational garbage collector works under the assumption that 
if it is collecting generation N and newer, then all objects in 
generations older than N are still alive.  This assumption is wrong most 
of the time and the garbage collector must be careful not to let this get it 
into trouble.   In the example you gave, the garbage collector believes that a 
tenured cons is alive and thus what it points to must be alive. This is 
the crux of the problem.  Most objects are alive as long as something
(such as the tentured cons) points to them, but for stack consed 
objects this is not the case.  Thus if you are going to allocate 
stack consed objects you must allocate them in a region of memory 
that is easy for the garbage collector to identify.
  In the Allegro CL garbage collector it only takes two compare instructions
for the garbage collector to identify a pointer to a possibly stack consed 
object (and since it makes this check it doesn't have the bug you've
described).
  so... I don't think that there is a problem with the 
dynamic-extent declaration nor should there be restrictions on storing 
a dynamic-extent object in a heap object.  Generational gc's achieve 
their good performance by making invalid assumptions but being careful to 
not cause any damage when those assumptions are wrong.  In the system you 
describe there is a potential for damage when the
"all tenured objects are alive" assumption is wrong.


-john (sean) foderaro
 franz inc.

From: Wade Hennessey
Subject: Re: Stack-allocated objects vs. GC
Date: Tue, 02 Aug 1994 23:53:40 +0000
Message-ID: <WADE.94Aug2165340@galatea.kaleida>

In article <················@polaris.ih.att.com> ···@polaris.ih.att.com (Lawrence G. Mayka) writes:

Newsgroups: comp.lang.lisp
From: ···@polaris.ih.att.com (Lawrence G. Mayka)
Organization: AT&T Bell Laboratories, Naperville, Illinois, USA
Date: Tue, 2 Aug 1994 16:39:22 GMT

[lots of stuff about DYANMIC-EXTENT and GC...]
refer to locations that at some earlier time were on a stack? If so,
DYNAMIC-EXTENT may often--depending on the GC implementation--be more
expensive than it's worth. On the other hand, if the garbage
collector can assume that the programmer will never ever store a
reference to a DYNAMIC-EXTENT object into a heap-allocated object,
then DYNAMIC-EXTENT is far more dangerous, and consequently less
generally useful, than some of us have previously thought.

Any comments from implementors or other experts? I remember that
Symbolics Genera makes significant use of stack-allocated objects.
Does it safely handle this situation, and if so, how?

One answer is that the garbage collector does not examine pointers
into stacks. The ScriptX real-time garbage collector allows stack
allocated objects, but the collector ignores all references to these
objects, since they're not subject to garbage collection anyway. Since
the stacks are scanned during each gc cycle, all live objects on the stacks
will also be scanned, so references back into the heap will be properly
traced. DYNAMIC-EXTENT is still a big win.

I'd guess the Lispm does the same thing. ·····@kaleida.com

From: Kris Karas
Subject: Re: Stack-allocated objects vs. GC
Date: Thu, 04 Aug 1994 15:50:26 +0000
Message-ID: <31r2o2$nf6@hsdndev.harvard.edu>

Lawrence G. Mayka writes:
>Any comments from implementors or other experts?  I remember that
>Symbolics Genera makes significant use of stack-allocated objects.
>Does it safely handle this situation, and if so, how?

Approximately the same way that the GC doesn't draw pretty pictures on
my video screen if I leave a pointer to a video memory location in a
cons cell.  How does the GC know that it doesn't even need to read the
memory location to see that it belongs to the video display?

OK, I'll provide a more detailed hint.  In a typical computer system,
a running program says it needs XXX words of memory, which the OS
provides conveniently starting at address 0 (since each process has
its own set of virtual memory page tables).  The program places heaps
at the bottom growing upwards, stacks at the top growing downwards,
and does its best to make sure they don't collide.  If a program reads
some memory at a truly absurd memory address (how about 0xC0000000) it
gets a page fault and a core dump; the virtual memory system just
doesn't have that much memory to offer, even if the program requested
it gracefully.  Now compare this to the lisp machine virtual memory
system.  If I store something at address 0x00000000 and again at
address 0xC0000000, I'm using only two blocks of virtual memory.
Addresses 0x00000001 through 0xBFFFFFFF are never allocated, and
therefore do not consume any resources within the virtual memory system.
Voila, we can now play games by allowing particular bits in the upper
portion of the address to denote "regions" in virtual memory; each
region can then be treated individually by the garbage collector.
It should be obvious by now that stacks live in a region whose gc
disposition is "ignore".  Whenever the GC encounters a pointer in a
cons cell that points to a stack region, it does not follow the
pointer, nor does it manipulate it in any way.  The only trouble that
a program can get into is referring to the pointer outside of its
dynamic extent, in which case "interesting things" can happen.  :-)
-- 
Kris Karas <···@enterprise.bih.harvard.edu> for fun, or @aviion-b... for work.
(setq *disclaimer* "I barely speak for myself, much less anybody else."
      *conformist-numbers* '((AMA-CCS 274) (DoD 1236))
      *bikes* '((RF900RR-94) (NT650-89 :RaceP T) (TSM-U3 :Freebie-P T)))

From: Martin Rodgers
Subject: Re: Stack-allocated objects vs. GC
Date: Sun, 07 Aug 1994 10:11:05 +0000
Message-ID: <776254265snz@wildcard.demon.co.uk>

In article <··········@hsdndev.harvard.edu>
           ···@enterprise.bih.harvard.edu "Kris Karas" writes:

> it gracefully.  Now compare this to the lisp machine virtual memory
> system.  If I store something at address 0x00000000 and again at
> address 0xC0000000, I'm using only two blocks of virtual memory.
> Addresses 0x00000001 through 0xBFFFFFFF are never allocated, and
> therefore do not consume any resources within the virtual memory system.
> Voila, we can now play games by allowing particular bits in the upper
> portion of the address to denote "regions" in virtual memory; each
> region can then be treated individually by the garbage collector.

This can also be done with NT. In fact, I don't think you can make
any assumption about the address that NT gives for a block of memory
that you've just allocated. Unused pages are "uncommitted". They
don't get created until they're used by the CPU. This could be very
useful for a GC.

For all I know, some Unix systems may also support this, esp if
they support shared memory or mapped files. I don't know, coz I
don't use or programmer for any Unix system. However, I'm sure that
someone else here will know.

-- 
Future generations are relying on us
It's a world we've made - Incubus	
We're living on a knife edge, looking for the ground -- Hawkwind

From: Henry G. Baker
Subject: Re: Stack-allocated objects vs. GC
Date: Sat, 06 Aug 1994 15:30:50 +0000
Message-ID: <hbakerCu4DrE.3Eo@netcom.com>

In article <················@polaris.ih.att.com> ···@polaris.ih.att.com (Lawrence G. Mayka) writes:
>The DYNAMIC-EXTENT declaration in ANSI Common Lisp is basically a
>promise from the programmer that any object(s) bound to the declared
>variable or function name will not be accessible (i.e., they will be
>"thrown away" rather than saved anywhere, and are hence safely
>reclaimable) outside the declaration's dynamic scope.
>
>But now let's say that I have written a DYNAMIC-EXTENT declaration of
>FOO.  Within the dynamic scope of that declaration I store the value
>of FOO into a cons cell (or other heap-allocated object).  I remain
>within this dynamic scope so long that the cons cell gets promoted to
>an older generation.  (Assume a generational garbage collector.)
>Still within this dynamic scope, I discard the cons cell (but without
>nulling out its reference to FOO's object).  I now leave the
>DYNAMIC-EXTENT's dynamic scope.  At a later time, the generational
>garbage collector, while collecting a young generation and therefore
>examining older generations for old-to-new references, encounters the
>cons cell I discarded earlier, and examines the cell's reference to
>the location that earlier held FOO's object.
[...]

I can think of two solutions to your problem offhand.

The first is mentioned in my paper "CONS Should not CONS its
Arguments" (Sigplan Notices 27,3 (Mar 1992), 24-34) -- any
stack-allocated object is immediately promoted or tenured if it is
made accessible to an older generation.

The second is to type such references to stack-allocated objects as
non-traceable by the GC.  Since the GC will trace everything on the stack
anyway, and since the user is responsible for reclaiming stack-allocated
objects, the GC can only cause trouble by trying to trace references into
the stack from non-stack objects.  Of course, this isn't 'safe', but then
stack-allocated objects are inherently unsafe, unless treated more like
solution #1, above.

From: Paul Fuqua
Subject: Re: Stack-allocated objects vs. GC
Date: Sun, 14 Aug 1994 16:41:41 +0000
Message-ID: <PF.94Aug14114141@elissa.hc.ti.com>

   Date: Tue, 2 Aug 1994 16:39:22 GMT
   From: ···@polaris.ih.att.com (Lawrence G. Mayka)

   But now let's say that I have written a DYNAMIC-EXTENT declaration of
   FOO.  Within the dynamic scope of that declaration I store the value
   of FOO into a cons cell (or other heap-allocated object).

On the TI Explorer we only had one kind of stack-allocated object, the
&rest list, and at the time we had two competing ideas for generational
GC, so we took a conservative route:  if you stored a stack-list pointer
into a cons cell, the stack-list was copied into a real list.

It avoided this kind of GC problem (which we probably never anticipated
anyway), though it caused some hair in the EQ microcode that I'm glad to
have forgotten in the intervening five or so years.

Paul Fuqua
Texas Instruments, Dallas, Texas                     ··@hc.ti.com