garbage collector

From: Bob Earl
Subject: garbage collector
Date: Wed, 23 Jan 2002 14:23:08 +0000
Message-ID: <6d5c6b30.0201230623.8fdbb79@posting.google.com>

When moving our application from MCL to LispWorks we experienced some
problems with the garbage collector: While the MCL application (based
on CL-HTTP) keeps more or less the same storage size even over several
weeks and in heavy use, the LW version behaves quite different. The
same code (despite the slight changes in CL-HTTP) produces a growing
storage size. After about a week the server needs 1.5 times more
storage, i.e. about 115MB (this figure refers to the "room" function).
In Windows the system even crashed after an overflow caused by the
extensive storage use.

Seems to me, like it is not possible to start gc in LispWorks
explicitly, as it is done in MCL, but only to set gc-parameters. These
are the default parameters:
((:ENLARGE-BY-SEGMENTS . 10) (:MINIMUM-FOR-PROMOTE . 1000) 
(:MAXIMUM-OVERFLOW . 1000000) (:MINIMUM-OVERFLOW . 500000) 
(:MINIMUM-BUFFER-SIZE . 200) (:NEW-GENERATION-SIZE . 4096) 
(:PROMOTE-MAX-BUFFER . 100000) (:PROMOTE-MIN-BUFFER . 200) 
(:MAXIMUM-BUFFER-SIZE . 131072) (:MINIMUM-FOR-SWEEP . 8000) 
(:BIG-OBJECT . 131072) (:SMALL-OBJECT . 100))

I wonder if sombody else experienced similar behavior. Do I need to
change these settings? The usual golden rule is: never ever change the
gc-settings, as long as you do not exactly know what you are doing.

Maybe it's not a gc problem, but a delivery problem, but anyway, I
don't know how to handle this problem. Any ideas?

Thanks in advance,
Earl.

Re: garbage collector Carl Shapiro
Re: garbage collector Marc Battyani
- Re: garbage collector Bob Earl
  - Re: garbage collector Tim Bradshaw
    - Re: garbage collector Thomas F. Burdick
    - Re: garbage collector Marc Battyani

From: Carl Shapiro
Subject: Re: garbage collector
Date: Wed, 23 Jan 2002 14:32:46 +0000
Message-ID: <ouypu41kui9.fsf@panix3.panix.com>

········@gmx.net (Bob Earl) writes:

> Seems to me, like it is not possible to start gc in LispWorks
> explicitly, as it is done in MCL, but only to set gc-parameters. 

Look at the documentation for HCL:MARK-AND-SWEEP, it should do what
you want.

From: Marc Battyani
Subject: Re: garbage collector
Date: Wed, 23 Jan 2002 14:58:23 +0000
Message-ID: <0211FC8E6DFDADCA.4CF1029133849ADD.32EF0B827C43FD64@lp.airnews.net>

"Bob Earl" <········@gmx.net> wrote
> When moving our application from MCL to LispWorks we experienced some
> problems with the garbage collector: While the MCL application (based
> on CL-HTTP) keeps more or less the same storage size even over several
> weeks and in heavy use, the LW version behaves quite different. The
> same code (despite the slight changes in CL-HTTP) produces a growing
> storage size. After about a week the server needs 1.5 times more
> storage, i.e. about 115MB (this figure refers to the "room" function).
> In Windows the system even crashed after an overflow caused by the
> extensive storage use.
>
> Seems to me, like it is not possible to start gc in LispWorks
> explicitly, as it is done in MCL, but only to set gc-parameters. These
> are the default parameters:
> ((:ENLARGE-BY-SEGMENTS . 10) (:MINIMUM-FOR-PROMOTE . 1000)
> (:MAXIMUM-OVERFLOW . 1000000) (:MINIMUM-OVERFLOW . 500000)
> (:MINIMUM-BUFFER-SIZE . 200) (:NEW-GENERATION-SIZE . 4096)
> (:PROMOTE-MAX-BUFFER . 100000) (:PROMOTE-MIN-BUFFER . 200)
> (:MAXIMUM-BUFFER-SIZE . 131072) (:MINIMUM-FOR-SWEEP . 8000)
> (:BIG-OBJECT . 131072) (:SMALL-OBJECT . 100))
>
> I wonder if sombody else experienced similar behavior. Do I need to
> change these settings? The usual golden rule is: never ever change the
> gc-settings, as long as you do not exactly know what you are doing.
>
> Maybe it's not a gc problem, but a delivery problem, but anyway, I
> don't know how to handle this problem. Any ideas?

You should call (hcl:mark-and-sweep 2) and upgrade to LWW4.2 if you don't
have it as they now have 4 generations instead of 3 thus speeding up GC in
generation 2 which is much smaller now.

Marc

From: Bob Earl
Subject: Re: garbage collector
Date: Thu, 24 Jan 2002 09:15:05 +0000
Message-ID: <6d5c6b30.0201240115.6a13853f@posting.google.com>

> You should call (hcl:mark-and-sweep 2) and upgrade to LWW4.2 if you don't
> have it as they now have 4 generations instead of 3 thus speeding up GC in
> generation 2 which is much smaller now.

Well, thanks so much. This will solve our problem. I my opinion
mark-and-sweep should be called automatically. But in LW it is left to
the developer. Especially for servers, which are running over a long
period of time, it is essential.

Earl.

From: Tim Bradshaw
Subject: Re: garbage collector
Date: Thu, 24 Jan 2002 11:11:07 +0000
Message-ID: <ey3zo34xaus.fsf@cley.com>

* Bob Earl wrote:

> Well, thanks so much. This will solve our problem. I my opinion
> mark-and-sweep should be called automatically. But in LW it is left to
> the developer. Especially for servers, which are running over a long
> period of time, it is essential.

I think it is called automatically.  I'm not really familiar with the
LW GC, but it is almost certainly some kind of generational system.
Thus it will try to avoid spending too much time doing GC by gradually
promoting long-lived objects into areas which are scanned less
frequently, ultimately `tenuring' them into an area which is never
looked at at all.  All generational GCs (? I think) have the problem
that things can gradually `leak' into the tenured area thus causing
eventual growth in image size.  The alternative is to scan all areas
which means that, if there is a lot of long-lived data, then the
system will get significant GC pauses.

Typically there will be a fair number of parameters you can adjust
which control things like how long objects need to survive to get
promoted, which generations are scanned and so on.  Generational GCs
often do need these parameters set to get the best performance and to
avoid leakage.

Finally, in many cases, where the system does leak gradually, it's
preferable to run a full GC manually, or under program control, at
known good times, rather than suffer frequent long(ish) pauses.  In a
system I worked on we used to trigger a full GC (and various other
long-lived activities) early in the morning, as that was a known quiet
time.

--tim

From: Thomas F. Burdick
Subject: Re: garbage collector
Date: Thu, 24 Jan 2002 20:19:17 +0000
Message-ID: <xcvbsfja4e2.fsf@famine.OCF.Berkeley.EDU>

Tim Bradshaw <···@cley.com> writes:

> Finally, in many cases, where the system does leak gradually, it's
> preferable to run a full GC manually, or under program control, at
> known good times, rather than suffer frequent long(ish) pauses.  In a
> system I worked on we used to trigger a full GC (and various other
> long-lived activities) early in the morning, as that was a known quiet
> time.

In a situation where you have a server that will leak into the tenured
generation, no matter how you tweak the GC, I think incremental
collection is the best solution.  I have a situation like that, but
where we cannot just do a stop-and-collect-world ever, because of
real-time-ish constraints.  If you can find a time when you *can* stop
the world, that's maybe a better solution.  But if you can't,
incremental collection seems the best option.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'

From: Marc Battyani
Subject: Re: garbage collector
Date: Thu, 24 Jan 2002 12:42:45 +0000
Message-ID: <7F490D129425C552.0B164FC51DB4CD63.FD6D91D5D68E721A@lp.airnews.net>

"Tim Bradshaw" <···@cley.com> wrote
> * Bob Earl wrote:
>
> > Well, thanks so much. This will solve our problem. I my opinion
> > mark-and-sweep should be called automatically. But in LW it is left to
> > the developer. Especially for servers, which are running over a long
> > period of time, it is essential.
>
> I think it is called automatically.  I'm not really familiar with the
> LW GC, but it is almost certainly some kind of generational system.
> Thus it will try to avoid spending too much time doing GC by gradually
> promoting long-lived objects into areas which are scanned less
> frequently, ultimately `tenuring' them into an area which is never
> looked at at all.  All generational GCs (? I think) have the problem
> that things can gradually `leak' into the tenured area thus causing
> eventual growth in image size.  The alternative is to scan all areas
> which means that, if there is a lot of long-lived data, then the
> system will get significant GC pauses.

The garbage collection of generation 2 is not automatic

From the LW doc:
"The first generation normally consists of two segments: the first segment
is relatively small, and is where most of the allocation takes place. The
second segment is called the big-chunk area, and is used for allocating
large objects and when overflow occurs (see below for a discussion of
overflow).

The second generation (generation 1) is an intermediate generation, for
objects that have been promoted from generation 0 (typically for objects
that live for some minutes).

Long-lived objects are eventually promoted to generation 2. Note that
generation 2 is not scanned automatically. Therefore these objects will not
be reclaimed (even if they are not referenced) until an explicit call to a
garbage collector function (for example mark-and-sweep on generation 2, or
clean-down ) or when the image is saved. Normally, objects are not promoted
from generation 2 to generation 3, except when the image is saved.

Generation 3 normally contains only objects that existed at startup time,
that is those were saved in the image. Normally it is not scanned at all,
except when an image is saved."

> Typically there will be a fair number of parameters you can adjust
> which control things like how long objects need to survive to get
> promoted, which generations are scanned and so on.  Generational GCs
> often do need these parameters set to get the best performance and to
> avoid leakage.

Avoiding promotion to older generation is difficult when you have many
threads. This is the case when you have a (busy) web server. Short life data
of a waiting thread can be promoted as other threads trigger GCs. so you end
with short life data (for a thread) seen as long life data by the GC.
I would be interested to know if they are usable ways to avoid this. (other
than using pools of objects)

> Finally, in many cases, where the system does leak gradually, it's
> preferable to run a full GC manually, or under program control, at
> known good times, rather than suffer frequent long(ish) pauses.  In a
> system I worked on we used to trigger a full GC (and various other
> long-lived activities) early in the morning, as that was a known quiet
> time.

I've done the same but it's not very pleasant.

Marc