From: Evan Monroig
Subject: memory problems
Date: 
Message-ID: <87ac42arty.fsf@gmail.com>
Hi,

While doing some data processing using SBCL, I ran into memory problems.

Basically what I do is for a set of cases, generate some input data,
write to a file, call an external program (Abaqus) to process the data,
read the output, process it a bit and finally write it out.

After 6 runs, SBCL hangs with the following error.

----begin offending output----

* Argh! gc_find_free_space failed (first_page), nbytes=16.
   Gen Boxed Unboxed LB   LUB  !move  Alloc  Waste   Trig    WP  GCs Mem-age
   0:     0     0     0     0     0        0     0  2000000    0   0  0.0000
   1:     0     0     0     0     0        0     0  2000000    0   0  0.0000
   2:     0     0     0     0     0        0     0  2000000    0   0  0.0000
   3:     0     0     0     0     0        0     0  2000000    0   0  0.0000
   4: 28255 43692     0     0     0 294673256 21656 105300216    0   1  1.0274
   5: 21191 30452   250  1161   325 217017200 291984  2000000 10138   0  0.4900
   6:  6071     0     0     0     0 24866816     0  2000000 5718   0  0.0000
   Total bytes allocated=536557272
fatal error encountered in SBCL pid 9660(tid 3053411232):


The system is too badly corrupted or confused to continue at the Lisp
level. If the system had been compiled with the SB-LDB feature, we'd drop
into the LDB low-level debugger now. But there's no LDB in this build, so
we can't really do anything but just exit, sorry.

Process inferior-lisp exited abnormally with code 1

----end offending output----

I cannot really make sense of it, but my guess is that the GC fails to
release memory, most probably because I made mistakes and still have
references to part of my data.  And a limit was reached. About 512MB,
which is one quarter of my RAM.

Can you confirm this?  Are there ways to look at what data is still
referenced? 

Thanks in advance,

Evan

From: ················@Web.de
Subject: Re: memory problems
Date: 
Message-ID: <1160668153.979592.170800@m7g2000cwm.googlegroups.com>
Hi,

> I cannot really make sense of it, but my guess is that the GC fails to
> release memory, most probably because I made mistakes and still have
> references to part of my data.  And a limit was reached. About 512MB,
> which is one quarter of my RAM.

SBCL limits your memory to a pre-allocated range, depending on
architecture. 512 MB is the figure I know from x86/Linux. If this is
the problem (and not dangling references) I can provide a patch to
increase to above 1 GB on that combination.

> Can you confirm this?  Are there ways to look at what data is still
> referenced?

There are ways, but it's easier to check the program and/or experiment
before diving into this. Here are two threads in which I'm battling
with such a problem. Maybe there's useful information in there for you.

http://groups.google.de/group/sbcl-help-archive/browse_frm/thread/279647d32894cc7b/343d5fbdc4a318be?lnk=gst&q=laux&rnum=1#343d5fbdc4a318be
http://groups.google.de/group/comp.lang.lisp/browse_frm/thread/2fc23ab47420dfbc/875849c2cf468195?lnk=gst&q=laux&rnum=1#875849c2cf468195

Cheers,

--
Chris Laux
http://artofcomputing.net/
From: Evan Monroig
Subject: Re: memory problems
Date: 
Message-ID: <87zmc0nz7i.fsf@gmail.com>
················@Web.de writes:
>
> SBCL limits your memory to a pre-allocated range, depending on
> architecture. 512 MB is the figure I know from x86/Linux. If this is
> the problem (and not dangling references) I can provide a patch to
> increase to above 1 GB on that combination.

I see.  I might ask you if I run in the problem again :).

>> Can you confirm this?  Are there ways to look at what data is still
>> referenced?
>
> There are ways, but it's easier to check the program and/or experiment
> before diving into this. Here are two threads in which I'm battling
> with such a problem. Maybe there's useful information in there for you.
>
> http://groups.google.de/group/sbcl-help-archive/browse_frm/thread/279647d32894cc7b/343d5fbdc4a318be?lnk=gst&q=laux&rnum=1#343d5fbdc4a318be
> http://groups.google.de/group/comp.lang.lisp/browse_frm/thread/2fc23ab47420dfbc/875849c2cf468195?lnk=gst&q=laux&rnum=1#875849c2cf468195

Thanks for the information.  I found that information very useful.  A
lot of special cases where something could be the cause, but none looks
like my case.

Actually I didn't know how to do a manual GC, so I tried

(sb-ext:gc :full t)

and memory was back to the original level (^_^).

My next thought was to perform a full garbage collection after each run,
by executing (sb-ext:gc :full t) in my function but outside of the
bindings that contain the data, but for some reason it didn't release
any memory...

Are there any best practices to perform a GC with SBCL, or can I do it
just about anywhere in my code?  I don't mean to release 1 or 2MB every
2 seconds, but 200 or 300MB every 10 minutes or so (when I go to the
next simulation).

Thanks again,

Evan
From: David E. Young
Subject: Re: memory problems
Date: 
Message-ID: <1160747440.197505.316860@h48g2000cwc.googlegroups.com>
Evan Monroig wrote:
 >
> My next thought was to perform a full garbage collection after each run,
> by executing (sb-ext:gc :full t) in my function but outside of the
> bindings that contain the data, but for some reason it didn't release
> any memory...
>
> Are there any best practices to perform a GC with SBCL, or can I do it
> just about anywhere in my code?  I don't mean to release 1 or 2MB every
> 2 seconds, but 200 or 300MB every 10 minutes or so (when I go to the
> next simulation).

This might not be relevant at all, but we discovered several months ago
that the Lispworks GC (which is a generational-type collector) does not
automatically collect long-lived objects promoted to generation 2. Our
app was accumulating these things over time and running out of memory.
Once we figured this out (it was all documented in the Lispworks docs,
of course!), we added a strategically placed call to the gc, asking it
to collect generation 2. Magic; problem solved.

I don't know how the sbcl gc works, but perhaps this will help.

peace, david
From: Damien Kick
Subject: Re: memory problems
Date: 
Message-ID: <ticYg.9419$Y24.7786@newsread4.news.pas.earthlink.net>
David E. Young wrote:
> 
> [...] we discovered several months ago that the Lispworks GC (which is
 > a generational-type collector) does not automatically collect
 > long-lived objects promoted to generation 2. Our app was accumulating
 > these things over time and running out of memory.  Once we figured
> this  out (it was all documented in the Lispworks docs, of course!),
> we  added a strategically placed call to the gc, asking it to
> collect generation  2. Magic; problem solved.
> 
> I don't know how the sbcl gc works, but perhaps this will help.

I'm not an expert on garbage collection so perhaps that is the reason 
that I'm rather surprised by this behavior.  I would hope that my lisp 
implementation, if it used such a generational GC, would at least 
automatically consider collecting such long-lived objects promoted to 
generation 2 for me if an attempt to cons had just failed because it had 
just run out of memory, as opposed to having to code this explicitly in 
the application somehow.
From: John Thingstad
Subject: Re: memory problems
Date: 
Message-ID: <op.thfnovghpqzri1@pandora.upc.no>
On Sat, 14 Oct 2006 23:05:29 +0200, Damien Kick <·····@earthlink.net>  
wrote:

> David E. Young wrote:
>>  [...] we discovered several months ago that the Lispworks GC (which is
>  > a generational-type collector) does not automatically collect
>  > long-lived objects promoted to generation 2. Our app was accumulating
>  > these things over time and running out of memory.  Once we figured
>> this  out (it was all documented in the Lispworks docs, of course!),
>> we  added a strategically placed call to the gc, asking it to
>> collect generation  2. Magic; problem solved.
>>  I don't know how the sbcl gc works, but perhaps this will help.
>
> I'm not an expert on garbage collection so perhaps that is the reason  
> that I'm rather surprised by this behavior.  I would hope that my lisp  
> implementation, if it used such a generational GC, would at least  
> automatically consider collecting such long-lived objects promoted to  
> generation 2 for me if an attempt to cons had just failed because it had  
> just run out of memory, as opposed to having to code this explicitly in  
> the application somehow.

conses are stored and handled seperatly.
It is in particular long lived arrays that get promoted to the area
that dosn't get garbage collected..

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Espen Vestre
Subject: Re: memory problems
Date: 
Message-ID: <m1vemkb3if.fsf@doduo.vestre.net>
Damien Kick <·····@earthlink.net> writes:

> I'm not an expert on garbage collection so perhaps that is the reason
> that I'm rather surprised by this behavior.  I would hope that my lisp
> implementation, if it used such a generational GC, would at least
> automatically consider collecting such long-lived objects promoted to
> generation 2 for me if an attempt to cons had just failed because it
> had just run out of memory, as opposed to having to code this
> explicitly in the application somehow.

You can tell LispWorks to do it automatically, if you prefer that
(with LW 5.0, I would consider that, since gen. 2 GCs tend to be
much faster than with previous versions, at least with my app.)
-- 
  (espen)
From: ·········@random-state.net
Subject: Re: memory problems
Date: 
Message-ID: <1160901616.581962.218130@b28g2000cwb.googlegroups.com>
Evan Monroig wrote:

> Actually I didn't know how to do a manual GC, so I tried
>
> (sb-ext:gc :full t)
>
> and memory was back to the original level (^_^).

*That* is a manual GC.

> Are there any best practices to perform a GC with SBCL, or can I do it
> just about anywhere in my code?  I don't mean to release 1 or 2MB every
> 2 seconds, but 200 or 300MB every 10 minutes or so (when I go to the
> next simulation).

You should not need to: garbage collection happens automatically unless
you inhibit it by WITHOUT-GCING or GC-OFF. Don't do that unless you
know what you are doing, though.

If you need to toggle the granularity at which GCs are performed you
can experiment with SB-EXT:*BYTES-CONSED-BETWEEN-GCS* -- but once
again, you should not need to:
using it is a way to optimize the GC for your allocation patterns.

If you are running out of heap, then you are either
  1) running out of heap. Solution: build an SBCL with a bigger heap,
help is available.

  2) holding on to data you should not be holding on to. This has
already been commented on.

  3) being bitten by the conservativism of the GC. (On x86 SBCL's GC is
stack and register conservative.) Being bitten by conservativism hard
enough to run out of heap is rare but not impossible: if eg. a large
array full of other objects is conserved, neither the array nor any of
the objects will be collected. You can try to debug release patterns by
using finalizers (see the manual for details).

For future advice on SBCL, I'd really recommend the sbcl-help mailing
list, which you can subscribe to here:
  http://lists.sourceforge.net/mailman/listinfo/sbcl-help
or read directly on GMane.
  http://news.gmane.org/gmane.lisp.steel-bank.general
While several SBCL developers do read the comp.lang.lisp, I can
guarantee you that it is monitored with distinctly less zeal the the
mailing lists (and posts like this without SBCL in the Subject are
quite liable to slip through the net).

Also, from the output you posted I can tell that the SBCL you are using
is not a very recent one. More up-to-date releases of SBCL (starting
with 0.9.14) try to manage to signal a sensible STORAGE-CONDITION when
running out of heap instead of just bailing out. Getting SBCL 0.9.17
might be a good idea.

Cheers,

  -- Nikodemus Siivola
From: Evan Monroig
Subject: Re: memory problems
Date: 
Message-ID: <87irilhxid.fsf@gmail.com>
Thanks for your detailed response.

I will start a new discussion on the SBCL mailing lisp as advised, but I
also answer below.

·········@random-state.net writes:
> Evan Monroig wrote:
>
>> Actually I didn't know how to do a manual GC, so I tried
>>
>> (sb-ext:gc :full t)
>>
>> and memory was back to the original level (^_^).
>
> *That* is a manual GC.

Yes, and I learned it from the previous messages.

>> Are there any best practices to perform a GC with SBCL, or can I do
>> it just about anywhere in my code?  I don't mean to release 1 or 2MB
>> every 2 seconds, but 200 or 300MB every 10 minutes or so (when I go
>> to the next simulation).
>
> You should not need to: garbage collection happens automatically
> unless you inhibit it by WITHOUT-GCING or GC-OFF. Don't do that unless
> you know what you are doing, though.

Ok.  As you say, I think I don't need that, because it looks like GC is
called several times while memory increases, but it looks like it is
unable to release memory.

> If you need to toggle the granularity at which GCs are performed you
> can experiment with SB-EXT:*BYTES-CONSED-BETWEEN-GCS* -- but once
> again, you should not need to: using it is a way to optimize the GC
> for your allocation patterns.
>
> If you are running out of heap, then you are either
>   1) running out of heap. Solution: build an SBCL with a bigger heap,
> help is available.

When I perform a run, the memory increases to about 200MB.  Then if do a
manual full GC, it comes back to about 40MB.  So if memory is correctly
released, the size of the heap shouldn't be a problem.

>   2) holding on to data you should not be holding on to. This has
> already been commented on.

Yes.  I don't think that is the case.

>   3) being bitten by the conservativism of the GC. (On x86 SBCL's GC
> is stack and register conservative.) Being bitten by conservativism
> hard enough to run out of heap is rare but not impossible: if eg. a
> large array full of other objects is conserved, neither the array nor
> any of the objects will be collected. You can try to debug release
> patterns by using finalizers (see the manual for details).

So this is the only possibility.  I will investigate a little more to
see what exactly is not being released.

> For future advice on SBCL, I'd really recommend the sbcl-help mailing
> list, which you can subscribe to here:
>   http://lists.sourceforge.net/mailman/listinfo/sbcl-help
> or read directly on GMane.
>   http://news.gmane.org/gmane.lisp.steel-bank.general
[...]

Thanks.  I will start a new discussion on the SBCL mailing list.

> Also, from the output you posted I can tell that the SBCL you are
> using is not a very recent one. More up-to-date releases of SBCL
> (starting with 0.9.14) try to manage to signal a sensible
> STORAGE-CONDITION when running out of heap instead of just bailing
> out. Getting SBCL 0.9.17 might be a good idea.

Ok.  I will switch to a more recent version.

Thanks for all,

Evan
From: Wade Humeniuk
Subject: Re: memory problems
Date: 
Message-ID: <wasXg.58703$E67.4359@clgrps13>
Evan Monroig wrote:
> Hi,
> 
> While doing some data processing using SBCL, I ran into memory problems.
> 
> Basically what I do is for a set of cases, generate some input data,
> write to a file, call an external program (Abaqus) to process the data,
> read the output, process it a bit and finally write it out.
> 
> After 6 runs, SBCL hangs with the following error.

Are you running the program from the listener prompt? If you
are you have to be careful as the history as what you are doing
is kept in the vars *,**,***.  Potentially if large data
structures are being returned by your evals then they may
hang around in your history taking up heap.

Wade
From: Evan Monroig
Subject: Re: memory problems
Date: 
Message-ID: <8764eope7a.fsf@gmail.com>
Wade Humeniuk <··················@telus.net> writes:
> Are you running the program from the listener prompt? If you
> are you have to be careful as the history as what you are doing
> is kept in the vars *,**,***.  Potentially if large data
> structures are being returned by your evals then they may
> hang around in your history taking up heap.

I am running the program from slime, but anyway it is encapsulated in a
function which returns nil, so I think that this is not the source.

Thanks for the advice anyway.  I had forgotten about this so I will know
where to look at if I have this kind of problems next time :).

Evan