From: Artem Baguinski
Subject: Garbage Collection and Foreign Data
Date: 
Message-ID: <8765ds7iwv.fsf@caracolito.lan>
hello again

my lisp functions use evil external library to read media files. it's
evil because it doesnt allow me to allocate memory in some cases -> i
have to remember to call close-codec before i can forget about the
codec altogether. 

is there a way to automate this? can i politelly ask garbage
collector to close codec before it collects it?

related question needs longer introduction:

playing back a media file consists of read /
process / display loop. the process part may differ depending on
effects i want to create, and may wanna have access to several chunks
of media [audio samples or video frames]. on the other hand i really
don't wanna allocate the video-frame object for every video frame i
read, instead i'd like to pre-allocate a frames pool and be able to
pool-get-frame when i need a frame and pool-return-frame when i'm
done with a frame. 

is there a mechanism / data structure in Common Lisp that would
facilitate such pool creation? i'm afraid the whole idea is a bit
counter intuitive and anti-GC, but i'm sure media playback is not the
only task where such alternative memory management would prove
useful...

thanks,
artm

-- 
gr{oe|ee}t{en|ings}
artm 

From: Duane Rettig
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <4znb4v625.fsf@franz.com>
Artem Baguinski <····@v2.nl> writes:

> hello again
> 
> my lisp functions use evil external library to read media files. it's
> evil because it doesnt allow me to allocate memory in some cases -> i
> have to remember to call close-codec before i can forget about the
> codec altogether. 
> 
> is there a way to automate this? can i politelly ask garbage
> collector to close codec before it collects it?

Some CL implementations are equipped with "finalizations", which allow
just this kind of thing (Allegro CL is one of them, of course :-)
Look for finalizations in your documentation or ask your CL vendor.

> related question needs longer introduction:
> 
> playing back a media file consists of read /
> process / display loop. the process part may differ depending on
> effects i want to create, and may wanna have access to several chunks
> of media [audio samples or video frames]. on the other hand i really
> don't wanna allocate the video-frame object for every video frame i
> read, instead i'd like to pre-allocate a frames pool and be able to
> pool-get-frame when i need a frame and pool-return-frame when i'm
> done with a frame. 
> 
> is there a mechanism / data structure in Common Lisp that would
> facilitate such pool creation? i'm afraid the whole idea is a bit
> counter intuitive and anti-GC, but i'm sure media playback is not the
> only task where such alternative memory management would prove
> useful...

Many CL implementations have some kind of resourcing mechanisms, but
one has to be careful, because such mechanisms are philosophically
at odds with garbage-collection, and if not implemented/used with
understanding of the issues they can lead to surprising performance
degradations.  Allegro CL has such a mechanism, but we've never
exported or formally documented it for this and other reasons.  Ours
has the feature of not trying to keep freed resources for too long
(because then they must be copied in order to be preserved, which
lengthens the gc process).  We do provide informal documentation
for those who ask for it.  As in all cases with implementation-dependent
features, the proper answer is "ask your CL vendor".

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   
From: Kaz Kylheku
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <cf333042.0402271232.2780b806@posting.google.com>
Artem Baguinski <····@v2.nl> wrote in message news:<··············@caracolito.lan>...
> hello again
> 
> my lisp functions use evil external library to read media files. it's
> evil because it doesnt allow me to allocate memory in some cases -> i
> have to remember to call close-codec before i can forget about the
> codec altogether. 
> 
> is there a way to automate this? can i politelly ask garbage
> collector to close codec before it collects it?

Most Lisp implementations support various extensions for dealing with
issues like this. Search your documentation for ``finalization''
``weak pointer'' ``weak hash'' and such.

Also, if you are using these foreign objects in a scoped displine, you
could use UNWIND-PROTECT to delete them.

> is there a mechanism / data structure in Common Lisp that would
> facilitate such pool creation? i'm afraid the whole idea is a bit
> counter intuitive and anti-GC, but i'm sure media playback is not the
> only task where such alternative memory management would prove
> useful...

You can keep such a pool in a global container. Use finalization to
remove dead objects from the container. This way you are GC-friendly:
objects in the pool are available for fast recycling, but they can
still be garbage-collected if necessary.

Some Lisp implementations support such an auto-cleaning feature in
their hash tables. This is called ``weak hashes'' and is typically
turned on by some nonstandard arguments to MAKE-HASH-TABLE. Weak means
that weak pointers are used to hold references to objets. Weak
pointers are references that are not considered to be references for
the purpose of garbage collection. If an object only has weak pointers
referencing it, it can be collected: and the weak pointers are
magically updated to hold some safe value like NIL, or otherwise
implement safe behavior when attempts are made to retrieve the
nonexistent object.

By directly using finalization hooks and weak pointers, you can
implement GC-friendly behavior over any type of container, such as a
plain list. Instead of inserting the object into the list directly,
you create a weak pointer to the object and insert that. Add a
finalization function to the object which will remove it from the
list.
From: Espen Vestre
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <kwllmo64sc.fsf@merced.netfonds.no>
Artem Baguinski <····@v2.nl> writes:

> my lisp functions use evil external library to read media files. it's
> evil because it doesnt allow me to allocate memory in some cases -> i
> have to remember to call close-codec before i can forget about the
> codec altogether. 
>
> is there a way to automate this? can i politelly ask garbage
> collector to close codec before it collects it?

The lispy way of doing this is to use unwind-protect, i.e. to
write a macro utilizing unwind-protect that looks like

(with-open-codec (coded)
 <your code here>)

> is there a mechanism / data structure in Common Lisp that would
> facilitate such pool creation? 

Hmm... it sounds like the objects are released in the same order
as they are getting used, so why not just put them in an array?
-- 
  (espen)
From: Artem Baguinski
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <871xog7gn0.fsf@caracolito.lan>
Espen Vestre <·····@*do-not-spam-me*.vestre.net> writes:

> Artem Baguinski <····@v2.nl> writes:
>
>> my lisp functions use evil external library to read media files. it's
>> evil because it doesnt allow me to allocate memory in some cases -> i
>> have to remember to call close-codec before i can forget about the
>> codec altogether. 
>>
>> is there a way to automate this? can i politelly ask garbage
>> collector to close codec before it collects it?
>
> The lispy way of doing this is to use unwind-protect, i.e. to
> write a macro utilizing unwind-protect that looks like
>
> (with-open-codec (coded)
>  <your code here>)

aha, i've written macros like that for open files but without
"unwind-protect". i'll check what is unwind-protect in a moment.

but one thing i don't really understand: say i have several media
inputs, each requiring an instance of a codec, and while the program
is running i say: instead of a file "/media/video/b.mov" use a dv
camera on "/dev/fw0". the processing of them should stay the same so
i'd like to simply say:

(setq *input-1* (dv-camera "/dev/fw0"))

which will turn old value *input-1* into garbage. but using "with-"
macro it becomes:

(with-video-inputs ((*input-1* (video-file "/media/b.mov"))
                    (*input-2* (dv-camera  "/media/fw1")))
  (progn 
    (handle-events)
    (display
      (chromakey 
          *green*
          (get-frame *input-1*)
          (get-frame *input-2*)))))

and handle-events is where the switching inputs or changing the
value of *green* may occure. so now instead of simply (setq ...) i
have to first close current input manually and i find that it
complicates the API which is evil. 

Disclaimer: i use lisp for less then a month so i might missed the
important mechanism i ought to use here, and i'd appreciate if
somebody would point it to me. right now i'm gonna go read
documentation of unwind-protect because i don't have any clue what it
may mean ;-)

>> is there a mechanism / data structure in Common Lisp that would
>> facilitate such pool creation? 
>
> Hmm... it sounds like the objects are released in the same order
> as they are getting used, 

not necesserily. 

the simplest example: video effect called "nervous" - you buffer the
frames for last, say, second and display them in randomly shuffled
order. and variations on the theme.

or for example playing back a file with video and audio: you read
video frames and audio samples intermixed in one "meta-stream" but
you can't be sure that the output will happen in the order you've
read them. 

etc.

> so why not just put them in an array?

if i would do that in C i'd allocate them in array and then maintain
the list of free frames. 

the reason i decided to switch from doing everything in C to dividing
the labor between C [codecs and some effects] and lisp [flexible
gluing them together] was that i've found overhead of implementing
lists and similar "hi-level" data structures in C too much of a
burden. 

-- 
gr{oe|ee}t{en|ings}
artm 
From: Espen Vestre
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <kwd68061es.fsf@merced.netfonds.no>
Artem Baguinski <····@v2.nl> writes:

> and handle-events is where the switching inputs or changing the
> value of *green* may occure. so now instead of simply (setq ...) i
> have to first close current input manually and i find that it
> complicates the API which is evil. 

My understanding of your program is to weak to be able to give
you good advice here, but the general point is that leaving
resource cleanup to the GC would be _really_ evil. It may be
that using a macro with unwind-protect isn't the best solution
for you, but there are zillions of other solutions that would
work, for instance you could make a simple macro:

BO-GUI 26 > (defmacro switch-input (var source)
              `(progn
                 (close-codec ,var)
                 (setf ,var ,source)))
SWITCH-INPUT

BO-GUI 27 > (macroexpand '(switch-input *input-1* (dv-camera "/dev/fw0")))
(PROGN (CLOSE-CODEC *INPUT-1*) (SETF *INPUT-1* (DV-CAMERA "/dev/fw0")))
T

> if i would do that in C i'd allocate them in array and then maintain
> the list of free frames. 

Ok, so keep them the free frames in a lisp list...

-- 
  (espen)
From: Artem Baguinski
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87oerk5xs5.fsf@caracolito.lan>
Espen Vestre <·····@*do-not-spam-me*.vestre.net> writes:
> Artem Baguinski <····@v2.nl> writes:
>> and handle-events is where the switching inputs or changing the
>> value of *green* may occure. so now instead of simply (setq ...) i
>> have to first close current input manually and i find that it
>> complicates the API which is evil. 
>
> My understanding of your program is to weak to be able to give
> you good advice here, but the general point is that leaving
> resource cleanup to the GC would be _really_ evil. 

i see. i'm not very comfortable with GC yet ;-) 

> It may be that using a macro with unwind-protect isn't the best
> solution for you, but there are zillions of other solutions that
> would work, for instance you could make a simple macro:
>
> BO-GUI 26 > (defmacro switch-input (var source)
>               `(progn
>                  (close-codec ,var)
>                  (setf ,var ,source)))
> SWITCH-INPUT
>
> BO-GUI 27 > (macroexpand '(switch-input *input-1* (dv-camera "/dev/fw0")))
> (PROGN (CLOSE-CODEC *INPUT-1*) (SETF *INPUT-1* (DV-CAMERA "/dev/fw0")))
> T

i think that's more a less the way to go: both the implementation and
the interface are simple -> artm's happy ;)

thank you for the enlightenment 


-- 
gr{oe|ee}t{en|ings}
artm 
From: Tim Bradshaw
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <ey3wu68wl1y.fsf@cley.com>
* Artem Baguinski wrote:

> i see. i'm not very comfortable with GC yet ;-) 

The main point about GCs is that they may never run, and if they do
run they run at completely unpredictable times.  So even if the GC
you're using allows you to associate cleanup actions with things, you
never really know when, or if, these actions run.  There's a sub-issue
that lots of cleanup actions might cause the whole GC to be slow, but
I think this isn't likely to be a real problem.

Really, you want to let the GC do what it's good at -  managing memory
- and not try to overload it with other object-management tasks.  A GC
isn't a universal object-manager, it's an optimised memory manager. It
can be useful, I guess, if the GC supports cleanup hooks, to add them
to some types of object in such a way that, if the hook ever runs,
you're informed, and use this as a way of detecting leaks in other
parts of the system (however don't assume it's a *reliable* way!).

--tim
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <4043C4F0.8B76CB0B@sonic.net>
Tim Bradshaw wrote:
> 
> Really, you want to let the GC do what it's good at -  managing memory
> - and not try to overload it with other object-management tasks.  A GC
> isn't a universal object-manager, it's an optimised memory manager. It
> can be useful, I guess, if the GC supports cleanup hooks, to add them
> to some types of object in such a way that, if the hook ever runs,
> you're informed, and use this as a way of detecting leaks in other
> parts of the system (however don't assume it's a *reliable* way!).

Hmm.  I think that there is a "proper implementation" for garbage 
collectors that makes them suitable for managing most resources.
The most fundamental constraints are that GC is triggered whenever 
one of the managed resources is unavailable and that finalizers are
guaranteed to run before program exit. 

				Bear
From: Rahul Jain
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87r7wbag6o.fsf@nyct.net>
Ray Dillinger <····@sonic.net> writes:

> finalizers are guaranteed to run before program exit.

How silly. It's _garbage_. You don't need to clean your room when your
house is going to be demolished tomorrow. I think even Sun fixed that
bogosity in the latest Java language spec.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: Joe Marshall
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <7jy39jiz.fsf@ccs.neu.edu>
Rahul Jain <·····@nyct.net> writes:

> Ray Dillinger <····@sonic.net> writes:
>
>> finalizers are guaranteed to run before program exit.
>
> How silly. It's _garbage_. You don't need to clean your room when your
> house is going to be demolished tomorrow. I think even Sun fixed that
> bogosity in the latest Java language spec.

It depends on how good the OS is.  A couple of very popular OS's have
no support for `unwind-protect' in user processes and get very cranky
if they have to clean up at all.
From: Pascal Bourguignon
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87ptbvw3gs.fsf@thalassa.informatimago.com>
Rahul Jain <·····@nyct.net> writes:

> Ray Dillinger <····@sonic.net> writes:
> 
> > finalizers are guaranteed to run before program exit.
> 
> How silly. It's _garbage_. You don't need to clean your room when your
> house is going to be demolished tomorrow. I think even Sun fixed that
> bogosity in the latest Java language spec.

Personnaly, I would not count of France Telecom to stop sending me its
invoices just because my house was demolished, including the the phone
and ADSL terminals.

I'd rather finalize this contract before the house is demolished.


Do  you run  an  operating  system that  flushes  your buffers  before
killing your process and closing your  files?  What OS is that?  Is it
as well able to scan your data structures to know what should be saved
in the user's documents before killing the application process?

-- 
__Pascal_Bourguignon__                     http://www.informatimago.com/
There is no worse tyranny than to force a man to pay for what he doesn't
want merely because you think it would be good for him.--Robert Heinlein
http://www.theadvocates.org/
From: Rahul Jain
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87fzcqpww9.fsf@nyct.net>
Pascal Bourguignon <····@thalassa.informatimago.com> writes:

> Rahul Jain <·····@nyct.net> writes:
>
>> How silly. It's _garbage_. You don't need to clean your room when your
>> house is going to be demolished tomorrow. I think even Sun fixed that
>> bogosity in the latest Java language spec.
>
> Personnaly, I would not count of France Telecom to stop sending me its
> invoices just because my house was demolished, including the the phone
> and ADSL terminals.
>
> I'd rather finalize this contract before the house is demolished.

They're still holding a reference to your "house". It may be in a
demolished state, but they still know where it was (is?).

> Do  you run  an  operating  system that  flushes  your buffers  before
> killing your process and closing your  files?  What OS is that?  Is it
> as well able to scan your data structures to know what should be saved
> in the user's documents before killing the application process?

What does any of this have to do with GC?

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <40454502.77F2115A@sonic.net>
Rahul Jain wrote:


> > Do  you run  an  operating  system that  flushes  your buffers  before
> > killing your process and closing your  files?  What OS is that?  Is it
> > as well able to scan your data structures to know what should be saved
> > in the user's documents before killing the application process?
> 
> What does any of this have to do with GC?

The operating system reclaims the "garbage" of exiting programs, 
if the garbage is memory (or in most cases file handles).  This is 
true of most operating systems in use today.

This is why releasing garbage that is memory is not necessary when 
a program exits; the OS will reclaim it anyhow. 

But other resources, which the OS is not necessarily smart enough 
to reclaim?  There you will have to help it.  This basically amounts
to putting your garbage in order so it can be recycled by the OS, and
is needed when the OS isn't smart enough to recycle the stuff on its 
own.

				Bear
From: Rahul Jain
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <871xoapr7q.fsf@nyct.net>
Ray Dillinger <····@sonic.net> writes:

> But other resources, which the OS is not necessarily smart enough 
> to reclaim?  There you will have to help it.  This basically amounts
> to putting your garbage in order so it can be recycled by the OS, and
> is needed when the OS isn't smart enough to recycle the stuff on its 
> own.

The issue is that you can _never_ _ever_ depend on a finalizer being run
because your app could be terminated for reasons beyond your control. If
you're done with something, and something absolutely has to be done to
clean it up, do it then. Do it as part of a close function, whatever. 
Finalizers are not an implementation method, they're a way of catching
bugs and dealing with the mess that dynamically writing your code
produces.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: Joe Marshall
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <r7wangd0.fsf@ccs.neu.edu>
Rahul Jain <·····@nyct.net> writes:

> Ray Dillinger <····@sonic.net> writes:
>
>> But other resources, which the OS is not necessarily smart enough 
>> to reclaim?  There you will have to help it.  This basically amounts
>> to putting your garbage in order so it can be recycled by the OS, and
>> is needed when the OS isn't smart enough to recycle the stuff on its 
>> own.
>
> The issue is that you can _never_ _ever_ depend on a finalizer being run
> because your app could be terminated for reasons beyond your control. If
> you're done with something, and something absolutely has to be done to
> clean it up, do it then. Do it as part of a close function, whatever. 

You can _never_ _ever_ depend on the cleanup code being run because
your app could be terminated for reasons beyond your control.
From: Rahul Jain
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87n06uc95d.fsf@nyct.net>
Joe Marshall <···@ccs.neu.edu> writes:

> Rahul Jain <·····@nyct.net> writes:
>
>> The issue is that you can _never_ _ever_ depend on a finalizer being run
>> because your app could be terminated for reasons beyond your control. If
>> you're done with something, and something absolutely has to be done to
>> clean it up, do it then. Do it as part of a close function, whatever. 
>
> You can _never_ _ever_ depend on the cleanup code being run because
> your app could be terminated for reasons beyond your control.

I view using a finalizer as going one step farther than trying to use a
cleanup form, just in case someone decided to not use a cleanup form. 
Therefore, your statement is covered by mine by transitivity. :)

The real issue is that the OS is buggy and can't clean up its own
resources when the application using them is gone. Either that or
there's some remote application that has the same problem. If a
connection to a client dies, the server should dump all resources
allocated for the purpose of serving that client.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <404515F7.593B7596@sonic.net>
Rahul Jain wrote:
> 
> Ray Dillinger <····@sonic.net> writes:
> 
> > finalizers are guaranteed to run before program exit.
> 
> How silly. It's _garbage_. You don't need to clean your room when your
> house is going to be demolished tomorrow. I think even Sun fixed that
> bogosity in the latest Java language spec.
> 

On the contrary, you need to clean your room to make sure you've 
got everything out of it that you want to keep.

And you need to file a change-of-address card with the post office, 
and you need to make sure you're not going to get billed for taxes 
for a house that doesn't exist anymore, and you need to update all 
your ID cards to get the former address off of them, and you should 
probably get the gas and power shut off so that the demolition doesn't
start any nasty fires or unexpected explosions....

					Bear
From: Paul Wallich
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <c2378s$kha$1@reader1.panix.com>
Ray Dillinger wrote:

> Rahul Jain wrote:
> 
>>Ray Dillinger <····@sonic.net> writes:
>>
>>
>>>finalizers are guaranteed to run before program exit.
>>
>>How silly. It's _garbage_. You don't need to clean your room when your
>>house is going to be demolished tomorrow. I think even Sun fixed that
>>bogosity in the latest Java language spec.
>>
> 
> 
> On the contrary, you need to clean your room to make sure you've 
> got everything out of it that you want to keep.
> 
> And you need to file a change-of-address card with the post office, 
> and you need to make sure you're not going to get billed for taxes 
> for a house that doesn't exist anymore, and you need to update all 
> your ID cards to get the former address off of them, and you should 
> probably get the gas and power shut off so that the demolition doesn't
> start any nasty fires or unexpected explosions....

And particularly if you borrowed a bunch of books and CDs and maybe a 
few small power tools from your friends, they're not going to take 
kindly to hearing, "I can't return your stuff because I had my house 
torn down."

paul
From: Pascal Bourguignon
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87oerdbez4.fsf@thalassa.informatimago.com>
Paul Wallich <··@panix.com> writes:
> And particularly if you borrowed a bunch of books and CDs and maybe a
> few small power tools from your friends, they're not going to take
> kindly to hearing, "I can't return your stuff because I had my house
> torn down."

Unless... the planet is torn down 42 minutes later :-)


-- 
__Pascal_Bourguignon__                     http://www.informatimago.com/
There is no worse tyranny than to force a man to pay for what he doesn't
want merely because you think it would be good for him.--Robert Heinlein
http://www.theadvocates.org/
From: Rahul Jain
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87brnepwm2.fsf@nyct.net>
Ray Dillinger <····@sonic.net> writes:

> Rahul Jain wrote:
>> 
>> How silly. It's _garbage_. You don't need to clean your room when your
>> house is going to be demolished tomorrow. I think even Sun fixed that
>> bogosity in the latest Java language spec.
>> 
>
> On the contrary, you need to clean your room to make sure you've 
> got everything out of it that you want to keep.
>
> And you need to file a change-of-address card with the post office, 
> and you need to make sure you're not going to get billed for taxes 
> for a house that doesn't exist anymore, and you need to update all 
> your ID cards to get the former address off of them, and you should 
> probably get the gas and power shut off so that the demolition doesn't
> start any nasty fires or unexpected explosions....

But that has nothing to do with the fact that you left some garbage
behind in your house. Finalizers are for _external_ and _limited_
resources that need to be cleaned up in case you happened to leave them
hanging around. You don't need to tell the OS to close your open file
handles when your app exits. If you need to do something when your app
exits, have a graceful exit procedure. Note that graceful exits don't
happen all the time.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: a
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <AU_0c.67778$vn.196629@sea-read.news.verio.net>
Really suitable for managing most resources? You're probably familiar with
the Resource Acquisition Is Initialization pattern, in which resources like
file handles are opened in constructors and closed in destructors.
Destructors are triggered when the resource should be closed. It is the
programmer's responsibility to trigger the destructor at the proper time.
One approach is to use lexical scope with Lisp's (with-open-handle....)
idiom or local objects in C++.

The difference between C++ and Lisp, in the lexically-scoped example, is
that in the destructor C++ frees not only the resource but also the memory.
The resource is freed in Lisp at the end of the lexical scope and the memory
is freed whenever the GC gets around to freeing it.

In that scenario, what do you mean by the "GC is triggered whenever one of
the managed resources is unavailable"? Do finalizers play any role in the
example I gave?

How is a GC an appropriate tool for managing resources like open file
handles? How does a "proper implementation" of a GC differ from the kind of
GC I have in my existing Lisps?


"Ray Dillinger" <····@sonic.net> wrote in message
······················@sonic.net...
> Tim Bradshaw wrote:
> >
> > Really, you want to let the GC do what it's good at -  managing memory
> > - and not try to overload it with other object-management tasks.  A GC
> > isn't a universal object-manager, it's an optimised memory manager. It
> > can be useful, I guess, if the GC supports cleanup hooks, to add them
> > to some types of object in such a way that, if the hook ever runs,
> > you're informed, and use this as a way of detecting leaks in other
> > parts of the system (however don't assume it's a *reliable* way!).
>
> Hmm.  I think that there is a "proper implementation" for garbage
> collectors that makes them suitable for managing most resources.
> The most fundamental constraints are that GC is triggered whenever
> one of the managed resources is unavailable and that finalizers are
> guaranteed to run before program exit.
>
> Bear
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <404514F6.C89933BF@sonic.net>
a wrote:
> 
> Really suitable for managing most resources? You're probably familiar with
> the Resource Acquisition Is Initialization pattern, in which resources like
> file handles are opened in constructors and closed in destructors.
> Destructors are triggered when the resource should be closed. It is the
> programmer's responsibility to trigger the destructor at the proper time.
> One approach is to use lexical scope with Lisp's (with-open-handle....)
> idiom or local objects in C++.
> 
> The difference between C++ and Lisp, in the lexically-scoped example, is
> that in the destructor C++ frees not only the resource but also the memory.
> The resource is freed in Lisp at the end of the lexical scope and the memory
> is freed whenever the GC gets around to freeing it.

This isn't the only way to do it.  I've written a GC that reaps both memory 
and file handles.  In that system you never even close a file; you either 
let the variable holding the port value go out of scope, or set it to some
other value, and the file handle gets reaped (and the file closed) when the 
next GC catches up to it. 
 
> In that scenario, what do you mean by the "GC is triggered whenever one of
> the managed resources is unavailable"? Do finalizers play any role in the
> example I gave?

Say that memory and file handles are your managed resources.  If you find 
that you can't allocate memory because there isn't enough memory, you trigger
a garbage collection.  If you find that you can't open a file because there 
aren't enough file handles or because it's locked (already open for writing, 
but tied to a "garbage" handle) you trigger a garbage collection. 

Note that you can avoid unnecessary complete collections here if you keep 
a table in memory that you can check for the filename, so you know it's 
not you that has the file open.  That's a refinement for better performance. 
 
> How is a GC an appropriate tool for managing resources like open file
> handles? How does a "proper implementation" of a GC differ from the kind of
> GC I have in my existing Lisps?

Two main ways:

First of all, a proper implementation for managing other resources can 
be triggered when those resources run short; the first attempt to manage
file handles with a GC fails when the system runs out of file handles
and sits there blocked on a file handle, not doing GC because it's not 
out of memory.

Second, if something has a finalizer (like a file handle's finalizer, which 
closes files), that finalizer must be guaranteed to run before program 
termination, even if the object never becomes garbage.  This is necessary
mainly because some kinds of finalizers are interfaces to other systems 
that must do something before shutting down (a port finalizer, for example,
may need to do more than just reap the file handle, which the OS might do 
on program exit; it may need to finish by sending a session checksum over 
the port first.)

A possible third way is that in some cases the work may go faster if the 
GC is aware of object identity; there may be a *particular* file handle it
must free before it becomes possible for an operation to succeed if that 
handle is open to the same file that the operation is trying to open and
the OS locks the file until it's freed.  The one I wrote doesn't bother 
with such niceties; it's an incrementalized three-color collector and it
just keeps going until the operation succeeds.  It could get caught doing
full GC's repeatedly if the program engages in a pathological behavior of
continually reopening the same file to write and then dropping the port, 
but this hasn't been a problem so far. If it becomes one, I can fix it 
by adding a memory table. 

Other resources suitable for this treatment include fonts, widgets, 
palettes, brushes, bitmaps, icons, etc...  basically anything the OS 
has its fingers in and wants to be notified about when you quit using 
it.

					Bear
From: Peter Seibel
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <m3r7wakevt.fsf@javamonkey.com>
Ray Dillinger <····@sonic.net> writes:

> a wrote:
>> 
>> Really suitable for managing most resources? You're probably familiar with
>> the Resource Acquisition Is Initialization pattern, in which resources like
>> file handles are opened in constructors and closed in destructors.
>> Destructors are triggered when the resource should be closed. It is the
>> programmer's responsibility to trigger the destructor at the proper time.
>> One approach is to use lexical scope with Lisp's (with-open-handle....)
>> idiom or local objects in C++.
>> 
>> The difference between C++ and Lisp, in the lexically-scoped example, is
>> that in the destructor C++ frees not only the resource but also the memory.
>> The resource is freed in Lisp at the end of the lexical scope and the memory
>> is freed whenever the GC gets around to freeing it.
>
> This isn't the only way to do it. I've written a GC that reaps both
> memory and file handles. In that system you never even close a file;
> you either let the variable holding the port value go out of scope,
> or set it to some other value, and the file handle gets reaped (and
> the file closed) when the next GC catches up to it.

Do the file and memory GC's run together or separately? If together,
what happens if I run out of files handles a microsecond before I'd
run out of memory (i.e. my next memory allocation would fail and
trigger a (memory) GC). Now suppose my filehandle finalizer has to
allocate memory. What happens? I can imagine ways of dealing with
this; I'm just curious about the approach you took.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <40453DE6.F8CD0BBC@sonic.net>
Peter Seibel wrote:
> 
> Ray Dillinger <····@sonic.net> writes:
> 
> Do the file and memory GC's run together or separately? If together,
> what happens if I run out of files handles a microsecond before I'd
> run out of memory (i.e. my next memory allocation would fail and
> trigger a (memory) GC). Now suppose my filehandle finalizer has to
> allocate memory. What happens? I can imagine ways of dealing with
> this; I'm just curious about the approach you took.
> 
> -Peter

They are the same collector; It collects lisp objects.  Some of the 
objects are ports, which have a file handle and some miscellaneous 
translation/encoding tables.  And if the object collected is a port, 
it closes the file before it drops the references to the tables etc.  
So they "run together" to the extent that the question is meaningful. 

Before I go any further, I'll just point out that there is no system 
whatsoever you can't screw somehow by using allocating finalizers.
Best practices are, as the doctor told the nudist, "don't do that."
Nevertheless, it's very handy, and some find it excruciating to try
to write non-allocating finalizers. 

That said, I'm not supporting user-written finalizer code in my GC 
yet.  It knows how to reclaim memory and it knows how to reclaim 
file handles, and in neither case does it need to do any allocation 
of either.

I've been trying to imagine how to add user-written finalizer code
to the system.  If the user is careful and writes non-allocating 
finalizers, that's easy.  If you don't require non-allocating finalizers,
then the GC must either keep resources around that it only uses for 
running finalizers, or make sure to free enough resources otherwise 
before attempting to run them. Either way, you need an upperbound of 
what the "most allocation" a finalizer might need to do is.  That 
could come from profiling, or from user declarations, or (If I'm 
really good) possibly from static analysis.  However you get the 
estimate of the upperbound, if you are wrong then the program winds 
up blocking waiting for some other program to release the needed 
resource. 

No matter how you slice it though, no matter what you do, there is 
some pathological arrangement of allocating finalizers that can bring 
*any* system down, unless you've got a solution to the halting problem 
in your pocket. 

				Bear
From: Rahul Jain
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <87u116ocik.fsf@nyct.net>
Ray Dillinger <····@sonic.net> writes:

> If you don't require non-allocating finalizers, then the GC must
> either keep resources around that it only uses for running finalizers,
> or make sure to free enough resources otherwise before attempting to
> run them. Either way, you need an upperbound of what the "most
> allocation" a finalizer might need to do is.

Have them allocate as any other code would: into newspace. If there's
not enough room there, then you'd have been out of memory or extremely
close to it anyway.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <40455C6E.8FEA733A@sonic.net>
Rahul Jain wrote:
> 
> Ray Dillinger <····@sonic.net> writes:
> 
> > If you don't require non-allocating finalizers, then the GC must
> > either keep resources around that it only uses for running finalizers,
> > or make sure to free enough resources otherwise before attempting to
> > run them. Either way, you need an upperbound of what the "most
> > allocation" a finalizer might need to do is.
> 
> Have them allocate as any other code would: into newspace. If there's
> not enough room there, then you'd have been out of memory or extremely
> close to it anyway.
> 

Hmmm.  The incremental-GC aspect of it gives it three args; minimum 
number of "steps" to do, minimum number of bytes of memory it must 
reclaim before returning, and minimum number of file handles it must 
reclaim before returning.  

When the GC is called because there is no memory, the bytes argument 
is nonzero.  When it's called because the program is blocking on a filehandle, 
the handles argument is nonzero.  In both cases, running an allocating 
finalizer would be a bad idea.  

But when it's just called with a "steps" argument, it doesn't indicate 
a crisis of any kind.  Allocating finalizers could be run then - but I 
want to make sure there's *enough* resources for them, because they 
cannot call the garbage collector when they try to allocate. I'm not 
comfortable just allocating into newspace with no assurances in that 
case. I still want some estimate of what the most allocation something
might do is. 

				Bear
From: Thomas F. Burdick
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <xcv8yih7q1y.fsf@famine.OCF.Berkeley.EDU>
I'm not convinced this GC approach is all that reasonable, but I could
see it being useful for systems where non-memory resources that are
difficult to manage correctly are extremely common, and memory usage
is relatively low.  No examples of systems with these properties
spring to mind, but I assume you must have run into them.  Obviously,
this approach will fall apart if you need very large amounts of
memory, such that much of the Lisp image is swapped to disk.

Ray Dillinger <····@sonic.net> writes:

> I've been trying to imagine how to add user-written finalizer code
> to the system.  If the user is careful and writes non-allocating 
> finalizers, that's easy.  If you don't require non-allocating finalizers,
> then the GC must either keep resources around that it only uses for 
> running finalizers, or make sure to free enough resources otherwise 
> before attempting to run them. Either way, you need an upperbound of 
> what the "most allocation" a finalizer might need to do is.  That 
> could come from profiling, or from user declarations, or (If I'm 
> really good) possibly from static analysis.

I think compiler support / SA would be the way to go.  It would be
fairly easy to write the analysis to ensure that code doesn't allocate
memory.  For an SA that determines the upper limit, it wouldn't have
to be heroic to be useful: if the SA gets confused, the user can
always pool resources themselves, then complain that the SA needs to
learn to handle a new case.

No matter how you cut it, I'd think you'd want to have finalizers
compiled specially, so their code is static, and won't be subject to
any surprises if a set of functions it uses are being redefined, and
the GC gets triggered when they're in an inconsistent state.

> No matter how you slice it though, no matter what you do, there is 
> some pathological arrangement of allocating finalizers that can bring 
> *any* system down, unless you've got a solution to the halting problem 
> in your pocket. 

Oh, that's easy:

  (error "I can't prove this code will halt.")

For special cases that intimately affect lowlevel system stability,
it's justifiable to break with the normal Lisp principle of accepting
the user's code, no matter how provably bad it is.  I'd make this
exception here exactly because it allows you to better support
inconsistent user code in the vast majority of the system.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Michael Hudson
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <m3fzcp22aq.fsf@pc150.maths.bris.ac.uk>
Ray Dillinger <····@sonic.net> writes:

> I've been trying to imagine how to add user-written finalizer code
> to the system.

I believe there are large and scary demons here.

> If the user is careful and writes non-allocating finalizers, that's
> easy.  If you don't require non-allocating finalizers, then the GC
> must either keep resources around that it only uses for running
> finalizers, or make sure to free enough resources otherwise before
> attempting to run them. 

What would you do about reference cycles between garbage objects with
user written finalizers?

Cheers,
mwh

-- 
  QNX... the OS that walks like a duck, quacks like a duck, but is,
  in fact, a platypus. ... the adventures of porting duck software 
  to the platypus were avoidable this time.
                                 -- Chris Klein, alt.sysadmin.recovery
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <404A1BF6.3AA0659E@sonic.net>
Michael Hudson wrote:
> 
> Ray Dillinger <····@sonic.net> writes:
> 
> > I've been trying to imagine how to add user-written finalizer code
> > to the system.
> 
> I believe there are large and scary demons here.
> 
> > If the user is careful and writes non-allocating finalizers, that's
> > easy.  If you don't require non-allocating finalizers, then the GC
> > must either keep resources around that it only uses for running
> > finalizers, or make sure to free enough resources otherwise before
> > attempting to run them.
> 
> What would you do about reference cycles between garbage objects with
> user written finalizers?

I think I would call that an error.  Since the finalizer code is still 
accessible, objects to which it refers cannot be garbage collected.  A 
single exception is made to this rule, and that is the object it's the 
finalizer *for*.  But if it refers to other objects as well, it seems 
that a reference cycle with user-written finalizers would prevent 
collection. 

				Bear
From: Marcin 'Qrczak' Kowalczyk
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <pan.2004.03.06.21.14.07.600116@knm.org.pl>
On Sat, 06 Mar 2004 18:32:43 +0000, Ray Dillinger wrote:

>> What would you do about reference cycles between garbage objects with
>> user written finalizers?
> 
> I think I would call that an error.  Since the finalizer code is still 
> accessible, objects to which it refers cannot be garbage collected.  A 
> single exception is made to this rule, and that is the object it's the 
> finalizer *for*.  But if it refers to other objects as well, it seems 
> that a reference cycle with user-written finalizers would prevent 
> collection. 

It shouldn't prevent collection. Since the objects can't be referred to
from the main program, they should be all considered "to be finalized
soon" even if they refer to each other. Similarly as cycles don't prevent
collecting objects without finalizers.

There is a question what to do when a finalizer resurrects a dead object
(either the object being finalized or some other object which was
considered for finalization at the same time) by storing it in some
globally accessible data structure. Other languages with finalizers have
the same problem (e.g. Java).

I think the consensus is that, well, the object is now alive again, but
its finalizer has been already run and it won't be run the second time
when the object becomes dead again. Unless the language provides a mean
to explicitly attach a finalizer to an object and another finalizer is
attached, of course.

In other words, when a subgraph of objects without finalizers becomes
unreachable, the memory is just reclaimed. And when a subgraph of objects
with finalizers becomes unreachable, the finalizers are run, and the
objects lose them, hopefully becoming objects belonging to the first
category on the next GC.

How it can be implemented? After a GC, the collector scans the list of
finalizers and moves finalizes whose objects were not found to be alive
to a separate queue. It then marks these objects as alive, together with
objects reachable from them, and then runs the queue of finalizers in a
separate thread. The thread might block, waiting for other threads to
release mutexes which guard shared data. Preemptive concurrency is not
needed but some concurrency is needed in order to write finalizes which
mutate state shared with the main program.

I have not implemented that yet, so I'm not sure that everything is
correct, but I hope so...

-- 
   __("<         Marcin Kowalczyk
   \__/       ······@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/
From: a
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <FKx1c.68165$vn.198180@sea-read.news.verio.net>
"Ray Dillinger" <····@sonic.net> wrote in message
······················@sonic.net...
...
>
> Say that memory and file handles are your managed resources.  If you find
> that you can't allocate memory because there isn't enough memory, you
trigger
> a garbage collection.  If you find that you can't open a file because
there
> aren't enough file handles or because it's locked (already open for
writing,
> but tied to a "garbage" handle) you trigger a garbage collection.

It sounds like the GC assumes it's at the OS, rather than a process, level.
If the GC runs at a process level, do any of these situations, which seem
like problems of varying degree to me, seem like problems to you?

If other, possibly non-Lisp, processes need the file but the file is locked,
open for writing but tied to a garbage handle, the GC is not be aware of the
other process's needs and the other process have to wait on the GC.

A deadlock occurs if the Lisp/File-GC'd process AP has file AF open in a
garbage handle while non-Lisp process BP has file BF open. AP tries to open
BF while BP tries to open AF. (This situation does not occurr when the file
handle for AF is closed as soon as AP completes processing the file.) Both
processes are idle and, because of the lack of activity, the Lisp process is
not triggered to run GC.

System file processes do not get the earliest possible chance to flush and
free file buffers. System buffer memory is tied up longer than necessary and
there are lost opportunities to spread out disk IO.

Socket handles would not get flushed, forcing the programmer to explicitly
flush when without the GC they would call close.

...

The GC you describe may be good for process-local handles such as bitmaps
and fonts, which you mentioned. I wonder if there is a politeness issue when
other processes outside the scope of the GC are involved.
From: Ray Dillinger
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <404A1E62.DDEBEA95@sonic.net>
a wrote:

> If other, possibly non-Lisp, processes need the file but the file is locked,
> open for writing but tied to a garbage handle, the GC is not be aware of the
> other process's needs and the other process have to wait on the GC.

That's true.  The same is true of dynamically allocated, garbage-collected 
memory or any other resource. 
 
> A deadlock occurs if the Lisp/File-GC'd process AP has file AF open in a
> garbage handle while non-Lisp process BP has file BF open. AP tries to open
> BF while BP tries to open AF. (This situation does not occurr when the file
> handle for AF is closed as soon as AP completes processing the file.) Both
> processes are idle and, because of the lack of activity, the Lisp process is
> not triggered to run GC.

No.  Since we are managing file handles with GC, blocking on a file 
handle *MUST* trigger a GC.  That's fundamental. 

 
> System file processes do not get the earliest possible chance to flush and
> free file buffers. System buffer memory is tied up longer than necessary and
> there are lost opportunities to spread out disk IO.

Also true, but only to the same extent and for the same reasons that it's 
true of garbage-collected heap space.  
 
> Socket handles would not get flushed, forcing the programmer to explicitly
> flush when without the GC they would call close.

Wait....  With a socket handle, you flush at protocol synchronization points 
to make sure all bytes written so far are actually transmitted.  That's not 
affected in any way by how you clean up the file handle when you're done. 
For what it's worth, an auto-flush happens when the GC closes the file handle, 
but that's not what you're talking about, is it?

> The GC you describe may be good for process-local handles such as bitmaps
> and fonts, which you mentioned. I wonder if there is a politeness issue when
> other processes outside the scope of the GC are involved.

There is ... to the same extent as, and for the same reasons as, there are 
politeness issues involved in managing dynamically allocated memory. 

				Bear
From: Mario S. Mommer
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <fz7jy88nx4.fsf@germany.igpm.rwth-aachen.de>
Espen Vestre <·····@*do-not-spam-me*.vestre.net> writes:
> My understanding of your program is to weak to be able to give
> you good advice here, but the general point is that leaving
> resource cleanup to the GC would be _really_ evil.

It is a technique that has its uses. If you use a reasonable lisp, and
you remember to trigger full gc often enough (in this case, perhaps
every 10th opened codec, for example. It might be that the delay
caused is unacceptable) it can simplify things greatly.

It is also usefull as a safeguard. It is probably going to happen
often while development that a codec (or any other resource, for that
matter) gets lost in the image, simply becasue there are no more
references to it. Instead of exiting and re-entering your lisp, you
just trigger gc. Problem solved.
From: Mario S. Mommer
Subject: Re: Garbage Collection and Foreign Data
Date: 
Message-ID: <fzllmoft15.fsf@germany.igpm.rwth-aachen.de>
Artem Baguinski <····@v2.nl> writes:
[...]
> and handle-events is where the switching inputs or changing the
> value of *green* may occure. so now instead of simply (setq ...) i
> have to first close current input manually and i find that it
> complicates the API which is evil. 

I wrote a small framework for tracking foreign objects and using
garbage collection. It is used in the lgtk lib, which you can find
here:

www.common-lisp.net/project/lgtk

Grep for capsules and nexus. It should be easy to reuse, since the
idea was precisely to have a reusable framework.

There is no documentation on these matters yet. I'm working on it, but
time is a scarce resource.