it's tempting to use hash-tables for caching various stuff, but if number of
objects we are dealing with is great, we have a chance to run out of memory.
on of solutions to this problem is weak hash tables -- they do not prevent
entries of hash tables from being garbage collect, if they are not
referenced in other places. we can't run out of memory with such table, but
it gives no guarantee how long entries stay in cache. if our application
processes data on requests (web server, for example), and we do not preserve
state in memory between requests, they are high chances that generational GC
will delete entries cached for current request as soon as request completes,
thus in some cases there will be no effect from cache at all!
so, for some caching purposes weak hash tables are not good at all.
Java programming language has so-called "soft references" that can be used
for implementation of memory-sensitive -- as i understand, garbage collector
will clear them in low-memory situation, but not just because it can.
unfortunately, i've seen nothing like this for Common Lisp.
so i'm looking for a user-mode library for managing cache. the only
requirement i have is low overhead -- something comparable with hash-table
access time. i don't need some optimized strategies and automatic
size-tuning at time, but that would be nice of course.
(if there's something implementation-dependant for SBCL that will work too).
i wasn't able to find one in internets, so it seems i'll have to write my
own. i have two ideas:
1. ht + LRU list.
HT just points to entry in LRU singly linked list, which contains data
itself. on hit entry is moved to the top of the list.
on miss, if destination size is reached, bottom in list is evicted from
cache (pointer in HT deleted) and new one is added on top.
seems to be simple. probably LRU can be replaced by something more advanced
like ARC.
2. 2 hts
we have two hash tables -- one is primary and another secondary. new entries
are added only to primary one, but in case of miss they are searched in
secondary. in case of hit in secondary, entry is added to primary ht and
erase in secondary one.
when primary fills to some level, it becomes new secondary, old secondary is
abandoned, and fresh primary is created.
(more hts can be added for advanced strategies)
it seems design with LRU list is more mainstream, so i'd better stick to it,
it seems.
additionally, if i was able to get random access to hash table keys --
ability to pick one random key -- i could implement soft cache in easy way:
when target size is reached, random entries are removed when new ones are
added to table. while it seems weird, it has some advantages -- it has
nearly zero overhead (as long as getting random elements works fast), and
it's less prone to pathalogic cases than LRU ones. but unfortunately CL
standard doesn't allow random access to keys -- either i'll have to use
implementation-dependant things, or "user-level" HT implementation (that
could be less optimized than implementation one).
also i see a problem that cache HAS to be configured manually, it's not
possible to get really soft behaviour without intimately interacting with GC
:(
In article <·························@news.sunsite.dk>,
"Alex Mizrahi" <········@users.sourceforge.net> wrote:
> so i'm looking for a user-mode library for managing cache. the only
> requirement i have is low overhead -- something comparable with hash-table
> access time. i don't need some optimized strategies and automatic
> size-tuning at time, but that would be nice of course.
> (if there's something implementation-dependant for SBCL that will work too).
Search for DEFRESOURCE. This implements a pool of objects, and probably
allows you to specify how big the pool should be, so that extra objects
are removed from the pool and allowed to be GCed.
--
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***