From: Stefan Monnier
Subject: Re: ADVICE PLEASE: how to build a method cache?
Date: 
Message-ID: <2gv5ip$ae0@disun41.epfl.ch>
In article <·····················@usage.csd.unsw.oz.au>,
Tim Menzies <····@cse.unsw.edu.au> wrote:
> i want to build a method cache for an object system i am designing and
> i need some advice.
> 
> the Smalltalk-80 books tell me that that 32% of an OO system's runtime
> behaviour is method look-up. with a method cache, we can reduce that to
> 5%. they recommend a cache of 512-1024 items.

I think the most important part of the cache is that it has to be
FAST. So I would just have a simple hash-table:

say you have object X of has value h:

if cache[h] = X then 
	HIT
else
	MISS: cache[h] := X ...
endif

This is a direct-mapped cache and has hence the following
caracteristics:
- fast
- not optimal due to competition (two objects competing for the
same table entry)

If you want better, you'll have to make that more complex. And given
the simplicity of the present scheme, more complex is likely to be
twice as slow. Maybe doubling the cache size is easier.


	Stefan
-- 

-----------------------------------------------------
-- On the average, people seem to be acting normal --
-----------------------------------------------------

From: Lou Steinberg
Subject: Re: ADVICE PLEASE: how to build a method cache?
Date: 
Message-ID: <LOU.94Jan12101144@atanasoff.rutgers.edu>
In article <··········@disun41.epfl.ch> ·······@di.epfl.ch (Stefan Monnier) writes:

   In article <·····················@usage.csd.unsw.oz.au>,
   Tim Menzies <····@cse.unsw.edu.au> wrote:
   > i want to build a method cache for an object system i am designing and
   > i need some advice.

   [...]

   This is a direct-mapped cache 

	   Stefan

I was about to suggest the same, so let me throw out another alternative:

It seems you want a hash atblee with at most N elements - if there
are N already and you want to add another, the least-recently-accessed
one should be removed.  The trick is to find the least recently
accessed.  So in addition to the hash table, keep a linked list of
cached items, and when you retrieve an item add it to the head of the
list.  IF it was in the hash table, and hence in the list already,
MOVE it to the head of the list (i.e. remove it from where it was in
the linked list).

In order to make this move operation cheap, in the hash table entry
for item A keep a pointer to the cons cell in the list BEFORE the cons
cell whose car is item A.

e.g. for the following list, the entry for item A has a pointer to
here _____
          |
          V
 ... -->  [ + | +-]--> [ + | +-]--> [ + | +-]--> NIL
            |            |            |
            V            V            V
            item R       item A       item G


--
					Lou Steinberg

uucp:   {pretty much any major site}!rutgers!aramis.rutgers.edu!lou 
internet:   ···@cs.rutgers.edu
From: Stefan Monnier
Subject: Re: ADVICE PLEASE: how to build a method cache?
Date: 
Message-ID: <2h1qam$gjv@disun41.epfl.ch>
In article <·················@atanasoff.rutgers.edu>,
Lou Steinberg <···@cs.rutgers.edu> wrote:
> It seems you want a hash atblee with at most N elements - if there
> are N already and you want to add another, the least-recently-accessed
> one should be removed.  The trick is to find the least recently
> accessed.  So in addition to the hash table, keep a linked list of

Well, the problem seems to be that the hash-value can't be freely
chosen, so the place in the hash-table is already chosen for you.
If you want a more fancy scheme, you can:

- go to set associative cache. The problem here is that, even though
the multiple checks can be done in parallel when the cache is in
hardware, it has to be done sequentially in software: the speed of
your cache goes down a lot.

- add a fancier second level cache: when you have to drop a cache
line, keep it in a second level cache (that can be an association list
(with move-to-front semantics) or (better) a splay-tree or another
hash-table (set associateive) or anything else you like)

The main point is that you should keep your first level cache as fast
as possible. Some Smalltalk even add a "zero-level" cache inline.
A method call is then:

IF standard-situation THEN
	call known-method
ELSE
	check in the method-cache
ENDIF

That is, they hardcode the most frequent path in the code (or they
even modify code on the fly so that the "standard-situation" is just
the "last-situation" (that's a one-entry inline cache))


	Stefan
-- 

-----------------------------------------------------
-- On the average, people seem to be acting normal --
-----------------------------------------------------
From: Harley Davis
Subject: Re: ADVICE PLEASE: how to build a method cache?
Date: 
Message-ID: <DAVIS.94Jan13105955@passy.ilog.fr>
In article <·················@atanasoff.rutgers.edu> ···@cs.rutgers.edu (Lou Steinberg) writes:

   It seems you want a hash atblee with at most N elements - if there
   are N already and you want to add another, the least-recently-accessed
   one should be removed.  The trick is to find the least recently
   accessed.  So in addition to the hash table, keep a linked list of
   cached items, and when you retrieve an item add it to the head of the
   list.  IF it was in the hash table, and hence in the list already,
   MOVE it to the head of the list (i.e. remove it from where it was in
   the linked list).

   In order to make this move operation cheap, in the hash table entry
   for item A keep a pointer to the cons cell in the list BEFORE the cons
   cell whose car is item A.

The maintenance of the LRU list can overwhelm the cache lookup; one
extra memory write can easily add 50% to a good hash table scheme, and
if you allocate a cons cell the game is lost.  In my experience, the
best thing to do is to support several hard-coded hash table sizes
(important!) and either estimate the appropriate hash table size based
on the number of potential entries (or dynamically on overflow) or
provide a means for the user to declare the cache size for a given
generic function (method name for SmallTalk).  If the biggest cache
size overflows, in general it is better to completely clear the cache
than to try to eliminate individual entries.

This last tidbit is empirical advice which I found difficult to
believe when I tried it, but it seems true nevertheless.  To
understand the phenomenon would need sophisticated locality studies
which I don't have time for, but I think that method calling comes in
waves for particular sets of classes and so it's not very useful to
just eliminate one entry.

These problems are worse in systems which use a few methods to
centralize communication between several objects, such as MVC or
distributed systems.  Other mechanisms should be looked at when
designing such systems.

-- Harley Davis
--

------------------------------------------------------------------------------
nom: Harley Davis			ILOG S.A.
net: ·····@ilog.fr			2 Avenue Gallie'ni, BP 85
tel: (33 1) 46 63 66 66			94253 Gentilly Cedex, France
From: Matt Kennel
Subject: Re: ADVICE PLEASE: how to build a method cache?
Date: 
Message-ID: <2h1ocmINN6du@network.ucsd.edu>
Stefan Monnier (·······@di.epfl.ch) wrote:
: In article <·····················@usage.csd.unsw.oz.au>,
: Tim Menzies <····@cse.unsw.edu.au> wrote:
: > i want to build a method cache for an object system i am designing and
: > i need some advice.
: > 
: > the Smalltalk-80 books tell me that that 32% of an OO system's runtime
: > behaviour is method look-up. with a method cache, we can reduce that to
: > 5%. they recommend a cache of 512-1024 items.

: I think the most important part of the cache is that it has to be
: FAST. So I would just have a simple hash-table:

Ask some of the Sather designers about this.  I think they have some very
simple very fast optimization for the common cases.

: 	Stefan

--
-Matt Kennel  		···@inls1.ucsd.edu
-Institute for Nonlinear Science, University of California, San Diego
-*** AD: Archive for nonlinear dynamics papers & programs: FTP to
-***     lyapunov.ucsd.edu, username "anonymous".
From: Ken Anderson
Subject: Re: ADVICE PLEASE: how to build a method cache?
Date: 
Message-ID: <KANDERSO.94Jan14112530@wheaton.bbn.com>
We could go on and discuss a lot of hashing schemes.  However, it would
help to know more about the design of your language.  For example, say you
just want single dispatch generic functions.  It is possible to convert
from an operation name to a table index at link time.  This index could be used
to lookup the proper method in a class's method dispatch table.  This takes
a few SVREF's.  There are no hash collisions because indicies are
maintained globally.  The dispatch table size is about 100 items for a 1000
class (3000 method) example i looked at.

Also, there are many optimizations possible for single method gf's and
reader and writer methods.  I recommend reading the pcl code ftpable from
parcftp.xerox.com:pub/pcl.


--
Ken Anderson 
Internet: ·········@bbn.com
BBN STC              Work Phone: 617-873-3160
10 Moulton St.       Home Phone: 617-643-0157
Mail Stop 6/4c              FAX: 617-873-3776
Cambridge MA 02138
USA