From: Jon A. Maxwell
Subject: Functional Languages and Caches
Date: 
Message-ID: <4iq667$nkv@csugrad.cs.vt.edu>
In a functional language supposedly there are no side-effects.

I'm thinking that having a cache of function parameters and
return values could speed up programs many times.  The example in
my algorithms book for Dynamic Programming is to parenthesize
matrix multiplications in order to minimize the number of element
multiplications.  This is done by evaluating subproblems and
using those results to solve larger problems. (O(n^3) it turns
out whereas recursion takes O(2^n)?)

Dynamic programming is so much faster because the subproblems are
solved only once instead of many times.  Wouldn't that happen
automatically if the (recursive) function had a cache associated
with it?

If you were doing factorials, say you evaluated 5! + 4!.  After
calculating 5!, the 4! would already be known and the answer
could be returned without having to calculate it again.

So would this cache be practical, and could it be used in a
language such as lisp (which has side effects)?  I just wondering
how they got emacs so fast.

--
     thur  Mail Address: ··········@vt.edu or ········@vt.edu
  n  r     
  a JAMax  "Last week I saw a woman flayed, and you will hardly
  h o   w  believe, how much it altered her person for the
  tan lle  worse."  --Jonathan Swift

From: Marty Hall
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <DoMGAx.4tI@aplcenmp.apl.jhu.edu>
In article <··········@csugrad.cs.vt.edu> ········@csugrad.cs.vt.edu
(Jon A. Maxwell) writes: 
>In a functional language supposedly there are no side-effects.
>
>I'm thinking that having a cache of function parameters and
>return values could speed up programs many times.  The example in
>my algorithms book for Dynamic Programming is to parenthesize
>matrix multiplications in order to minimize the number of element
>multiplications.  This is done by evaluating subproblems and
>using those results to solve larger problems. (O(n^3) it turns
>out whereas recursion takes O(2^n)?)
>
>Dynamic programming is so much faster because the subproblems are
>solved only once instead of many times.  Wouldn't that happen
>automatically if the (recursive) function had a cache associated
>with it?
[...]
>So would this cache be practical, and could it be used in a
>language such as lisp (which has side effects)?  

The technique of caching previous computations is known as
"memoization" and is widely used. Automatically converting an existing
function into one that caches computations is known as "automatic
memoization". The idea of memoization was first popularized by Donald
Michie at the University of Edinburgh in the late 1960's. Michie and
later David Marsh implemented some libraries in the Pop-2 language and
looked at some usage issues. Michie's idea was more focused on machine
learning than on dynamic programming, with the thought being to have
fuzzy matches against the cache. However, this was done by potentially
examing the whole cache, resulting in O(N) performance where N is the
number of values already cached. Or one can use hashing and give up
on the idea of fuzzy matches. We are currently looking at ways to
have our cake and eat it too in this regard. See references below.

It was then mostly discussed in the functional programming literature.
Reade and Field&Harrison both discuss it in their texts on Functional
Programming, and there are multiple related papers as well.

In Lisp, Abelson and Sussman suggest the idea of automatic memoization
in Scheme in _Structure and Interpretation of Computer Programs_, and
Peter Norvig gives a much more complete implementation in _Paradigms's
of AI Programming_. People also discuss applications to logic
programming (Warren, Dietrich), context-free parsing (Norvig), Dynamic
Programming (Cormen), PDA's (Amtoft), etc.

Applying memoization in pure functional languages is nice, but not all
that helpful for the 99.9% of people who use impure languages. Applying
it in Lisp is much easier and more flexible than in most other
languages, but can still be quite tricky. At IJCAI-85, Mostow and
Cohen presented a very interesting paper on some of the problems they
had trying to figure when it was safe to use. I have a couple of
papers discussing several of these problems with potential solutions,
suggesting additional applications, and summarizing our experiences
using such a package we developed in Common Lisp. One of these papers
is available at
http://www.apl.jhu.edu/~hall/test/Papers/Monterrey-Memoization.ps. I 
will try to get the others online soon and linked to my Lisp WWW page 
(http://www.apl.jhu.edu/~hall/lisp.html). You can obtain an early
version of our library at http://www.apl.jhu.edu/~hall/lisp/Memoization/
as well as from the Lisp archives at CMU. I will try to get an updated
version and links from my Lisp WWW page relatively soon.

We are currently working on a version in C/C++, and are considering a
Java version sometime later.

> I just wondering how they got emacs so fast.

To my knowledge, emacs does not use memoization.

>If you were doing factorials, say you evaluated 5! + 4!.  After
>calculating 5!, the 4! would already be known and the answer
>could be returned without having to calculate it again.

The problem is that the cache-retrieval time is likely to be
significantly larger than the time to calculate a small factorial from
scratch. By using hashing to store the cache, you can make the access
time O(1) with respect to the number of stored values (O(N) with
respect to the size of the input argument unless you use
object-identity instead of object-equality as the test, which has its
own problems). But this time is unlikely to be faster than that
required to do a simple factorial. For instance, in Common Lisp on my
Sparc10, a computation has to take at least 1/1000th of a second for
memoization to be potentially useful. Note also that if you want the
super-dramatic speedups you referred to above, you need to apply
memoization to a divide and conquer problem with overlapping
subproblems, which is not the case with factorials.

But memoization can still be extremely useful in other problems, but
just not in such tiny ones. It is usually considered to be a
space-for-time tradeoff, but if the function you are memoizing
generates a lot of temporary memory (garbage) to do its calculations,
memoizing it can save this. In our empirical tests on a large decision
aid in Lisp, we were surprised to find huge reductions in garbage
after our programmers used the memoization library (in addition to
very large speedups).

Cheers-
						- Marty
(proclaim '(inline skates))
From: Richard A. O'Keefe
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <4itae7$38b@goanna.cs.rmit.EDU.AU>
········@csugrad.cs.vt.edu (Jon A. Maxwell) writes:

>In a functional language supposedly there are no side-effects.

>I'm thinking that having a cache of function parameters and
>return values could speed up programs many times.

Congratulations.  You have just reinvented "memo functions",
first described as far as I know in the early '60s (I think
Bob Doran was one of the inventors) and first implemented
(AFAIK) for Pop-2.

The idea has also been applied to logic programming and is
built into XSB-Prolog.
-- 
The election is over, and Australia lost; the idjits elected _politicians_!
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.
From: Jon A. Maxwell
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <4itn8u$9k9@csugrad.cs.vt.edu>
Richard A. O'Keefe wrote:
] >I'm thinking that having a cache of function parameters and
] >return values could speed up programs many times.
]
] Congratulations.  You have just reinvented "memo functions",
] first described as far as I know in the early '60s (I think Bob
] Doran was one of the inventors) and first implemented (AFAIK)
] for Pop-2.

If only I had been born 40 years ago...

Thanks for the reply, and also thanks to Marty Hall (for writing
a really long account of memoization).  Kinda disappointing when
I reinvent the wheel, but good that we have it!

(the other day I reinvented the Sutherland-Hodgman polygon
 clipping algorithm ;(

--
     thur  Mail Address: ··········@vt.edu or ········@vt.edu
  n  r     
  a JAMax  "A good style should show no sign of effort.  What
  h o   w  is written should seem a happy accident."
  tan lle             --Somerset Maugham
From: Martin Cracauer
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <1996Mar22.205645.12160@wavehh.hanse.de>
········@csugrad.cs.vt.edu (Jon A. Maxwell) writes:

>In a functional language supposedly there are no side-effects.

>I'm thinking that having a cache of function parameters and
>return values could speed up programs many times.  

I do not think that caching of function results violates functional
style. Although the function has state, the responces are not affected
by it.

Either Grahams 'On Lisp' or Norvig's Book has an example of a function
(or macro) that automaticaly generates functions with cache for Common
Lisp. I don't remember which book, sorry.

[...]

>If you were doing factorials, say you evaluated 5! + 4!.  After
>calculating 5!, the 4! would already be known and the answer
>could be returned without having to calculate it again.

This case wouldn't be covered by a simple cache that stores only the
latest result. 

[...]

>So would this cache be practical, and could it be used in a
>language such as lisp (which has side effects)?  I just wondering

See above, I remember the book's solution to be very elegant, a real
reason to use Lisp. I'm sure SML could have such a
function-modification function, too. I'm not sure about C++ :-)

>how they got emacs so fast.

?? I think a good Common Lisp compiler is much faster when modifing
byte arrays, although I didn't benchmark.

Happy Hacking
	Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <········@wavehh.hanse.de>  -  BSD User Group Hamburg
BSD, Lisp and other programming info http://www.bik-gmbh.de/~cracauer
From: Marty Hall
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <DottEx.287@aplcenmp.apl.jhu.edu>
In article <······················@wavehh.hanse.de>
········@wavehh.hanse.de (Martin Cracauer) writes: 
[... Memoization in Common Lisp ...]
> I'm sure SML could have such a
>function-modification function, too. I'm not sure about C++ :-)

Surprisingly, it is not so easy to do this in SML at runtime. Creating
a function that returns a caching version of an input function can be
tricky since you have to know a lot about types. However, the real
problem is that SML uses static binding to associate function objects
with function (variable) names. So even if you could make a memoizing
version of your function and assigned it back to the original function
(variable) name, previous references would get the old function, not
the new one.

Paul McNamee (············@jhuapl.edu) and I are working on a version
in C++ that uses a preprocessor to do the conversion at (before)
compile time instead of at runtime. This limits some of the useful
things you can do in Common Lisp versions of memoization, but still
seems quite useful.

An early paper on our Common Lisp memoization package and a link to
the source code is available from my Lisp Tutorial page at
http://www.apl.jhu.edu/~hall/lisp.html.

						- Marty
(proclaim '(inline skates))
From: Vassili Bykov
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <4ja20f$ki3@tandem.CAM.ORG>
········@wavehh.hanse.de (Martin Cracauer) wrote:
>········@csugrad.cs.vt.edu (Jon A. Maxwell) writes:
>
>>In a functional language supposedly there are no side-effects.
>
>>I'm thinking that having a cache of function parameters and
>>return values could speed up programs many times.  
>
>I do not think that caching of function results violates functional
>style. Although the function has state, the responces are not affected
>by it.

I think Jon's point was different.  It was not that caching can
violate functional style.  The problem is that if a function does have
state, we cannot cache the value it returns using only the function's
argument tuple as a cache lookup key.

Technically, such caching is trivial, and a Scheme version is given in
SICP, I believe.  Of course, it is the same old time/space tradeoff.

--Vassili
From: Marty Hall
Subject: Re: Functional Languages and Caches
Date: 
Message-ID: <DoxILM.zp@aplcenmp.apl.jhu.edu>
In article <··········@tandem.CAM.ORG> Vassili Bykov <······@cam.org> writes:

>I think Jon's point was different.  It was not that caching can
>violate functional style.  The problem is that if a function does have
>state, we cannot cache the value it returns using only the function's
>argument tuple as a cache lookup key.

The problem is even worse than this. Suppose that the procedure really
is a true function (output completely and deterministically determined
by input) with no side effects. However, suppose that someone calling
the routine does destructive operations on the returned
value. Memoizing the function may now break something that worked
before. For instance, imagine that Calculate-Values is a true function
that returns a list of numbers given some inputs. Imagine that
Normalize-Values uses DELETE to throw away the highest and lowest
values.

Before memoization (Normalize-Values (Calculate-Values <Args>))
might have been perfectly safe is Calculate-Values consed a fresh list
each time. But after memoizing Calculate-Values each call would get a
shorter list. 

>Technically, such caching is trivial, and a Scheme version is given in
>SICP, I believe.

SICP gives a basic sketch, but with no particular implementation of
the cache. The obvious way to do the cache is via hashing, and
Norvig's PAIP gives such an implementation in Common Lisp.

A much more complete library to do this in Common Lisp can be found at 
<http://www.apl.jhu.edu/~hall/lisp/Memoization>. This includes a bunch
of routines for evaluating the benefits of memoization, saving tables
to disk to be reloaded in later sessions, etc. The top portion of
<http://www.apl.jhu.edu/~hall/lisp/Memoization.lisp> describes the
routines briefly. Slightly more detail is given in
<http://www.apl.jhu.edu/~hall/lisp/Memoization/Users-Guide-Draft.ps>
Links to this source code and to a paper on the subject can all be
found on my Lisp WWW page at <http://www.apl.jhu.edu/~hall/lisp.html>

Although the basic implementation is indeed straightforward, I think
that saying the implementation is "trivial" is a bit of an
overstatement. For instance, doing inexact matches against the cache
(as Michie originally proposed) without losing the O(1) behavior of
hashing is a bit of a challenge. Furthermore, limiting the size of the
caches in a useful way (e.g. based on frequency of use) without
significant performance penalties is also non-trivial. Also, how does
one notice when you are memoizing when you are not supposed to (side
effects, destructive operations)? Or how do you tell if a persistent
cache is now obsolete due to changes in subfunctions called by the
memoized routines?

None of this is to say that automatic memoization is not useful. On
the contrary, when we applied it widely on a large DARPA program we
were surprised at how widely applicable it was, how many new
applications we discovered for it, and what a large payoff we got.
But although the payoffs were quite large, you need to be aware of the
pitfalls to use if effectively.

>Of course, it is the same old time/space tradeoff.

This is certainly frequently true. But memoizing a routine that
generates a bunch of temporary storage to perform its calculations has
the beneficial effect of reducing this consing in later calls if you
get cache hits. In the DARPA program mentioned above, we provided the
automatic memoization library to a bunch of developers from 3 or 4
different companies who were working on the system. Later, we went
back and evaluated the benefits memoization had provided. The time
improvement was large (over 100 fold in the total system), but what
surprised us was that memoization reduced consing by more than 1000
fold. Of course this is only in our particular application which
needed to allocate a lot of temporary storage. And of course the same
speedups could have been obtained by other approaches. But the results
were interesting nonetheless. If a technique is convenient, lots of
people will try it. If not, it will be used a lot less frequently.

Cheers-
						- Marty
(proclaim '(inline skates))