tail recursion

From: Ralph Freese
Subject: tail recursion
Date: Fri, 12 Mar 1993 21:57:40 +0000
Message-ID: <C3sqC4.DFE@news.Hawaii.Edu>

I wonder if we could revive the discussion of the function q:

> In article <·········@sluggo.ai.mit.edu> ···@ai.mit.edu (Patrick A.
> O'Donnell) writes:
> : In article <················@wisdom.wisdom.attmail.com>, Mike Haynie
> writes:
> : >In article <··········@cs.uiuc.edu> ·······@cs.uiuc.edu (Conrad
> Taylor) writes:
> : >>        I was wondering, does anyone know how to convert the
> following
> : >>   algorithm to a tail recursive algorithm?:  Thanks in advance.
> : >>
> : >>   (define (q m n)
> : >>     (cond ((= m 1) 1)
> : >>       ((= n 1) 1)
> : >>       ((< m n) (q m m))
> : >>       ((= m n) (+ 1 (q m (- m 1))))
> : >>       (#t (+ q m (- n 1)) (q (- m n) n)))))
> : >
> : >Perhaps I am blind, but I do not think that tail recursion is
> possible
> : >here. If I am not mistaken, q is the combinations of n things taken m
> : >at a time, sometimes written C(n,m). The #t case above is the
> culprit,
> : >as it bifurcates.
> :
> : Nonsense.  This function can be made completely tail-recursive.  True,
> : the bifurcation Mike mentions forces one to keep additional state,
> which
> : in a completely tail-recursive function would have to be kept in the
> : heap.  It's easier to resort to full recursion for that portion of the
> : computation and let the control stack maintain that state.
> :
> It is a standard fact that any function can be made completely
> tail-recursive
> by converting it to CPS form.  See "Essentials of Programming Languages"
> by Daniel P. Friedman et al, Section 8.4.
>  
> Each function (f x) is replaced by one with an extra continuation
> parameter
> (f x k).  A bifurcating expression such as (+ (f x) (f y)) is converted
> to
> (f x (lambda (v) (f y (lambda) (w) (k (+ v w)))))).
>  
> --
> --Bill Kinnersley
>   ·····@hawk.cs.ukans.edu

I got the book from our library. It gave a nice explanation of
continuation passing style (CPS), which I knew nothing about.  This
completely answers the original question about converting  q  to tail
recursive form, but the original definition of  q  runs very slowly and
it appears to me that the conversion to the tail recursive form will
still have this problem. (This is not meant as a critism of either
Kinnersley or Friedman et al.; they never claimed the tail recursive
form would be faster.) For example, q(56,56) makes over 1000000 calls
to itself under the original definition but q(56,56) can be determined
by a little over 1000 values q(r,s).

Of course this problem can be fixed by saving the values of the 
q(r,s)'s in an array. I did this (in Common Lisp), see below. 
One objection to this technique is that the minimum running time of
q(m,n) is now  mn. For this  q  that's ok, but for other functions 
it may not be.

So my question: is there a general techique for converting functions
like  q  into something that runs in polymonial time? I had in mind
something like the conversion of the straight forward definition of
fibonacci, fib(n) to one with two extra parameters
fib-fast(n,a_0,a_1) given in many texts.  

(defun fast-q (m n)
  (let* ((matrix (make-array (list (+ m 1) (+ n 1)))))
	(labels 
	 ((f (r s)
	     (if (aref matrix r s) 
		 (aref matrix r s)
		 (setf (aref matrix r s) 
		       (cond
			    ((= r 1) 1)
			    ((= s 1) 1)
			    ((< r s) (f r r))
			    ((= r s) (+ 1 (f r (- r 1))))
			    (t (+ (f r (- s 1)) (f (- r s) s))))))))
	 (f m n) )))

----------------------------+-------------------------------------------------
 Ralph Freese               | INTERNET: ·····@math.hawaii.edu
 Department of Mathematics  | BITNET:   ·····@uhunix.bitnet
 University of Hawaii       | 
 2565 The Mall              | Phone:    (808) 956-8595
 Honolulu, Hawaii 96822     | Fax:      (808) 956-4659
----------------------------+-------------------------------------------------

		Ralph Freese
		·····@math.hawaii.edu

Re: tail recursion Marty Hall
Re: tail recursion Mike Haynie

From: Marty Hall
Subject: Re: tail recursion
Date: Mon, 15 Mar 1993 14:30:49 +0000
Message-ID: <C3xpnD.7HC@aplcenmp.apl.jhu.edu>

In article <··········@news.Hawaii.Edu> ·····@math.Hawaii.Edu (Ralph Freese) 
writes:

>> : >>   (define (q m n)
>> : >>     (cond ((= m 1) 1)
>> : >>       ((= n 1) 1)
>> : >>       ((< m n) (q m m))
>> : >>       ((= m n) (+ 1 (q m (- m 1))))
>> : >>       (#t (+ q m (- n 1)) (q (- m n) n)))))

>> [Q: Can I convert this to tail recursion?]
>  [A: Yes, using continuation passing style. See "Essentials of Programming 
>      Languages" by Daniel P. Friedman et al, Section 8.4.]

>I got the book from our library. It gave a nice explanation of
>continuation passing style (CPS), which I knew nothing about.  This
>completely answers the original question about converting  q  to tail
>recursive form, but the original definition of  q  runs very slowly and
>it appears to me that the conversion to the tail recursive form will
>still have this problem. (This is not meant as a critism of either
>Kinnersley or Friedman et al.; they never claimed the tail recursive
>form would be faster.) For example, q(56,56) makes over 1000000 calls
>to itself under the original definition but q(56,56) can be determined
>by a little over 1000 values q(r,s).
>
>Of course this problem can be fixed by saving the values of the 
>q(r,s)'s in an array. I did this (in Common Lisp), see below. 
>One objection to this technique is that the minimum running time of
>q(m,n) is now  mn. For this  q  that's ok, but for other functions 
>it may not be.
>
>So my question: is there a general techique for converting functions
>like  q  into something that runs in polymonial time? I had in mind
>something like the conversion of the straight forward definition of
>fibonacci, fib(n) to one with two extra parameters
>fib-fast(n,a_0,a_1) given in many texts.  

Well, one technique is to use automatic memoization. This basically 
automatically converts functions into versions akin to yours that saved 
values of q(r,s)'s in an array. A basic approach to this is described in 
Peter Norvig's _Paradigms of AI Programming_ (Morgan Kaufmann, 1992). I 
have a more extended version; an old posting describing it is attached 
below. It is available from the LISP archives on think.com (see the LISP
FAQ), or from archive.cs.umbc.edu in /pub/Memoization.

				- Marty
(proclaim '(inline skates))

===================== Lengthy Description Follows ====================

This message is to announce the availability of a facility for
automatic memoization in Common LISP. Automatic memoization is a
technique by which an existing function can be transformed into one
that "remembers" previous arguments and their associated results,
yielding large performance gains for certain types of applications.
This code, in beta version, was inspired by the section on automatic
memoization in Peter Norvig's _Paradigms of AI Programming_ (which in
turn was inspired by Abelson and Sussman's _Structure and
Interpretation of Computer Programs_). However, the idea of this
facility is to extend those ideas into what is needed for a practical
facility for use in a large program. As such, it adds many facilities
for bookkeeping, timing, evaluating the timing advantages memoization
provides, saving hash tables to disk for automatic reuse in a later
session, etc. This utility has been used over the last year on the
DARPA Signature Management System, resulting in performance
improvements of 10-20 fold in several key modules.

These utilities, along with an overview of memoization and its
applications, are described below. The code itself is available via
anonymous FTP from archive.cs.umbc.edu, in /pub/Memoization. If you
are interested but do not have FTP access, I can send the code via email.

I am quite interested in suggestions, comments, corrections, and
experiences of anyone who tries out the package.

					Marty Hall
					Artificial Intelligence Lab
					AAI Corporation
					PO Box 126
					Hunt Valley, MD 21030
					····@aplcen.apl.jhu.edu
					(410) 683-6455

======================================================================

WHAT IS AUTOMATIC MEMOIZATION?
==============================

The idea of memoization is that it allows a function to "remember"
previous invocations, returning the previously calculated values
(rather than recalculating) if it is called with exactly the same
arguments as in a previous execution. *Automatic* memoization means
that an existing function can be transformed into a memoized one
without requiring any changes in the code for the function itself.
This can result in tremendous speedups if calculations are repeated at
various points in a program's execution, yet while remaining somewhat
transparent to the users of the code.

APPLICATIONS
============

(A) There are four basic applications of memoization. The first, which is
illustrated below, is when a single routine calls some subroutine (or
itself recursively) more than is needed, resulting in extra calculations.
By memoizing, these results are returned immediately for subsequent calls,
with the effect of dynamic programming. In fact, this first case can be
thought of as a tool for automatic dynamic programming, but without the
need to build the subpieces in the correct order. This can frequently
reduce the time of exponential algorithms to polynomial or even linear
time. Given enough thought, this can be solved without an automatic
memoization facility by either building up the subpieces in the proper
order or maintaining a special purpose local data structure to retain the
results (ie "manual" memoization). The advantage of doing it automatically
is that less debugging and testing is required if the simple algorithm has
been already tested, the versions can be changed back and forth
interactively at run time, it is more transparent, and most importantly it
is simple and easy to use.

To illustrate this first case, consider a naive implementation of a
function to calculate the Nth Fibonacci number:

(defun Fib (N)
  (if (<= N 1)
      1
      (+ (Fib (- N 1)) (Fib (- N 2)))) )

Once (Fib (- N 1)) is calculated, there should be no need to repeat the
calculation of (Fib (- N 2)), since it has already been performed as part of
the calculation of (Fib (- N 1)). These wasted calculations result in 
exponential time to calculate (Fib N), growing as the golden ratio to
the Nth power. On almost any machine, N of 40 or 50 is the maximum tractable
calculation. Calling (Memoize 'Fib) then calling Fib results in linear time,
so that N in the hundreds still only requires fractions of a second on most
machines. Later calculations require only constant time if they calculate
(Fib M) for an M less than or equal to a previously calculated value. 

Of course, one doesn't need memoization to get an efficient calculation of
Fibonacci numbers. Simple tail-recursive or iterative implementations will
give the same performance, and there is even a closed form. But there are
many real-life problems where the more efficient algorithm is not so obvious,
and it is far simpler to memoize the obvious algorithm than to determine
a better algorithm. For instance, in the Memoization-Examples file included
in the distribution, a slightly less obvious illustration is given of an
algorithm to calculated divided differences. As further illustrations, Peter
Norvig showed that a memoized version of a simple recursive descent parser
yields the same performance as chart parsing or Earley's algorithm for parsing
context free languages. Paul McNamee has shown that memoizing the simple
brute-force approaches to the 0/1 knapsack problem or the matrix chain
problem gives the same performance as the complex dynamic programming
implementations. Other similar examples abound.

(B) The second case is for invocations of a function that are repeated over
time, but from scattered places in the program, or even when revoked
repeatedly by a user in an interactive program. This generally yields a
speedup by a constant factor, but that factor may be large. Without an
automatic memoization facility, the only alternative is maintaining a
special purpose global data structure, requiring testing and debugging,
and much extra effort for something that at best is equally efficient as
memoization.

(C) The third case is when a function is so expensive that you want to
perform the calculations off-line and save the results for a later
session. The automatic memoization facility provides a simple and
transparent method to save the results and have them associated with the
function automatically in a later session. See Save-Memo-Table in the
following section for an explanation of how to do that.

(D) The final case is when using memoization as a tool in conventional
performance profiling and optimization. Many implementations provide
some sort of a metering system, and this should be used for major test
cases.  However, there is often tremendous overhead involved, with
20-30x slower performance when a routine is fully metered. For quick
test cases, it is often useful to know if speeding up a particular
routine will have much effect on the top-level timing. By using
Memoized-Time or With-Memoization, a user can memoize the routines in
question then run the same test case multiple times. If the identical
test case runs only, say 5% faster even during the second memoized
run, then this suggests that no amount of memoization in the routines
in question will make more than a 5% difference in the performance of
the test case, and that this is likely not the place to begin the
optimization efforts.

MAIN USER ROUTINES
==================

   Define-Memo-Function: a macro that can be used just like "defun", but
     which has the result of defining a memoized function. Also updates
     the doc string and the results of calling "Arglist" (if available in
     current LISP implementation) on that function name. Any of the keywords
     acceptable to Memoize can optionally be passed on, resulting in 
     specialized versions of memoization that seed their initial hash 
     tables from the disk, use particular hash table tests, etc.

   With-Memoization: a macro that takes a list of function names and any
     number of LISP forms and executes them in a context where the
     functions are temporarily memoized.
       (With-Memoization (Foo Bar Baz)
         (Form-1)
         (Form-2))    results in executing the two forms while functions
     named Foo, Bar, and Baz are memoized. Useful for getting a quick feel
     for the potential speed benefits of memoization.

   Without-Memoization: a macro that executes LISP forms in a context
     where all memoization is temporarily turned off.
     (Without-Memoization
       (Form-1)
       (Form-2))  executes the two forms with no functions memoized.

   Memoize: Takes a function name and changes its function definition to
     be a memoized function. 
       (defun Foo (Args) <Body of Foo>)  followed by 
       (Memoize 'Foo) has the same effect as doing 
       (Define-Memo-Function Foo (Args) <Body of Foo>), but calling
     "Memoize" directly is sometimes more convenient when testing things
     out, as it requires no changes in the preexisting code.

   Save-Memo-Table: Saves to disk a definition of the hash table
     associated with a given memoized function name. By defining a
     memoized function via 
        (Define-Memo-Function Foo (<Args>)
          <Body>)
      running the time-consuming cases off-line, calling
        (Save-Memo-Table '<Function-Name>)
      then using
        (Define-Memo-Function Foo (<Args>)
          (:Hash-Table-Source :Disk)
          <Body>)
     or by calling Memoize with the :Hash-Table-Source set to :Disk,
     you can have a function "remember" the values it calculated, not only
     in the current run but also in the previous off-line run.

   Clear-Memo-Table: Takes a function name and clears out the memo table
     associated with the function. Useful when some internal change makes
     the previously stored values incorrect.

   Unmemoize: Takes a function name and returns it to the unmemoized form.
     Useful for timing and for debugging, especially tracing recursive
     routines. In combination with "Memoize", this lets you switch back
     and forth between memoized and normal versions without changing or
     reloading the code. Similarly, Unmemoize-Functions takes a list
     instead of a single one, and Unmemoize-All-Functions unmemoizes
     everything.

   Rememoize: Takes the name of a function that is currently unmemoized,
     but which had previously been memoized. Memoizes it again, but uses
     the previous hash table instead of creating a new one. Similarly,
     Rememoize-Functions applies to a list.

   Memoized-Function-Call-Count: Given the name of a memoized function,
     tells how many times that function has been called, and which of
     those were simple table lookups that had been stored from a previous
     invocation, vs how many were newly calculated using the original
     function. For a normal memoized function, lets the user see if
     memoization is paying off after a long period of time. For a function
     whose memo table was stored on disk, lets the user see if the stored
     values covered all or most of the cases.

   Memoized-Time: Takes a list of functions and a single form and evaluates
     and times the form 3 times, once without memoization, once with
     memoization and an empty cache, and once with memoization but the
     full cache from the previous run. 

   *Memoized-Function-Names*: a list of the currently memoized functions.

"Memoize", "Memo", and "Clear-Memo-Table" were based on code in Peter
Norvig's outstanding book _Paradigms of AI Programming: Case Studies in
Common LISP_, Morgan Kaufmann, 1992, which in turn was inspired by code in
ex. 3.27 of Abelson, Sussman, and Sussman's _Structure and Interpretation
of Computer Programs_, MIT Press, 1985. Comments and suggestions on the
code were given by Jim Mayfield (University of Maryland Baltimore County),
J. Paul McNamee (AAI Corporation), Peter Norvig (Sun Microsystems), and

From: Mike Haynie
Subject: Re: tail recursion
Date: Tue, 16 Mar 1993 19:27:19 +0000
Message-ID: <MBH.93Mar16142957@wisdom.wisdom.attmail.com>

In article <··········@news.Hawaii.Edu> ·····@math.Hawaii.Edu (Ralph Freese) writes:

   I wonder if we could revive the discussion of the function q:
	[...]

   > : >>   (define (q m n)
   > : >>     (cond ((= m 1) 1)
   > : >>       ((= n 1) 1)
   > : >>       ((< m n) (q m m))
   > : >>       ((= m n) (+ 1 (q m (- m 1))))
   > : >>       (#t (+ (q m (- n 1)) (q (- m n) n)))))
   > : >

	[...]
   > by converting it to CPS form.  See "Essentials of Programming Languages"
   > by Daniel P. Friedman et al, Section 8.4.

I learn new things all the time on the net ... ;-)

   I got the book from our library. It gave a nice explanation of
   continuation passing style (CPS), which I knew nothing about.  This
   completely answers the original question about converting  q  to tail
   recursive form, but the original definition of  q  runs very slowly and
   it appears to me that the conversion to the tail recursive form will
   still have this problem. (This is not meant as a critism of either
   Kinnersley or Friedman et al.; they never claimed the tail recursive
   form would be faster.) For example, q(56,56) makes over 1000000 calls
   to itself under the original definition but q(56,56) can be determined
   by a little over 1000 values q(r,s).

   Of course this problem can be fixed by saving the values of the 
   q(r,s)'s in an array. I did this (in Common Lisp), see below. 
   One objection to this technique is that the minimum running time of
   q(m,n) is now  mn. For this  q  that's ok, but for other functions 
   it may not be.

Actually, I think that O(q(n,m)) is necessarily mn only for the first
application, but later applications can be O(1). Since q() has no
other state, each q(i,j) == q(n,m) for i==n, j==m. (This is a *key*
property of q()!!!) Therefore, the table of results that was valid for
the first application will be valid for the second application. If you
were to retain the table of results (in a global variable, for
instance [yea, I know... ;-)]), then subsequent applications pay only
constant cost for the computation for any previously computed (i,j),
==> O(1).

   So my question: is there a general techique for converting functions
   like  q  into something that runs in polymonial time? I had in mind
   something like the conversion of the straight forward definition of
   fibonacci, fib(n) to one with two extra parameters
   fib-fast(n,a_0,a_1) given in many texts.  

I believe that the conversion for fib relies on the fact that *only*
the 2 previous results are required to compute the next result. With
q(), that relationship is not at all obvious (to me, at least).
Therefore, I come up dry looking up a general technique for converting
q(), and the associated general case.

   (defun fast-q (m n)
     (let* ((matrix (make-array (list (+ m 1) (+ n 1)))))
	   (labels 
	    ((f (r s)
		(if (aref matrix r s) 
		    (aref matrix r s)
		    (setf (aref matrix r s) 
			  (cond
			       ((= r 1) 1)
			       ((= s 1) 1)
			       ((< r s) (f r r))
			       ((= r s) (+ 1 (f r (- r 1))))
			       (t (+ (f r (- s 1)) (f (- r s) s))))))))
	    (f m n) )))

I wonder, how dense is the matrix? If it is sparse enough, another
storage tool might be in order...

Happy hacking.
--

                                ____/|
Michael Haynie                  \ o.O|   ACK!
···@wisdom.attmail.com           =(_)=  THPHTH!
                                   U