From: Bruce L. Lambert, Ph.D.
Subject: function name as argument?
Date: 
Message-ID: <4s5um6$2b4g@piglet.cc.uic.edu>
Hi folks,

I have written a document clustering program that can implement three
standard types of clustering: single linkage, group average linkage, and
complete linkage. The clustering methods differ only in the way the
compute nearest neighbors between non-singleton clusters. So depending,
on the cluster method specified in the argument list, I want the cluster
function to use one of three method-specific find-nearest-neighbor
functions. 

The top level progam is called cluster. I want to pass the cluster-method
to cluster as an argument. Right now, I use let to allocate a symbol
called method-specific-find-nn, and depending on the cluster-method, I
setf the symbol-function of method-specific-find-nn to the
symbol-function of the appropriate find-nn function name. Is there a more
elegant way of doing this, perhaps with a macro?

Here's an excerpt:

(defun cluster (sim-matrix
                collection-size
                threshold
                cluster-method)

  "Compute clusters for a set of document vectors."

  (let* ((sizes                  (make-array collection-size))
         (num-active              collection-size)
         (cluster-members        (make-array collection-size))
         (active-clusters        (make-array collection-size))
         (nearest-neighbors      (make-array collection-size))
         (sim-values             (make-array collection-size))
         (id1                     nil)
         (id2                     nil)
         (max-sim                 0)
         (method-specific-find-nn nil))

    (if (equal cluster-method "group-average")
        ;then
        (setf (symbol-function 'method-specific-find-nn)
              (symbol-function 'find-grp-avg-nn))
        ;else
        (setf (symbol-function 'method-specific-find-nn)
              (symbol-function 'find-clink-nn)))

      .......stuff deleted......

 (multiple-value-bind (nn sim)
   (method-specific-find-nn  current-cluster
                             cluster-members
                             num-active
                             active-clusters
                             sim-matrix)

     etc....

Thanks.

-bruce

From: David Duff
Subject: Re: function name as argument?
Date: 
Message-ID: <duff-1507960123270001@news>
In article <···········@piglet.cc.uic.edu>, Bruce L. Lambert, Ph.D.
<········@uic.edu> wrote:

> Hi folks,
> 
> I have written a document clustering program that can implement three
> standard types of clustering: single linkage, group average linkage, and
> complete linkage. The clustering methods differ only in the way the
> compute nearest neighbors between non-singleton clusters. So depending,
> on the cluster method specified in the argument list, I want the cluster
> function to use one of three method-specific find-nearest-neighbor
> functions. 
> 
> The top level progam is called cluster. I want to pass the cluster-method
> to cluster as an argument. Right now, I use let to allocate a symbol
> called method-specific-find-nn, and depending on the cluster-method, I
> setf the symbol-function of method-specific-find-nn to the
> symbol-function of the appropriate find-nn function name. Is there a more
> elegant way of doing this, perhaps with a macro?
> 
> Here's an excerpt:
> 
> (defun cluster (sim-matrix
>                 collection-size
>                 threshold
>                 cluster-method)
> 
>   "Compute clusters for a set of document vectors."
> 
>   (let* ((sizes                  (make-array collection-size))
>          (num-active              collection-size)
>          (cluster-members        (make-array collection-size))
>          (active-clusters        (make-array collection-size))
>          (nearest-neighbors      (make-array collection-size))
>          (sim-values             (make-array collection-size))
>          (id1                     nil)
>          (id2                     nil)
>          (max-sim                 0)
>          (method-specific-find-nn nil))
> 
>     (if (equal cluster-method "group-average")
>         ;then
>         (setf (symbol-function 'method-specific-find-nn)
>               (symbol-function 'find-grp-avg-nn))
>         ;else
>         (setf (symbol-function 'method-specific-find-nn)
>               (symbol-function 'find-clink-nn)))
> 
>       .......stuff deleted......
> 
>  (multiple-value-bind (nn sim)
>    (method-specific-find-nn  current-cluster
>                              cluster-members
>                              num-active
>                              active-clusters
>                              sim-matrix)
> 
>      etc....
> 
bruce,

it appears that you want your program to call one of several different
functions in certain places.  here are a couple of things that may help
you:

first, as an aside, if you want your function to accept a "token", that is
an argument whose sole function is to convey a symbolic value, then it is
conventional in lisp to use keywords (you used strings).  there are
several reasons for this convention - briefly, they use less memory and
are more efficient to compare than strings.  

if you used keywords, then at the places in your code where you want to
invoke the functionality, you could use a form like:

(case cluster-method
   (:group-average  (find-grp-avg-nn current-cluster cluster-members ...))
   (t (find-clink-nn cluster-members ...)))

note that because we are using keywords, the values that we're comparing
against are compile-time constants, and thus we can use the more concise
CASE macro instead of writing the explicit comparison code - where you
used (if (equal cluster-method "group-average") ...).

note also, that we simply call two different functions in two branches of
our code.  nothing complicated here.  no need to bind the symbol-function
slots of any symbols.  there's nothing that i can see in your statement of
your problem that calls for this.

now, supposing that there were lots of calls to method-specific-find-nn
throughout the body of your cluster function such that it would be
inconvenient and perhaps inefficient to duplicate the conditional code
(either your IF form or the case form above) multiple times...  (i assume
that might be the reason you felt compelled to modify the function-binding
of the symbol as you did.)  in this case, you might want to consider the
fact that functions are normal objects in common lisp and can be passed as
arguments to functions.  for example, instead of passing a string or a
symbol into your cluster function, you could have it accept a function as
an arugment and then simply use the common lisp FUNCALL function to invoke
the function.  specfically, 

(defun cluster (... cluster-method)
   (let (...)
      (multiple-value-bind (...)
         (funcall cluster-method ...)
      ...))))

in other words, cluster-method, rather than being a string, as in your
code, will now be an actual function that implements the nearest neighbors
behavior.  the rest of the body of the cluster function is unchanged,
except that wherever you called method-specific-find-nn before, you can
now simply insert (funcall cluster-method).  once again, there appears to
be no need here to bind the symbol-function cells of any symbols.  

dynamically binding the function slots of symbols is not a really a
standard "idiom" in common lisp, as it is in, say, the language scheme,
where let can be used just as easily to bind a symbol to a function as to
a value. actually, to be precise, let can be used to bind symbols to
functions in common lisp, as well, but there is a different calling
conventions when calling the function which is the current value of a
symbol vs. calling the function which is the current symbol-function of
the symbol -- (funcall foo a b c) vs. (foo a b c). static or "global"
bindings for functions are typically made with defining forms (like defun)
and lexically scoped bindings for functions are made with the flet or
labels special forms, but commonlisp does not provide any dynamic binding
form for functions.   

some consider this to be a "wart" in the language design.  i believe it
came to be this way for historical reasons, because it allows compilers to
generate reasonably efficient code, and because experience has proven that
it works pretty well for most things.

-- 
David Duff
MITRE Corporation AI Technical Center
McLean, Virginia  USA
From: Robert Munyer
Subject: Dynamic scoping and warts (was: function name as argument?)
Date: 
Message-ID: <4tm0hc$aeu@Mars.mcs.com>
In article <·····················@news>, David Duff <····@mitre.org> wrote:

> [...] there is a different calling conventions when calling the
> function which is the current value of a symbol vs. calling the
> function which is the current symbol-function of the symbol --
> (funcall foo a b c) vs. (foo a b c).  [...]
>
> some consider this to be a "wart" in the language design.  i believe
> it came to be this way for historical reasons, because it allows
> compilers to generate reasonably efficient code, and because
> experience has proven that it works pretty well for most things.

I wasn't there when the decision was made (was it in the sixties?)
but I think I have a good idea of what the reason might have been.
I suspect that making a symbol's function binding separate from
its value binding was a side effect of dynamic scoping, an attempt
to mitigate the damage that dynamic scoping can do to modularity.

Consider a simple version of MAPCAR and a function that calls it:

(defun mapper (f l)
  (if l (cons (funcall f (first l)) (mapper f (rest l)))))

(defun decrypt (sentence)
  (let ((l '((goldilocks . enemy) (mamabear . destroyer)
             (papabear . carrier) (babybear . submarine))))
    (mapper #'(lambda (w) (or (cdr (assoc w l)) w)) sentence)))

In a lexically scoped Lisp (e. g. Common Lisp or Scheme) this will
work as intended.

Now imagine running this code in a dynamically scoped Lisp, like
Lisp 1.5: it will fail because MAPPER's parameter L will shadow
the variable L used by DECRYPT.  Packages did not exist at the
time, so the only solution I can think of would be to use uninterned
symbols as your parameter names (clumsy and inconvenient).  Or you
could specify when you document the function MAPPER that function
F must not refer to free variables named F or L, because such
references would be intercepted and misdirected.

Now imagine running the same code in a dynamically scoped Lisp
without separate function bindings (I think maybe XLISP works like
this).  In this case, the parameters F and L will interfere with
the function F's ability to call global functions named F or L.
Interfering with function references is much worse than interfering
with variable references, because in a typical Lisp program most
variable references are local and most function references are
global.  So with dynamic scoping you're better off if the function
and value bindings are separate.

I think this shows that there was a good reason for separate function
bindings, back in the days of dynamic scoping.  These days I can't
think of any good reason except compatibility with old code.  So
I guess I'd agree it's a bit of a wart.

-- Robert
From: Ken Tilton
Subject: Re: function name as argument?
Date: 
Message-ID: <31EBD706.144D@bway.net>
Bruce,

>> Is there a more elegant way of doing this, perhaps with a macro?

Well, I wouldn't say "more elegant", but I think this will be an 
eye-opener for you in terms of how cool Lisp can be.

Here is my version, with some comments/enhancements

Instead of passing a string:

    (cluster aMatrix aSize aThreshold "group-average")

...just pass the function itself (#'xxx is same as (symbol-function 
'xxx)):

    (cluster aMatrix aSize aThreshold #'find-grp-avg-nn)

And for fun I have made the default be find-clink-nn (seems to be your 
intent), so for convenience you can just code:

    (cluster aMatrix aSize aThreshold)

as well as (to the same end, but perhaps more self-documenting):

    (cluster aMatrix aSize aThreshold  #'find-clink-nn)

Those are minor improvements. Here is the big one, made possible by the  
miracle of funcall/apply (in turn inspired (in John McCarthy) by Alan 
Turing's definition of, well, a Turing Machine):

(defun cluster (sim-matrix
                collection-size
                threshold
                &optional (cluster-method #'find-clink-nn)

  "Compute clusters for a set of document vectors."

  (let* ((sizes                  (make-array collection-size))
         (num-active              collection-size)
         (cluster-members        (make-array collection-size))
         (active-clusters        (make-array collection-size))
         (nearest-neighbors      (make-array collection-size))
         (sim-values             (make-array collection-size))
         (id1                     nil)
         (id2                     nil)
         (max-sim                 0))

     ;; .... symbol-function stuff not needed now .....

     ;; .......stuff deleted......

 (multiple-value-bind (nn sim)
   ;;
   ;; #'funcall is neat. See also: #'apply
   ;;
   (funcall cluster-method  current-cluster
                             cluster-members
                             num-active
                             active-clusters
                             sim-matrix)