From: Slobodan Blazeski
Subject: Cluster-by challenge
Date: 
Message-ID: <1193870858.809408.277680@o38g2000hse.googlegroups.com>
Tom Mortel has a solution to NPR puzzle:
Take the names of two U.S. States, mix them all together, then
rearrange the letters to form the names of two other U.S. States. What
states are these?

Especially interesthing is his ClusterBy utulity:
http://blog.moertel.com/articles/2007/09/01/clusterby-a-handy-little-function-for-the-toolbox

import Control.Arrow ((&&&))
import qualified Data.Map as M

clusterBy :: Ord b => (a -> b) -> [a] -> [[a]]
clusterBy f = M.elems . M.map reverse . M.fromListWith (++)
            . map (f &&& return)
Examples:
*Main> clusterBy length (words "the tan ant gets some fat")
[["the","tan","ant","fat"],["gets","some"]]

*Main> clusterBy head (words "the tan ant gets some fat")
[["ant"],["fat"],["gets"],["some"],["the","tan"]]

*Main> clusterBy last (words "the tan ant gets some fat")
[["the","some"],["tan"],["gets"],["ant","fat"]]

My straightforward translation to lisp ugly:
(defun cluster-by (fn lst)
              (let ((clusters (make-hash-table)))
                (dolist (e lst)
                  (setf (gethash (funcall fn e) clusters)
                    (cons e (gethash (funcall fn e) clusters))))
                (loop for value being the hash-values of clusters
collect value)))

CG-USER(22): (cluster-by #'length
                         '("the" "tan" "ant" "gets" "some" "fat"))
(("fat" "ant" "tan" "the") ("some" "gets"))

Any tips for improvement will be appreciated.

cheers
Slobodan

From: Slobodan Blazeski
Subject: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <1193878294.817088.54550@57g2000hsv.googlegroups.com>
Two improvements through using push with hash table and moving let
into the loop clause,
and using shorter names :) , Haskell names are even shorter than lisps
so don't say I'm cheating.

(defun cluster-by (f l)
  (loop with r = (make-hash-table)
      for e in l do  (push e (gethash (funcall f e) r))
      finally (return (loop for v being the hash-values of r collect
v))))

Is there any way to collect all the values of a hash table more
concisely ?
Or any other improvement, one row less and will be in pair with
Haskell solution

Slobodan
From: Leandro Rios
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <472932ef$0$1340$834e42db@reader.greatnowhere.com>
Slobodan Blazeski escribi�:
> Two improvements through using push with hash table and moving let
> into the loop clause,
> and using shorter names :) , Haskell names are even shorter than lisps
> so don't say I'm cheating.
> 
> (defun cluster-by (f l)
>   (loop with r = (make-hash-table)
>       for e in l do  (push e (gethash (funcall f e) r))
>       finally (return (loop for v being the hash-values of r collect
> v))))
> 
> Is there any way to collect all the values of a hash table more
> concisely ?
> Or any other improvement, one row less and will be in pair with
> Haskell solution
> 
> Slobodan
> 

This is my three liner (I love obfuscation contests :)

(defun cluster-by (fn list &aux (clusters (make-hash-table)))
   (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
   (loop for value being the hash-values of clusters collecting value))

And one without loop:

(defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
     (maphash  #'(lambda (key value) (push value result)) clusters)
     (nreverse result))

Leandro
From: Barry Margolin
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <barmar-0A13F3.11342501112007@comcast.dca.giganews.com>
In article <························@reader.greatnowhere.com>,
 Leandro Rios <··················@gmail.com> wrote:

> Slobodan Blazeski escribi�:
> > Two improvements through using push with hash table and moving let
> > into the loop clause,
> > and using shorter names :) , Haskell names are even shorter than lisps
> > so don't say I'm cheating.
> > 
> > (defun cluster-by (f l)
> >   (loop with r = (make-hash-table)
> >       for e in l do  (push e (gethash (funcall f e) r))
> >       finally (return (loop for v being the hash-values of r collect
> > v))))
> > 
> > Is there any way to collect all the values of a hash table more
> > concisely ?
> > Or any other improvement, one row less and will be in pair with
> > Haskell solution
> > 
> > Slobodan
> > 
> 
> This is my three liner (I love obfuscation contests :)
> 
> (defun cluster-by (fn list &aux (clusters (make-hash-table)))
>    (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>    (loop for value being the hash-values of clusters collecting value))
> 
> And one without loop:
> 
> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
>      (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>      (maphash  #'(lambda (key value) (push value result)) clusters)
>      (nreverse result))

Why reverse the result?  There's no particular ordering to MAPHASH, so 
why is the reverse order preferable?

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Leandro Rios
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <472a7924$0$1346$834e42db@reader.greatnowhere.com>
Barry Margolin escribi�:
> In article <························@reader.greatnowhere.com>,
>  Leandro Rios <··················@gmail.com> wrote:
>>
>> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
>>      (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>>      (maphash  #'(lambda (key value) (push value result)) clusters)
>>      (nreverse result))
> 
> Why reverse the result?  There's no particular ordering to MAPHASH, so 
> why is the reverse order preferable?
> 

Mmm, too much hurry and not too much thinking. Collecting with push I 
did a nreverse to restore the original order. My bad.

But now I discover that all of the proposed solutions are wrong. 
According to the definition of clusterBy:

"What clusterBy does is group a list of values by their signatures, as 
computed by a given signature function f, and returns the groups in 
/order of ascending signature/."

So this new version works as intended:

(defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
     (maphash  #'(lambda (key value) (push value result)) clusters)
     (sort result #'< :key #'(lambda (x) (length (car x)))))

CL-USER> (cluster-by #'length '("a" "b" "abc" "bc" "a" "abcd" "e" "fg"))
(("e" "a" "b" "a") ("fg" "bc") ("abc") ("abcd"))
CL-USER>

Now, how can this be made more efficient?

Thanks for pointing this out.

Leandro
From: Leandro Rios
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <472a8d49$0$1341$834e42db@reader.greatnowhere.com>
Leandro Rios escribi�:
> So this new version works as intended:
> 
> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>     (maphash  #'(lambda (key value) (push value result)) clusters)
>     (sort result #'< :key #'(lambda (x) (length (car x)))))
> 

Um, sorry:

(defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
     (maphash  #'(lambda (key value) (push value result)) clusters)
     (sort result #'< :key #'(lambda (x) (funcall fn (car x)))))
                                          ^^^^^^^^^^

I better get some sleep.

Leandro
From: Ken Tilton
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <_XxWi.272$6I6.63@newsfe08.lga>
Leandro Rios wrote:
> Leandro Rios escribi�:
> 
>> So this new version works as intended:
>>
>> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result 
>> nil))
>>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) 
>> list)
>>     (maphash  #'(lambda (key value) (push value result)) clusters)
>>     (sort result #'< :key #'(lambda (x) (length (car x)))))
>>
> 
> Um, sorry:
> 
> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>     (maphash  #'(lambda (key value) (push value result)) clusters)
>     (sort result #'< :key #'(lambda (x) (funcall fn (car x)))))
>                                          ^^^^^^^^^^
> 
> I better get some sleep.

When you wake up, howse about don't throw away the key when collecting 
the results, throw it away after sorting on it so you do not have to 
recompute it? You might also consider getting fancy and collecting 
directly into a btree.

kt

-- 
http://www.theoryyalgebra.com/

"In the morning, hear the Way;
  in the evening, die content!"
                     -- Confucius
From: Leandro Rios
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <472b068a$0$1342$834e42db@reader.greatnowhere.com>
Ken Tilton escribi�:
> 
> 
> Leandro Rios wrote:
>> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result 
>> nil))
>>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) 
>> list)
>>     (maphash  #'(lambda (key value) (push value result)) clusters)
>>     (sort result #'< :key #'(lambda (x) (funcall fn (car x)))))
>>                                          ^^^^^^^^^^
>>
>> I better get some sleep.
> 
> When you wake up, howse about don't throw away the key when collecting 
> the results, throw it away after sorting on it so you do not have to 
> recompute it? You might also consider getting fancy and collecting 
> directly into a btree.
> 

Good morning :)

(defun cluster-by2 (fn list &aux (clusters (make-hash-table)) (result nil))
     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
     (maphash  #'(lambda (key value) (push (list key value) result)) 
clusters)
     (mapcar #'second (sort result #'< :key #'car)))

CL-USER> (time (dotimes (i 10000) (cluster-by #'length '("a" "b" "abc" 
"bc" "a" "abcd" "e" "fg"))))
Evaluation took:
   0.094 seconds of real time
   0.076005 seconds of user run time
   0.0 seconds of system run time
   [Run times include 0.008 seconds GC run time.]
   0 calls to %EVAL
   0 page faults and
   13,437,408 bytes consed.
NIL
CL-USER> (time (dotimes (i 10000) (cluster-by2 #'length '("a" "b" "abc" 
"bc" "a" "abcd" "e" "fg"))))
Evaluation took:
   0.096 seconds of real time
   0.084005 seconds of user run time
   0.0 seconds of system run time
   [Run times include 0.012 seconds GC run time.]
   0 calls to %EVAL
   0 page faults and
   15,199,584 bytes consed.
NIL

There is not too much difference in execution time with the original, 
but I'm using the function #'length in the test case, which is not too 
expensive.

Thanks,

Leandro
From: Ken Tilton
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <4GFWi.19$BT4.9@newsfe10.lga>
Leandro Rios wrote:
> Ken Tilton escribi�:
> 
>>
>>
>> Leandro Rios wrote:
>>
>>> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result 
>>> nil))
>>>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) 
>>> list)
>>>     (maphash  #'(lambda (key value) (push value result)) clusters)
>>>     (sort result #'< :key #'(lambda (x) (funcall fn (car x)))))
>>>                                          ^^^^^^^^^^
>>>
>>> I better get some sleep.
>>
>>
>> When you wake up, howse about don't throw away the key when collecting 
>> the results, throw it away after sorting on it so you do not have to 
>> recompute it? You might also consider getting fancy and collecting 
>> directly into a btree.
>>
> 
> Good morning :)
> 
> (defun cluster-by2 (fn list &aux (clusters (make-hash-table)) (result nil))
>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>     (maphash  #'(lambda (key value) (push (list key value) result)) 
> clusters)
>     (mapcar #'second (sort result #'< :key #'car)))
> 
> CL-USER> (time (dotimes (i 10000) (cluster-by #'length '("a" "b" "abc" 
> "bc" "a" "abcd" "e" "fg"))))
> Evaluation took:
>   0.094 seconds of real time
>   0.076005 seconds of user run time
>   0.0 seconds of system run time
>   [Run times include 0.008 seconds GC run time.]
>   0 calls to %EVAL
>   0 page faults and
>   13,437,408 bytes consed.
> NIL
> CL-USER> (time (dotimes (i 10000) (cluster-by2 #'length '("a" "b" "abc" 
> "bc" "a" "abcd" "e" "fg"))))
> Evaluation took:
>   0.096 seconds of real time
>   0.084005 seconds of user run time
>   0.0 seconds of system run time
>   [Run times include 0.012 seconds GC run time.]
>   0 calls to %EVAL
>   0 page faults and
>   15,199,584 bytes consed.
> NIL
> 
> There is not too much difference in execution time with the original, 
> but I'm using the function #'length in the test case, which is not too 
> expensive.

Right. You do more consing and then need a second pass, so the sig func 
has to be more expensive to matter. Why not modify it to do something 
more challenging (sum the char-codes?), find the threshhold. And this is 
why I was thinking collecting directly into a btree would be fun, then 
you just need a final pass to extract the results. I 'spose the 
hashtable would be faster than the btree....

kt

-- 
http://www.theoryyalgebra.com/

"In the morning, hear the Way;
  in the evening, die content!"
                     -- Confucius
From: Leandro Rios
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <472b6094$0$1341$834e42db@reader.greatnowhere.com>
Ken Tilton escribi�:
> 

> Right. You do more consing and then need a second pass, so the sig func 
> has to be more expensive to matter. Why not modify it to do something 
> more challenging (sum the char-codes?), find the threshhold. And this is 
> why I was thinking collecting directly into a btree would be fun, then 
> you just need a final pass to extract the results. I 'spose the 
> hashtable would be faster than the btree....
> 

Ok, I did this: I used Graham's bst version from ACL (page 71) modified 
to accomodate a key.

(defstruct node key elt (l nil) (r nil))

(defun bst-insert (key obj bst <)
   (if (null bst)
       (make-node :key key :elt (list obj))
       (let ((lookup (node-key bst))
	    (elt (node-elt bst)))
         (if (eql key lookup)
	    (progn
	      (push obj (node-elt bst))
	      bst)
             (if (funcall < key lookup)
                 (make-node
		 :key lookup
		 :elt elt
		 :l   (bst-insert key obj (node-l bst) <)
		 :r   (node-r bst))
                 (make-node
		 :key lookup
		 :elt elt
		 :r   (bst-insert key obj (node-r bst) <)
		 :l   (node-l bst)))))))


(defun bst-traverse (fn bst)
   (when bst
     (bst-traverse fn (node-l bst))
     (funcall fn bst)
     (bst-traverse fn (node-r bst))))

And rewrote the function to use it:

(defun cluster-by3 (fn list &aux (clusters nil) (result nil))
     (mapcar #'(lambda (x) (setf clusters (bst-insert (funcall fn x) x 
clusters #'>))) list)
     (bst-traverse #'(lambda (n) (push (node-elt n) result)) clusters)
     result)

CL-USER> (cluster-by3 #'length '("a" "b" "abc" "bc" "a" "abcd" "e" "fg"))
(("e" "a" "b" "a") ("fg" "bc") ("abc") ("abcd"))

I'm leaving for the weekend right now. When I get back, I'll do some 
testing to see how it compares with the hash table version.

Leandro
From: namekuseijin
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <1194030946.393980.135150@o3g2000hsb.googlegroups.com>
On 2 nov, 15:38, Leandro Rios <··················@gmail.com> wrote:
> Ok, I did this: I used Graham's bst version from ACL (page 71) modified
> to accomodate a key.
>
> (defstruct node key elt (l nil) (r nil))
>
> (defun bst-insert (key obj bst <)
>    (if (null bst)
>        (make-node :key key :elt (list obj))
>        (let ((lookup (node-key bst))
>             (elt (node-elt bst)))
>          (if (eql key lookup)
>             (progn
>               (push obj (node-elt bst))
>               bst)
>              (if (funcall < key lookup)
>                  (make-node
>                  :key lookup
>                  :elt elt
>                  :l   (bst-insert key obj (node-l bst) <)
>                  :r   (node-r bst))
>                  (make-node
>                  :key lookup
>                  :elt elt
>                  :r   (bst-insert key obj (node-r bst) <)
>                  :l   (node-l bst)))))))
>
> (defun bst-traverse (fn bst)
>    (when bst
>      (bst-traverse fn (node-l bst))
>      (funcall fn bst)
>      (bst-traverse fn (node-r bst))))
>
> And rewrote the function to use it:
>
> (defun cluster-by3 (fn list &aux (clusters nil) (result nil))
>      (mapcar #'(lambda (x) (setf clusters (bst-insert (funcall fn x) x
> clusters #'>))) list)
>      (bst-traverse #'(lambda (n) (push (node-elt n) result)) clusters)
>      result)

Holy crap!  the man wanted conciseness and you come up with such a
huge implementation?!  Just because it's abstracted away as a function
call doesn't mean it compares well to the super zipper compact Haskell
solution.  Ok, so I'm not being fair either:  the Haskell one also
uses several functions not-part of the standard...

well, now I don't feel that bad for posting my tail-recursive
implementation which is nowhere as concise as the original request,
but much more than yours anyway.  And it only uses a few predefined
functions from the standard...

(define (group-by f ls gs)
  (if (null? ls) gs
      (let* ((i (car ls)) (cmp (f i)))
        (group-by f
                 (cdr ls)
                 (let group ((gs gs) (acc '()) (g (cons (cons i '())
gs)))
                   (if (null? gs) g
                       (if (eq? cmp (f (caar gs)))
                           (append acc (cons (cons i (car gs)) (cdr
gs)))
                           (group (cdr gs) (cons (car gs) acc)
g))))))))

use it like:
; obvious
(group-by string-length ls '())
; by first character
(group-by (lambda (x) (string-ref x 0)) ls '())

of course, if order is important:
(reverse (map reverse (group-by string-length ls '())))
From: Slobodan Blazeski
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <1194033002.371327.279080@z9g2000hsf.googlegroups.com>
On Nov 2, 6:31 am, Ken Tilton <···········@optonline.net> wrote:
> Leandro Rios wrote:
> > Ken Tilton escribi�:
>
> >> Leandro Rios wrote:
>
> >>> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result
> >>> nil))
> >>>     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters)))
> >>> list)
> >>>     (maphash  #'(lambda (key value) (push value result)) clusters)
> >>>     (sort result #'< :key #'(lambda (x) (funcall fn (car x)))))
> >>>                                          ^^^^^^^^^^
>
> >>> I better get some sleep.
>
> >> When you wake up, howse about don't throw away the key when collecting
> >> the results, throw it away after sorting on it so you do not have to
> >> recompute it? You might also consider getting fancy and collecting
> >> directly into a btree.
>
> > Good morning :)
>
> > (defun cluster-by2 (fn list &aux (clusters (make-hash-table)) (result nil))
> >     (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
> >     (maphash  #'(lambda (key value) (push (list key value) result))
> > clusters)
> >     (mapcar #'second (sort result #'< :key #'car)))
>
> > CL-USER> (time (dotimes (i 10000) (cluster-by #'length '("a" "b" "abc"
> > "bc" "a" "abcd" "e" "fg"))))
> > Evaluation took:
> >   0.094 seconds of real time
> >   0.076005 seconds of user run time
> >   0.0 seconds of system run time
> >   [Run times include 0.008 seconds GC run time.]
> >   0 calls to %EVAL
> >   0 page faults and
> >   13,437,408 bytes consed.
> > NIL
> > CL-USER> (time (dotimes (i 10000) (cluster-by2 #'length '("a" "b" "abc"
> > "bc" "a" "abcd" "e" "fg"))))
> > Evaluation took:
> >   0.096 seconds of real time
> >   0.084005 seconds of user run time
> >   0.0 seconds of system run time
> >   [Run times include 0.012 seconds GC run time.]
> >   0 calls to %EVAL
> >   0 page faults and
> >   15,199,584 bytes consed.
> > NIL
>
> > There is not too much difference in execution time with the original,
> > but I'm using the function #'length in the test case, which is not too
> > expensive.
>
> Right. You do more consing and then need a second pass, so the sig func
> has to be more expensive to matter. Why not modify it to do something
> more challenging (sum the char-codes?), find the threshhold. And this is
> why I was thinking collecting directly into a btree would be fun, then
> you just need a final pass to extract the results. I 'spose the
> hashtable would be faster than the btree....
>
> kt
>
> --http://www.theoryyalgebra.com/
>
> "In the morning, hear the Way;
>   in the evening, die content!"
>                      -- Confucius

I overlooked the ordered constraint, nice catch Leandro. I don't know
Haskell but I suspect that Mortel is storing data in balanced self-
adjusting tree, in a c++ terminology that would be a  STL map which
keeps it's data sorted by key. It's slower than hash-table but it's
ordered.
I would rather call this a library thing, if standard has specified
one or by using a library of such container we would be again with 3
liner.

cheers
Slobodan
From: Thomas A. Russ
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <ymimyts38eq.fsf@blackcat.isi.edu>
Leandro Rios <··················@gmail.com> writes:

> (defun cluster-by2 (fn list &aux (clusters (make-hash-table)) (result nil))
>      (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)

You should replace MAPCAR with MAPC.  You don't care about the return value
from the mapping operation, so there's no point in collecting those
values only to throw them away immediately.

>      (maphash  #'(lambda (key value) (push (list key value) result))
> clusters)
>      (mapcar #'second (sort result #'< :key #'car)))

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Leandro Rios
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <472fc425$0$1344$834e42db@reader.greatnowhere.com>
Thomas A. Russ escribi�:
> Leandro Rios <··················@gmail.com> writes:
> 
>> (defun cluster-by2 (fn list &aux (clusters (make-hash-table)) (result nil))
>>      (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
> 
> You should replace MAPCAR with MAPC.  You don't care about the return value
> from the mapping operation, so there's no point in collecting those
> values only to throw them away immediately.

Thanks,

Leandro
From: Thomas A. Russ
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <ymizlxr1apj.fsf@blackcat.isi.edu>
Leandro Rios <··················@gmail.com> writes:

> Leandro Rios escribio:
> > So this new version works as intended:
> 
> (defun cluster-by (fn list &aux (clusters (make-hash-table)) (result nil))
>      (mapcar #'(lambda (x) (push x (gethash (funcall fn x) clusters))) list)
>      (maphash  #'(lambda (key value) (push value result)) clusters)
>      (sort result #'< :key #'(lambda (x) (funcall fn (car x)))))

Not really.  It only works if you assume that the signature function FN
always produces a numeric result.  (That may well be the case, I forget)

If, for example, FN were defined as

   (defun first-letter (string) (char 0 string))

your example would fail.  The only way to get this to be properly robust
would be to also include an option to specify the comparison function as
well.  Perhaps a signature like

  (defun cluster-by (fn list &optional (order-fn #'<) &aux ...)
     ...
     (sort result order-fn :key ...))

would work.

Also, as others have pointed out, there may be some run-time efficiency
benefit to storing the signature value as part of what you accumulate,
although you do pay in readability by doing that.

> I better get some sleep.


-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Chris Russell
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <1193885333.258849.93120@57g2000hsv.googlegroups.com>
On 1 Nov, 00:51, Slobodan Blazeski <·················@gmail.com>
wrote:
> Two improvements through using push with hash table and moving let
> into the loop clause,
> and using shorter names :) , Haskell names are even shorter than lisps
> so don't say I'm cheating.
>
> (defun cluster-by (f l)
>   (loop with r = (make-hash-table)
>       for e in l do  (push e (gethash (funcall f e) r))
>       finally (return (loop for v being the hash-values of r collect
> v))))
>
> Is there any way to collect all the values of a hash table more
> concisely ?
> Or any other improvement, one row less and will be in pair with
> Haskell solution
>
> Slobodan

You can move the initialisation of the variables into optional
arguments, if we're golfing.

CL-USER> (defun cluster-by (f l &optional (r (make-hash-table)))
	   (dolist (e l) (push e (gethash (funcall f e) r)))
	   (loop for v being the hash-values of r collect v))
CLUSTER-BY
CL-USER> (cluster-by #'length
                         '("the" "tan" "ant" "gets" "some" "fat"))
(("fat" "ant" "tan" "the") ("some" "gets"))

or

CL-USER> (defun cluster-by (f l &optional (r (make-hash-table))d)
	   (dolist (e l) (push e (gethash (funcall f e) r)))
	   (maphash (lambda(x y) (push y d))r )d)
; in: LAMBDA NIL
;     (LAMBDA (X Y) (PUSH Y D))
; ==>
;   #'(LAMBDA (X Y) (PUSH Y D))
;
; caught STYLE-WARNING:
;   The variable X is defined but never used.
;
; compilation unit finished
;   caught 1 STYLE-WARNING condition
STYLE-WARNING: redefining CLUSTER-BY in DEFUN
CLUSTER-BY
CL-USER> (cluster-by #'length
                         '("the" "tan" "ant" "gets" "some" "fat"))
(("some" "gets") ("fat" "ant" "tan" "the"))

Arguably, someone might actually want to pass partially computed
values into the function but it still feels like an unlispy hack.
What you want is some kind of higher order function that maps into a
new hash so you can compose it on the fly.
From: Slobodan Blazeski
Subject: Re: Any way to collect all the values of a hash table more concisely ?
Date: 
Message-ID: <1193907949.640694.221430@o38g2000hse.googlegroups.com>
Abuse of optional and aux arguments , how could I forgeth that :) I
thought it was my weird hack but it seems it's in widespread use
within lispers. Anyway we're in pair with Haskell, lisp solutions has
more characters but lisp operators are longer.

thanks Leandro and Chris

Slobodan
From: Thomas A. Russ
Subject: Re: Cluster-by challenge
Date: 
Message-ID: <ymi7il1538f.fsf@blackcat.isi.edu>
Slobodan Blazeski <·················@gmail.com> writes:

> My straightforward translation to lisp ugly:
> (defun cluster-by (fn lst)
>               (let ((clusters (make-hash-table)))
>                 (dolist (e lst)
>                   (setf (gethash (funcall fn e) clusters)
>                     (cons e (gethash (funcall fn e) clusters))))
>                 (loop for value being the hash-values of clusters
>                       collect value)))

One simple improvement would involve using PUSH and the optional
argument to GET-HASH to deal with non-existent values. The latter is not
strictly necessary, since the missing value return is already NIL, but
it makes me feel better to be explicit about it:

(defun cluster-by (fn lst)
  (let ((clusters (make-hash-table)))
    (dolist (e lst)
      (push e (gethash (funcall fn e) clusters nil)))
    (loop for value being the hash-values of clusters
          collect value)))


-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Madhu
Subject: Re: Cluster-by challenge
Date: 
Message-ID: <m3y7d0hy8z.fsf@robolove.meer.net>
* Slobodan Blazeski in <·····························@o38g2000hse.googlegroups.com> :

| Tom Mortel has a solution to NPR puzzle:
| Take the names of two U.S. States, mix them all together, then
| rearrange the letters to form the names of two other U.S. States. What
| states are these?
[...]
| My straightforward translation to lisp ugly:
| (defun cluster-by (fn lst)
|               (let ((clusters (make-hash-table)))
|                 (dolist (e lst)
|                   (setf (gethash (funcall fn e) clusters)
|                     (cons e (gethash (funcall fn e) clusters))))
|                 (loop for value being the hash-values of clusters
| collect value)))
|
| CG-USER(22): (cluster-by #'length
|                          '("the" "tan" "ant" "gets" "some" "fat"))
| (("fat" "ant" "tan" "the") ("some" "gets"))
|
| Any tips for improvement will be appreciated.

This is functionally identical to the GROUP2 function posted on
comp.lang.lisp by "Chaitanya Gupta" in May this year in the thread
titled:

  Subject: How do I make this utility more flexible without losing speed?

See <·················@registered.motzarella.org> for the thread

--
Madhu
From: Madhu
Subject: googlegroups mangling Message-ID / News URLS  (was: Cluster-by challenge)
Date: 
Message-ID: <m3bq9vig44.fsf_-_@robolove.meer.net>
* I wrote
| * Slobodan Blazeski in <·····························@o38g2000hse.googlegroups.com> :
[...]
|| My straightforward translation to lisp ugly:
|| (defun cluster-by (fn lst)
[...]
|| Any tips for improvement will be appreciated.
|
| This is functionally identical to the GROUP2 function posted on
| comp.lang.lisp by "Chaitanya Gupta" in May this year in the thread
| titled:
|
|   Subject: How do I make this utility more flexible without losing speed?
|
| See <·················@registered.motzarella.org> for the thread

I noticed googlegroups rewrites the Message-ID in the body of articles
and renders it useless, for cutting and pasting into the Message-ID box
in their advanced search to get at the denoted article.

Do googlegroups users have some means to recover the mangled email
addresses/message-IDs ?

Maybe I should use this syntax
( f1pq5c$m8o$1 @ registered.motzarella.org)
to post message-IDs on usenet so they don't get mangled automatically?

FWIW In this case The deja-vu thread I wanted to point out is at
<URL:http://groups.google.com/group/comp.lang.lisp/msg/275438f7cfe34cdf?as_umsgid=f1pq5c$m8o$1%40registered.motzarella.org>
--
Madhu