top-n/bottom-n?

From: Foxpointe
Subject: top-n/bottom-n?
Date: Sun, 27 Aug 2006 00:42:53 +0000
Message-ID: <_pCdnRyk44YRd23ZnZ2dnUVZ_qudnZ2d@comcast.com>

I'm sure this has been done many times over so thought I'd ask for 
pointers on existing implementations as a starting point:  Given a list 
(or array), I'd like to get a list (or array) of the top or bottom N 
elements which would be  returned either as the elements themselves or 
as indexes to the elements in the list (or array) passed.

Re: top-n/bottom-n? ·······@gmail.com
- Re: top-n/bottom-n? Pascal Bourguignon
Re: top-n/bottom-n? Sam Steingold
Re: top-n/bottom-n? Fabien LE LEZ
Re: top-n/bottom-n? Wade Humeniuk

From: ·······@gmail.com
Subject: Re: top-n/bottom-n?
Date: Sun, 27 Aug 2006 03:42:32 +0000
Message-ID: <1156650152.379760.303780@h48g2000cwc.googlegroups.com>

The simplest way I can think of:

(defun top-n (seq n)
  (subseq seq 0 n))

(defun bot-n (seq n)
  (subseq seq (- (length x) n)))

Foxpointe wrote:
> I'm sure this has been done many times over so thought I'd ask for
> pointers on existing implementations as a starting point:  Given a list
> (or array), I'd like to get a list (or array) of the top or bottom N
> elements which would be  returned either as the elements themselves or
> as indexes to the elements in the list (or array) passed.

From: Pascal Bourguignon
Subject: Re: top-n/bottom-n?
Date: Sun, 27 Aug 2006 05:36:59 +0000
Message-ID: <87r6z2vix0.fsf@thalassa.informatimago.com>

·······@gmail.com writes:
> Foxpointe wrote:
>> I'm sure this has been done many times over so thought I'd ask for
>> pointers on existing implementations as a starting point:  Given a list
>> (or array), I'd like to get a list (or array) of the top or bottom N
>> elements which would be  returned either as the elements themselves or
>> as indexes to the elements in the list (or array) passed.
>
> The simplest way I can think of:
>
> (defun top-n (seq n)
>   (subseq seq 0 n))
>
> (defun bot-n (seq n)
>   (subseq seq (- (length x) n)))

Rather:

(defun most-n (seq n lessp &key (key (function identity)))
    (subseq (sort seq lessp :key key) 0 n))

(defun top-n-reals (seq-of-reals n)
   (most-n seq-of-reals (function >=)))

(defun bot-n-reals (seq-of-reals n)
   (most-n seq-of-reals (function <=)))

Now, assuming big sequences of length s, and sizeable n,  sort is
O(s*log(s)) and subseq O(n), giving a total of O(s*log(s)+n) but it's
possible to do it in better than O(s*n), which is interesting when
n<log(s).  If you expect large n (assuming gigantic s), you can do it
in O(s*log(n)) using a tree to keep the found "mosts" instead of a
vector.

(defun most-n (seq n lessp &key (key (function identity)))
  (let ((most '())
        (most-length 0))
    (flet ((insert (item index previous)      ; O(n)
             "Inserts in order the (item . index) in the list previous"
             (let ((head previous))
               (loop
                  :while (and (cdr previous)
                              (funcall lessp
                                       (funcall key item)
                                       (funcall key (caadr previous))))
                  :do (pop previous)
                  :finally (setf (cdr previous)
                                 (acons item index (cdr previous))))
               (cdr head))))
      (map nil                                ; O(s)
           (let ((index -1))
             (lambda (item)
               (incf index)
               ;; (print most)
               (cond
                 ((< most-length n)
                  (setf most (insert item index (cons nil most)))
                  (incf most-length))
                 ((funcall lessp (funcall key item) (funcall key (caar most)))
                  (setf most (insert item index most))))))
           seq))
    most))


(mapcar (lambda (lessp)
          (most-n '(L algorithme en question a ete publie en \1960
                    dans l IBM Journal article intitule Toward
                    Mechanical Mathematics avec des variantes et une
                    extension au calcul des predicats Il s agit ici du
                    premier programme de Wang systeme P)
                  6 lessp))
        '(string<= string>=))
-->
(((AU . 24) (ARTICLE . 13) (ALGORITHME . 1) (AGIT . 30) (A . 4) (|1960| . 8))
 ((S . 29) (SYSTEME . 37) (TOWARD . 15) (UNE . 22) (VARIANTES . 20) (WANG . 36)))


-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
----------> http://www.netmeister.org/news/learn2quote.html <-----------
---> http://homepage.ntlworld.com/g.mccaughan/g/remarks/uquote.html <---

__Pascal Bourguignon__                     http://www.informatimago.com/

From: Sam Steingold
Subject: Re: top-n/bottom-n?
Date: Sun, 27 Aug 2006 02:04:19 +0000
Message-ID: <m3u03zvsrg.fsf@loiso.podval.org>

> * Foxpointe <·········@pbzpnfg.arg> [2006-08-26 20:42:53 -0400]:
>
> I'm sure this has been done many times over so thought I'd ask for
> pointers on existing implementations as a starting point:  Given a list
> (or array), I'd like to get a list (or array) of the top or bottom N
> elements which would be  returned either as the elements themselves or
> as indexes to the elements in the list (or array) passed.

CLOCC/CLLIB/sorted.lisp does this and much more.
see TOP-BOTTOM and TOP-BOTTOM-UI

-- 
Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux)
http://pmw.org.il http://iris.org.il http://israelnorthblog.livejournal.com
http://memri.org http://mideasttruth.com http://israelunderattack.slide.com
I don't have an attitude problem. You have a perception problem.

From: Fabien LE LEZ
Subject: Re: top-n/bottom-n?
Date: Sun, 27 Aug 2006 01:40:57 +0000
Message-ID: <lpt1f291lgfeibrdhtiascqbrnor6mk858@4ax.com>

On Sat, 26 Aug 2006 20:42:53 -0400, Foxpointe <·········@comcast.net>:

>I'd like to get a list (or array) of the top or bottom N 
>elements 

Naive, straightforward solution:

(defun top-n (l n)
  (reduce #'cons l :end n :initial-value nil :from-end t))

(defun bottom-n (l n)
  (reduce #'cons l :start (- (length l) n) 
                   :initial-value nil :from-end t))

(For arrays, I'm sorry, but I'm not far enough in the study of Lisp to
answer.)

From: Wade Humeniuk
Subject: Re: top-n/bottom-n?
Date: Sun, 27 Aug 2006 17:19:09 +0000
Message-ID: <hukIg.13515$Nz6.8580@edtnps82>

Foxpointe wrote:
> I'm sure this has been done many times over so thought I'd ask for 
> pointers on existing implementations as a starting point:  Given a list 
> (or array), I'd like to get a list (or array) of the top or bottom N 
> elements which would be  returned either as the elements themselves or 
> as indexes to the elements in the list (or array) passed.

With Lisp you can write a function that can return the
elements themselves and the result shares storage with
the original sequence.  For vector (properly 1-D arrays)
the example below creates a displaced array
that maps directly to inputed sequence.  The only problem
case is writing the top-n for lists.  The result cannot
be composed of the original cons cells of the input list.

(defun bottom-n (sequence n)
   "Return a subsequence of the same type as
    sequence and directly allows access to the
    elements of sequence"
   (etypecase sequence
     (list (let ((length (length sequence)))
             (assert (<= n length))
             (nthcdr (- length n) sequence)))
     (vector (make-array n :displaced-to sequence
                         :displaced-index-offset (- (length sequence) n)))))

(defun top-n (sequence n)
   "Return a subsequence of the same type as
    sequence and directly allows access to the
    elements of sequence"
   (etypecase sequence
     (list (subseq sequence 0 n))
     (vector (make-array n :displaced-to sequence
                         :displaced-index-offset 0))))

CL-USER 1 > (defvar v #(1 2 3 4 5 6 7))
V

CL-USER 2 > (bottom-n v 3)
#(5 6 7)

CL-USER 3 > (setf vsub *)
#(5 6 7)

CL-USER 4 > vsub
#(5 6 7)

CL-USER 5 > v
#(1 2 3 4 5 6 7)

CL-USER 6 > (setf (aref vsub 0) 20)
20

CL-USER 7 > vsub
#(20 6 7)

CL-USER 8 > v
#(1 2 3 4 20 6 7)

CL-USER 9 >