From: Christopher Dollin
Subject: Filtering Vectors in CL Question
Date: 
Message-ID: <1350001@otter.HP.COM>
I have a query that has arisen out of local comparisons of Common Lisp and
Pop11.

What is the idiomatic and efficent way in CL to filter a vector; that is, to
make a new vector from an existing vector, containing only those elements
that satisfy some patameter predicate, and preserving the order?

The best I can think of is

    (defun filterv (v p)
        (let ( (vv (make-array (length v) :fill-pointer 0)) )
            (dotimes (i (length v))
                (if (funcall (p (aref v i)))
                    (vector-push vv (aref v i))
                )
            )
            vv
        )
    )

which seems to be rather ham-fisted, especially since it wastes space 
allocating an over-sized vector (if p actually removes any elements) and 
delivers an array with a fill-pointer, which may not be what you want (copying
to a simple vector may also not be what you want, as you've now wasted space
TWICE!).

What's the nice way? 

[Note: one Pop11 solution is

    define filter( v, p ); lvars v, procedure p;
        {% 
            appdata
                ( 
                v, 
                procedure ( x ); lvars x; if p( x ) then x endif endprocedure
                )
        %}
    enddefine;

which makes a vector out of all the elements in v that satisfy p, generates no
garbage, and runs acceptably fast].


Regards,
Kers                                    | "Why Lisp if you can talk Poperly?"

From: Barry Margolin
Subject: Re: Filtering Vectors in CL Question
Date: 
Message-ID: <13979@think.UUCP>
In article <·······@otter.HP.COM> ····@otter.HP.COM (Christopher Dollin) writes:
>What is the idiomatic and efficent way in CL to filter a vector; that is, to
>make a new vector from an existing vector, containing only those elements
>that satisfy some patameter predicate, and preserving the order?

(defun filterv (v p)
  (delete-if-not p v))

The second argument of DELETE-IF-NOT is actually a sequence, which
encompasses lists and vectors, and the result is a sequence of the
same type as the input.

---
Barry Margolin
Thinking Machines Corp.

······@think.com
seismo!think!barmar
From: Christopher Dollin
Subject: Re: Filtering Vectors in CL Question
Date: 
Message-ID: <1350003@otter.HP.COM>
One note and one direct mail reply (to which I don't have access, being
on a different machine ...) suggested using REMOVE-IF-NOT.

Alas! I have asked the wrong question. The existance of a primitive to
perform the operation gives few clues as to idiomatic style ...

So suppose someone says "please implement 'remove-if-not'. I would like
your solution to generate no garbage if possible and to be reasonably
efficient".

Or suppose they say "write a function to copy a list, eliminating items
that do not satisfy a predicate and duplicating elements which do".

Regards,
Kers.                                | "Why Lisp if you can talk Poperly?"
From: Barry Margolin
Subject: Re: Filtering Vectors in CL Question
Date: 
Message-ID: <15622@think.UUCP>
In article <·······@otter.HP.COM> ····@otter.HP.COM (Christopher Dollin) writes:
>So suppose someone says "please implement 'remove-if-not'. I would like
>your solution to generate no garbage if possible and to be reasonably
>efficient".

Here's a version I threw together quickly (note that it doesn't take
any options as the CL version does, and it only accepts a list, not
any sequence):

(defun simple-remove-if-not (test list)
  (let ((result nil))
    (dolist (item list)
      (unless (funcall test item)
	(push item result)))
    (nreverse result)))

The only inefficiency here is the NREVERSE at the end.  In Symbolics
Lisp I would use (LOOP ... COLLECT ...) because I know it uses a more
efficient collection mechanism involving maintaining a pointer to the
last cons in the result list; a portable version generates one garbage
cons cell (the Symbolics version uses a locative pointer to a local
variable):

(defun not-quite-so-simple-remove-if-not (test list)
  (let* ((result (ncons nil))
	 (current result))
    (dolist (item list)
      (unless (funcall test item)
	(rplacd current
		(setq current (ncons item)))))
    (cdr result)))

And here's one that doesn't generate any garbage but has a slightly
slower inner loop:

(defun even-less-simple-remove-if-not (test list)
  (let* ((result nil)
	 (current result))
    (dolist (item list)
      (unless (funcall test item)
	(if current
	    (rplacd current
		    (setq current (ncons item)))
	  (setq current (setq result (ncons item))))))
    result))
Barry Margolin
Thinking Machines Corp.

······@think.com
uunet!think!barmar
From: Jeff Dalton
Subject: Re: Filtering Vectors in CL Question
Date: 
Message-ID: <242@aiva.ed.ac.uk>
In article <·······@otter.HP.COM> ····@otter.HP.COM (Christopher Dollin)
writes:
>One note and one direct mail reply (to which I don't have access, being
>on a different machine ...) suggested using REMOVE-IF-NOT.

I tried to send mail, but I think it failed (I got a failure message).
Now I'll try news...

>Alas! I have asked the wrong question. The existance of a primitive to
>perform the operation gives few clues as to idiomatic style ...

Well, it *is* the idiomatic style.  What do you do in Pop?  When you
asked the question, you gave this solution:

    define filter( v, p ); lvars v, procedure p;
        {% 
            appdata
                ( 
                v, 
                procedure ( x ); lvars x; if p( x ) then x endif endprocedure
                )
        %}
    enddefine;

> which makes a vector out of all the elements in v that satisfy p, 
> generates no garbage, and runs acceptably fast.

What would you do without appdata, which looks to be at about the same
level as REMOVE-IF-NOT?

>So suppose someone says "please implement 'remove-if-not'. I would like
>your solution to generate no garbage if possible and to be reasonably
>efficient".

>Or suppose they say "write a function to copy a list, eliminating items
>that do not satisfy a predicate and duplicating elements which do".

Well, if it's a *list*, you can do:

     (defun filter (p l)
       (mapcan #'(lambda (x) (if (funcall p x) (list x))) l))

or various other things...  But it seemed that the point of your
original question was that doing it for a *vector* was messy in Lisp
but (happened to be) easy in Pop.  That's just because Pop has
different primitives.  So I think REMOVE-IF-NOT was a fair answer.

As for your other questions...

In <an earlier article> Christopher Dollin writes:
> If p and q are functions with arbitrary in-arity and out-arity, what is the
> idiomatic efficient way of writing a function
>
>    (compose p q)

Your Lisp solution is pretty much how it must be done, but in some
implementations it would not require any consing because the &rest
list would be on the stack.  In Pop, you can use chain or something,
can't you?  In CL, I suspect people just write the appropriate
lambda-expression in-place rather than use a general composition
operation.

> Similarly if p is a function, what is an idiomatic or efficient way to
> define
>
>     (partapply p a1 ... an)
> 
> which produces a function which, when applied to arguments b1 ... bn,
> applies p to a1 ... an b1 ... bn.

If you want to do exactly this, Pop wins because it's built-in, but in
most cases lexical closures are more appropriate.  For example, you
can easily make dynamic lists in CL or implement a general DELAY w/o
having to know the variables that must be "frozen".

Pop does have some primitives that make some things easier.  Lisp has
different primitives that make other things easier.  In some cases,
Pop is nicer; in others, it isn't.

Jeff Dalton,                      JANET: ········@uk.ac.ed             
AI Applications Institute,        ARPA:  ·················@nss.cs.ucl.ac.uk
Edinburgh University.             UUCP:  ...!ukc!ed.ac.uk!J.Dalton
From: Stephen Knight
Subject: Re: Filtering Vectors in CL Question
Date: 
Message-ID: <1350010@otter.hple.hp.com>
The neat collection of solutions proposed by Barry only work for lists and not
vectors.  The idiom Chris is trying to investigate (I believe) is 
that of constructing vectors without going through an 
intermediate structure when the size of the vector is not known beforehand.
Of course, this problem can be generalised to :-

    Construct a structure with an unknown number of components
    in which some computation needs to be done to figure out the
    components.  This construction should generate as little garbage
    as possible (none) and be reasonably efficient (order of the 
    number of components with constant factor < 3, say.)

The trick which makes this possible in Pop11 is the global, open stack PLUS
the knowledge that this stack is efficient to access.  The problem in Lisp
style languages is that there is no dynamically growable structure that 
is comparably efficient.  As such, I think that all solutions in Lisp have to
be justified in terms of the guaranteed performance of the primitives.

The two styles of solution I see are either to arrange to have a stack-like
entity that is fairly cheap or to reclaim the intermediate store.  The first
solution leads into extensible arrays -- and then insisting that implementors
should be obliged to guarantee space & time efficiency.  The second approach
is to construct all lists with a version of "cons" that checks a free list
first before allocating fresh store and then provide an operation for returning
list cells to the free list.  This is a fairly widely known hack that 
should have found its way into CLtL.

The general problem, stated roughly above, finds its way into my code with
surprising regularity.  Similarly, the use of partially applied functions
and multiple results are elegantly facilitated by the "open stack" approach
of Pop11.  Although, coming from a Lisp background, I found the idioms 
associated with the open stack a bit strange at first, I found myself
missing it when I went back to Lisp.  These days I hack in Pop11 almost
exclusively & feel sure that a lot of people who find Lisp frustrating
would get a lot out of looking at this language, too.

Steve Knight