Sequence functions on arrays

From: Larry Hunter
Subject: Sequence functions on arrays
Date: Fri, 22 Nov 2002 18:41:41 +0000
Message-ID: <m3lm3lmopm.fsf@huge.uchsc.edu>

Folks,

I was recently asked by a student why some of the sequence functions
(e.g. position) aren't defined on arrays. The quick answer is that the
order of processing isn't well defined. But since row-major-order is
part of the spec, I wondered why not just use it? Is there any good
reason not to extend sequence functions to arrays like this: 

(defun array-as-sequence (array)
  (make-array (array-total-size array)
	      :displaced-to array
	      :element-type (array-element-type array)))

(defun array-subscripts-from-row-major-index (array index)
  (let ((index-left index))
    (maplist (lambda (dimensions)
	       (multiple-value-bind (quotient remainder)
		   (floor index-left (reduce #'* (rest dimensions)))
		 (setq index-left remainder)
		 quotient))
	     (array-dimensions array))))

And then 

(defun array-reduce (function array &rest args)
  (apply #'reduce `(,function ,(array-as-sequence array) ,@args)))

and 

(defun array-position (item array &rest args)
   (array-subscripts-from-row-major-index array
     (apply #'position `(,item ,(array-as-sequence array) ,@args))))

etc.

So we get 

(setq test-array (make-array '(4 2 3) :initial-contents
		     '(((4 5 6) (1 2 3))
		       ((7 8 9) (3 1 2))
		       ((10 11 12) (2 3 1))
		       ((13 14 15) (0 0 0)))))

(array-reduce #'max test-array)
=>15

(array-position 11 test-array)
=>(2 0 1)

Is there some good reason this wasn't made part of the spec? What's
wrong with it?

Larry
-- 

Lawrence Hunter, Ph.D.
Director, Center for Computational Pharmacology
Associate Professor of Pharmacology, PMB & Computer Science

phone  +1 303 315 1094           UCHSC, Campus Box C236    
fax    +1 303 315 1098           School of Medicine rm 2817b   
cell   +1 303 324 0355           4200 E. 9th Ave.                 
email: ············@uchsc.edu    Denver, CO 80262       
PGP key on public keyservers     http://compbio.uchsc.edu/hunter

Re: Sequence functions on arrays Marco Antoniotti
Re: Sequence functions on arrays Kent M Pitman
- Re: Sequence functions on arrays Barry Margolin

From: Marco Antoniotti
Subject: Re: Sequence functions on arrays
Date: Fri, 22 Nov 2002 20:41:58 +0000
Message-ID: <y6cd6oxibft.fsf@octagon.valis.nyu.edu>

Larry Hunter <············@uchsc.edu> writes:

> Folks,
> 
> I was recently asked by a student why some of the sequence functions
> (e.g. position) aren't defined on arrays. The quick answer is that the
> order of processing isn't well defined. But since row-major-order is
> part of the spec, I wondered why not just use it? Is there any good
> reason not to extend sequence functions to arrays like this: 

How come we are working on the same thing?

I just went trhough a two-days implementation of some array
manipulation routines along the lines you are describing.

I was interested in COUNT and COUNT-IF to start with, but I quickly
ended up (as usual) to over engineer the library. :)

One thing that I believe is not usefully defined in CL is the notion
of "array slices" (for lack of a better word).  Array displacement
does not quite cut it, because of the requirements on the layout of
arrays in row major order.

Think of the following

        (defvar *aa* (make-array '(3 3)
                                 :initial-contents '((1 2 3)
                                                     (4 5 6)
                                                     (7 8 9))))
        ==> *aa*
        *aa*
        ==> #2A((1 2 3) (4 5 6) (7 8 9))

There is no easy way in CL to get the sub array

        #2A((2 3) (5 6))

from *aa*.

(At least I could not figured out).  So I went out and build a small
library of functionalities that allow me to do exactly that.

        (defvar *aa-slice* (as:make-array-slice '(2 2) *aa* '(0 1)))
        ==> *array-slice*
        *aa-slice*
        ==> #2A((2 3) (5 6))
        (type-of *)
        ==> AS:ARRAY-SLICE
        (as:ref *as-slice* 0 1)
        ==> 3

I will package it up and put it up somewhere.

> (defun array-as-sequence (array)
>   (make-array (array-total-size array)
> 	      :displaced-to array
> 	      :element-type (array-element-type array)))
> 
> (defun array-subscripts-from-row-major-index (array index)
>   (let ((index-left index))
>     (maplist (lambda (dimensions)
> 	       (multiple-value-bind (quotient remainder)
> 		   (floor index-left (reduce #'* (rest dimensions)))
> 		 (setq index-left remainder)
> 		 quotient))
> 	     (array-dimensions array))))
> 
> And then 
> 
> (defun array-reduce (function array &rest args)
>   (apply #'reduce `(,function ,(array-as-sequence array) ,@args)))
> 
> and 
> 
> (defun array-position (item array &rest args)
>    (array-subscripts-from-row-major-index array
>      (apply #'position `(,item ,(array-as-sequence array) ,@args))))

Nice going.

I think the way to go would be to define a new package

        (defpackage "TRAVERSING" (:use "COMMON-LISP")
           (:shadow cl:count
                    cl:position
                    cl:find
                    ;; etc etc
                    ))

and define specialized version of these interfaces for array
structures (and, why not, for other dictionary structures as well).

I am definitively willing to pay the price of consing up an object of
type INDEX, provided that I can turn it into a fixnum, a list of
fixnums or `something else' given the appropriate context (of course,
more thought shoudl go in the definition of such library).

Cheers

-- 
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group        tel. +1 - 212 - 998 3488
715 Broadway 10th Floor                 fax  +1 - 212 - 995 4122
New York, NY 10003, USA                 http://bioinformatics.cat.nyu.edu
                    "Hello New York! We'll do what we can!"
                           Bill Murray in `Ghostbusters'.

From: Kent M Pitman
Subject: Re: Sequence functions on arrays
Date: Fri, 22 Nov 2002 19:43:23 +0000
Message-ID: <sfwzns1tmp0.fsf@shell01.TheWorld.com>

Larry Hunter <············@uchsc.edu> writes:

> I was recently asked by a student why some of the sequence functions
> (e.g. position) aren't defined on arrays. The quick answer is that the
> order of processing isn't well defined. But since row-major-order is
> part of the spec, I wondered why not just use it? Is there any good
> reason not to extend sequence functions to arrays like this: 
> 
> Is there some good reason this wasn't made part of the spec? What's
> wrong with it?

Historically, of course, ROW-MAJOR-AREF was not there until ANSI CL, 
(though row major order was defined, so maybe that doesn't matter).

I think the key reason that POSITION wasn't defined is that it's more
an example of "overloading" than of "genericity".  You can't tell if
(POSITION X Y) will return an index or a set of indexes, so it's a bit
of a vague problem.

There's also an issue that if you return the position as a consed list,
you have to do needless consing, and if you return the position as multiple
values, then you have a problem where ARRAY-RANK-LIMIT may exceed
MULTIPLE-VALUES-LIMIT.  So there are practical considerations making this
not a particularly great function...

By contrast, there was a lot of fuss over whether to allow operations
like mapping, which don't involve exposing the choice of indexes.
I'm not sure I remember why this ultimately didn't happen for the mappers,
but as I say, it was discussed.

From: Barry Margolin
Subject: Re: Sequence functions on arrays
Date: Fri, 22 Nov 2002 19:59:14 +0000
Message-ID: <m0wD9.22$Zf2.900@paloalto-snr1.gtei.net>

In article <···············@shell01.TheWorld.com>,
Kent M Pitman  <······@world.std.com> wrote:
>By contrast, there was a lot of fuss over whether to allow operations
>like mapping, which don't involve exposing the choice of indexes.
>I'm not sure I remember why this ultimately didn't happen for the mappers,
>but as I say, it was discussed.

I think we decided that it would be easy enough for people to displace a
vector to the array when they need to do this, and this technique addresses
all the sequence operations.

-- 
Barry Margolin, ······@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.