array/vector indices

From: Albert Reiner
Subject: array/vector indices
Date: Thu, 19 Feb 2004 12:42:50 +0000
Message-ID: <m1hdxnqly6.fsf@reiner.chello.at>

Hi,

from Fortran I am used to being able to declare not only the number of array
elements but rather the index bounds; e.g., I can have a one-dimensional array
indexed from -5 to +5 rather than from 0 to 10.  Quite often this is a very
handy feature, both for having more readable code and for saving unnecessary
additions.

If my reading of the HyperSpec is correct, in CL array indices always start at
0, and even array displacement does not seem to help: Although this allows me
to make index i in vector B to correspond to index i+k in vector A, for some
reason neither i nor k seem to be allowed to be negative.

Is this correct?  And how do people deal with this kind of problem: Adding or
subtracting a constant at every array access seems wasteful; using hash tables
with integer keys does not seem fast, either; and re-structuring the code to
shifted exponents goes against readability and maintainability.
I guess my real problem is that I have not yet developed an intuition for
performance of Lisp - maybe adding a fixnum is negligible in comparison with
the cost of an array access anyway.

Thanks in advance,

Albert.

Re: array/vector indices Paul F. Dietz
Re: array/vector indices Pascal Costanza
- Re: array/vector indices Pascal Bourguignon
Re: array/vector indices Tim Bradshaw
- Re: array/vector indices Barry Margolin
  - Re: array/vector indices Christopher C. Stacy
    - Re: array/vector indices Albert Reiner
      - Re: array/vector indices Rahul Jain
        Re: array/vector indices Kalle Olavi Niemitalo
        Re: array/vector indices Rahul Jain
  - Re: array/vector indices Tim Bradshaw

From: Paul F. Dietz
Subject: Re: array/vector indices
Date: Thu, 19 Feb 2004 13:05:12 +0000
Message-ID: <6ZmdnXAHxIO9KKndRVn-jg@dls.net>

Albert Reiner wrote:

> Is this correct?  And how do people deal with this kind of problem: Adding or
> subtracting a constant at every array access seems wasteful;

What do you think the fortran compiler is doing behind the scenes?  If it
optimizes the extra arithmetic away, then so could the lisp compiler.

If you want to hide the extra arithmetic in the source code, use macros.
This is a common solution when you notice the same idiom repeating itself
in your code.

	Paul

From: Pascal Costanza
Subject: Re: array/vector indices
Date: Thu, 19 Feb 2004 13:59:04 +0000
Message-ID: <c12ff8$36l$1@newsreader2.netcologne.de>

Albert Reiner wrote:

> Hi,
> 
> from Fortran I am used to being able to declare not only the number of array
> elements but rather the index bounds; e.g., I can have a one-dimensional array
> indexed from -5 to +5 rather than from 0 to 10.  Quite often this is a very
> handy feature, both for having more readable code and for saving unnecessary
> additions.

Not extensively tested:

(defstruct array-with-constant-addend
   addend
   array)

(defun create-array-with-constant-addend (from to)
   (let ((length (- to from)))
     (if (<= length 0)
       (error "Array of no useful size.")
       (make-array-with-constant-addend
        :addend (- from)
        :array (make-array length)))))

(declaim (inline aref*))

(defun aref* (array index)
   (if (array-with-constant-addend-p array)
     (aref (array-with-constant-addend-array array)
           (+ index (array-with-constant-addend-addend array)))
     (aref array index)))

(declaim (inline (setf aref*)))

(defun (setf aref*) (newvalue array index)
   (if (array-with-constant-addend-p array)
     (setf (aref (array-with-constant-addend-array array)
                 (+ index (array-with-constant-addend-addend array)))
           newvalue)
     (setf (aref array index) newvalue)))


? (setf v (create-array-with-constant-addend -5 5))
#S(ARRAY-WITH-CONSTANT-ADDEND :ADDEND 5 :ARRAY #(0 0 0 0 0 0 0 0 0 0))
? (setf (aref* v -5) 0)
0
? (setf (aref* v -6) 0)
 > Error: Array index -1 out of bounds for #<SIMPLE-VECTOR 10> .
 > While executing: CCL::TOPLEVEL-EVAL
 > Type Command-. to abort.
See the Restarts� menu item for further choices.
1 >
Aborted
? (setf (aref* v 4) 0)
 > Error: Array index 10 out of bounds for #<SIMPLE-VECTOR 10> .
 > While executing: CCL::TOPLEVEL-EVAL
 > Type Command-. to abort.
See the Restarts� menu item for further choices.
1 >
Aborted
? (setf (aref* v 4) 0)
0


One could/should polish the code by adding more suitable error messages, 
support for multi-dimensional arrays, etc., but that's basically it.


Pascal

-- 
Tyler: "How's that working out for you?"
Jack: "Great."
Tyler: "Keep it up, then."

From: Pascal Bourguignon
Subject: Re: array/vector indices
Date: Thu, 19 Feb 2004 16:57:53 +0000
Message-ID: <87k72j9fce.fsf@thalassa.informatimago.com>

Pascal Costanza <········@web.de> writes:

> Albert Reiner wrote:
> 
> > Hi,
> > from Fortran I am used to being able to declare not only the number
> > of array
> > elements but rather the index bounds; e.g., I can have a one-dimensional array
> > indexed from -5 to +5 rather than from 0 to 10.  Quite often this is a very
> > handy feature, both for having more readable code and for saving unnecessary
> > additions.
> 
> Not extensively tested:
> 
> (defstruct array-with-constant-addend
> (defun create-array-with-constant-addend (from to)

That does not sound abstract enough!

(defstruct array-with-bounds  #| ... |# )
(defmacro make-array-with-bounds (bounds &key  #| ... |# )  #| ... |# )
(defmacro abref (array &rest indices) #| ... |# )


(defvar ab (make-array-with-bounds '( (-5 5) (0 9) (-10 -2) )))
(setf (abref ab -1 1 -3) 1)


-- 
__Pascal_Bourguignon__                     http://www.informatimago.com/
There is no worse tyranny than to force a man to pay for what he doesn't
want merely because you think it would be good for him.--Robert Heinlein
http://www.theadvocates.org/

From: Tim Bradshaw
Subject: Re: array/vector indices
Date: Fri, 20 Feb 2004 12:04:32 +0000
Message-ID: <fbc0f5d1.0402200404.3a14db44@posting.google.com>

Albert Reiner <······@chello.at> wrote in message news:<··············@reiner.chello.at>...
>
> 
> Is this correct?  And how do people deal with this kind of problem: Adding or
> subtracting a constant at every array access seems wasteful; using hash tables
> with integer keys does not seem fast, either; and re-structuring the code to
> shifted exponents goes against readability and maintainability.
> I guess my real problem is that I have not yet developed an intuition for
> performance of Lisp - maybe adding a fixnum is negligible in comparison with
> the cost of an array access anyway.
>

Adding a constant is what the Fortran system must do anyway!  In
general, operations which don't access memory are very much faster
than ones that do on modern systems.  However there are many caveats
which make performance hard to predict - for instance memory access
patterns can either result in much faster performance than you'd
expect (if you get lots of cache hits and/or can get the memory
pileline working for you) or much slower (if you can't).  However if
the compiler can (a) convince itself that all the arithmetic involved
is fixnums, and (b) arrange for it all to happen in registers (so:
load the offset into a register and keep it there), then the overhead
for the Lisp case should be no more than that for Fortran.

Of course, Lisp doesn't allow you to `natively' express this kind of
offset, but you can do so either by defining some kind of structural
wrapper, or - possibly better - by defining a WITH-ARRAY-DISPLACEMENT
macro which defines a local macro in its body to do the offset access.
 You probably want to think about the design of this, as the offset is
per array, obviously.  It might be sensible to have some signature
like:

WITH-ARRAY-DISPLACEMENT ((var/accessor &rest offsets) &body forms)

where var/accessor is either the name of an array variable or a list
of (array accessor.  Then:

(with-array-displacement (x ...)
  ;; here the macro X accesses the array bound to X with offsets
  )

(with-array-displacement ((x y) ...)
  ;; here X accesses Y
  )

This latter form is important because you might want to do:

(with-array-displacement ((x (ship-coordinates my-ship)) ...)
  ...)

And this needs to expand to something like:

(let ((.tmp. (ship-coordinate my-ship))) ;.TMP. is a gensym in real
life
  (macrolet ((x (...) ... .tmp. ))
    ...))

There's a whole issue about whether the offsets are constant during
the execution of the macro and so on which also needs to be addressed:
I haven't thought about this very hard!

--tim

From: Barry Margolin
Subject: Re: array/vector indices
Date: Fri, 20 Feb 2004 16:23:50 +0000
Message-ID: <barmar-CCC984.11235020022004@comcast.ash.giganews.com>

In article <····························@posting.google.com>,
 ··········@tfeb.org (Tim Bradshaw) wrote:

> Adding a constant is what the Fortran system must do anyway!

Actually, it doesn't have to do this on every access.  In the case of 
one-dimensional vectors, it can optimize this away by adjusting the 
array's base address by that constant.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

From: Christopher C. Stacy
Subject: Re: array/vector indices
Date: Fri, 20 Feb 2004 17:00:02 +0000
Message-ID: <uy8qx660d.fsf@news.dtpq.com>

I thought the OP's actual issue was that he liked the semantics 
of vectors that had negative indices (rather than concern over
the machine efficiency of the pointer arithmetic).

This is another example of the "arrays are not generic extensible"
argument in Lisp that comes up from time to time.

From: Albert Reiner
Subject: Re: array/vector indices
Date: Sat, 21 Feb 2004 17:34:53 +0000
Message-ID: <m1fzd4e3ok.fsf@reiner.chello.at>

First of all, thanks for all the answers.

······@news.dtpq.com (Christopher C. Stacy) writes:

> I thought the OP's actual issue was that he liked the semantics 
> of vectors that had negative indices (rather than concern over
> the machine efficiency of the pointer arithmetic).

Actually, my main concern is to get an idea of what CL provides me with and
what not, and also to make sure I'm not just looking at the wrong places in
the HyperSpec: not long ago I could not find how to check for presence of a
substring, simply because I read about strings, vectors and arrays but forgot
sequences, and I looked for symbols named, e.g., ...-find-...  rather than
...-search-....  Most of the time it is pretty clear to me to understand how I
can get what I want if my reading of the standard is correct and some handy
feature is not provided.

So far, I have the impression that the CL constructs typically are very
general, with their many optional parameters that allow one to get exactly
what one wants.  In particular, array displacement seemd to provide *nearly*
the thing I wanted, so I was a bit surprised that my reading of the HS seemed
to indicate that array handling is far less flexible than Fortran-90's (which
itself probably has something to do with the historical evolution of Fortran,
the default lower bound of 1 in particular), and I wanted to confirm that.

Another factor is that I have done a sufficient amount of work on both Fortran
and Haskell to rarely be surprised by the performance, and I do not yet have a
feeling for the costs associated with various ways of doing things in CL.  So,
yes, in general I am also interested in knowing the performance implications,
though I am quite happy if the code just runs fast enough.

Thanks again,

Albert.

From: Rahul Jain
Subject: Re: array/vector indices
Date: Sat, 21 Feb 2004 18:28:34 +0000
Message-ID: <87oerscmnh.fsf@nyct.net>

Albert Reiner <······@chello.at> writes:

> So far, I have the impression that the CL constructs typically are very
> general, with their many optional parameters that allow one to get exactly
> what one wants.  In particular, array displacement seemd to provide *nearly*
> the thing I wanted, so I was a bit surprised that my reading of the HS seemed
> to indicate that array handling is far less flexible than Fortran-90's (which
> itself probably has something to do with the historical evolution of Fortran,
> the default lower bound of 1 in particular), and I wanted to confirm that.

IMHO, an array is simply a bunch of data arranged in a cartesian
fashion. The coordinates have nothing to do with the arrangement of the
data structure. Vaguely like topology vs. geometry.

Really, the feature you're looking for is at the level of application,
not data structure. You want some coordinate translation to be applied
before accessing the data. This can easily be accomplished by deciding
on a general interface for such an object and then implementing that
general interface.

More importantly, since these are application objects, you can give the
accessors more appropriate names, like objects-at-position or
newest-object-at-position or whatever. The point is that you should
define your application objects separately from the implementation that
happens to be most efficient for their usage patterns. Then there will
be no stress caused by a need for a change in the underlying data
structures used, maybe due to a new feature which changes the
assumptions about the common operations done on the data structure.

HTH,
-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist

From: Kalle Olavi Niemitalo
Subject: Re: array/vector indices
Date: Sat, 21 Feb 2004 22:06:57 +0000
Message-ID: <87n07cw0hq.fsf@Astalo.kon.iki.fi>

Rahul Jain <·····@nyct.net> writes:

> Really, the feature you're looking for is at the level of application,
> not data structure. You want some coordinate translation to be applied
> before accessing the data.

Such translation can also be used to implement multidimensional
arrays on top of vectors, yet the language supports both.

From: Rahul Jain
Subject: Re: array/vector indices
Date: Tue, 24 Feb 2004 01:48:06 +0000
Message-ID: <87oerpdz8p.fsf@nyct.net>

Kalle Olavi Niemitalo <···@iki.fi> writes:

> Rahul Jain <·····@nyct.net> writes:
>
>> Really, the feature you're looking for is at the level of application,
>> not data structure. You want some coordinate translation to be applied
>> before accessing the data.
>
> Such translation can also be used to implement multidimensional
> arrays on top of vectors, yet the language supports both.

A multidimensional array has a significantly different conceptual "look"
than a 1D vector. Displaced arrays are probably only needed to be
implemented by the language implementor because the internal storage
mechanism for the array is implementation-specific and optimizations
based on that information should be available.

IMHO, transformed array accessors are in the domain of the application,
just like creating hash-tables for a custom application-specific
binding-space (although my specified-types library implements a general
way to create something like this with very little code). Maybe there is
enough need for either of these features to be in the standard, but I
wouldn't view it as a deficiency or a problem.

Actually, my tables library has a very generic way of specifying the
access and underlying storage of all kinds of associative data strucures
as mixins: sparse, dense, and with any variety of key types, ranks, and
value types. The real trick is figuring out some protocol for
change-class to move the data from one structure to another (as much as
possible) and for this changing to be done dynamically as the dataset
grows, shrinks, and changes. This isn't exactly what the OP was asking
for, but maybe it'll give some idea of the power of lisp to easily
create such abstract, yet optimized, systems.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist

From: Tim Bradshaw
Subject: Re: array/vector indices
Date: Sun, 22 Feb 2004 18:04:26 +0000
Message-ID: <ey3d6876led.fsf@cley.com>

* Barry Margolin wrote:

> Actually, it doesn't have to do this on every access.  In the case of 
> one-dimensional vectors, it can optimize this away by adjusting the 
> array's base address by that constant.

Yes.  I think (without even trying an implementation) that a
macro-based thing could do that to for CL.  It might be fairly hard to
get right though.

--tim