arrays or lists ?

From: Martin Raspaud
Subject: arrays or lists ?
Date: Thu, 02 Sep 2004 08:55:39 +0000
Message-ID: <ch6puc$l64$1@news.u-bordeaux.fr>

Hi all,

I am currently developping something of a sound analysis application 
using common lisp.

I'd like to get the best performance, so I got a little question :

I have for example a sound that has 1000 sample. I analyse it using a 
fourier transform by transforming 50 sample (thus getting an array). I 
find the maxima of this array and perform some operation on them and 
then keep that somewhere. Then I do that with the next 50 samples of my 
1000 sample sound and so on.

thus I obtain a 2 dimension matrix, with variable length for the second 
dimension (and there is no way to know in advance what it will be)

I heard simple-arrays were better for performance, but here, as I don't 
know the size, I'm using lists.
Would vectors with fill-pointers be more appropriate in my case ?

(the thing is cmucl gets out of memory if I want a 1 sample precision 
(slinding the 50 samples by one sample only each time))

Thanks for any answer you could give.

Martin

Re: arrays or lists ? rif
- Re: arrays or lists ? Martin Raspaud
  - Re: arrays or lists ? pkhuong
  - Re: arrays or lists ? Joel Ray Holveck
  - Re: arrays or lists ? Kushagra Merchant
Re: arrays or lists ? David R. Sky

From: rif
Subject: Re: arrays or lists ?
Date: Thu, 02 Sep 2004 13:17:13 +0000
Message-ID: <wj0656w4xpi.fsf@five-percent-nation.mit.edu>

You haven't really given enough information to answer your question,
but having written several numerically intensive CMUCL apps, including
some sound processing, I'll jump in.

1.  MOST IMPORTANT.  Get your program working first.  You might be
    surprised to discover that it was already "fast enough".  If it's
    not fast enough, profile it, see what's slow, and work on
    optimizing that.  Trying to get "the best program" in advance, is,
    in my opinion, not the best way to write Lisp program.  I find
    that Lisp wins when you try to write the shortest, simplest, most
    elegant program first, and then optimizing only the parts that are
    actually slowing it down.

2.  In my own work, I find that rather than using 2-d arrays, I use an
    array of 1-d arrays.  This allows the 1-d arrays to be variable
    length.  I've found that fill-vectors lead to substantially worse
    performance.  If you're only constructing these things once and
    then reading them once later, I would just build lists (assuming
    you can construct them from the end first).  If you need to
    process them a bunch but their length stays fixed after
    construction, I would build them up as lists, then allocate a
    fixed-length array (IMPORTANT: make sure you use an array with
    typed-elements if you care about performance).

Others might weigh in with better strategies, but I've gotten a long
way with points 1 and 1 above.

Cheers,

rif

From: Martin Raspaud
Subject: Re: arrays or lists ?
Date: Fri, 03 Sep 2004 07:49:24 +0000
Message-ID: <ch9aec$hjo$1@news.u-bordeaux.fr>

rif wrote:
> You haven't really given enough information to answer your question,
> but having written several numerically intensive CMUCL apps, including
> some sound processing, I'll jump in.
> 
> 1.  MOST IMPORTANT.  Get your program working first.  You might be
>     surprised to discover that it was already "fast enough".  If it's
>     not fast enough, profile it, see what's slow, and work on
>     optimizing that.  Trying to get "the best program" in advance, is,
>     in my opinion, not the best way to write Lisp program.  I find
>     that Lisp wins when you try to write the shortest, simplest, most
>     elegant program first, and then optimizing only the parts that are
>     actually slowing it down.

Well I wrote the list version already. It's working quite well, however 
(as I said in the first post) it's getting out of memory with high 
precision calculation.

I also wrote the non-fixed-length array version as a small test 
yesterday, but it doesn't seem to go faster.

So I actually followed your advice of writting the program "elegantly" 
first (as elegant as a newbie can write) and now I'm profiling this part 
  now.

I think the bottleneck is the preforming of the fourier transform. I 
think the one I have is pretty good, though maybe not the best. It's 5 
times slower than fftw, but I couldn't get anything better (yet).
Maybe I didn't google enough, but the web didn't give me any faster 100% 
CL implementation of the transform.

> 2.  In my own work, I find that rather than using 2-d arrays, I use an
>     array of 1-d arrays.  This allows the 1-d arrays to be variable
>     length.  I've found that fill-vectors lead to substantially worse
>     performance.  If you're only constructing these things once and
>     then reading them once later, I would just build lists (assuming
>     you can construct them from the end first).  If you need to
>     process them a bunch but their length stays fixed after
>     construction, I would build them up as lists, then allocate a
>     fixed-length array (IMPORTANT: make sure you use an array with
>     typed-elements if you care about performance).

Ok, I'll try that. However, it there a way to free a list ? I mean is 
(setf my-list nil) sufficient to free the memory allocated by the list ?
And do you just do some :
(setf my-array (make-array (length my-list) :initial-contents my-list 
:element-type foo-class) ?
Or is there a more elegant way to do that ?

> Others might weigh in with better strategies, but I've gotten a long
> way with points 1 and 1 above.

Thanks a lot anyway

Martin

From: pkhuong
Subject: Re: arrays or lists ?
Date: Sat, 04 Sep 2004 19:45:54 +0000
Message-ID: <51184814.0409041145.3e2f16a7@posting.google.com>

Martin Raspaud <········@labri.fr> wrote in message news:<············@news.u-bordeaux.fr>...
[...]
> Ok, I'll try that. However, it there a way to free a list ? I mean is 
> (setf my-list nil) sufficient to free the memory allocated by the list ?
> And do you just do some :
> (setf my-array (make-array (length my-list) :initial-contents my-list 
> :element-type foo-class) ?
> Or is there a more elegant way to do that ?
Yes. Don't try to manually manage memory when you have GC. As long as
there's no reference to it, an element will be freed when it needs to
be. But apart from that, there's no explicit way to deallocate
something, so your memory usage might still be very large, but the GC
will make sure it fits in memory as well as it can.

From: Joel Ray Holveck
Subject: Re: arrays or lists ?
Date: Sat, 04 Sep 2004 20:20:26 +0000
Message-ID: <y7cisatdbw5.fsf@sindri.juniper.net>

> Ok, I'll try that. However, it there a way to free a list ? I mean is
> (setf my-list nil) sufficient to free the memory allocated by the list
> ?

Yes and no.  Technically, Lisp is never obligated to free memory.  Of
course, as a point of implementation, it will periodically free memory
when it is no longer accessible, that is, not referenced by anything.
This process is called "garbage collection", and for a while, was one
of the distinctive characteristics of Lisp.  (Nowdays, they're getting
more popular; Java's GC is well-known, and Perl likes to believe it
has a GC.)  A GC may not happen immediately, though.

So if you haven't squirreled another reference anywhere, (setf my-list
nil) will cause the list to be eventually freed, in general.

TECHNICAL NOTE: In theory, if MY-LIST is around for long enough,
there's some aspects of generational garbage collectors that might
cause it to hang around (since they often only automatically GC the
most recent generations), but that's a case I wouldn't be concerned
about.  You'd have to have a pretty long lifetime on MY-LIST for that
to happen.

But most Lisp programmers prefer to use short functions that do one
task then return.  In that case, your my-list variable will only be
referenced by that one function, and the list it references will
become inaccessible as soon as that function returns.  Example:
  (defun end-of-stream-p (stream)
    (not (peek-char nil stream nil nil)))
  (defun read-samples (stream)
    (let ((my-list (loop
                    until (end-of-stream-p stream)
                    collect (read-sample stream))))
      (coerce my-list 'vector)))

WARNING: The remainder of this post is me talking about what's
possible, not necessarily what you should do.  Generally, I'd have no
problem using READ-SAMPLES with heap allocation, and trusting the GC
to do the right thing.

There might be tricks you could play with DYNAMIC-EXTENT that would
let you signal the compiler that you won't need the list anymore once
READ-SAMPLES returns, but you have to be careful with that: the cons
cells of MY-LIST are going to become inaccessible, but the elements
are not.  From what I can tell, making a DYNAMIC-EXTENT declaration on
MY-LIST would give the compiler license to stack-allocate the elements
too, so it might free them when READ-SAMPLES returns, which would be a
problem.

You might also be able to get away with some implicit stack
allocation.  In the following example, a sufficiently clued compiler
might be able to detect that the list can be stack-allocated, but I'm
not sure that such a compiler actually exists.
  (defun read-samples (stream)
    (coerce (loop
             until (end-of-stream-p stream)
             collect (read-sample stream))
            'vector))

For all practical purposes, though, you don't need to play these silly
allocation games.  The GC will do the right thing, as long as you
don't squirrel away references where you're never going to use them,
but they're still technically accessible.

Hope this helps.

Cheers,
joelh

PS: In my examples, I used the simple type 'VECTOR for brevity.  In
reality, you should use a typed vector, as rif pointed out.  Something
like '(simple-array (simple-array sample-type (*)) (*)) should be
about right, but YMMV.  Note that my Lisps (LispWorks and CMUCL) don't
differentiate between that and just 'SIMPLE-VECTOR.  This doesn't
change the fact that the inner vectors should still be typed, even if
the outer vectors can't be.  I suspect that's what rif was talking
about, since I'd expect most Lisps wouldn't have a specialized type
for vectors of vectors, but don't let me put words in his mouth.

From: Kushagra Merchant
Subject: Re: arrays or lists ?
Date: Tue, 07 Sep 2004 07:44:23 +0000
Message-ID: <a3f71260.0409062344.50fe5d17@posting.google.com>

Martin Raspaud <········@labri.fr> wrote in message news:<············@news.u-bordeaux.fr>...
> rif wrote:
> > You haven't really given enough information to answer your question,
> > but having written several numerically intensive CMUCL apps, including
> > some sound processing, I'll jump in.
> > 
> > 1.  MOST IMPORTANT.  Get your program working first.  You might be
> >     surprised to discover that it was already "fast enough".  If it's
> >     not fast enough, profile it, see what's slow, and work on
> >     optimizing that.  Trying to get "the best program" in advance, is,
> >     in my opinion, not the best way to write Lisp program.  I find
> >     that Lisp wins when you try to write the shortest, simplest, most
> >     elegant program first, and then optimizing only the parts that are
> >     actually slowing it down.
> 
> Well I wrote the list version already. It's working quite well, however 
> (as I said in the first post) it's getting out of memory with high 
> precision calculation.
> 
> I also wrote the non-fixed-length array version as a small test 
> yesterday, but it doesn't seem to go faster.
> 
> So I actually followed your advice of writting the program "elegantly" 
> first (as elegant as a newbie can write) and now I'm profiling this part 
>   now.
> 
> I think the bottleneck is the preforming of the fourier transform. I 
> think the one I have is pretty good, though maybe not the best. It's 5 
> times slower than fftw, but I couldn't get anything better (yet).

You may find this helpful http://www.netlib.org/fftpack/fft.c

Regards,
Kushagra

From: David R. Sky
Subject: Re: arrays or lists ?
Date: Fri, 03 Sep 2004 03:48:08 +0000
Message-ID: <Pine.LNX.4.31.0409022045230.21547-100000@viper.wapvi.bc.ca>

Martin,

I don't know how to address your question, but know someone who does. Roger
Dannenberg has developed Nyquist, based on xlisp, a sound synthesis  and
analysis code. He's at ··········@cs.cmu.edu .

David



On Thu, 2 Sep 2004, Martin Raspaud wrote:

> Hi all,
>
> I am currently developping something of a sound analysis application
> using common lisp.
>
> I'd like to get the best performance, so I got a little question :
>
> I have for example a sound that has 1000 sample. I analyse it using a
> fourier transform by transforming 50 sample (thus getting an array). I
> find the maxima of this array and perform some operation on them and
> then keep that somewhere. Then I do that with the next 50 samples of my
> 1000 sample sound and so on.
>
> thus I obtain a 2 dimension matrix, with variable length for the second
> dimension (and there is no way to know in advance what it will be)
>
> I heard simple-arrays were better for performance, but here, as I don't
> know the size, I'm using lists.
> Would vectors with fill-pointers be more appropriate in my case ?
>
> (the thing is cmucl gets out of memory if I want a 1 sample precision
> (slinding the 50 samples by one sample only each time))
>
> Thanks for any answer you could give.
>
> Martin
>

--