Hi all,
I'm trying to enhance a part of the program I'm working on. I profiled
it and came to the conclusion that a part similar to the next function
was slowing all the process.
(defun stupid-function ()
(let ((array (make-array 2048 :element-type '(complex double-float)))
(window (make-array 2048 :element-type '(complex double-float))))
(loop for i from 0 below 2048 do ;initialize the arrays
(setf (aref array i) (coerce (1+ i) '(complex double-float)))
(setf (aref window i) (coerce (- 2048 i) '(complex double-float))))
(loop for i from 1 to 2000 do
(setf array (map '(simple-array (complex double-float) (*)) #'/
window array)))))
This takes around 10 seconds on my computer. The problematic part is
_map_... So I thought maybe there is a way to replace it somehow,
because as you can see, the "window" is always the same...
Any ideas ?
Martin
Martin Raspaud wrote:
> Hi all,
>
> I'm trying to enhance a part of the program I'm working on. I profiled
> it and came to the conclusion that a part similar to the next function
> was slowing all the process.
>
>
>
> (defun stupid-function ()
> (let ((array (make-array 2048 :element-type '(complex double-float)))
> (window (make-array 2048 :element-type '(complex double-float))))
> (loop for i from 0 below 2048 do ;initialize the arrays
> (setf (aref array i) (coerce (1+ i) '(complex double-float)))
> (setf (aref window i) (coerce (- 2048 i) '(complex double-float))))
> (loop for i from 1 to 2000 do
> (setf array (map '(simple-array (complex double-float) (*)) #'/
> window array)))))
>
>
> This takes around 10 seconds on my computer. The problematic part is
> _map_... So I thought maybe there is a way to replace it somehow,
> because as you can see, the "window" is always the same...
MAP is allocating a new array each time. It's also probably
not being inlined (use DISASSEMBLE to inspect your code).
You need to specify the optimization settings as well. The default
settings may not lead to good performance.
There's MAP-ONTO, and you could always write your own loop.
If MAP-ONTO is too slow in your implementation, complain to the
implementors that you'd like it to be inlined in such cases.
Which implementation are you using?
Paul
who plans to write performance tests sometime after ansi-tests
Paul F. Dietz wrote:
>
> MAP is allocating a new array each time. It's also probably
> not being inlined (use DISASSEMBLE to inspect your code).
>
> You need to specify the optimization settings as well. The default
> settings may not lead to good performance.
>
> There's MAP-ONTO, and you could always write your own loop.
>
> If MAP-ONTO is too slow in your implementation, complain to the
> implementors that you'd like it to be inlined in such cases.
>
> Which implementation are you using?
>
> Paul
> who plans to write performance tests sometime after ansi-tests
I'm using cmucl... And I found map-into (not map-onto) in the hyperspec,
but it's even slower. Actually the setf I wrote in the end was just
there to modify "array" which is not the way I do it in my program.
What optimization settings would you recommand ? I used (declare
(optimize (speed 3) (safety 0))) but it didn't do much (if anything).
Martin
Martin Raspaud wrote:
>> There's [MAP-INTO], and you could always write your own loop.
>>
>
> I tried to write my own loop, but it's not faster...
What's may be happening is things are being boxed and unboxed,
which means one heap allocation per floating point number. This
is slow. Do disassemble on the compiled function to see if
that's happening. Make sure the arefs are being inlined as
well (type declarations might help if that's not happening).
What did your loop look like? Again, disassemble it and look
for places where things haven't been inlined.
Also, check that (upgraded-array-element-type '(complex double-float))
is (complex double-float). It is on recent CMUCL, but I don't
know about older versions.
Paul
Paul F. Dietz wrote:
> Martin Raspaud wrote:
>
>> I tried to write my own loop, but it's not faster...
>
>
> Funny, a loop does look faster to me. You may need to declare the loop
> index
> variables to avoid generic arithmetic. Also note that COMPLEX is not being
> inlined, which may be quite expensive (since it is probably heap
> allocating the
> result.)
>
> In comparing this against C, be sure you include the cost of allocating
> the vectors in the C program.
>
> (This is at safety 0, debug 0, space 0, speed 3)
>
> * (defun stupid-function ()
> (let ((array (make-array 2048 :element-type '(complex double-float)))
> (window (make-array 2048 :element-type '(complex double-float))))
> (loop for i of-type fixnum from 0 below 2048 do
> ;initialize the arrays
> (setf (aref array i) (complex (1+ i) 0.0d0))
> (setf (aref window i) (complex (- 2048 i) 0.0d0)))
> (loop for i of-type fixnum from 1 to 2000 do (setf (aref array i) (/
> (aref window i) (aref array i))))
> array))
> STUPID-FUNCTION
Well, you forgot a loop in there, maybe on purpose (for simplicity).
However, this did the trick....
Thanks a million.
But let me understand this better : is it because the types have been
declared , thus avoiding generic arithmetic ?
Martin
In article <············@news.u-bordeaux.fr>,
Martin Raspaud <········@labri.fr> wrote:
> Hi all,
>
> I'm trying to enhance a part of the program I'm working on. I profiled
> it and came to the conclusion that a part similar to the next function
> was slowing all the process.
>
>
>
> (defun stupid-function ()
> (let ((array (make-array 2048 :element-type '(complex double-float)))
> (window (make-array 2048 :element-type '(complex double-float))))
> (loop for i from 0 below 2048 do ;initialize the arrays
> (setf (aref array i) (coerce (1+ i) '(complex double-float)))
> (setf (aref window i) (coerce (- 2048 i) '(complex double-float))))
> (loop for i from 1 to 2000 do
> (setf array (map '(simple-array (complex double-float) (*)) #'/
> window array)))))
>
>
> This takes around 10 seconds on my computer. The problematic part is
> _map_... So I thought maybe there is a way to replace it somehow,
> because as you can see, the "window" is always the same...
You could simply replace it with a loop:
(dotimes (j (length array))
(setf (aref array j) (/ (aref window j) (aref array j))))
My guess is that by using MAP, you're not getting the division inlined,
but this version should allow it.
It might also help to declare the types of ARRAY and WINDOW, so that it
can generate the most efficient division code.
--
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***