I'm debating between two methods of storing and manipulating 2D and 3D
vectors: a struct or a 1D array. To test this, I whipped up a test
routine and structure:
(defstruct v3 x y z)
(defun v3-cross (v1 v2)
"Returns the cross product of two, 3D vectors."
(make-v3 :x (- (* (v3-y v1) (v3-z v2)) (* (v3-z v1) (v3-y v2)))
:y (- (* (v3-z v1) (v3-x v2)) (* (v3-x v1) (v3-z v2)))
:z (- (* (v3-x v1) (v3-y v2)) (* (v3-y v1) (v3-x v2)))))
(defun v3-across (v1 v2)
"Returns the cross product of two, 3D vectors."
(vector (- (* (aref v1 1) (aref v2 2)) (* (aref v1 2) (aref v2 1)))
(- (* (aref v1 2) (aref v2 0)) (* (aref v1 0) (aref v2 2)))
(- (* (aref v1 0) (aref v2 1)) (* (aref v1 1) (aref v2 0)))))
Now, both of these could be improved by storing x, y and z in local
variables, but the point of the test is to see, overall, if one method
of access really beat out the other (this is running on LispWorks).
What gets me, is that using a benchmark macro, the TIME or PROFILE
functions over 1,000,000+ iterations returns very varied results each
time. I'm assuming that this is being caused by the garbage collector
possibly kicking in at inopportune times?
As example numbers that I'm getting, using TIME and 1M iterations of a
single cross product on each gives me:
V3-CROSS: 3.640s, 72003376 bytes standard / 11004213 bytes conses
V3-ACROSS: 3.343s, 96003216 bytes standard / 11004257 bytes conses
Yet, running the same test again results in almost swapped values (the
CROSS versions runs faster than the ACROSS version). Just wondering if
anyone out there had any suggestions as to how and better profile these
functions.
Note: This is not a quest for the fastest cross product.. rather the
quest for the better storage method. However, determining how to
profile functions better is a good side-quest ;)
Jeff M.
--
http://www.retrobyte.org
··············@gmail.com
"Jeff" <·······@gmail.com> writes:
> I'm debating between two methods of storing and manipulating 2D and 3D
> vectors: a struct or a 1D array.
There should not be a lot of difference.
For example in clisp note how the same is done both in the case of
vectors and strctures:
LISPFUNNR(svref,2)
{ /* (SVREF simple-vector index), CLTL p. 291 */
/* check simple-vector: */
if (!simple_vector_p(STACK_1))
fehler_kein_svector(TheSubr(subr_self)->name,STACK_1);
/* check index: */
var uintL index = test_index();
/* fetch element: */
VALUES1(TheSvector(STACK_1)->data[index]);
skipSTACK(2);
}
/* Subroutine for record access functions
> STACK_1: record argument
> STACK_0: index argument
< STACK: cleared up
< returns: the address of the referred record item */
local gcv_object_t* record_up (void) {
/* the record must be a Closure/Structure/Stream/OtherRecord: */
if_recordp(STACK_1, ; , { skipSTACK(1); fehler_record(); } );
var object record = STACK_1;
var uintL length = Record_length(record);
var uintL index;
if (!(posfixnump(STACK_0) && ((index = posfixnum_to_L(STACK_0)) < length)))
/* extract and check index */
fehler_index(length);
skipSTACK(2); /* clear up stack */
return &TheRecord(record)->recdata[index]; /* record element address */
}
LISPFUNNR(record_ref,2)
{ /* (SYS::%RECORD-REF record index) return the index'th entry in the record */
VALUES1(*(record_up())); /* record element as value */
}
An implementation could easily use exactly the same low-level data
structure for both structures and (vector t). Note how CLHS even give
the option to the user to reveal such an implementation choice with
the :type option of defstruct.
So you will choose one or the other depending on how you use the
fields. If the operators on your objects can be expressed simply only
with absolute names, a structure will do and _could_ be slightly
faster with a higly optimizing compiler. On the other hand, if you
have complex operations that are better expressed with indices you'd
be better using vectors. With few dimensions, it's probably a better
bet to choose structures, may be with some macrology to expand indexed
notation to absolute "fields", but I would not bother.
--
__Pascal Bourguignon__ http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.
Pascal Bourguignon wrote:
>
> "Jeff" <·······@gmail.com> writes:
> > I'm debating between two methods of storing and manipulating 2D and
> > 3D vectors: a struct or a 1D array.
>
> There should not be a lot of difference.
Actually, for what they are being used for, there could potentially be
a lot of difference (which is why I'll be using vectors), considering
other functions that exist in the standard.
That's why I really didn't want this to turn into a debate between
structs and vectors (in this particular case). I was more interested in
why the profiling seemed to bounce back and forth. Thanks for the
reply, though. :)
Jeff M.
--
http://www.retrobyte.org
··············@gmail.com
For speed, the most important thing is to declare the number types:
single-float, double-float, fixnum, or whatever. Use :type in
defstruct, and
(declare (double-float ...)) or
(declare (type (simple-vector double-float *) ...) after let and defun.
Then compile with
(declaim (optimize speed)). Or even
(declaim (optimize speed (safety 0) (debug 0))),
but read your compiler's manual for why (safety 0) is way risky.
In v3-across, consider simple-vector and svref instead of array/vector
and aref.
To be absolutely pedantic, wrap (the double-float ...) around every
function call that returns a double-float.
All this is ugly. But you can experiment, remove the changes that
don't matter, and wrap some of the others in a macro.
After compilation with the speed setting, compare
(disassemble 'v3-cross) and
(disassemble 'v3-across)
···············@yahoo.com wrote:
> For speed, the most important thing is to declare the number types:
> single-float, double-float, fixnum, or whatever. Use :type in
> defstruct, and
> (declare (double-float ...)) or
> (declare (type (simple-vector double-float *) ...) after let and defun.
>
> Then compile with
> (declaim (optimize speed)). Or even
> (declaim (optimize speed (safety 0) (debug 0))),
> but read your compiler's manual for why (safety 0) is way risky.
>
> In v3-across, consider simple-vector and svref instead of array/vector
> and aref.
>
> To be absolutely pedantic, wrap (the double-float ...) around every
> function call that returns a double-float.
>
> All this is ugly. But you can experiment, remove the changes that
> don't matter, and wrap some of the others in a macro.
>
> After compilation with the speed setting, compare
> (disassemble 'v3-cross) and
> (disassemble 'v3-across)
Yes, that'll work.
It'll *work*, but...
Instead of declaring parameters to be of type fixnum, declare them to be
something more limited. For example, if one parameter can only be between 0
and 9, declare it as (integer 0 9).
Type inference will work better, *and* you get a bonus assertion at high
safety levels. (1+ a) can return a bignum if a is declared as a fixnum, but
not if a is an (integer 0 1023), or similar.
On Tue, 07 Dec 2004 07:07:14 GMT, <·······@gmail.com> wrote:
> structs and vectors (in this particular case). I was more interested in
> why the profiling seemed to bounce back and forth. Thanks for the
The OS, especially if it's windoze.
You need to run multiple trials in such circumstances.
--
Everyman has three hearts;
one to show the world, one to show friends, and one only he knows.
Jeff wrote:
>
> What gets me, is that using a benchmark macro, the TIME or PROFILE
> functions over 1,000,000+ iterations returns very varied results each
> time. I'm assuming that this is being caused by the garbage collector
> possibly kicking in at inopportune times?
>
> As example numbers that I'm getting, using TIME and 1M iterations of a
> single cross product on each gives me:
>
> V3-CROSS: 3.640s, 72003376 bytes standard / 11004213 bytes conses
> V3-ACROSS: 3.343s, 96003216 bytes standard / 11004257 bytes conses
>
> Yet, running the same test again results in almost swapped values (the
> CROSS versions runs faster than the ACROSS version). Just wondering if
> anyone out there had any suggestions as to how and better profile these
> functions.
>
There could be three possible areas that can introduce uncertainty into the
timing. Multiprocessing, interrupts and gc. I can seem to reduce the
uncertainty by using:
mp:with-preemption
mp:without-interrupts
hcl:with-heavy-allocation
See the reference manual under the proper sections:
http://www.lispworks.com/reference/lw43/LWRM/html/lwref.htm
You could try this version. It seems to be more stable
for timing the opreation, at least one has more
control over what my-time-function does than what
is available in time.
(defun v3-cross-test ()
(let ((v1 (make-v3 :x 3.0 :y 1.23 :z 4.5))
(v2 (make-v3 :x 4.5 :y 19.43 :z 3.1)))
(dotimes (i 1000000)
(defun v3-across-test ()
(let ((v1 (vector 3.0 1.23 4.5))
(v2 (vector 4.5 19.43 3.1)))
(dotimes (i 1000000)
(v3-across v1 v2)))) (v3-cross v1 v2))))
(defun my-time-function (function)
(mp:without-preemption
(mp:without-interrupts
(hcl:with-heavy-allocation
(let ((start-time (get-internal-real-time)))
(funcall function)
(prog1
(float (/ (- (get-internal-real-time) start-time)
internal-time-units-per-second))
(hcl:gc-if-needed)))))))
Seems I have to run it a few times to get it stable,
CL-USER 60 > (my-time-function #'v3-cross-test)
2.985
CL-USER 61 > (my-time-function #'v3-cross-test)
3.025
CL-USER 62 > (my-time-function #'v3-cross-test)
3.114
CL-USER 63 > (my-time-function #'v3-cross-test)
3.135
CL-USER 64 > (my-time-function #'v3-cross-test)
3.155
CL-USER 65 > (my-time-function #'v3-cross-test)
2.653
CL-USER 66 > (my-time-function #'v3-cross-test)
2.653
CL-USER 67 > (my-time-function #'v3-cross-test)
2.654
CL-USER 68 > (my-time-function #'v3-cross-test)
2.654
CL-USER 69 > (my-time-function #'v3-cross-test)
2.674
CL-USER 70 > (my-time-function #'v3-cross-test)
2.654
CL-USER 71 > (my-time-function #'v3-cross-test)
2.474
CL-USER 72 > (loop for i from 0 below 30
sum (my-time-function #'v3-cross-test) into
total
finally (return (/ total 30)))
2.6111
CL-USER 73 > (loop for i from 0 below 30
sum (my-time-function #'v3-across-test) into
total
finally (return (/ total 30)))
3.560133333333333
CL-USER 74 >
Wade
On second thought it may be better to do:
(defun my-time-function (function)
(mp:without-preemption
(mp:without-interrupts
(unwind-protect
(progn
(hcl:avoid-gc)
(let ((start-time (get-internal-real-time)))
(funcall function)
(float (/ (- (get-internal-real-time) start-time)
internal-time-units-per-second))))
(progn (hcl:normal-gc)
(hcl:gc-if-needed))))))
Wade