So I wanted to write a function that would average the numbers in a
sequence: either a list or an array. My first attempt was quite lame, but
functional:
(defun average (sequence &key (key #'identity))
;;; WARNING: Sloppy code. Could compute len and sum at same time
(let ((len (length sequence))
(sum (reduce #'+ sequence :key key)))
(/ sum len)))
This is obviously slower than need be for a number of reasons. The most
obvious is that for lists, it traverses things twice, once to find the
length and then again to create the sum.
I then decided to specialize things a bit and arrived at:
(defun fast-average (sequence &key (key #'identity))
(etypecase sequence
(list (loop for x in sequence
count x into length
sum (funcall key x) into sum
finally (return (/ sum length))))
(vector (loop for x across sequence
sum (funcall key x) into sum
finally (return (/ sum (length sequence)))))))
Now, this works well: the runtime on a long list is about half that of the
first version, which is what you would expect if you only traverse the list
once, not twice.
The main problem is that it seems so bifurcated. By that I mean that the
first version seems to really operate on sequences while the second is
obviously two routines, each specialized on a data type.
Is there anyway to sort of harmonize these two, giving some that has better
runtime but yet doesn't get totally data-type specific?
--
Dave Roberts, ·············@re-move.droberts.com
Slowly but surely, the programming world is finding Lisp...
http://www.findinglisp.com/blog/
Dave Roberts wrote:
> So I wanted to write a function that would average the numbers in a
> sequence: either a list or an array. My first attempt was quite lame, but
> functional:
>
> (defun average (sequence &key (key #'identity))
> ;;; WARNING: Sloppy code. Could compute len and sum at same time
> (let ((len (length sequence))
> (sum (reduce #'+ sequence :key key)))
> (/ sum len)))
>
> This is obviously slower than need be for a number of reasons. The most
> obvious is that for lists, it traverses things twice, once to find the
> length and then again to create the sum.
>
> I then decided to specialize things a bit and arrived at:
>
> (defun fast-average (sequence &key (key #'identity))
> (etypecase sequence
> (list (loop for x in sequence
> count x into length
> sum (funcall key x) into sum
> finally (return (/ sum length))))
> (vector (loop for x across sequence
> sum (funcall key x) into sum
> finally (return (/ sum (length sequence)))))))
>
> Now, this works well: the runtime on a long list is about half that of the
> first version, which is what you would expect if you only traverse the list
> once, not twice.
>
> The main problem is that it seems so bifurcated. By that I mean that the
> first version seems to really operate on sequences while the second is
> obviously two routines, each specialized on a data type.
>
> Is there anyway to sort of harmonize these two, giving some that has better
> runtime but yet doesn't get totally data-type specific?
>
Is this OK?:
(defun average (sequence &key (key #'identity) &aux (len 1))
(/ (reduce (lambda (x y)
(incf len)
(+ x y))
sequence :key key) len))
kenny
--
Cells? Cello? Celtik?: http://www.common-lisp.net/project/cells/
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Kenny Tilton wrote:
> Is this OK?:
>
> (defun average (sequence &key (key #'identity) &aux (len 1))
> (/ (reduce (lambda (x y)
> (incf len)
> (+ x y))
> sequence :key key) len))
Doh! Yes, of course. Thank you. (Dave shuffles back to his keyboard, licking
his wounds... ;-)
--
Dave Roberts, ·············@re-move.droberts.com
Slowly but surely, the programming world is finding Lisp...
http://www.findinglisp.com/blog/
Interestingly...
(defun fast-average (sequence &key (key #'identity))
(etypecase sequence
(list (loop for x in sequence
count x into length
sum (funcall key x) into sum
finally (return (/ sum length))))
(vector (loop for x across sequence
sum (funcall key x) into sum
finally (return (/ sum (length sequence)))))))
(defun fast-average2 (sequence &key (key #'identity))
(let ((len 1))
(/ (reduce #'(lambda (x y) (incf len) (+ x y)) sequence :key key)
len)))
CL-USER> (time (dotimes (i 10000) (fast-average *prices*)))
Evaluation took:
1.051 seconds of real time
0.95 seconds of user run time
0.1 seconds of system run time
0 page faults and
102,318,080 bytes consed.
NIL
CL-USER> (time (dotimes (i 10000) (fast-average2 *prices*)))
Evaluation took:
1.263 seconds of real time
1.21 seconds of user run time
0.05 seconds of system run time
0 page faults and
102,481,712 bytes consed.
NIL
CL-USER> (length *prices*)
1277
The REDUCE version is 20% slower than the other. I guess that's the effect
of having to call the LAMBDA func every time?
BTW, even the very naive version was fast enough. This was just me trying to
understand the relative cost of various Lisp forms and the tradeoffs in one
solution versus another.
--
Dave Roberts, ·············@re-move.droberts.com
Slowly but surely, the programming world is finding Lisp...
http://www.findinglisp.com/blog/
Dave Roberts wrote:
> The REDUCE version is 20% slower than the other. I guess that's the
> effect of having to call the LAMBDA func every time?
>
> BTW, even the very naive version was fast enough. This was just me
> trying to understand the relative cost of various Lisp forms and
> the tradeoffs in one solution versus another.
I'm curious for your timings on:
(defmethod average ((sequence list)&key (key #'identity) )
(loop for x in sequence
count x into length
sum (funcall key x) into sum
finally (return (/ sum length))))
On my Lispworks/Win2k, there's little difference between AVERAGE and
your FAST-AVERAGE's. (AVERAGE is ~2% slower.) However on Clisp/Win2k,
AVERAGE takes about 5X longer.
I've noticed that CLisp is unusual in terms of timing anything,
though.
MfG,
Tayssir
Tayssir John Gabbour wrote:
> Dave Roberts wrote:
>> The REDUCE version is 20% slower than the other. I guess that's the
>> effect of having to call the LAMBDA func every time?
>>
>> BTW, even the very naive version was fast enough. This was just me
>> trying to understand the relative cost of various Lisp forms and
>> the tradeoffs in one solution versus another.
>
> I'm curious for your timings on:
>
> (defmethod average ((sequence list)&key (key #'identity) )
> (loop for x in sequence
> count x into length
> sum (funcall key x) into sum
> finally (return (/ sum length))))
>
> On my Lispworks/Win2k, there's little difference between AVERAGE and
> your FAST-AVERAGE's. (AVERAGE is ~2% slower.) However on Clisp/Win2k,
> AVERAGE takes about 5X longer.
>
> I've noticed that CLisp is unusual in terms of timing anything,
> though.
On SBCL, I get:
CL-USER> (time (dotimes (i 10000) (average3 *prices*)))
Evaluation took:
1.094 seconds of real time
1.03 seconds of user run time
0.06 seconds of system run time
0 page faults and
102,326,272 bytes consed.
(I renamed it to AVERAGE3, just so I wouldn't blow away my original
definition.)
I thought about using a generic function here earlier, too.
So again, to summarize:
AVERAGE: 2.273 seconds
FAST-AVERAGE (my original): 1.051 seconds
FAST-AVERAGE2 (Kenny's REDUCE+LAMBDA/closure version): 1.263 seconds
AVERAGE3 (Tayssir's GF version with LOOP): 1.094 seconds
All timings on SBCL 0.8.16.
I guess that all sort of makes sense to me. It's about what I would expect
in terms of the relative times for each of the various algorithms. My
original naive REDUCE version traversed a list twice, once to compute the
sum and once to compute the length. It take roughly twice as long to
execute. The most efficient is the iterative version that simply computes
everything in a single traversal. The REDUCE+LAMBDA/closure version takes
slightly longer because of the various function calls that have to be
performed and the fact that the additional function and closure probably
prohibit some amount of optimization. Finally, Tayssir's GF version is
basically the iterative version plus a bit of GF dispatch overhead.
--
Dave Roberts, ·············@re-move.droberts.com
Slowly but surely, the programming world is finding Lisp...
http://www.findinglisp.com/blog/
Dave Roberts wrote:
> Interestingly...
>
> (defun fast-average (sequence &key (key #'identity))
> (etypecase sequence
> (list (loop for x in sequence
> count x into length
> sum (funcall key x) into sum
> finally (return (/ sum length))))
> (vector (loop for x across sequence
> sum (funcall key x) into sum
> finally (return (/ sum (length sequence)))))))
>
> (defun fast-average2 (sequence &key (key #'identity))
> (let ((len 1))
> (/ (reduce #'(lambda (x y) (incf len) (+ x y)) sequence :key key)
> len)))
>
> CL-USER> (time (dotimes (i 10000) (fast-average *prices*)))
> Evaluation took:
> 1.051 seconds of real time
> 0.95 seconds of user run time
> 0.1 seconds of system run time
> 0 page faults and
> 102,318,080 bytes consed.
> NIL
> CL-USER> (time (dotimes (i 10000) (fast-average2 *prices*)))
> Evaluation took:
> 1.263 seconds of real time
> 1.21 seconds of user run time
> 0.05 seconds of system run time
> 0 page faults and
> 102,481,712 bytes consed.
> NIL
> CL-USER> (length *prices*)
> 1277
>
>
> The REDUCE version is 20% slower than the other. I guess that's the effect
> of having to call the LAMBDA func every time?
>
Looks that way. I changed your version not to default to 'identity, then
just use the sequence element if no key was supplied, to avoid the
funcall of identity. First the original as timed on my box:
(defun fast-average (sequence &key (key #'identity))
(etypecase sequence
(list (loop for x in sequence
count x into length
sum (funcall key x) into sum
finally (return (/ sum length))))
(vector (loop for x across sequence
sum (funcall key x) into sum
finally (return (/ sum (length sequence)))))))
;----------------------------------------------------------
(let ((prices (loop for n below 1277 collecting n)))
(time (dotimes (n 10000)
(fast-average prices))))
; cpu time (non-gc) 290 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 290 msec user, 0 msec system
; real time 290 msec
; space allocation:
; 80,025 cons cells, 176 other bytes, 0 static bytes
;---------------------------------------------------------
(let ((prices (apply 'vector (loop for n below 1277 collecting n))))
(time (dotimes (n 10000)
(fast-average prices))))
; cpu time (non-gc) 541 msec user, 0 msec system
; cpu time (gc) 10 msec user, 0 msec system
; cpu time (total) 551 msec user, 0 msec system
; real time 551 msec
; space allocation:
; 80,031 cons cells, 208 other bytes, 0 static bytes
;--------------------------------------------------
Now without always calling 'identity:
(defun fast-average2 (sequence &key key)
(etypecase sequence
(list (loop for x in sequence
count x into length
sum (if key (funcall key x) x) into sum
finally (return (/ sum length))))
(vector (loop for x across sequence
sum (if key (funcall key x) x) into sum
finally (return (/ sum (length sequence)))))))
;----------------------------------------------------------
(let ((prices (loop for n below 1277 collecting n)))
(time (dotimes (n 10000)
(fast-average2 prices))))
; cpu time (non-gc) 170 msec user, 0 msec system
; cpu time (gc) 20 msec user, 0 msec system
; cpu time (total) 190 msec user, 0 msec system
; real time 191 msec
; space allocation:
; 80,022 cons cells, 240 other bytes, 0 static bytes
;---------------------------------------------------------
(let ((prices (apply 'vector (loop for n below 1277 collecting n))))
(time (dotimes (n 10000)
(fast-average2 prices))))
; cpu time (non-gc) 420 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 420 msec user, 0 msec system
; real time 420 msec
; space allocation:
; 80,028 cons cells, 80 other bytes, 0 static bytes
;--------------------------------------------------
But I do not know much about tuning Lisp, so maybe there is a way to
make those funcalls more efficient?
kenny
--
Cells? Cello? Celtik?: http://www.common-lisp.net/project/cells/
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Dave Roberts <·············@re-move.droberts.com> writes:
> Interestingly...
> The REDUCE version is 20% slower than the other. I guess that's the effect
> of having to call the LAMBDA func every time?
It depends a lot on your list implementation. You should not spend the
time you're spending to get only a 20% speed increase. You've already
got a O(n) algorithm with reduce, if you can't come with a O(log(n))
or a O(1), it's not worth thinking about optimizing it!!!
In my implementation, reduce is faster in run time, then closely
fast-average, then fast-avergage2 is slower. I could get probably
much faster times avoiding so much consing (summing a lot of integers
into bignums).
CL-USER> (time (dotimes (i 10000) (average *prices*)))
Real time: 20.12705 sec.
Run time: 17.7 sec.
Space: 204160784 Bytes
GC: 164, GC time: 10.33 sec.
NIL
CL-USER> (time (dotimes (i 10000) (fast-average *prices*)))
Real time: 19.265816 sec.
Run time: 17.92 sec.
Space: 204160784 Bytes
GC: 163, GC time: 10.26 sec.
NIL
CL-USER> (time (dotimes (i 10000) (fast-average2 *prices*)))
Real time: 23.412487 sec.
Run time: 20.39 sec.
Space: 204560784 Bytes
GC: 164, GC time: 10.11 sec.
NIL
> BTW, even the very naive version was fast enough. This was just me trying to
> understand the relative cost of various Lisp forms and the tradeoffs in one
> solution versus another.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Voting Democrat or Republican is like choosing a cabin in the Titanic.
Kenny Tilton <·······@nyc.rr.com> writes:
> (defun average (sequence &key (key #'identity) &aux (len 1))
> (/ (reduce (lambda (x y)
> (incf len)
> (+ x y))
> sequence :key key) len))
Interestingly, this is much slower than the na�ve version I use in my
statistics code, since LispWorks seems to be generating much poorer
code from your function, at least with standard compiler settings.
Let's have a look:
(list-from-hell is a list consisting of 1 million random numbers
(0-10), timings are with LispWorks 4.3.7 on Debian Unstable/linux
2.4.25, AMD Athlon XP 2200+):
;;;;;;;;;;;;;;;;;;
CL-USER 58 > (defun mean (sequence &key (key #'identity))
(/ (reduce #'+ sequence :key key)
(length sequence)))
MEAN
CL-USER 59 > (compile *)
MEAN
NIL
NIL
CL-USER 60 > (time (mean list-from-hell))
Timing the evaluation of (MEAN LIST-FROM-HELL)
user time = 0.030
system time = 0.000
Elapsed time = 0:00:00
Allocation = 16 bytes standard / 0 bytes conses
0 Page faults
Calls to %EVAL 34
562519/125000
CL-USER 61 > (defun average (sequence &key (key #'identity) &aux (len 1))
(/ (reduce (lambda (x y)
(incf len)
(+ x y))
sequence :key key) len))
AVERAGE
CL-USER 62 > (compile *)
AVERAGE
NIL
NIL
CL-USER 63 > (time (average list-from-hell))
Timing the evaluation of (AVERAGE LIST-FROM-HELL)
user time = 0.450
system time = 0.000
Elapsed time = 0:00:01
Allocation = 1256 bytes standard / 3168 bytes conses
0 Page faults
Calls to %EVAL 34
562519/125000
;;;;;;;;;;;;;;;;;;
Anyway, the na�ve version is very, very fast so I guess that one
lesson to learn is that unless you have very special needs (you're
doing averages /all the time/, this kind of optimization work is a
complete waste of programmer time (In my case, I average data that are
collected from database tables, so guess where the bottleneck is...).
--
(espen)
Dave Roberts wrote:
> So I wanted to write a function that would average the numbers in a
> sequence: either a list or an array. My first attempt was quite lame, but
> functional:
>
> (defun average (sequence &key (key #'identity))
> ;;; WARNING: Sloppy code. Could compute len and sum at same time
> (let ((len (length sequence))
> (sum (reduce #'+ sequence :key key)))
> (/ sum len)))
>
> This is obviously slower than need be for a number of reasons. The most
> obvious is that for lists, it traverses things twice, once to find the
> length and then again to create the sum.
>
> I then decided to specialize things a bit and arrived at:
>
> (defun fast-average (sequence &key (key #'identity))
> (etypecase sequence
> (list (loop for x in sequence
> count x into length
> sum (funcall key x) into sum
> finally (return (/ sum length))))
> (vector (loop for x across sequence
> sum (funcall key x) into sum
> finally (return (/ sum (length sequence)))))))
>
> Now, this works well: the runtime on a long list is about half that of the
> first version, which is what you would expect if you only traverse the list
> once, not twice.
>
> The main problem is that it seems so bifurcated. By that I mean that the
> first version seems to really operate on sequences while the second is
> obviously two routines, each specialized on a data type.
>
> Is there anyway to sort of harmonize these two, giving some that has better
> runtime but yet doesn't get totally data-type specific?
>
What if the sequence is empty?
(defun average (sequence)
(let ((len 0)
(sum 0))
(map nil #'(lambda (elt)
(incf len)
(incf sum elt))
sequence)
(if (zerop len)
0
(/ sum len))))
David Sletten
David Sletten wrote:
> Dave Roberts wrote:
>
>> So I wanted to write a function that would average the numbers in a
>> sequence: either a list or an array. My first attempt was quite lame, but
>> functional:
>>
>> (defun average (sequence &key (key #'identity))
>> ;;; WARNING: Sloppy code. Could compute len and sum at same time
>> (let ((len (length sequence))
>> (sum (reduce #'+ sequence :key key)))
>> (/ sum len)))
>>
>> This is obviously slower than need be for a number of reasons. The most
>> obvious is that for lists, it traverses things twice, once to find the
>> length and then again to create the sum.
>>
>> I then decided to specialize things a bit and arrived at:
>>
>> (defun fast-average (sequence &key (key #'identity))
>> (etypecase sequence
>> (list (loop for x in sequence
>> count x into length
>> sum (funcall key x) into sum
>> finally (return (/ sum length))))
>> (vector (loop for x across sequence
>> sum (funcall key x) into sum
>> finally (return (/ sum (length sequence)))))))
>>
>> Now, this works well: the runtime on a long list is about half that of
>> the
>> first version, which is what you would expect if you only traverse the
>> list
>> once, not twice.
>>
>> The main problem is that it seems so bifurcated. By that I mean that the
>> first version seems to really operate on sequences while the second is
>> obviously two routines, each specialized on a data type.
>>
>> Is there anyway to sort of harmonize these two, giving some that has
>> better
>> runtime but yet doesn't get totally data-type specific?
>>
> What if the sequence is empty?
>
> (defun average (sequence)
> (let ((len 0)
> (sum 0))
> (map nil #'(lambda (elt)
> (incf len)
> (incf sum elt))
> sequence)
> (if (zerop len)
> 0
> (/ sum len))))
(defun average (sequence &key (key #'identity) &aux (len 1))
(/ (reduce (lambda (x y)
(incf len)
(+ x y))
sequence :key key :initial-value 0)
len))
kenny
--
Cells? Cello? Celtik?: http://www.common-lisp.net/project/cells/
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Rahul Jain <·····@nyct.net> wrote in message news:<··············@nyct.net>...
> Kenny Tilton <·······@nyc.rr.com> writes:
>
> > (defun average (sequence &key (key #'identity) &aux (len 1))
> > (/ (reduce (lambda (x y)
> > (incf len)
> > (+ x y))
> > sequence :key key :initial-value 0)
> > len))
>
> Eh?!? The average value of no values is 0? I think not!
Another example of http://www.tilton-technology.com/rogercormannyc3.html
···············@yahoo.com (Mark McConnell) writes:
> Rahul Jain <·····@nyct.net> wrote in message news:<··············@nyct.net>...
>> Kenny Tilton <·······@nyc.rr.com> writes:
>>
>> > (defun average (sequence &key (key #'identity) &aux (len 1)) [...])
>>
>> Eh?!? The average value of no values is 0? I think not!
>
> Another example of http://www.tilton-technology.com/rogercormannyc3.html
I'm just trying to get him to give me back my soul!!
--
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
Dave Roberts <·············@re-move.droberts.com> writes:
> [.....]
> The main problem is that it seems so bifurcated. By that I mean that the
> first version seems to really operate on sequences while the second is
> obviously two routines, each specialized on a data type.
>
> Is there anyway to sort of harmonize these two, giving some that has better
> runtime but yet doesn't get totally data-type specific?
> [.....]
Imo, (I want to be mistaken) this can't be helped. If you want speed, in
general case code must be bifurcated.
example:
;;; be me.
;;;
(defun split-string-on-sequence (s seq/char)
;;
(declare (optimize speed)
(type string s)
(type (or sequence character) seq/char))
;;
;; 3-6 times faster than Sam's cllib:split-string.
;; for example:
;;
;; (cllib:split-string "c sa.fd:Sf: ,slf:s.f:de,a.sf:d " ": ,")
;;
;; ==> 49042 CPU cycles, 328 bytes consed.
;;
;; (split-string-on-sequence "c sa.fd:Sf: ,slf:s.f:de,a.sf:d " ": ,")
;;
;; ==> 12450 CPU cycles, 200 bytes consed.
;;
;;
;; (cllib:split-string "foo:bar:baz:quux" '(#\:))
;;
;; ==> 48800 CPU cycles, 168 bytes consed.
;;
;; (split-string-on-sequence "foo:bar:baz:quux" '(#\:))
;;
;; ==> 9455 CPU cycles, 104 bytes consed.
;;
;;
;; (cllib:split-string "foo;bar:baz quux" #(#\: #\; #\Space))
;;
;; ==> 56418 CPU cycles, 168 bytes consed.
;;
;; (split-string-on-sequence "foo;bar:baz quux" #(#\: #\; #\Space))
;;
;; ==> 13125 CPU cycles, 104 bytes consed.
;;
(if (typep s 'simple-string)
(if (typep seq/char 'character)
(split-simple-string-on-char s seq/char)
(if (typep seq/char 'list)
(if (eq (cdr seq/char) nil)
(split-simple-string-on-char s (car seq/char))
(split-simple-string-on-list-of-chars s seq/char))
(if (= (length seq/char) 1)
(split-simple-string-on-char s (aref seq/char 0))
(if (typep seq/char 'simple-string)
(split-simple-string-on-simple-string s seq/char)
(split-simple-string-on-vector s seq/char)))))
;;
;; (split-simple-string-on-list-of-chars s (loop for i arcross seq/char collect i))
;;
;; is faster than (split-simple-string-on-vector s seq/char), but first list must
;; be created... (so more consing occurs).
;;
(if (typep seq/char 'character)
(split-string-on-char s seq/char)
(if (typep seq/char 'list)
(if (eq (cdr seq/char) nil)
(split-string-on-char s (car seq/char))
(split-string-on-list-of-chars s seq/char))
(if (= (length seq/char) 1)
(split-simple-string-on-char s (aref seq/char 0))
(if (typep seq/char 'simple-string)
(split-string-on-simple-string s seq/char)
(split-string-on-vector s seq/char)))))))
Regards, Szymon.
Dave Roberts <·············@re-move.droberts.com> writes:
> Is there anyway to sort of harmonize these two, giving some that has better
> runtime but yet doesn't get totally data-type specific?
Yes. Use SERIES. :)
--
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
Rahul Jain wrote:
> Dave Roberts <·············@re-move.droberts.com> writes:
>
>> Is there anyway to sort of harmonize these two, giving some that has
>> better runtime but yet doesn't get totally data-type specific?
>
> Yes. Use SERIES. :)
>
That's something already on my list of things to check out. I simply haven't
had the time yet. Good point, though.
--
Dave Roberts, ·············@re-move.droberts.com
Slowly but surely, the programming world is finding Lisp...
http://www.findinglisp.com/blog/