sum over list?

From: Drew Krause
Subject: sum over list?
Date: Sat, 26 May 2001 16:41:29 +0000
Message-ID: <3B0FDCB8.41FDB9CB@mindspring.com>

Boy is this a dumb question, but ....

I need to create a function which sums all members of a variable-length
list, e.g.

(sum '(2 3 4)) => 9
(sum '(5 6)) => 11

I'm having trouble doing this for variable lengths, as using '+'
directly on the list of course won't work.
Any help appreciated!

Re: sum over list? Jordan Katz
- Re: sum over list? Kent M Pitman
  - Re: sum over list? Jordan Katz
    - Re: sum over list? Kent M Pitman
      - Open-coded vs closed-coded (Was: sum over list?) Harald Hanche-Olsen
        Re: Open-coded vs closed-coded (Was: sum over list?) Kent M Pitman
        Re: Open-coded vs closed-coded (Was: sum over list?) Gareth McCaughan
- Re: sum over list? Marc Battyani
Re: sum over list? Drew Krause
Re: sum over list? Francois Aguet
Re: sum over list? Frode Vatvedt Fjeld
Re: sum over list? Erik Naggum
- Re: sum over list? Kent M Pitman

From: Jordan Katz
Subject: Re: sum over list?
Date: Sat, 26 May 2001 14:07:28 +0000
Message-ID: <m3pucw9pb3.fsf@underlevel.underlevel.net>

Drew Krause <········@mindspring.com> writes:

> Boy is this a dumb question, but ....
> 
> I need to create a function which sums all members of a variable-length
> list, e.g.
> 
> (sum '(2 3 4)) => 9
> (sum '(5 6)) => 11
> 
> I'm having trouble doing this for variable lengths, as using '+'
> directly on the list of course won't work.
> Any help appreciated!

Let me warm you that I am a Lisp newbie as well, but this definition
works for me (using CLISP):

  (defun sum (args)
    (apply #'+ args))
-- 
Jordan Katz <····@underlevel.net>  |  Mind the gap

From: Kent M Pitman
Subject: Re: sum over list?
Date: Sat, 26 May 2001 15:55:28 +0000
Message-ID: <sfwlmnkyuj3.fsf@world.std.com>

Jordan Katz <····@underlevel.net> writes:

> > I need to create a function which sums all members of a variable-length
> > list, e.g.
> > 
> > (sum '(2 3 4)) => 9
> > (sum '(5 6)) => 11
> > 
> > I'm having trouble doing this for variable lengths, as using '+'
> > directly on the list of course won't work.
> > Any help appreciated!
> 
> Let me warm you that I am a Lisp newbie as well, but this definition
> works for me (using CLISP):
> 
>   (defun sum (args)
>     (apply #'+ args))

This definition will indeed work for small lists, but you should know
that since it requires setting up a function call block with as many
args as the given argument, in general it may blow up for very long
lists.  So if you know your list to be short, this is the way to go.

If it's longer, use REDUCE instead.  REDUCE does left-associative (or,
with :FROM-END T, right-associative) pairwise binary application 
of a function in order to achieve a single result.  So while
 (apply  #'+ '(1 2 3)) == (+ 1 2 3)
 (reduce #'+ '(1 2 3)) == (+ (+ 1 2) 3)
See the examples part of the dictionary entry on REDUCE in CLHS for more
samples uses.

CLHS, the Common Lisp HyperSpec, can be found at:
 http://www.xanalys.com/software_tools/reference/HyperSpec/FrontMatter/

Or use LOOP, as described already described by Frode Fjeld in another post.

From: Jordan Katz
Subject: Re: sum over list?
Date: Sun, 27 May 2001 02:53:34 +0000
Message-ID: <m3snhr8pu9.fsf@underlevel.underlevel.net>

Kent M Pitman <······@world.std.com> writes:

[snip]
> If it's longer, use REDUCE instead.  REDUCE does left-associative (or,
> with :FROM-END T, right-associative) pairwise binary application 
> of a function in order to achieve a single result.  So while
>  (apply  #'+ '(1 2 3)) == (+ 1 2 3)
>  (reduce #'+ '(1 2 3)) == (+ (+ 1 2) 3)
> See the examples part of the dictionary entry on REDUCE in CLHS for more
> samples uses.

Is reduce a better solution simply because it handles large lists
correctly or is it also faster?

Thanks,
-- 
Jordan Katz <····@underlevel.net>  |  Mind the gap

From: Kent M Pitman
Subject: Re: sum over list?
Date: Sun, 27 May 2001 07:36:13 +0000
Message-ID: <sfwy9rj6y6q.fsf@world.std.com>

Jordan Katz <····@underlevel.net> writes:

> Kent M Pitman <······@world.std.com> writes:
> 
> [snip]
> > If it's longer, use REDUCE instead.  REDUCE does left-associative (or,
> > with :FROM-END T, right-associative) pairwise binary application 
> > of a function in order to achieve a single result.  So while
> >  (apply  #'+ '(1 2 3)) == (+ 1 2 3)
> >  (reduce #'+ '(1 2 3)) == (+ (+ 1 2) 3)
> > See the examples part of the dictionary entry on REDUCE in CLHS for more
> > samples uses.
> 
> Is reduce a better solution simply because it handles large lists
> correctly or is it also faster?

Keep in mind that fast has no meaning if the result will not be correct.
But with that in mind, here's the long answer...

However, as long as you know the list is suitably short, probably
APPLY is faster.  APPLY is quite primitive and likely to be
open-coded.  REDUCE will probably (thought not necessarily) be a
closed-coded function call and already that will make it slower.

As someone already mentioned, the variable CALL-ARGUMENTS-LIMIT tells
you how many counts as "small".  The value of this variable will be
different in different implementations, but portably it is required to
be at least 50.  Below that number, you should win with APPLY.  I
usually use APPLY for functions of fixed arity or functions where I
know the arglist. e.g., a quoted arglist like
 (apply #'+ '(1 2 3))
you can see the list is small.  Or 
 (apply #'complex x)
because COMPLEX takes only 2 args, so I don't need to know what X is.
I usually use REDUCE when I don't know how long it might be. e.g.,
 (apply #'+ (mapcar #'age (all-citizens))) ;BAD
 (reduce #'+ (mapcar #'age (all-citizens))) ;GOOD
Note, though, that REDUCE has to involve a pairwise operator.  It's
in principle possible that you could construct a list so long that
 (apply #'open x) 
would fail, but that would just be bad programming.  It won't help to
do 
 (reduce #'open x) ;VERY BAD!
because OPEN isn't a pair-wise operator.

Also, back on the isue of speed, REDUCE takes keyword args, and
decoding those at runtime will be slow if the compiler has no special
optimizers for it.  The standard does not require compilers to be
smart, only the marketplace controls that.  So implementations that
tout themselves as high-peformance tend to notice common patterns of
keyword arguments and rewrite them as calls to system-internal fixed
argument functions to avoid the performance hit.  But even so, the
final result would typically be closed-coded.

All of that said, though, my best advice is this: Just write your code
in the way it feels correct and comfortable to write it.  When you
have two functions of equivalent power that seem to express what you
want, use the one you feel happy calling.  Then if/when your program
seems slow later, use the implementation's performance/tuning tools to
find why and optimize only those parts that are killing you.  The
reasons are these: Most of the time, you will throw away your program
long before it ever gets to commercial delivery, so speed of execution
is secondary to your own personal time spent writing and debugging the 
program.  And also, the parts of your program that are slow still mostly
won't really in practice unless you're in what people call a "tight inner
loop"--then the slowness is magnified in a way that it's worth worrying
about ... IF you've gotten to the deployment phase.  

But for the most part, program as if the language will be suitably fast
and report what is not as a bug; do not assume that a poorly compiled piece
of code is "how it should be".  If you discover something poorly compiled,
report it as a bug.  Most vendors want to know about ideas for optimizing
and will be responsive to reported problems.

From: Harald Hanche-Olsen
Subject: Open-coded vs closed-coded (Was: sum over list?)
Date: Sun, 27 May 2001 10:54:21 +0000
Message-ID: <pcovgmnnjtu.fsf_-_@thoth.home>

+ Kent M Pitman <······@world.std.com>:

| However, as long as you know the list is suitably short, probably
| APPLY is faster.  APPLY is quite primitive and likely to be
| open-coded.  REDUCE will probably (thought not necessarily) be a
| closed-coded function call and already that will make it slower.

From this I can make a few educated guesses at what the terms
open-coded and closed-coded mean, but in case I guessed wrong, I'd
appreciate hearing a brief definition from the horse's mouth, as it
were.  Care to explain further?  (Presumably, open-coded means the
compiler knows what the function does, so it can replace the function
call by actual code?)

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Yes it works in practice - but does it work in theory?

From: Kent M Pitman
Subject: Re: Open-coded vs closed-coded (Was: sum over list?)
Date: Sun, 27 May 2001 14:07:22 +0000
Message-ID: <sfwn17ynaw5.fsf@world.std.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + Kent M Pitman <······@world.std.com>:
> 
> | However, as long as you know the list is suitably short, probably
> | APPLY is faster.  APPLY is quite primitive and likely to be
> | open-coded.  REDUCE will probably (thought not necessarily) be a
> | closed-coded function call and already that will make it slower.
> 
> From this I can make a few educated guesses at what the terms
> open-coded and closed-coded mean, but in case I guessed wrong, I'd
> appreciate hearing a brief definition from the horse's mouth, as it
> were.  Care to explain further?  (Presumably, open-coded means the
> compiler knows what the function does, so it can replace the function
> call by actual code?)

Yes, exactly. I probably should use the term "inline" because that's
the term CL uses.  closed-coded means compiled as a separate function
call (whose nature is closed to view, I suppose--not sure the
etymology, but must be something like that).

From: Gareth McCaughan
Subject: Re: Open-coded vs closed-coded (Was: sum over list?)
Date: Mon, 28 May 2001 01:32:26 +0000
Message-ID: <slrn9h3ala.11j2.Gareth.McCaughan@g.local>

Kent Pitman wrote:

> Yes, exactly. I probably should use the term "inline" because that's
> the term CL uses.  closed-coded means compiled as a separate function
> call (whose nature is closed to view, I suppose--not sure the
> etymology, but must be something like that).

I think "open-coded" came first, the imagery being that
of opening up the function definition and putting the
contents in its place. "closed-coded" comes later, and
gets its meaning as the opposite of "open-coded". If
this is right, then the point isn't visibility so much
as, er, expandedness.

-- 
Gareth McCaughan  ················@pobox.com
.sig under construc

From: Marc Battyani
Subject: Re: sum over list?
Date: Sat, 26 May 2001 16:14:05 +0000
Message-ID: <9eom3b$qaq$1@reader1.fr.uu.net>

"Jordan Katz" <····@underlevel.net> wrote
> Drew Krause <········@mindspring.com> writes:
>
> > Boy is this a dumb question, but ....
> >
> > I need to create a function which sums all members of a variable-length
> > list, e.g.
> >
> > (sum '(2 3 4)) => 9
> > (sum '(5 6)) => 11
> >
> > I'm having trouble doing this for variable lengths, as using '+'
> > directly on the list of course won't work.
> > Any help appreciated!
>
> Let me warm you that I am a Lisp newbie as well, but this definition
> works for me (using CLISP):
>
>   (defun sum (args)
>     (apply #'+ args))

REDUCE is the prefered way to do this as APPLY could fail if it call the
function with more than the CALL-ARGUMENTS-LIMIT arguments.
CALL-ARGUMENTS-LIMIT  is a constant > 50

(defun sum (args)
    (reduce #'+ args))

or you can also use (loop for x in args sum x).

FRACTAL 274 > CALL-ARGUMENTS-LIMIT
255

FRACTAL 275 > (apply #'+ (loop for i below 1000 collect i))

Error: Argument list too long in APPLY: #<function + 200FF95A> to nil.
  1 (abort) Return to level 0.
  2 Return to top loop level 0.

Type :b for backtrace, :c <option number> to proceed,  or :? for other
options

FRACTAL 276 > (reduce #'+ (loop for i below 1000 collect i))
499500

FRACTAL 278 > (loop for i below 1000 sum i)
499500

Marc

From: Drew Krause
Subject: Re: sum over list?
Date: Sat, 26 May 2001 18:45:06 +0000
Message-ID: <3B0FF9B2.29E39ACE@mindspring.com>

Muchas gracias. Everybody ...

From: Francois Aguet
Subject: Re: sum over list?
Date: Sat, 26 May 2001 13:45:25 +0000
Message-ID: <3b0fb39d$1_5@news.bluewin.ch>

Hello,

You can't use '+' directly on a list, but the fucntion 'apply' has been
designed for this purpose:

: (apply #'+ '(2 3 4))
9
: (apply #'+ '(5 6))
11

Hope this helps...

Francois

From: Frode Vatvedt Fjeld
Subject: Re: sum over list?
Date: Sat, 26 May 2001 14:03:00 +0000
Message-ID: <2hhey8fbsb.fsf@dslab7.cs.uit.no>

Drew Krause <········@mindspring.com> writes:

> (sum '(2 3 4)) => 9
> (sum '(5 6)) => 11

The usual way of doing this would be

  (reduce #'+ '(2 3 4))

which is considered to be preferable to using APPLY. LOOP can do the
job almost as well:

  (loop for x in '(2 3 4) sum x)

The thing about LOOP is that it's much more flexible, in case you need
some slight variation of the theme, like for example:

  (loop for x in '(2 3 a 4) when (numberp x) sum x)

-- 
Frode Vatvedt Fjeld

From: Erik Naggum
Subject: Re: sum over list?
Date: Mon, 28 May 2001 14:19:38 +0000
Message-ID: <3200048369259034@naggum.net>

* Drew Krause <········@mindspring.com>
> Boy is this a dumb question, but ....
> 
> I need to create a function which sums all members of a variable-length
> list, e.g.
> 
> (sum '(2 3 4)) => 9
> (sum '(5 6)) => 11
> 
> I'm having trouble doing this for variable lengths, as using '+'
> directly on the list of course won't work.

  There are a number of reasonably obvious answers to your question, but
  the reasons for choosing one among them might be worth investigating.

(apply <function> [<arg>...] <arglist>)

  calls <function> with the arguments that first include all the
  <args>... and then those that results from "spreading" <arglist> into
  separate arguments.  This function is most applicable in combination with
  the &rest lambda list keyword, which does the reverse operation: it makes
  a bunch of arguments into a list.  apply is a good function use when the
  arguments originate under complete program control.  Otherwise, you may
  run into the system limit on the number of arguments in argument lists,
  you may get run-time errors reflecting unexpected types of arguments, and
  you may in pathological cases come across a circular or dotted list,
  which may have rather severe consequences on your Common Lisp system.
  (Surprisingly, Allegro CL just drops dead if apply gets a circular list.)

(reduce <function> <sequence>)

  calls <function> with the first two elements from the sequence, then uses
  that value as the first argument in the next call, which takes the next
  element from the sequence, and thus it walks through the entire sequence,
  yielding the value of the final call to <function>.  (You can cause it to
  start with an initial value or to walk from the end with keyword args.)
  This is good when the sequence is not a list, and when it is really long,
  as it does not use more than two arguments to <function>.  Mind you, +
  would have to be called (1- n) times for a sequence of length n, anyway,
  so if you can avoid the overhead of calling + with all the arguments at
  once, that is a winner.  However, it will fail with unknown argument
  types, circular lists and dotted lists.

(let ((sum 0)) (dolist (elt <list> sum) (when (numberp elt) (incf sum elt))))

  is an inlined, iterative version that checks for numberness of each
  element before summing them.  This version also dies with circular lists
  and dotted lists.

(loop for elt in <list> when (numberp element) sum elt)

  is a specialized version for summation, which in this case may be the
  most appropriate.  It ensure that non-number elements are excluded from
  the process.  This avoid argument type errors.  The loop will still fail
  with an error if the list is dotted, and will never terminate if the list
  is circular.  One problem here is that it might be an error if there is a
  non-number in the list, and you would never detect that.  This could
  happen if you read numbers from a file that had gotten corrupted or were
  written by another program that, say, had a different floating-point
  format that turned into symbols for some of the not-quite-numbers.

(and (list-length <list>)
     ...)

  using any of the above forms in place of the ellipsis, will ensure that
  circular lists are detected and dotted lists cause an error before
  anything else happens.

(cond ((not (ignore-errors (list-length <list>)))
       ...)
      ((not (every #'numberp <list>))
       ...)
      (t (loop ...)))

  will test the primary conditions for success before trying to do the real
  work.  However, this is a rather expensive approach, so what if we do it
  all in one fell swoop?

(defun list-sum (list)
  (unless (and (listp list)
	       (listp (cdr list)))
    (error "~S is not a proper list." list))
  (do ((accum 0)
       (tortoise list (cdr tortoise))
       (hare (cdr list) (cddr hare)))
      ((null tortoise)
       accum)
    (when (eq tortoise hare)
      (error "~S is a circular list." list))
    (unless (and (consp tortoise)
		 (listp hare)
		 (listp (cdr hare)))
      (error "~S is a dotted list." list))
    (unless (numberp (first tortoise))
      (error "~S is not a number." (first tortoise)))
    (incf accum (first tortoise))))

  Of course, this is suitable for a program that can bitch at the user, but
  what if you want to bitch in a way tha that the _program_ can handle, and
  want to make this idiomatic expression into a standard tool?

(defmacro do-proper-list ((var listform &optional resultform &key if-circular if-dotted)
			  &body body)
  "Traverse the value of listform, binding var to each element over body.
Returns the value of resultform (default nil), upon completion.  If the list is
a circular list and if-circular is not a form that causes a non-local exit,
return nil.  Similarly with dotted lists and if-dotted.  The form establishes a
block named nil, like dolist."
  (let ((list (make-symbol "list"))
	(tortoise (make-symbol "tortoise"))
	(hare (make-symbol "hare")))
    `(do* ((,var nil)
	   (,list ,listform)
	   (,tortoise ,list (cdr ,tortoise))
	   (,hare (cdr ,list) (cddr ,hare)))
	 ((null ,tortoise)
	  ,resultform)
       ;; circular list?
       (when (eq ,tortoise ,hare)
	 ,if-circular
	 (return nil))
       ;; dotted list?
       (unless (and (listp ,tortoise)
		    (listp ,hare)
		    (listp (cdr ,hare)))
	 ,if-dotted
	 (return nil))
       ;; proper list, use element
       (setq ,var (car ,tortoise))
       ,@body)))

(defun proper-list-p (list)
  "If <list> is a proper list, return the number of elements, otherwise nil."
  (and (listp list)
       (listp (cdr list))
       (do-proper-list (ignore list t))))

(deftype proper-list ()
  `(and list (satisfies proper-list-p)))

(defun list-sum (list &key junk-allowed)
  "Return the sum of the elements of list, which must be numbers unless
junk-allowed is true, in which case non-numbers are ignored."
  (let ((sum 0))
    (if (do-proper-list (elt list t)
	  (if (numberp elt)
	      (incf sum elt)
	    (unless junk-allowed
	      (error 'type-error :datum elt :expected-type 'number))))
	sum
      (error 'type-error :datum list :expected-type 'proper-list))))

  Unfortunately, Allegro CL discourages the use of typed conditions by
  lacking any useful default :report method for most of them, but I believe
  the above is in the spirit of the language as specified, so those who
  want this to be more user-friendly may decide to use the internal
  function excl::.type-error with the datum and the exected-type as
  ordinary (not keyword) arguments, or the _non-standard_ arguments
  :format-control and :format-arguments that should only have been in
  simple-type-error.  If you dare to mess with the system internals, this
  also works well:

#+allegro
(defmethod print-object ((condition type-error) stream)
  (if (or *print-readably* *print-escape*)
      (call-next-method)
      (format stream ···@<·@<~S is not of the expected type ····@>··@>"
	      (type-error-datum condition)
	      (type-error-expected-type condition))))

  This is one of those small things that make using the excellent condition
  system in Common Lisp properly less than enticing.

#:Erik
-- 
  Travel is a meat thing.

From: Kent M Pitman
Subject: Re: sum over list?
Date: Mon, 28 May 2001 17:12:16 +0000
Message-ID: <sfwy9rhiej3.fsf@world.std.com>

Erik Naggum <····@naggum.net> writes:

> [much good analysis deleted]
>
> #+allegro
> (defmethod print-object ((condition type-error) stream)
>   (if (or *print-readably* *print-escape*)
>       (call-next-method)
>       (format stream ···@<·@<~S is not of the expected type ····@>··@>"
> 	      (type-error-datum condition)
> 	      (type-error-expected-type condition))))
> 
>   This is one of those small things that make using the excellent condition
>   system in Common Lisp properly less than enticing.

What Erik means, but is too polite to say, is that we seriously goofed here.

In fairness, I've come to believe it's a design error in the language
that almost every print-object method starts with 
 (if *print-escape* ...)
There are two problems with this.  One is that I frequently screw up and
forget to put *print-readably* there--so the design is error-prone.  But
also, easy customization is inhibited by having to populate the definition
with boilerplate.

For some reason, btw, I always use print-unreadable-object instead of
call-next-method, although I suppose the result is generally the same,
and in some implementations maybe I'm heading off a superior method
that has #S(...) and would be readable.  Sigh. I suppose I should
train myself to do better on that, too.

But we DEFINITELY should have made at least the default method for CONDITION
introduce a REPORT method that you could customize by itself without 
involving yourself in readable printings of the condition.

These days I often do:

==============================================================================

(defclass print-object-mixin () ())

(defmethod print-object ((object print-object-mixin) stream)
  (if (or *print-readably* *print-escape*)
      (prin1-object object stream)
      (princ-object object stream)))

(defmethod prin1-object ((object print-object-mixin) stream)
  (print-unreadable-object (object stream :type t :identity t)
    (print-object-summary object stream)))

(defmethod print-object-summary ((object print-object-mixin) stream)
  (princ "-" stream))

(defmethod princ-object ((object print-object-mixin) stream)
  (prin1-object object stream))

==============================================================================

This makes it easier and prettier to do things like:

==============================================================================

(defclass named-object (print-object-mixin)
  ((name :initarg :name :accessor name :initform nil)))

(defmethod print-object-summary ((object named-object) stream)
  (prin1 (name object) stream))

(defmethod princ-object ((object named-object) stream)
  (princ (name object) stream))

==============================================================================

Of course, there's some subtlety lost in here to do with the way 
CALL-NEXT-METHOD was used in your example, Erik, where it was ambiguous
as to whether the parent was selecting #<...> or #S(...).  And adding
a PRINT-OBJECT-SUMMARY method might arguably be useless if you don't have
a way of forcing #S(...) to turn into a #<...> on demand.  I don't have
a good thought about how to arrange that, perhaps partially because I
haven't thought too hard about it and I haven't felt the need.  I mostly
don't mind losing the #S(...) but some people do.

Anyone got any ideas on this?