CL Pitfalls list

From: Jeff Dalton
Subject: CL Pitfalls list
Date: Wed, 11 Oct 1995 00:00:00 +0000
Message-ID: <813378905.21273@iiltd.demon.co.uk>

I posted an earlier version of this list a few years ago, and much
of it was put in the FAQ.  But I still had the list, and someone just
asked me about it, so I've fixed a few things and added some items,
and here it is:

----------------------------------------------------------------------

This is a list of "pitfalls": little traps that can catch even
experienced programmers.  They often involve somewhat counterintuitive
aspects of Common Lisp tend to be revealed only by a careful reading
of CLtL (or of the ANSI standard).

This list is written by Jeff Dalton <········@ed.ac.uk>.  Please
send suggestions and corrections

Another pitfall list, containing parts of this one, can be found in
the "Frequently Asked Questions" list of Comp.lang.lisp.

Contents:
 * Assorted pitfalls
 * Sort pitfalls
 * Iteration vs closures
 * Limits
 * Definitions and declarations

All references to CLtL are to the 2nd edition.

Assorted pitfalls.

  * The result of many non-destructive functions such as REMOVE
    and UNION can share structure with an argument; so you can't
    rely on them to make a completely new list.

  * SORT is (usually) destructive.

    So, for instance, (SORT (REMOVE ...) ...) may not be safe.

  * Destructive functions that you think would modify CDRs might
    modify CARs instead.  (Eg, NREVERSE.)

  * Array elements might not be initialized to NIL.  Eg,

       (make-array 10) => #(0 0 0 0 0 0 0 0 0 0)

    Use (make-array 10 :initial-element nil).

  * READ-FROM-STRING has some optional arguments before the
    keyword parameters.  If you want to supply some keyword
    arguments, you have to give all of the optional ones too.

    Other functions with this property: WRITE-STRING, WRITE-LINE.

  * (flet ((f ...)) (eq #'f #'f)) can return false.  The same is true
    with LABELS instead of FLET or EQL instead of EQ.

  * EQUAL compares both vectors and structs with EQ.  EQUALP descends
    vectors and structs but has other properties (such as ignoring
    character case) that may make it inappropriate.  EQUALP does not
    descend instances of STANDARD-CLASSes

  * The function LIST-LENGTH is not a faster, list-specific version
    of the sequence function LENGTH.  It is list-specific, but it's
    slower than LENGTH because it can handle circular lists.

  * (EXPORT NIL) does not export NIL.  You have to use (EXPORT '(NIL)).

  * (FORMAT T ...) interprets T as *STANDARD-OUTPUT*
    All other I/O functions interpret T as *TERMINAL-IO*.

  * Shared (i.e., class-allocated) slots of standard classes are not
    per-(sub)class.  That is, if a class C has a slot S with :allocation
    :class, that one slot is used by instances of C and of C's subclasses.
    [Note, however, that shared slots can be shadowed.  See CLtL p 779.]

  * If you have a package named P that exports a symbol named P, you
    won't (in another package) be able to say (USE-PACKAGE 'P).  Instead
    of 'P you'd have to write "P" or maybe #:P.  ["Why?" is left as
    an exercise for the reader.]

SORT pitfalls.

  It may seem odd to have a whole section devoted to SORT, but I've
  seen so many erroneous SORT calls that I've decided it's necessary.

  * As mentioned above, SORT is (usually) destructive and so such
    combinations as (SORT (REMOVE ...) ...) may not do as you expect
    (if what you expect is that REMOVE will produce a new list).

  * SORT may not return equal elements in the same order in different
    CL implementations, because different sorting algorithms are used.
    To ensure that you get the same result in all implementations, use
    STABLE-SORT (or write your own).

  * The comparison predicate given to SORT must be a strict and
    _consistent_ "less than".  It's easy to go wrong when writing
    "lexicographic" multi-step comparisons and end up with a predicate
    that says both A < B and B < A for some A and B.  For example:

      (defun fg-lessp (a b)
        (or (< (f a) (f b))    ;first compare f values
            (< (g a) (g b))))  ;then compare g values

    Fg-lessp does the right thing when (f a) is less than (f b):
    it returns true -- and when (f a) is equal to (f b): it goes
    on to compare g values.  But it also compares g values when
    (f a) is _greater than_ (f b), when what it should do is return
    false.

    Also make sure the predicate is _transitive_.  For instance,
    if you're comparing objects of different types and you decide
    (for example) that numbers are < strings and strings are < symbols,
    make sure that numbers are < symbols too.

  * CLtL suggests that using the :KEY argument to SORT may be more
    efficient than calling the key function inside the comparison
    predicate (p 408).  However, it may well be less efficient.
    Implementations may not take advantage of the separate :KEY
    to extract the keys only once; and the key function might be
    compiled in-line when it appears inside the predicate.

Iteration vs closures.

  * DO and DO* update the iteration variables by assignment; DOLIST and
    DOTIMES are allowed to use assignment (rather than a new binding).
    (All CLtL says of DOLIST and DOTIMES is that the variable "is
    bound" which has been taken as _not_ implying that there will be
    separate bindings for each iteration.)

    Consequently, if you make closures over an iteration variable
    in separate iterations they may nonetheless be closures over
    the same variable and hence will all refer to the same value
    -- whatever value the variable was given last.  Eg,

     (let ((fns '()))
       (do ((x '(1 2) (cdr x)))
           ((null x))
         (push #'(lambda () x)
               fns))
       (mapcar #'funcall (reverse fns)))

    returns (nil nil), not (1 2), not even (2 2).

Limits.

  * The array-total-size-limit may be as small as 1024.

  * Such things as (apply #'append list-of-lists) to flatten a list
    of lists may run afoul of call-arguments-limit, which can be as
    small as 50.

    Alternatives include:

      (reduce #'append list-of-lists :from-end t)

      (mapcan #'copy-list list-of-lists)

Definitions and declarations.

  * (DEFVAR var init) assigns to the variable only if it does not
    already have a value.  So if you edit a DEFVAR in a file and
    reload the file only to find that the value has not changed,
    this is the reason.  (Cf DEFPARAMETER.)

  * DEFCONSTANT has several potentially unexpected properties:

     - Once a name has been declared constant, it cannot be used a
       the name of a local variable (lexical or special) or function
       parameter.  Really.  See page 87 of CLtL II.

     - A DEFCONSTANT cannot be re-evaluated (eg, by reloading the
       file in which it appears) unless the new value is EQL to the
       old one.  Strictly speaking, even that may not be allowed.
       (DEFCONSTANT is "like DEFPARAMETER" and hence does an
       assignment, which is not allowed if the name has already
       been declared constant by DEFCONSTANT.)

       Note that the EQL rule makes it difficult to use anything
       other than numbers, symbols, and characters as constants.

     - When compiling (DEFCONSTANT name form) in a file, the form
       may be evaluated at compile-time, load-time, or both.  

       (You might think it would be evaluated at compile-time and
       the _value_ used to obtain the object at load-time, but it
       doesn't have to work that way.)

  * CLtL says that type declarations apply to names rather than to
    particular variables (bindings).  Consequently, they can cross
    lexical scopes.  Eg:

       (let ((x 1))
	 (declare (fixnum x))
	 (let ((x ...))
	   ;; This x must be a fixnum too!
	   ...))

    There is some doubt that this is what Common Lisp was meant to
    do, and the rule may have been changed in the ANSI standard, but
    it's what CLtL II says on page 219.

  * You often have to declare the result type to get the most
    efficient arithmetic.  Eg, 

       (the fixnum (+ (the fixnum e1) (the fixnum e2)))

     rather than

       (+ (the fixnum e1) (the fixnum e2))

  * Declaring the iteration variable of a DOTIMES to have type FIXNUM
    does not guarantee that fixnum arithmetic will be used.  That is,
    implementations that use fixnum-specific arithmetic in the presence
    of appropriate declarations may not think _this_ declaration is
    sufficient.  It may help to declare that the limit is also a
    fixnum, or you may have to write out the loop as a DO and add
    appropriate declarations for each operation involved.

  * INLINE and NOTINLINE (see CLtL II, pages 229-230)

     - You should place an INLINE proclamation _before_ the definition
       of a function you want to have compiled inline.

     - You need a NOTINLINE declaration to make sure SETF of SYMBOL-
       FUNCTION will affect calls between functions defined in the
       same file.

     - Implementations aren't required to compile anything inline
       no matter what declarations you use.

----------------------------------------------------------------------

-- jeff

Re: CL Pitfalls list Marty Hall
Re: CL Pitfalls list Jeff Dalton

From: Marty Hall
Subject: Re: CL Pitfalls list
Date: Wed, 11 Oct 1995 00:00:00 +0000
Message-ID: <DGADJM.1s7@aplcenmp.apl.jhu.edu>

In article <···············@iiltd.demon.co.uk> ····@interactive.co.uk 
(Jeff Dalton) writes:

>This is a list of "pitfalls": little traps that can catch even
>experienced programmers.

Great. If Jeff doesn't object, I'll add this to my Lisp-hints page
(http://www.apl.jhu.edu/~hall/lisp.html).

						- Marty
(proclaim '(inline skates))

From: Jeff Dalton
Subject: Re: CL Pitfalls list
Date: Mon, 16 Oct 1995 00:00:00 +0000
Message-ID: <DGJy6p.GBp@cogsci.ed.ac.uk>

········@Rincewindarch.su.edu.au (Thorsten Schnier) writes:

>In article <···············@iiltd.demon.co.uk>, ····@interactive.co.uk (Jeff Dalton) writes:

>|>   * Declaring the iteration variable of a DOTIMES to have type FIXNUM
>|>     does not guarantee that fixnum arithmetic will be used.  [...]

>You might add that the type of the result of (the ...) declarations is not
>checked, even with optimization for safety. 

Thanks.  I should add something about that, and about
declarations-and-checking in general.  I'll have to look
at the standard to see what the current rules are...

>For fixnum dotimes loops, I wrote the following macro (the only way of getting
>the internal variable used in GCL for dotimes declared as well):

>(defmacro dofix ((var count) . body)
>  (let ((max (gensym))
>	(loopstart (gensym)))
>    `(block ()
>	(let* ((,max ,count) (,var 0))
>	  (declare (fixnum ,max ,var))
>	  (tagbody ,loopstart
>	     (if (>= ,var ,max) (return nil))
>	     (tagbody ,@body)
>	     (setq ,var (1+ ,var))
>	     (go ,loopstart))))))

It's good to see it can be done in GCL with only one declaration.

>(Incidentially, 'fix' means quick in german...)

Cool.

-- jeff