GET / GETF warning.

From: Dan Hoey
Subject: GET / GETF warning.
Date: Fri, 26 Apr 1991 15:35:50 +0000
Message-ID: <641@zogwarg.etl.army.mil>

The fact that GET/GETF use EQ instead of EQL needs to be emphasized
more.  Such as a remark that it is an error to use a number as an
indicator in the property list.

It's all the worse that for many implementations EQL fixnums are EQ,
so you don't see the bug until your problem gets into bignums.

This might be a useful thing to put in the FAQ list, too.

Dan

Re: GET / GETF warning. Barry Margolin
- GET / GETF warning. David V. Wallace
Eternal fixnum question (was Re: GET / GETF warning.) Jeff Dalton
- Re: Eternal fixnum question (was Re: GET / GETF warning.) Rob MacLachlan

From: Barry Margolin
Subject: Re: GET / GETF warning.
Date: Fri, 26 Apr 1991 21:24:52 +0000
Message-ID: <1991Apr26.212452.26837@Think.COM>

In article <···@zogwarg.etl.army.mil> ····@zogwarg.etl.army.mil (Dan Hoey) writes:
>The fact that GET/GETF use EQ instead of EQL needs to be emphasized
>more.  Such as a remark that it is an error to use a number as an
>indicator in the property list.

Just this morning I happened to be reviewing the section of the draft ANSI
spec that includes GET, and I made precisely the same comment, so it should
make it into the standard.

Characters also can be EQL but not EQ, so they shouldn't be used as
indicators, either.

[Before anyone asks, the draft spec is not yet in any shape for
distribution outside X3J13.]

>It's all the worse that for many implementations EQL fixnums are EQ,
>so you don't see the bug until your problem gets into bignums.
>
>This might be a useful thing to put in the FAQ list, too.

I doubt that people frequently use non-symbols as indicators; people almost
always use symbols.  In general, it's bad programming practice to use any
global object as an indicator, because you might clash with another program
that makes the same mistake.  If you use symbols, they should be in your
own package (and watch out for inherited symbols -- don't do (setf (get
person 'car) 'datsun)); if you use lists, they should be constructed at
runtime rather than constants (because the compiler may coalesce similar
constant lists).
--
Barry Margolin, Thinking Machines Corp.

······@think.com
{uunet,harvard}!think!barmar

From: David V. Wallace
Subject: GET / GETF warning.
Date: Fri, 26 Apr 1991 19:53:17 +0000
Message-ID: <GUMBY.91Apr26155317@Cygnus.COM>

   Date: 26 Apr 91 21:24:52 GMT
   From: ······@think.com (Barry Margolin)

			In general, it's bad programming practice to use any
   global object as an indicator, because you might clash with another program
   that makes the same mistake.

Whis is why I use arbitrary runtime data (eg (CONS NIL NIL) or
(make-symbol "Meaningful name")) as a tag whenever possible...

These days you might as well make a symbol; if you're concerned about
the storage overhead yu probably shouldn't use plists anyway.

From: Jeff Dalton
Subject: Eternal fixnum question (was Re: GET / GETF warning.)
Date: Tue, 30 Apr 1991 18:08:26 +0000
Message-ID: <4580@skye.ed.ac.uk>

In article <···@zogwarg.etl.army.mil> ····@zogwarg.etl.army.mil (Dan Hoey) writes:
>The fact that GET/GETF use EQ instead of EQL needs to be emphasized
>more.  Such as a remark that it is an error to use a number as an
>indicator in the property list.
>
>It's all the worse that for many implementations EQL fixnums are EQ,
>so you don't see the bug until your problem gets into bignums.

I suppose this isn't the _best_ place to raise this question,
but when EQ, fixnums, and FAQ appear in the same place it's
almost irresistible.

As you all will recall, Common Lisp allows EQ to return false
for fixnums that are EQL.  Indeed, it can do so even in a case
like this:

[1]   (let ((x z) (y z)) (eq x y))  ; CLtL II p 288

or even:

[2]   (let ((x 5)) (eq x x))        ; CLtL II p 104

Indeed, at least one compiler (AKCL 1-505) has EQ always return
NIL when the arguments are known to be fixnums.  There is no 
significant efficiency reason for this.  EQL for known fixnums
compiles into the same code as EQ for objects: (x == y).  The
only difference is that x and y are C ints in the EQL case
and pointers in the EQ one.

A number of people find cases like [1] and especially [2] difficult to
understand, even if they are experienced users or even implementors of
Lisp.  They could understand if the reason were just to keep the rule
simple: EQ can _never_ be relied on for fixnums.  What's hard to see
is why in any reasonable implementation [1] or [2] would actually turn
out to be false.  And, as I will try to show below, the usual answers
to this are not very convincing.  I'm hoping there's a better answer 
and that someone can say what it is.

It's easy to understand a case like this one:

[3]   (eq 1 (- 3 2)) ==> ?

Why?  Well, in some implementations fixnums might be implemented
as pointers to locations that contain the actual values and new
fixnums might be allocated whenever a fixnum result is returned.
In [3], 1 and (- 3 2) might be pointers to different locations
that both contain 1; but since EQ compares only the pointers it
would return NIL.

So much for [3].  But we need a different argument to show why [1] or
[2] might return NIL, because if all fixnums were pointers-to, both
[1] and [2] would be true (unless we imagine that merely referring to
a fixnum causes a new one to be allocated, an implementation that
would be bizarre, to say the least).  Moreover, they would also be
true if all fixnums were always represented as tagged immediate
values, because they would have the same tag (fixnum) and the same
value.  Indeed, it's hard to imagine any reasonable representation
that, if used for all fixnums, would have [1] or [2] be false.

This suggests that the relevant case is when more than one
representation might be used.  The official reason (CLtL II p 288
again) seems to be that the "breakdown" of EQ for numbers is needed
for a compiler to be able to produce "exceptionally efficient
numerical code".  We can imagine an implementation that normally
allocated fixnums on the heap and represented them as pointers
to heap locations but also put fixnums directly in registers or
stack locations for manipulation by efficient compiled code.

But this isn't very convincing either.  KCL uses two representations
in the way suggested and could make [1] and [2] return true without
any efficiency loss -- unless you count the efficiency of being able
to skip the comparison altogether.  It is, moreover, hard to see why
KCL should be a special case.  If fixnum values are being used
directly, the compiler knows and can emit a direct compare.  If
the values are not direct, the compiler knows _that_ and can emit
a direct compare of the pointers.

This seems to account at least for [2], (let ((x 5)) (eq x x)), where
there is only one variable.  But what about [1], (let ((x z) (y z))
(eq x y))?  Maybe x is represented one way and y another.  But again
the compiler has to know this, because if it didn't it would also
be unable to handle cases such as

[4]  (let ((x z) (y z)) (+ x y))

So we seem to be left with the conclusion that there isn't an
efficiency reason after all.

There is still a conceptual argument, namely that if x and y are
different registers or stack locations they shouldn't count as EQ.
This will not work as-is, because it would imply that x and y should
not count as EQ when they contain other values such as (pointers to)
cons cells.  Instead, we'd have to say something like this: when
fixnum values are used directly, they aren't Lisp objects any more
and so it isn't meaningful to compare them with EQ.

That's better, but it still seems to be wrong.  This treatment of
fixnums is an optimization.  We should let optimizations have an
impact on the semantics only when some important efficiency gain is
obtained.  Otherwise, we leave the semantics alone and allow only
those optimizations that respect the semantics to the extent that any
violations have no impact on the behavior of programs.  That some
objects might be manipulated internally as non-objects is not in
itself a reason for saying they weren't objects in the first place.

It therefore seems that examples such as [1] and [2] are useful
only to make the problems due to examples such as [3] more universal.

-- jd

From: Rob MacLachlan
Subject: Re: Eternal fixnum question (was Re: GET / GETF warning.)
Date: Wed, 01 May 1991 01:55:34 +0000
Message-ID: <1991May1.015534.27054@cs.cmu.edu>

In article <····@skye.ed.ac.uk> ····@aiai.UUCP (Jeff Dalton) writes:
>
>I suppose this isn't the _best_ place to raise this question,
>but when EQ, fixnums, and FAQ appear in the same place it's
>almost irresistible.
>
>As you all will recall, Common Lisp allows EQ to return false
>for fixnums that are EQL.  
>
>[2]   (let ((x 5)) (eq x x))        ; CLtL II p 104
>
>What's hard to see
>is why in any reasonable implementation [1] or [2] would actually turn
>out to be false.  

This is more convincing with floats, since fixnums normally have immediate
representations in modern lisps, but here is the scenario:
 -- X is known to be a fixnum, so the compiler allocates in a register
    without any tag bits.  This is an unboxed (or non-descriptor)
    representation. 
 -- The compiler notices some references to X in contexts that require a
    tagged Lisp object.  For each such reference, it heap allocates a copy
    of X.
 -- This results in EQ being passed two difference copies of X.

Now, you might argue that this is a rather silly thing for the compiler to
do, especially when both arguments are known to be fixnums.  But consider
something like:
    (let ((x 1)) (eq x (if yow x (gloob))))

Basically, there was a langauge design tradoff here.  Is it more important
to have a well-defined pointer identity for numbers, or is it more important
to be able to generate tense numeric code?

The only way to ensure that pointer identity is preserved for numbers would
be to always retain the original object pointer, as well as any untagged
value, and then to use the original object pointer in any tagged context.
But this causes nasty problems with set variables:
    (let ((x z)
          (y z))
      (declare (fixnum x y))
      (when (gloob) (setq x (+ x (the fixnum grue))))
      (eq x y))

Now what to you do?  Keep a shadow tagged X, but invalidate it whenever it
is set, and then cons X at the EQ when the shadow X is invalidated?  Cons
the result of + just so that you can store it into the shadow X?

If this were an important problem to solve, I'm sure a solution could be
worked out that wouldn't be too terribly inefficient most of the time, but
why bother?  As near as I can tell, object identity (EQ) is only an
important operation on objects which can be side-effected, and numbers
can't.  

You can also use EQ as a probabilistic optimization of general equality: if
two objects are EQ, they are definitely the same.  If not, they might be the
same.  But number copying only makes this hint slightly less sucessful, it
doesn't break the semantics.

Robert MacLachlan (···@cs.cmu.edu)