Properties of CL symbols (probably an RTFM question)

From: Tim Bradshaw
Subject: Properties of CL symbols (probably an RTFM question)
Date: Mon, 25 Jun 2001 12:31:08 +0000
Message-ID: <nkjk820hhbn.fsf@tfeb.org>

I was asked by someone if it's OK to put properties on symbols in CL.
Both the asker and I thought that it wouldn't be.  But I can't find
such a restriction.  If there was one I'd have expected to find it in
11.1.2.1.2, but it's not there.  So maybe it's OK to do this, but I
can't find a statement that it is, either.  I'm not sure if is an
omission, if it is OK or if I need to RTFM (RTFHS?) better.  Probably
the latter.

Intuitively, it seems to me that it should be OK to add properties
that the user controls to CL symbols - (setf (get 'cons 'mypack:foo)
1) for instance.  But (setf (get 'cons 'type) 1) might be a bit
dubious, and (setf (symbol-plist 'cons) '()) strikes me as positively
foolhardy.  So I'm, slightly nervously, thinking about going with a
restriction to non-CL property names.

Can anyone clarify this, or point out what I should have read?

Thanks

--tim

Re: Properties of CL symbols (probably an RTFM question) Barry Margolin
- Re: Properties of CL symbols (probably an RTFM question) Kent M Pitman
  - Re: Properties of CL symbols (probably an RTFM question) Tim Bradshaw
    - Re: Properties of CL symbols (probably an RTFM question) Barry Margolin
      - Re: Properties of CL symbols (probably an RTFM question) Steven M. Haflich
  - hashtables (was Re: Properties of CL symbols (probably an RTFM question)) Christian Lynbech
    - Re: hashtables (was Re: Properties of CL symbols (probably an RTFM question)) Kent M Pitman
      - Re: hashtables (was Re: Properties of CL symbols (probably an RTFM question)) Kent M Pitman
      - Re: hashtables (was Re: Properties of CL symbols (probably an RTFM question)) Steven M. Haflich
        Re: hashtables (was Re: Properties of CL symbols (probably an RTFM question)) Kent M Pitman

From: Barry Margolin
Subject: Re: Properties of CL symbols (probably an RTFM question)
Date: Mon, 25 Jun 2001 14:55:54 +0000
Message-ID: <_hIZ6.4$dO2.620@burlma1-snr2>

In article <···············@tfeb.org>, Tim Bradshaw  <···@tfeb.org> wrote:
>Intuitively, it seems to me that it should be OK to add properties
>that the user controls to CL symbols - (setf (get 'cons 'mypack:foo)
>1) for instance.  But (setf (get 'cons 'type) 1) might be a bit
>dubious, and (setf (symbol-plist 'cons) '()) strikes me as positively
>foolhardy.  So I'm, slightly nervously, thinking about going with a
>restriction to non-CL property names.
>
>Can anyone clarify this, or point out what I should have read?

As long as either the symbol you're adding the property to or the symbol
being used as the property name are in your own package, you should be
fine.  The point of most of the restrictions on what you can do with CL
symbols is to prevent conflicts between unrelated packages, and using a
property name in your own package should achieve that.

So even though the spec doesn't specifically prohibit (setf (get 'cons
'type) 1), don't do it.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Kent M Pitman
Subject: Re: Properties of CL symbols (probably an RTFM question)
Date: Mon, 25 Jun 2001 16:49:26 +0000
Message-ID: <sfwae2wa4ix.fsf@world.std.com>

Barry Margolin <······@genuity.net> writes:

> In article <···············@tfeb.org>, Tim Bradshaw  <···@tfeb.org> wrote:

Not at all an RTFM question.  Quite the contrary.  I think it was McCarthy
at one of the very early X3J13 meetings who asked a very intresting
question about our charter 
 http://world.std.com/~pitman/CL/x3j13-86-020.html
What is it to "establish normative Common Lisp programming practice"?
It sounded good when we wrote it in, but how were we going to accomplish
it.  And it's still the part that I think we occasionally fell short on.

As I recall, one example that was given at the time was that it might
mean "don't add operators to the language that appear to be the thing
to use, but that are really traps".  APPEND was discussed, and the fact
that almost all uses of it were an efficiency trap, for example.  I think
there were not good answers; no one wanted to leave out APPEND at that
time, nor did they want to document it as inefficent or unusable per se.
It was left for the teaching texts how to resolve this.  But it isn't the
only operator that suffers from the need for careful explanation about
how to use it.  Maybe the answer is that this didn't belong in our charter.

> >Intuitively, it seems to me that it should be OK to add properties
> >that the user controls to CL symbols - (setf (get 'cons 'mypack:foo)
> >1) for instance.  But (setf (get 'cons 'type) 1) might be a bit
> >dubious, and (setf (symbol-plist 'cons) '()) strikes me as positively
> >foolhardy.  So I'm, slightly nervously, thinking about going with a
> >restriction to non-CL property names.
> >
> >Can anyone clarify this, or point out what I should have read?
> 
> As long as either the symbol you're adding the property to or the symbol
> being used as the property name are in your own package, you should be
> fine.  The point of most of the restrictions on what you can do with CL
> symbols is to prevent conflicts between unrelated packages, and using a
> property name in your own package should achieve that.
> 
> So even though the spec doesn't specifically prohibit (setf (get 'cons
> 'type) 1), don't do it.

Nothing Barry says is linguistically false, and I've dispensed this same
advice over time, but I'd go a different path here...

The following two are about computationally equivalent:

 (defun foo (x) (get x 'foo))

or

 (defvar *foo-table* (make-hash-table))
 (defun foo (x) (gethash x *foo-table*))

The latter, modulo the use of specials (where they'd use a closed lexical)
is more like what you'd do in Scheme, and I tend to use the Schemish 
solution these days so as not to clutter the property list AND to add a
slight degree of extra data privacy (not the least of which is keeping 
some idiot from using SETF of SYMBOL-PLIST to screw things.  You should 
definitely NEVER EVER using SETF of SYMBOL-PLIST  to do anything other
than *augment* a symbol.  

[Well, I used to once-in-a-while, in single-processing lisps, use 
SETF of SYMBOL-PLIST to implement things like fast SUBLIS by pushing and
popping the plist with gensymed properties to make sure the property was
at the head or not there at all.  The problem is that you have to lock out
interrupts in a multiprocessing Lisp, or at least synchronize the use of
SETF of GET/SYMBOL-PLIST/etc, and you can't.  So that makes applications 
which try to play with the plist order less useful in modern lisps.  It'
obscure trivia now, so I'll leave it at that, but Ask if I haven't made 
that clear and you actually care.]

The reason I tend to go for hash tables, though, is (ironically) that they
are a little slower than the plist.  The plist is directly attached to a
symbol and provides really quite a fast way to get to certain data.

Note, though, that the plist is not threadsafe.  Well, neither is a hash
table for that matter, but it's much more common that a plist would be used
cross-process than a hash table would, since the plist serves multiple masters.
Usually you know when your hash table will need to be threadsafe, but knowing
whether the plist is threadsafe is trickier.  So you're always at some
risk of lossage.  Some people say don't do updates  of plists EXCEPT at
load time and don't do LOAD except when a multiprocessing  system is 
quiescent.  I heard such lore for the Lisp Machine, and it probably related
to this.

Anyway, to go on, I tend to try to save properties in reserve for
things I need to super-optimize, because the number of instructions to
execute (get 'foo 'bar) can be really tiny compared to executing a hash 
key and searching a hash table, IF the plist is short.  If you've used lots
of properties, you may have screwed your ability to use this as a last
ditch speed bum against a hash table.  (Also, of course, GET will not take
the occasional irksome speed hit that GETHASH will due to GC cyclage, and
sometimes one needs to work past that.)

[Ask if you don't know about the "ironic" speed hit GETHASH can take. 
 Everyone should know about that, but I'm too lazy to mention it now.
 It's due to a mis-design in MAKE-HASH-TABLE, which should take a 
 :OPTIMIZE-PERFORMANCE :REALTIME or :OPTIMIZE-PERFORMANCE :TOTAL-TIME,
 but since it doesn't, it tries to guess [usually total-time] and screws
 real-time applications.]

Bottom line: Use GETHASH unless your application is real-time or you need
 a last-minute speed bum.  Do not use GET on symbols otherwise.

So goes my personal style rule.

Your mileage may vary, as with all style rules.

From: Tim Bradshaw
Subject: Re: Properties of CL symbols (probably an RTFM question)
Date: Tue, 26 Jun 2001 10:48:33 +0000
Message-ID: <nkjd77r4iv2.fsf@tfeb.org>

Kent M Pitman <······@world.std.com> writes:

> [Interesting things]

I think I agree with everything you say, although my personal
tradeoffs might be slightly different I think (maybe not).

At the level of `must' rather than `ought' though, I was slightly
surprised that there isn't a requirement in the standard not to, say,
do (setf (symbol-plist 'car) nil).  In at least one implementation
something like this is catastrophic.  I guess the problem is
describing what you are and aren't allowed to do in such a way that
it's not restrictive but still useful - for instance you don't want to
restrict changing properties that are in CL, you also want to prevent
ones that are in implementation-internal packages. Or maybe it's just
because that's such an obviously stupid thing to do...


The issue with hashtables that you mentioned: is this that there's no
way of asking the system `I want operations on hashtables to take
bounded time'? So, for instance, it might elect not to do expensive
rehashes or representation changes which mean that, say, a particular
GETHASH will take a long time, even though it thinks that will make
the amortised cost lower.

--tim

From: Barry Margolin
Subject: Re: Properties of CL symbols (probably an RTFM question)
Date: Tue, 26 Jun 2001 15:41:54 +0000
Message-ID: <532_6.9$0v4.225@burlma1-snr2>

In article <···············@tfeb.org>, Tim Bradshaw  <···@tfeb.org> wrote:
>At the level of `must' rather than `ought' though, I was slightly
>surprised that there isn't a requirement in the standard not to, say,
>do (setf (symbol-plist 'car) nil).  In at least one implementation
>something like this is catastrophic.

The cleanup issue LISP-SYMBOL-REDEFINITION says: At the March 89 X3j13
meeting, a proposed additional constraint ("Altering its property list")
was removed. Presumably this means that conformal programs are allowed to
alter the property list of symbols in the COMMON-LISP package.

This means that implementations are not supposed to put anything they
depend on in the property list.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Steven M. Haflich
Subject: Re: Properties of CL symbols (probably an RTFM question)
Date: Fri, 29 Jun 2001 03:04:30 +0000
Message-ID: <3B3BF03E.5AC63BE3@pacbell.net>

Barry Margolin wrote:

> The cleanup issue LISP-SYMBOL-REDEFINITION says: At the March 89 X3j13
> meeting, a proposed additional constraint ("Altering its property list")
> was removed. Presumably this means that conformal programs are allowed to
> alter the property list of symbols in the COMMON-LISP package.
> 
> This means that implementations are not supposed to put anything they
> depend on in the property list.

I believe I can remember something about the reasoning at that meeting,
or perhaps some later meeting where the plist issue came up:  There was
sentiment that the standard _should_ codify proper implementation and
application treatment of symbol property lists, but those present at the
time realized that the issues, while not extremely difficult, were
difficult enough that we were unlikely to get it right the first time
around, and that we were supposed to be finishing up the standard at
whichever meeting it was that I'm remembering.  This is fom someone who
has difficulty remembering what he ate for breakfast yesterday.

I think the central point is fairly clear, that well-behaved program
modules/threads shouldn't mess with the symbol-plist of symbols that
other modules might be using, other than by using cl:set and setf
thereof.  All property indicators used should be symbols in some package
under the control of the particular module.

These rules permit even a multiprocessing implementation to provide
applications with safe plist usage.  Probably most people would agree
with the intent of this specification.

But there are some fine points:  What about a program that uses symbol
property lists for some kind of temporary storage?  That program might
be able to guarantee that all the symbols it uses have no possibility of
use by any other program -- gensyms and private packages are your friends
--  and should not such a program be allowed to bash or pop symbol-plist
when it wants to flush accumulated properties?  Although this style of
Lisp programming feels about 35 years old, I would not want to prohibit
it.

So, if anyone thinks he can draft a workable specification for use of
symbol-plists, write them down and we can get them in the next version
of the ANS.

From: Christian Lynbech
Subject: hashtables (was Re: Properties of CL symbols (probably an RTFM question))
Date: Tue, 26 Jun 2001 10:20:41 +0000
Message-ID: <ofbsnb35l2.fsf_-_@chl.ted.dk.eu.ericsson.se>

>>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:

Kent> [Ask if you don't know about the "ironic" speed hit GETHASH can take. 
Kent>  Everyone should know about that, but I'm too lazy to mention it now.
Kent>  It's due to a mis-design in MAKE-HASH-TABLE, which should take a 
Kent>  :OPTIMIZE-PERFORMANCE :REALTIME or :OPTIMIZE-PERFORMANCE :TOTAL-TIME,
Kent>  but since it doesn't, it tries to guess [usually total-time] and screws
Kent>  real-time applications.]

I would like to hear more about this. 

I am working on an application that uses quite a lot of hashtables,
and allthough this is neither hard realtime nor in the stage where we
need to really squeeze it for speed, we are a bit concerned whether we
are in fact doing something silly.


------------------------+-----------------------------------------------------
Christian Lynbech       | Ericsson Telebit, Skanderborgvej 232, DK-8260 Viby J
Phone: +45 8938 5244    | email: ·················@ted.ericsson.dk
Fax:   +45 8938 5101    | web:   www.ericsson.com
------------------------+-----------------------------------------------------
Hit the philistines three times over the head with the Elisp reference manual.
                                        - ·······@hal.com (Michael A. Petonic)

From: Kent M Pitman
Subject: Re: hashtables (was Re: Properties of CL symbols (probably an RTFM question))
Date: Tue, 26 Jun 2001 15:16:57 +0000
Message-ID: <sfwy9qftgnq.fsf@world.std.com>

Christian Lynbech <·················@ted.ericsson.dk> writes:

> 
> >>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:
> 
> Kent> [Ask if you don't know about the "ironic" speed hit GETHASH can take. 
> Kent>  Everyone should know about that, but I'm too lazy to mention it now.
> Kent>  It's due to a mis-design in MAKE-HASH-TABLE, which should take a 
> Kent>  :OPTIMIZE-PERFORMANCE :REALTIME or :OPTIMIZE-PERFORMANCE :TOTAL-TIME,
> Kent>  but since it doesn't, it tries to guess [usually total-time] and screws
> Kent>  real-time applications.]
> 
> I would like to hear more about this. 
> 
> I am working on an application that uses quite a lot of hashtables,
> and allthough this is neither hard realtime nor in the stage where we
> need to really squeeze it for speed, we are a bit concerned whether we
> are in fact doing something silly.

EQ hash tables have to be "rehashed" on every gc when a relocating gc
is used, since the pointers might have moved.  However, often such a
table is not used between one gc cycle and the next, so many
implementations do you the "favor" of not rehashing.  This definitely
optimizes "total time" since the time to rehash is simply avoided (the
hash table is marked "dirty" and the hash table is rehashed instead on
demand the next time a gethash is done, which might be several gc
cycles later).  HOWEVER, it ignores the fact that most people use
GETHASH thinking they are getting more-or-less O(1) fast lookup time,
and a rehash of a table is not necessarily an O(1) "feel"... It's
*possible* that the gc would have occurred while the machine (or at
least the Lisp process) was idle and that rehash at that time, while
increasing "total time", would not have compromised realtime performance.
IMO, and this is another one of those stupid banners that I seem to be the
only one ever to wave, the Lisp system should CEASE AND DESIST making this
"helpful" decision for users and vendors should offer a keyword
option at MAKE-HASH-TABLE time which allows you to either select or avoid
this "optimization" in a way that is right for the application.  
Especially if you're doing things like writing CLIM interfaces, where the
CLIM interface is often already doing a ton of stuff and running slowly,
there's nothing like finding you clicked on a pull-down menu and you
went to access something in a not-recently-used hash-table only to find
it says "before I pull down this menu, how about I rehash this hash table".
Bleah...  I think right now just about all implementations unconditionally
assume you want to optimize "total time" and don't really understand the
"realtime response" issue.

From: Kent M Pitman
Subject: Re: hashtables (was Re: Properties of CL symbols (probably an RTFM question))
Date: Tue, 26 Jun 2001 15:34:01 +0000
Message-ID: <sfwd77rz252.fsf@world.std.com>

Kent M Pitman <······@world.std.com> writes:

> Christian Lynbech <·················@ted.ericsson.dk> writes:
> 
> > >>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:
> > 
> > Kent> [Ask if you don't know about the "ironic" speed hit GETHASH 
> > Kent> can take. ...]
> > 
> > I would like to hear more about this. 
> 
> EQ hash tables have to be "rehashed" on every gc when a relocating gc
> is used, since the pointers might have moved.  However, often such a
> table is not used between one gc cycle and the next, so many
> implementations do you the "favor" of not rehashing.  This definitely
> optimizes "total time" since the time to rehash is simply avoided (the
> hash table is marked "dirty" and the hash table is rehashed instead on
> demand the next time a gethash is done, which might be several gc
> cycles later).  HOWEVER, it ignores the fact that most people use
> GETHASH thinking they are getting more-or-less O(1) fast lookup time,
> and a rehash of a table is not necessarily an O(1) "feel"... It's
> *possible* that the gc would have occurred while the machine (or at
> least the Lisp process) was idle and that rehash at that time, while
> increasing "total time", would not have compromised realtime performance.
> IMO, and this is another one of those stupid banners that I seem to be the
> only one ever to wave, the Lisp system should CEASE AND DESIST making this
> "helpful" decision for users and vendors should offer a keyword
> option at MAKE-HASH-TABLE time which allows you to either select or avoid
> this "optimization" in a way that is right for the application.  
> Especially if you're doing things like writing CLIM interfaces, where the
> CLIM interface is often already doing a ton of stuff and running slowly,
> there's nothing like finding you clicked on a pull-down menu and you
> went to access something in a not-recently-used hash-table only to find
> it says "before I pull down this menu, how about I rehash this hash table".
> Bleah...  I think right now just about all implementations unconditionally
> assume you want to optimize "total time" and don't really understand the
> "realtime response" issue.

Btw, I wanted to add a couple of remarks.

It's not that I'm not sensitive to the fact that sometimes a gc could happen
in a place that disturbs realtime  response, so it's not like some 
implementations couldn't ignore my request to reshash at a different time.

One way to handle this would be for the "idle process" (or the
last-ditch-before-idle process :-) to do rehashing rather than nothing
on tables known to be needed for realtime response.  Another way to handle
it would be to have a (with-realtime ...) that hinted to the system that 
this would be a bad time to be trading total time for realtime response.

Mostly what I'm arguing for is not countermanding careful metering of
a given implementation, but for the language offering enough information to
know what's in the programmer's head.  Certain gethash's need to be 
especially fast.  Part of the problem may be using hash-tables at all.
We should have just called them tables and let the system pick the 
implementation.  It might be that linear search could be faster and this
might queue the system to know that...  For very small tables, most everyone
knows an alist is probably faster anyway, but it's per-implementation
where the dividing line is, so having a generic table type would hide this
in the implementation's domain...

Just thinking out loud, but I figured if I was going to say half of the stuff,
I should say the other half.

What really bugs me about make-hash-table/gethash as they are now is that
the system makes all its decisions on the basis of no cues from the user,
when the user probably has cues to offer but has no gesture available
for offering them.

From: Steven M. Haflich
Subject: Re: hashtables (was Re: Properties of CL symbols (probably an RTFM  question))
Date: Fri, 29 Jun 2001 03:45:39 +0000
Message-ID: <3B3BF9E3.D8A8A50F@pacbell.net>

Kent M Pitman wrote:

> EQ hash tables have to be "rehashed" on every gc when a relocating gc
> is used, since the pointers might have moved.

I don't mean to pick an argument here, but I want to point out that
hash table implementation is more subtle than realized in 103% of the
messages posted about the subject to comp.lang.lisp.

It is in general true that an EQ (or EQL and maybe even EQUAL/P,
depending on the implementation) hashtable needs to be rehashed before
it can be accessed after any GC, there are possible optimizations to
this requirement that, indeed, make a difference to some applications.

For any hashtable predicate (whether or not one of the standard four
predicates in the ANS) the set of possible Lisp objects can be partitioned
in an implementation-dependent way according to their properties as hash
keys (see ANS 18.1.2 Modifying Hash Table Keys):

 (1) Some objects change their hash lookup as a result of GC.
 (2) Some objects don't.

Now, the implementation of a hash table certainly can know the difference
between these two kinds of objects, and it follows that the hash table can
know (sometimes) that it contains no keys from the first class.  If it
contains none, then it need not rehash after GC.  If it contains some, or
previously contained some and _might_ still contain some, then it can
trivially and efficiently recompute this binary attribute the next time it
is forced to rehash by access after a GC.

Now this might at first seem a tedious and useless optimization applicable
only in absurdly rare circumstances, but this is not the case.  Consider
for example an implementation that assigns immutable hash codes to each
symbol upon its creation.  If all the keys in the hash table are symbols
(or perhaps fixnums or other eq--under-gc-preserving hashcodes) then the
hash table does not need to rehash after GC.  And the hash table can
maintain this attribute with nearly zero cost.

ACL performs this optimization (the implementation is mostly due to Duane)
and it is very important to certain applications.  But there is a meta
point here:  The performance of hash tables is very subtle, and it is
_very_ hard for a naive application to predict and optimize the efficiency
of particular hash table usage.  So, portable applications can't depend
upon this or any other hash table optimization.  For many applications,
efficiency-critical hash table use remains in the domain of portions of
the application tha may need to be customized to the implementation.  This
is, of course, not a good thing, but it is the condition on the ground.

Now, during the X3J13 process Kent was eloquent that the standard should
not merely define certain guaranteed semantics, it must also delineate
(so far as possible) the proper usage that quality implementations with
achieve with efficiency, and not suggest code usage that (while
semantically portable) would not be practically portable because of
uncertain performance behavior between implementations.  (Kent might not
remember making this point, but I remember it well and I remember being
impressed with the point at the time and agreeing with him.  I still
agree.  If Kent wants to deny the point now, that's his privilege, but
I'll point out that Kent probably _also_ doesn't remember what I ate for
breakfast yesterday...)

It is really _really_ hard to capture the possibility of this kind of
optimization in a language standard.  The conditions of applicability
are much too hard to nail down. (Memory addresses change in any sort of
copying GC, but never change in freelist gc implementations.  Few GC
implementations these days use freelists, but it occurs to me that my
monthly salary would today easily buy more semiconductor memory than
ever plugged in all 36xx/LM-xx chassis every built, don't discount the
notion that GC strategies may yet change again.  Maybe there will again
even be viable implementations that never GC.  But I digress...)
I'm here publicizing the optimization because if enough other
implementations provided it, there would be that much less variability
in hash table performance, and the language would be improved.

···@franz.com

From: Kent M Pitman
Subject: Re: hashtables (was Re: Properties of CL symbols (probably an RTFM   question))
Date: Fri, 29 Jun 2001 07:08:14 +0000
Message-ID: <sfwithf69wh.fsf@world.std.com>

"Steven M. Haflich" <·······@pacbell.net> writes:

> Kent M Pitman wrote:
> 
> > EQ hash tables have to be "rehashed" on every gc when a relocating gc
> > is used, since the pointers might have moved.
> 
> I don't mean to pick an argument here,

I don't have too much argument with what you wrote, really.

> Now this might at first seem a tedious and useless optimization applicable
> only in absurdly rare circumstances, but this is not the case.  Consider
> for example an implementation that assigns immutable hash codes to each
> symbol upon its creation.  If all the keys in the hash table are symbols
> (or perhaps fixnums or other eq--under-gc-preserving hashcodes) then the
> hash table does not need to rehash after GC.  And the hash table can
> maintain this attribute with nearly zero cost.

Certainly a reasonable optimization.  It's a pity that keywords are not just
immutable strings (or vice versa) so that Java-like "canonical strings" could
play this game, too.

> ACL performs this optimization (the implementation is mostly due to Duane)
> and it is very important to certain applications.

I think I said explicitly in my message that I had no quarrel with the cases
where the gc-relocation wasn't needed.  e.g., obviously for certain predicates
it's not needed.  My point wasn't that you shouldn't do the optimization, or
that it couldn't work most of the time, but that vendors should still allow
the user to specify at hash table creation time what the preference was for
hash tables where the optimization wasn't going to succeed.  (And since it's
hard to say which ones this matters for, I think it should take this keyword
even in cases where the optimization doesn't apply; simpler to explain that
way.)

> Now, during the X3J13 process Kent was eloquent that the standard should
> not merely define certain guaranteed semantics, it must also delineate
> (so far as possible) the proper usage that quality implementations with
> achieve with efficiency, and not suggest code usage that (while
> semantically portable) would not be practically portable because of
> uncertain performance behavior between implementations.  (Kent might not
> remember making this point, but I remember it well and I remember being
> impressed with the point at the time and agreeing with him.  I still
> agree.  If Kent wants to deny the point now, that's his privilege, but
> I'll point out that Kent probably _also_ doesn't remember what I ate for
> breakfast yesterday...)

No, I'm not denying that. Nice of you to remember.  I'm just not sure how
a language change to say
 (defvar *foo* (make-hash-table :test 'eql :optimize :realtime-access))
will be any harder for you to work with or any less portable.  A vendor can
still ignore the information when they don't think it matters.  But when
faced with the little-known quandary about "rehash on gc" or
"rehash on access", they at least have user-advice to fall back on to help
them.  I *think* that doesn't violate my earlier-stated concern about
portable use, but  if it does, I'm curious as to how.

I don't basically imagine anyone can easily write a test suite that can test
whether the advice got taken.  But then, I remember a lot of fussing about
whether hash tables had to *really* be hash tables and couldn't they just 
be "lists" (internally) for very small situations, growing only to use hash
codes when they get enough elements for it to matter.  My feeling has always
been that if all it buys you is speed, of course they can, and surely no one
will be injured.  Performance is desirable but not semantic.  So similarly
for the hash-table case I'm not asking that vendors do one or the other
optimizations, only that they do what they do (which I recognize to be
implementation-specific and only best decided by vendors) having asked me
the things I can provide portably.

It used to be fashionable to sometimes extend a STRUCT to add extra slots
in a patch by using GETHASH so the pre-existing "instances" didn't have to
be "updated".  But in early CLIM, I saw some really bad places where the
pull-down menus were slow and some of it tracked to this case, where GETHASH
was used for its "near constant-time-ness" and where the user was blindsided 
by it not being.  If the system can otherwise guarantee "near 
constant-time-ness", I'm happy with that.

> It is really _really_ hard to capture the possibility of this kind
> of optimization in a language standard.  The conditions of
> applicability are much too hard to nail down.

I tried through how I proposed the option not to specify either the  
possibility or necessity of the particular optimization, but to keep it
high level.  I agree with this statement on your part.

> I'm here publicizing the optimization because if enough other
> implementations provided it, there would be that much less variability
> in hash table performance, and the language would be improved.

I think you're right here, and I appreciate the mention of such a thing.
But I still kind of wish, until we can guarantee no hash table does this,
that users were polled as to their needs before an optimizaation was chosen
at random for the few cases where it comes up.