From: Sean SCC
Subject: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <1156856157.041228.210380@74g2000cwt.googlegroups.com>
Hi,

I will soon be writing an application that depends on generating hash
tables for fast lookup. Rather than rolling my own I intend to use the
Lisp provided hash tables. I have some questions though:

1. How do Lisp Hash Tables handle clashes?
2. If a clash has occurred how will I know I am retrieving the
"correct" data from the table?
3. Is the way Lisp handles Hash Tables standardised and thus constant
across dialects of CL?

Thx
Sean

From: justinhj
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <1156862826.852433.197400@75g2000cwc.googlegroups.com>
Sean SCC wrote:
> Hi,
>
> I will soon be writing an application that depends on generating hash
> tables for fast lookup. Rather than rolling my own I intend to use the
> Lisp provided hash tables. I have some questions though:
>
> 1. How do Lisp Hash Tables handle clashes?
> 2. If a clash has occurred how will I know I am retrieving the
> "correct" data from the table?
> 3. Is the way Lisp handles Hash Tables standardised and thus constant
> across dialects of CL?
>
> Thx
> Sean

Hi Sean

>From the common lisp hyperspec

http://www.cs.rit.edu/~swm/lisp/HyperSpec/Body/18_aa.htm

answer 1) then is
-- A hash table can only associate one value with a given key. If an
attempt is made to add a second value for a given key, the second value
will replace the first. Thus, adding a value to a hash table is a
destructive operation; the hash table is modified.

answer 2) you get the latest data added

answer 3) yes it's standard for any cl implementation that adheres to
the clhs
From: Joe Knapka
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <Vd0Jg.2973$o42.876@tornado.texas.rr.com>
justinhj wrote:

> Sean SCC wrote:
> 
>>Hi,
>>
>>I will soon be writing an application that depends on generating hash
>>tables for fast lookup. Rather than rolling my own I intend to use the
>>Lisp provided hash tables. I have some questions though:
>>
>>1. How do Lisp Hash Tables handle clashes?
>>2. If a clash has occurred how will I know I am retrieving the
>>"correct" data from the table?
>>3. Is the way Lisp handles Hash Tables standardised and thus constant
>>across dialects of CL?
>>
>>Thx
>>Sean
> 
> 
> Hi Sean
> 
>>From the common lisp hyperspec
> 
> http://www.cs.rit.edu/~swm/lisp/HyperSpec/Body/18_aa.htm
> 
> answer 1) then is
> -- A hash table can only associate one value with a given key. If an
> attempt is made to add a second value for a given key, the second value
> will replace the first. Thus, adding a value to a hash table is a
> destructive operation; the hash table is modified.

Yes.  But only if you add a key with the same *value*.
In that case, it's hard to see any other reasonable
behavior.

A "clash" (by which I assume you mean a hash collision)
occurs when two *different* key values hash to the same hash-value.
In that case, CL will do the right thing, storing each
key separately and returning the proper associated value
for each of those keys.

-- JK
From: Rob Thorpe
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <1156868240.977528.106280@i3g2000cwc.googlegroups.com>
Sean SCC wrote:
> Hi,
>
> I will soon be writing an application that depends on generating hash
> tables for fast lookup. Rather than rolling my own I intend to use the
> Lisp provided hash tables. I have some questions though:
>
> 1. How do Lisp Hash Tables handle clashes?
> 2. If a clash has occurred how will I know I am retrieving the
> "correct" data from the table?
> 3. Is the way Lisp handles Hash Tables standardised and thus constant
> across dialects of CL?

In almost every language the way the hash table works is hidden from
the programmer, it isn't really relevant. Some are better than others
but non really asymtotically better or worse.
From: Sean SCC
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <1156870150.318234.296530@m73g2000cwd.googlegroups.com>
Clashes are a show-stopper for me - I presume the way to handle this is
if Lisp allows you to keep track of which hashes produced a clash, you
could creat a seperate, presumably much smaller list of exceptions?
Only pain for this (assuming it is possible) is the need to do a lookup
from the original hash table and the list of exceptions every time.

Would have been much better if the hash table simply returned a list of
all items at that hash address. Mostly the list would contain only 1
item but at least clashes would be handled elegantly.

Cheers,
Sean





Rob Thorpe wrote:
> Sean SCC wrote:
> > Hi,
> >
> > I will soon be writing an application that depends on generating hash
> > tables for fast lookup. Rather than rolling my own I intend to use the
> > Lisp provided hash tables. I have some questions though:
> >
> > 1. How do Lisp Hash Tables handle clashes?
> > 2. If a clash has occurred how will I know I am retrieving the
> > "correct" data from the table?
> > 3. Is the way Lisp handles Hash Tables standardised and thus constant
> > across dialects of CL?
>
> In almost every language the way the hash table works is hidden from
> the programmer, it isn't really relevant. Some are better than others
> but non really asymtotically better or worse.
From: bradb
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <1156870759.408323.304610@p79g2000cwp.googlegroups.com>
Sean SCC wrote:
> Clashes are a show-stopper for me - I presume the way to handle this is
> if Lisp allows you to keep track of which hashes produced a clash, you
> could creat a seperate, presumably much smaller list of exceptions?
> Only pain for this (assuming it is possible) is the need to do a lookup
> from the original hash table and the list of exceptions every time.
>
> Would have been much better if the hash table simply returned a list of
> all items at that hash address. Mostly the list would contain only 1
> item but at least clashes would be handled elegantly.
>
> Cheers,
> Sean

I don't think I understand you here.  Are you talking about hashing
collisions, or key collisions?  Hashing collisions (ie, 2 different
keys produce the same internal address) are an implementation issue.
Key collisions are a programmer issue.

(defvar *table* (make-hash-table))
(setf (gethash 1 *table*) 'hello)
(setf (gethash 1 *table*) 'goodbye)
(gethash 1 *table*) => GOODBYE

This is exactly as I intended.

Cheers
Brad
From: Robert Uhl
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <m3fyffthfq.fsf@NOSPAMgmail.com>
"Sean SCC" <···········@googlemail.com> writes:
>
> Clashes are a show-stopper for me - I presume the way to handle this
> is if Lisp allows you to keep track of which hashes produced a clash,
> you could creat a seperate, presumably much smaller list of
> exceptions?  Only pain for this (assuming it is possible) is the need
> to do a lookup from the original hash table and the list of exceptions
> every time.

Dude, are you worrying about values whose hash function clashes?  In
that case, don't--it's all taken care of under the table.  If you want
to store multiple values for a single key, I don't know of any language
which would do that: hash tables by definition providing a mapping from
a key to a value.  In every language I know of, assigning a value to a
hash key over-writes any previous value stored therein.

> Would have been much better if the hash table simply returned a list
> of all items at that hash address.  Mostly the list would contain only
> 1 item but at least clashes would be handled elegantly.

You could do this if you needed to, simply by using push:

cl-user> (defvar hash (make-hash-table :test 'equal))
hash
cl-user> (push 3 (gethash "foo" hash))
(3)
cl-user> (push 4 (gethash "foo" hash))
(4 3)
cl-user> (push 4 (gethash "bar" hash))
(4)
cl-user> (maphash #'(lambda (key val)
		      (format t "key: ~a; val: ~a~%" key val))
		  hash)
key: foo; val: (4 3)
key: bar; val: (4)
nil
cl-user> 

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
> "global" connectivity (as in, access to OTHERS' PRIVATELY OWNED
> equipment) is a COURTESY and PRIVILEGE granted by the owners of
> that equipment. It is not a birthright.
                         --Chris Stassen
From: Christophe Rhodes
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <squ03vwk3w.fsf@cam.ac.uk>
"Sean SCC" <···········@googlemail.com> writes:

> Clashes are a show-stopper for me - I presume the way to handle this is
> if Lisp allows you to keep track of which hashes produced a clash, you
> could creat a seperate, presumably much smaller list of exceptions?
> Only pain for this (assuming it is possible) is the need to do a lookup
> from the original hash table and the list of exceptions every time.

I think you might be misinterpreting the "clash", or possibly your
respondents are.

If the objects that you are using as keys are "different" under the
test specified with the hash table (eq, eql, equal or equalp for
standard hash tables), then outwardly there is no "clash": they are
different keys, and can be associated with different values in the
same hash table.  Whether they actually happen to have colliding hash
values is irrelevant; the Lisp system must make that invisible to the
user: the only thing that matters is equality under the hash table's
test.

If the objects that you are using as keys are the same under the test,
then they have the same hash bucket.  If that's a problem, choose a
more restrictive test: since one of the standard tests is based on
object identity (loosely, the address of an object, but don't quote me
on that) there should always be a test that can be made to fit your
needs.  (If not, it is possible to extend the set of tests / hash
functions in most CL implementations).

> Would have been much better if the hash table simply returned a list of
> all items at that hash address. Mostly the list would contain only 1
> item but at least clashes would be handled elegantly.

If I've understood this right, this is a non-concern: clashes just do
not happen, ever, in Lisp hash tables.

Christophe
From: Dave Seaman
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <ed1t63$nru$1@mailhub227.itcs.purdue.edu>
On Tue, 29 Aug 2006 18:02:43 +0100, Christophe Rhodes wrote:
> "Sean SCC" <···········@googlemail.com> writes:

>> Would have been much better if the hash table simply returned a list of
>> all items at that hash address. Mostly the list would contain only 1
>> item but at least clashes would be handled elegantly.

> If I've understood this right, this is a non-concern: clashes just do
> not happen, ever, in Lisp hash tables.

To expand on that:  if a hash table becomes so full that a new key cannot
be added without a clash, then the hash table is dynamically expanded and
all the entries are rehashed.  All this is invisible to the programmer.


-- 
Dave Seaman
U.S. Court of Appeals to review three issues 
concerning case of Mumia Abu-Jamal.
<http://www.mumia2000.org/>
From: Pascal Bourguignon
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <87lkp7fdcp.fsf@thalassa.informatimago.com>
Dave Seaman <·······@no.such.host> writes:

> On Tue, 29 Aug 2006 18:02:43 +0100, Christophe Rhodes wrote:
>> "Sean SCC" <···········@googlemail.com> writes:
>
>>> Would have been much better if the hash table simply returned a list of
>>> all items at that hash address. Mostly the list would contain only 1
>>> item but at least clashes would be handled elegantly.
>
>> If I've understood this right, this is a non-concern: clashes just do
>> not happen, ever, in Lisp hash tables.
>
> To expand on that:  if a hash table becomes so full that a new key cannot
> be added without a clash, then the hash table is dynamically expanded and
> all the entries are rehashed.  All this is invisible to the programmer.

Not necessarily. This is an implementation detail.  The standard
doesn't specifies that they be expanded.  (It only specifies some
"hint" parameters).

    rehash-size specifies a minimum amount to increase the size of the
    hash-table when it becomes full enough to require rehashing;

In an implementation that handles collision buckets as dynamic
structures (trees or lists), the table never becomes full enough to
_require_ rehashing.


    The values of rehash-size and rehash-threshold do not constrain
    the  implementation to use any particular method for computing
    when and by how much the size of hash-table should be
    enlarged. Such decisions are  implementation-dependent, and these
    values only hints from the programmer to the implementation, and
    the implementation is permitted to ignore them.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Our users will know fear and cower before our software! Ship it!
Ship it and let them flee like the dogs they are!"
From: Ari Johnson
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <m2bqq3iopu.fsf@nibbler.theari.com>
"Sean SCC" <···········@googlemail.com> writes:

> Hi,
>
> I will soon be writing an application that depends on generating hash
> tables for fast lookup. Rather than rolling my own I intend to use the
> Lisp provided hash tables. I have some questions though:
>
> 1. How do Lisp Hash Tables handle clashes?
> 2. If a clash has occurred how will I know I am retrieving the
> "correct" data from the table?
> 3. Is the way Lisp handles Hash Tables standardised and thus constant
> across dialects of CL?

The hash table type in Lisp is part of the ANSI standard.  See
http://www.lispworks.com/documentation/HyperSpec/Body/18_.htm for
information on what the standard requires of a conforming
implementation.

As an aside, to call the various implementations "dialects of CL" is
not very precise, as they do not change Common Lisp at all but rather
add things entirely outside of the standard.
From: Kaz Kylheku
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <1156881473.778848.64280@h48g2000cwc.googlegroups.com>
Sean SCC wrote:
> Hi,
>
> I will soon be writing an application that depends on generating hash
> tables for fast lookup. Rather than rolling my own I intend to use the
> Lisp provided hash tables. I have some questions though:
>
> 1. How do Lisp Hash Tables handle clashes?

Clashes aren't a concept in Common Lisp hash tables.

Of course, you can replace the data item stored under an existing key.

> 2. If a clash has occurred how will I know I am retrieving the
> "correct" data from the table?

Where in the interface specification did you deduce that this is
allowed to happen?

Have you ever seen a hash table module in any language which retrieved
incorrect data for a key, and which was not considered defective for
that reason?

> 3. Is the way Lisp handles Hash Tables standardised and thus constant
> across dialects of CL?

CL is an ANSI-standard language.

How can you be programming in Lisp, yet not be familiar with the
standard, or the HTML-ized derivation of it known as the Common Lisp
HyperSpec?
From: Pascal Bourguignon
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <87irkbha7z.fsf@thalassa.informatimago.com>
"Sean SCC" <···········@googlemail.com> writes:

> Hi,
>
> I will soon be writing an application that depends on generating hash
> tables for fast lookup. Rather than rolling my own I intend to use the
> Lisp provided hash tables. I have some questions though:
>
> 1. How do Lisp Hash Tables handle clashes?

It's implementation dependant.


> 2. If a clash has occurred how will I know I am retrieving the
> "correct" data from the table?

It's the job of the implementation to return you the correct data.

If it doesn't it's called a bug, which you can report to the
maintainers or vendors of your implementation.


> 3. Is the way Lisp handles Hash Tables standardised and thus constant
> across dialects of CL?

What do you mean by "handles".  Implementation is not standardized.
API and behavior is standardized.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"You cannot really appreciate Dilbert unless you read it in the
original Klingon"
From: Alex Mizrahi
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <44f48a50$0$75035$14726298@news.sunsite.dk>
(message (Hello 'Sean)
(you :wrote  :on '(29 Aug 2006 05:55:57 -0700))
(

 SS> 1. How do Lisp Hash Tables handle clashes?
 SS> 2. If a clash has occurred how will I know I am retrieving the
 SS> "correct" data from the table?

looks like you think too naively about hash table implementation.
hash table is not merely an array mapping hash to value.
it's more complex structure that never looses data.
basically HTs map hashes to bucket, each bucket can contain from zero to 
'inifinite' (key => value) pairs. so to look up value, it scans all the 
bucket keys for specific key doing eq eql or whatever comparison. thus hash 
clashes can only somewhat slow down lookups.
typically they choose hash space (number of different possible hash values) 
to be multiple of 2 greater than count of objects. so if hashes are 
uniformely distributed buckets typically contain 1, 0, or 2 values. you 
might want less hash space -- i.e. 2..3 objects in each bucket -- if you 
know that key comparison is quite fast, hashes are uniformely distributed, 
and do not want to have big array of hashes.

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: =?utf-8?b?R2lzbGUgU8ODwqZsZW5zbWk=?= =?utf-8?b?bmRl?=
Subject: Re: How do Lisp Hash Tables handle clashes?
Date: 
Message-ID: <0n4pvvf9te.fsf@apal.ii.uib.no>
"Sean SCC" <···········@googlemail.com> writes:

> Hi,
> 
> I will soon be writing an application that depends on generating hash
> tables for fast lookup. Rather than rolling my own I intend to use the
> Lisp provided hash tables. I have some questions though:
> 
> 1. How do Lisp Hash Tables handle clashes?
> 2. If a clash has occurred how will I know I am retrieving the
> "correct" data from the table?
> 3. Is the way Lisp handles Hash Tables standardised and thus constant
> across dialects of CL?

To understand what a CL hash table is, it is important to understand that it
is not even required for an compiler/implementation to implement it as a 
hash table data structure as known from computer science introductory courses.
In that sense, the name used in e.g python 'dictionary' would be a better name,
as it does not associate the concept with a perticular way of implementing it.

In the CL spec, it is (as others have told) specified how a hash table should
behave, and that include that no distinct keys should ever clash. If the 
underlaying implementation of the hash table actually is a hash table data stucture,
it must have some way to ensure that clashes is handled properly, so that it is
never visible for the programmer/user of the common lisp system. If not the 
implementation violates the CL standard. So the conclusion is that you should 
not worry about clashes.

One thing you should think about however, it the difference between the diffent 
tests you can initialize a hash table with. (eq, eql, equal and equalp). These
effect when two values are considered equal. For example, with an 'eq' hash table
two keys "hello world" may be different, because they by be at two differnt memory
locations, but with an "equal" hash table the keys will always be considered to
be the same. This is esential to understand CL hash tables.

> 
> Thx
> Sean
> 

-- 
Gisle Sælensminde, Phd student, Scientific programmer
Computational biology unit, BCCS, University of Bergen, Norway, 
Email: ·····@cbu.uib.no
The best way to travel is by means of imagination