From: J. Romano
Subject: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <b893f5d4.0212281408.715887c6@posting.google.com>
Hi,

   In the Fall of 2001, I was a Graduate Teaching Assistant for a
class at my University.  Although Lisp was a requirement for the
class, few people knew it as well as they knew C++ or Java.

   When the instructor explained how to make hash tables in Lisp, many
students were confused, partly due to the fact that many of them
didn't even know what a hash table was (and when to use it). 
Therefore, I created the following write-up to help the students in my
class understand how to create and use hash tables in Lisp, as well as
how they could be used.

   I'm posting the write-up here on the chance that somebody somewhere
might find it useful.  Keep in mind that my goal in composing the
write-up was to use simple, understandable wording (instead of aiming
for completeness using technical terms).  Anybody who has basic
knowldege of Lisp and knowledge of arrays in either C++ or Java should
be able to understand my write-up.

   Feel free to distribute my write-up, especially for educational
purposes.  I would ask, however, that my write-up not be changed
unless it is to correct an error.

   Finally, here is my write-up:

------------------------------------------------------

   A Guide to Understanding Lisp Hash-Tables (a little better)

      by Jean-Luc Romano

There is normally some confusion when students try to learn about hash
tables in any language.  Therefore, this file has been made in the
hopes that much of the confusion can be cleared up.


A LITTLE BIT OF C++ CODE

Before we begin discussing hash-tables, let's examine some of the
differences between C++ and Lisp (if you've never used C++, don't
worry -- the code used here will be understood by any Java
programmer).

In C++, if we want to set variable myVar to 5, we write:
   myVar = 5;
In Lisp, we write:
   (setf myVar 5)

If we want myVar to be on the right-hand side, in C++ we write:
   a = myVar;
In Lisp, we write:
   (setf a myVar)

Notice that instead of using the equal sign, Lisp uses the "setf"
function.


C++ ARRAYS

Almost every programmer knows about arrays.  In C++ the syntax is very
simple.  To assign a value to the first element of an array in C++, we
write:

   A[0] = 5;

The following lines are also valid, provided that the array index is
within range:

   A[3] = 55;
   A[42] = 13;

But there are certain constraints to remember when assigning (or
referencing) an array:
   a.  The array index must be in the proper range
   b.  The array index must always be an integer

These two rules make sense to us.  How can you have a floating point
number as an index to an array?  If that were allowed, then any array
would have to have an infinite number of elements and therefore use an
infinite amount of space.  The very idea is too bizarre to even
consider.

Or is it?  What if I wanted to write:
   A[2.71] = 6;
Or maybe something with an array index that is a string, like:
   A["hello"] = 2;
Is this even possible?

This IS possible, ever since the introduction of C++ Standard Template
Libraries (STL).  In these libraries, there is a class type called
"map" which allows a user to create a special kind of array that does
not have to use an integer as the array index.  The following line:

   map<string,float> A;

creates a variable A that behaves like an array, but whose array index
must be a string and whose values must all be floating point numbers. 
Now the following line is perfectly valid:
   A["pi"] = 3.14;
and the expression:
   float b = 2 * A["pi"];
will set b equal to 6.28 .

This explanation of the variable A is one way of looking at a hash
table (it is by no means the only way, however).  A hash table stores
a value based on a "key" which, in C++, looks like an array index. 
Unlike an array, however, the key can be a string, float, character,
integer, or even a user-defined type.

So what is the difference between an array and a hash table?  Here are
a few:
   a.  a hash table will allow any index (or key) to
       be used, not just a positive integer
   b.  since arrays have pre-allocated memory, their
       access time is much quicker than hash tables
   c.  hash tables have to worry about how to
       organize and retreive the values internally;
       however, the programmer does not have to worry
       about this (the code that manages this has
       already been written!)

In case you may be wondering why this might be useful, consider
storing someone's age in a hash table, with his/her name as a key (or
index):

   Age["Joe"] = 24;
   Age["Ann"] = 18;
   cout << "Joe's age is " << Age["Joe"] << endl;
   cout << "Ann's age is " << Age["Ann"] << endl;


HASH TABLES IN LISP

So how do we create hash tables in Lisp?  Very easily.  We use the
function:
   (make-hash-table :test #'equal)

Note that this function RETURNS a hash table, not declares one. 
Therefore, we must place its return value into a variable that we want
to use as a hash-table:
   (setf a  (make-hash-table :test #'equal) )

Now, how do we set/retreive the value at key "pi", for instance?  The
answer is not quite as intuitive as the C++ method, but it is still
rather simple:
   (gethash "pi" a)

Therefore,  (gethash "pi" a)  is Lisp's equivalent code for C++'s 
A["pi"] .

Two things to note:
   a.  "gethash", unlike "make-hash-table", does not
       contain any hyphens
   b.  unlike C++, the key (in this case "pi") comes
       before the hash table's name (here, ''a''
       comes after "pi")

So how do we set a value to a particular key?  Just like we set a
value to any other variable.  If we wanted to set 3.14 to the variable
pi, we would write:
   (setf pi 3.14)
But if we want to set it to A["pi"], we substitute  (gethash "pi" a) 
for pi:
   (setf  (gethash "pi" a)  3.14)

Now, how do we read a value that was set to a particular key?  Simple:
 we just use  (gethash "pi" a)  just like any old variable in Lisp. 
Instead of
   (setf newVar pi)
we write:
   (setf newVar  (gethash "pi" a) )

There are a few things to remember about hash tables in Lisp:

   a.  because Lisp allows any variable to bind
       to any variable (that is, a variable 
       assigned to an integer can also be assigned 
       to a float, string, or a list), neither the 
       key nor the value types have to be specified 
       in Lisp, even though they do have to be 
       specified in C++.  This has both advantages 
       and disadvantages, but at least it is less 
       stuff to remember for the Lisp programmer.

   b.  if two values are written to the same key,
       the old value will be replaced (or 
       "clobbered") by the new value.  For example,
       the lines:
          (setf (gethash "pi" a) 3.14)
          (setf (gethash "pi" a) 7)
       will set A["pi"] (that is, (gethash "pi" a) )
       to 7 .

   c.  if no value has been set for a particular key,
       its value will be NIL.


Well, that's about it for this file.  I hope you find it helpful!

------------------------------------------------------

From: Marc Spitzer
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <86d6nl1jg9.fsf@bogomips.optonline.net>
·······@hotmail.com (J. Romano) writes:

> Hi,
> 
>    In the Fall of 2001, I was a Graduate Teaching Assistant for a
> class at my University.  Although Lisp was a requirement for the
> class, few people knew it as well as they knew C++ or Java.
> 
>    When the instructor explained how to make hash tables in Lisp, many
> students were confused, partly due to the fact that many of them
> didn't even know what a hash table was (and when to use it). 

What level was this course, freshman, sophomore, jr, sr, grad? 
Either way they should have seen a hash function and learned about 
hashing by the end of the freshman year.

> Therefore, I created the following write-up to help the students in my
> class understand how to create and use hash tables in Lisp, as well as
> how they could be used.
> 
>    I'm posting the write-up here on the chance that somebody somewhere
> might find it useful.  Keep in mind that my goal in composing the
> write-up was to use simple, understandable wording (instead of aiming
> for completeness using technical terms).  Anybody who has basic
> knowldege of Lisp and knowledge of arrays in either C++ or Java should
> be able to understand my write-up.
> 
>    Feel free to distribute my write-up, especially for educational
> purposes.  I would ask, however, that my write-up not be changed
> unless it is to correct an error.
> 
>    Finally, here is my write-up:
> 
> ------------------------------------------------------
> 
>    A Guide to Understanding Lisp Hash-Tables (a little better)
> 
>       by Jean-Luc Romano
> 
> There is normally some confusion when students try to learn about hash
> tables in any language.  Therefore, this file has been made in the
> hopes that much of the confusion can be cleared up.
> 
> 
> A LITTLE BIT OF C++ CODE
> 
> Before we begin discussing hash-tables, let's examine some of the
> differences between C++ and Lisp (if you've never used C++, don't
> worry -- the code used here will be understood by any Java
> programmer).
> 
> In C++, if we want to set variable myVar to 5, we write:
>    myVar = 5;
> In Lisp, we write:
>    (setf myVar 5)
> 
> If we want myVar to be on the right-hand side, in C++ we write:
>    a = myVar;
> In Lisp, we write:
>    (setf a myVar)
> 
> Notice that instead of using the equal sign, Lisp uses the "setf"
> function.
> 
> 
> C++ ARRAYS
> 
> Almost every programmer knows about arrays.  In C++ the syntax is very
> simple.  To assign a value to the first element of an array in C++, we
> write:
> 
>    A[0] = 5;
> 
> The following lines are also valid, provided that the array index is
> within range:
> 
>    A[3] = 55;
>    A[42] = 13;
> 
> But there are certain constraints to remember when assigning (or
> referencing) an array:
>    a.  The array index must be in the proper range
>    b.  The array index must always be an integer
> 
> These two rules make sense to us.  How can you have a floating point
> number as an index to an array?  If that were allowed, then any array
> would have to have an infinite number of elements and therefore use an
> infinite amount of space.  The very idea is too bizarre to even
> consider.
> 



> Or is it?  What if I wanted to write:
>    A[2.71] = 6;
> Or maybe something with an array index that is a string, like:
>    A["hello"] = 2;

I think that adding a little syntax might make things clearer
array:
a[2]=6;
hash table:
a[["hello"]]=2;

or something else that is not used by C/C++/JAVA to clarify
visualy when you are talking about an array or a hash.

> Is this even possible?
> 
> This IS possible, ever since the introduction of C++ Standard Template
> Libraries (STL).  In these libraries, there is a class type called
> "map" which allows a user to create a special kind of array that does
> not have to use an integer as the array index.  The following line:
> 
>    map<string,float> A;
> 
> creates a variable A that behaves like an array, but whose array index
> must be a string and whose values must all be floating point numbers. 
> Now the following line is perfectly valid:
>    A["pi"] = 3.14;
> and the expression:
>    float b = 2 * A["pi"];
> will set b equal to 6.28 .
> 
> This explanation of the variable A is one way of looking at a hash
> table (it is by no means the only way, however).  A hash table stores
> a value based on a "key" which, in C++, looks like an array index. 
> Unlike an array, however, the key can be a string, float, character,
> integer, or even a user-defined type.

I think it would be more accurate to state that the print value of the
data type in question not the data type itself is used as the key or
index value. for example lets say I have a float, x = 2.0, and I use
it for an index in to a hash.  The internal print representation was
"+2.0E0" and later I use the string "2.0" to do a search for what is
stored at index "+2.0E0" and I do not get it even though they are the
same value as a float.

> 
> So what is the difference between an array and a hash table?  Here
> are a few: a.  a hash table will allow any index (or key) to be
> used, not just a positive integer

I would chang this to read:

a: A hash table will allow the string representation of any data type
to be used as the index(I prefer key) not just an integer.

  *note* Many languages allow arbitrary integer ranges to be used for
  array indexes, pascal and friends(ada, modula 2, delphi) comes to
  mind.

>    b.  since arrays have pre-allocated memory, their access time is
>    much quicker than hash tables

You could make the argument that array access is faster because it
does not have to hash the index string it will be faster.  But the one
you use above makes no sense, if you were to replace 'pre-allocated'
with 'contiguous' it would be more accurate.

>    c.  hash tables have to worry about how to organize and retreive
>    the values internally; however, the programmer does not have to
>    worry about this (the code that manages this has already been
>    written!)

so do arrays a[4]= 3 and x = a[4] do exactly what you say in point c

> 
> In case you may be wondering why this might be useful, consider
> storing someone's age in a hash table, with his/her name as a key
> (or index):
> 
>    Age["Joe"] = 24; Age["Ann"] = 18; cout << "Joe's age is " <<
>    Age["Joe"] << endl; cout << "Ann's age is " << Age["Ann"] <<
>    endl;

simple printing in C++ is ugly, how bad does complex printing get?

> 
> 
> HASH TABLES IN LISP
> 
> So how do we create hash tables in Lisp?  Very easily.  We use the
> function: (make-hash-table :test #'equal)
> 
> Note that this function RETURNS a hash table, not declares one.
> Therefore, we must place its return value into a variable that we
> want to use as a hash-table: (setf a (make-hash-table :test #'equal)
> )
> 
> Now, how do we set/retreive the value at key "pi", for instance?
> The answer is not quite as intuitive as the C++ method, but it is
> still rather simple: (gethash "pi" a)
> 
> Therefore, (gethash "pi" a) is Lisp's equivalent code for C++'s
> A["pi"] .
> 
> Two things to note: a.  "gethash", unlike "make-hash-table", does
> not contain any hyphens

I think a is silly to point out, because it is so self evident

>    b.  unlike C++, the key (in this case "pi") comes before the hash
>    table's name (here, ''a'' comes after "pi")
> 
> So how do we set a value to a particular key?  Just like we set a
> value to any other variable.  If we wanted to set 3.14 to the
> variable pi, we would write: (setf pi 3.14) But if we want to set it
> to A["pi"], we substitute (gethash "pi" a) for pi: (setf (gethash
> "pi" a) 3.14)
> 
> Now, how do we read a value that was set to a particular key?
> Simple: we just use (gethash "pi" a) just like any old variable in
> Lisp.  Instead of (setf newVar pi) we write: (setf newVar (gethash
> "pi" a) )
> 
> There are a few things to remember about hash tables in Lisp:
> 
>    a.  because Lisp allows any variable to bind to any variable
>    (that is, a variable assigned to an integer can also be assigned
>    to a float, string, or a list), neither the key nor the value
>    types have to be specified in Lisp, even though they do have to
>    be specified in C++.  This has both advantages and disadvantages,
>    but at least it is less stuff to remember for the Lisp
>    programmer.
> 
>    b.  if two values are written to the same key, the old value will
>    be replaced (or "clobbered") by the new value.  For example, the
>    lines: (setf (gethash "pi" a) 3.14) (setf (gethash "pi" a) 7)
>    will set A["pi"] (that is, (gethash "pi" a) ) to 7 .

that is not always true, it can do that or it can keep the orignal
value or it can keep a list of all values assigned to any given key.
In CL the operation is distructive for the organic hash functions.

> 
>    c.  if no value has been set for a particular key, its value will
>    be NIL.
> 

if no value has been set for a particular key it has no value set, you
can still return nill with a value set to nil for a given key, read
this: http://www.lispworks.com/reference/HyperSpec/Body/18_aa.htm

to start with.

> 
> Well, that's about it for this file.  I hope you find it helpful!
> 
> ------------------------------------------------------

There were some other parts that I did not like but did not comment
on, I did not feel I would have done a very good job with them unless
I spent a good deal of time on them and I did not feel like making the
effort tonight.

marc
From: J. Romano
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <b893f5d4.0212290904.23dadc25@posting.google.com>
Marc Spitzer <········@optonline.net> wrote in message news:<··············@bogomips.optonline.net>...
>
> ·······@hotmail.com (J. Romano) writes:
> >    In the Fall of 2001, I was a Graduate Teaching Assistant for a
> > class at my University.  Although Lisp was a requirement for the
> > class, few people knew it as well as they knew C++ or Java.
> 
> What level was this course, freshman, sophomore, jr, sr, grad? 
> Either way they should have seen a hash function and learned about 
> hashing by the end of the freshman year.

   Believe it or not, it was a Senior-level course.  Knowledge of Lisp
was a prerequisite, but most only knew as much as they had learned in
one other class (which wasn't much).

> > Or is it?  What if I wanted to write:
> >    A[2.71] = 6;
> > Or maybe something with an array index that is a string, like:
> >    A["hello"] = 2;
> 
> I think that adding a little syntax might make things clearer
> array:
> a[2]=6;
> hash table:
> a[["hello"]]=2;
> 
> or something else that is not used by C/C++/JAVA to clarify
> visualy when you are talking about an array or a hash.

   I don't think that would make it clearer.  For one thing, these
students' main language was C++ or Java, and to use syntax that is not
supported in either of those languages (or even Lisp) would make
matters much more confusing.

> > Now the following line is perfectly valid:
> >    A["pi"] = 3.14;
> > and the expression:
> >    float b = 2 * A["pi"];
> > will set b equal to 6.28 .
>
> I think it would be more accurate to state that the print value of the
> data type in question not the data type itself is used as the key or
> index value. for example lets say I have a float, x = 2.0, and I use
> it for an index in to a hash.  The internal print representation was
> "+2.0E0" and later I use the string "2.0" to do a search for what is
> stored at index "+2.0E0" and I do not get it even though they are the
> same value as a float.

   I purposefully tried not to complicate things.  As I mentioned
earlier, I was not aiming for technical completeness, but rather for
comprehension.  It's important to note that for many of these students
Lisp is like a foreign or alien language that many of them struggle
with.

> > So what is the difference between an array and a hash table?  Here
> > are a few: a.  a hash table will allow any index (or key) to be
> > used, not just a positive integer
> 
> I would chang this to read:
> 
> a: A hash table will allow the string representation of any data type
> to be used as the index(I prefer key) not just an integer.
> 
>   *note* Many languages allow arbitrary integer ranges to be used for
>   array indexes, pascal and friends(ada, modula 2, delphi) comes to
>   mind.

   That's true.  (But since many of the students were only familiar
with Java or C++, I didn't want to introduce rules that didn't apply
to this lesson... After all, the students were trying to learn Lisp,
not Pascal or Fortran.)  But you do make a good point.

> >    c.  hash tables have to worry about how to organize and retreive
> >    the values internally; however, the programmer does not have to
> >    worry about this (the code that manages this has already been
> >    written!)
> 
> so do arrays a[4]= 3 and x = a[4] do exactly what you say in point c

   I think so.  The reason I added this point is that I wanted the
students to realize that they could write their own hash-table
abstraction if they really wanted to.  In other words, there's nothing
magical or mysterious about Lisp hash tables, or any hash tables. 
It's just that if they decided to write their own, they would have to
devote a significant amount of time before their own hash-table works
anywhere near as fast and efficiently as Lisp's.

   The reason Lisp hash-table lookups work so well is that someone
already wrote the code for it.  Even if you wrote your own code for a
hash-table, you would be in a sense "re-inventing the wheel," but it's
definitely possible to do it.  Lisp will in no way prevent you from
writing and implementing your own hash-table.

> > Two things to note: a.  "gethash", unlike "make-hash-table", does
> > not contain any hyphens
> 
> I think a is silly to point out, because it is so self evident

   I disagree here.  One of the problems that instructors have in
teaching new students is that it's not always obvious to the
instructor what concept are difficult to the students (that's one
reason that having a Graduate Teaching Assistant around is a good
idea).  True, to the seasoned Lisp programmer the spellings of
different function names is trivial, but not to beginners.  A beginner
may spend an hour trying to find the bug in the program that contains
the function call to "get-hash", not realizing that "gethash" has no
hyphen in it.  A seasoned Lisp programmer would probably understand
the error message right away, but you have to understand that
beginners can't always understand the error messages of a language
that they aren't familiar with.

   In fact, I once wasted an hour trying to track down a simple error
in a simple ten-line program in another language.  I can't remember
the language or the exact error, but it had something to do with
whitespaces in an assignment.  I had written something like:

   a = 4

Only to have the program choke (but I didn't know that that was the
line at fault).  By removing each line one-by-one and retesting the
program, I eventually brought the program down to that ONE LINE of
code.  It was then that I discovered that I should have written the
line without whitespaces, like this:

   a=4

I wish my instructor had told me that I couldn't include whitespaces
on the sides of the equal sign.  That hadn't even crossed my mind
until my program was one line long.

   Likewise, a Lisp beginner might not realize that, even though
"make-hash-table" has hyphens, "gethash" does not.  And how long will
it take them to catch that mistake?  Thirty minutes?  One hour?  A
whole day?

   (I don't want to incite a flame war here, but a beginner might
think that this hyphen inconsistency means that Lisp lacks
consistency.  And if they should ever trip on this fact and spend an
hour debugging their program (by looking for bugs on lines that din't
have any), that certainly isn't going to encourage them to use Lisp.)

   And in a time when students are learning languages like C++ and
Java over Lisp, I don't want to discourage them from using Lisp.

> There were some other parts that I did not like but did not comment
> on, I did not feel I would have done a very good job with them unless
> I spent a good deal of time on them and I did not feel like making the
> effort tonight.

   I'm sorry that my write-up didn't please you much, but like I said,
I wasn't going for a technical manual, but rather a guide to help
struggling students understand a new concept.

   In my experience I have found that, in general, reference manuals
don't make good teaching guides, and teaching guides make lousy
reference manuals.  So there's a balance:  How much technical material
should I put in, and how simple and to the point should I keep the
lesson?  And since I was a Graduate Teaching Assistant, my duty was
towards my instructor-of-record and my students, so I felt obligated
to make my write-up easy to understand even if that meant referring to
C++, a language most of them already knew.  In fact, I didn't have to
compose the write-up at all (much less post it here); I just thought
it would help my students overcome an obstacle they were obviously
having trouble with.

   My instructor liked my write-up enough that he had me explain it to
the students in class before they had a chance to read it themselves. 
And as some of you know, it's very easy to make a mistake on the
white/blackboard at the front of the room.  So I'm pleased to say that
when I did make mistakes on the whiteboard teaching them about Lisp
hash-tables, most students were able to catch, point out, and correct
my mistakes.  That's conforting to an instructor, because that means
that the students are actually understanding the concept.

   ...and that's the best kind of feedback.

  -- Jean-Luc
From: Marc Spitzer
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <86lm28zjcx.fsf@bogomips.optonline.net>
·······@hotmail.com (J. Romano) writes:

> Marc Spitzer <········@optonline.net> wrote in message news:<··············@bogomips.optonline.net>...
> >
> > ·······@hotmail.com (J. Romano) writes:

> > What level was this course, freshman, sophomore, jr, sr, grad? 
> > Either way they should have seen a hash function and learned about 
> > hashing by the end of the freshman year.
> 
>    Believe it or not, it was a Senior-level course.  Knowledge of Lisp
> was a prerequisite, but most only knew as much as they had learned in
> one other class (which wasn't much).

The problem I had was not that they do not know lisp, but that they
got confused by the idea of a hash table.  So they have not completed
there first year of CS and are getting ready to graduate.

> 
> > > Or is it?  What if I wanted to write:
> > >    A[2.71] = 6;
> > > Or maybe something with an array index that is a string, like:
> > >    A["hello"] = 2;
> > 
> > I think that adding a little syntax might make things clearer
> > array:
> > a[2]=6;
> > hash table:
> > a[["hello"]]=2;
> > 
> > or something else that is not used by C/C++/JAVA to clarify
> > visualy when you are talking about an array or a hash.
> 
>    I don't think that would make it clearer.  For one thing, these
> students' main language was C++ or Java, and to use syntax that is not
> supported in either of those languages (or even Lisp) would make
> matters much more confusing.
> 

but in lisp make-has-table is just a function and it looks like other
functions, ditto for gethash.  Operator overloading is a bad idea, generally.
What does + do here said the QA guy?  We will get back to you on that said
the programmer. 

> > > Now the following line is perfectly valid:
> > >    A["pi"] = 3.14;
> > > and the expression:
> > >    float b = 2 * A["pi"];
> > > will set b equal to 6.28 .
> >
> > I think it would be more accurate to state that the print value of the
> > data type in question not the data type itself is used as the key or
> > index value. for example lets say I have a float, x = 2.0, and I use
> > it for an index in to a hash.  The internal print representation was
> > "+2.0E0" and later I use the string "2.0" to do a search for what is
> > stored at index "+2.0E0" and I do not get it even though they are the
> > same value as a float.
> 
>    I purposefully tried not to complicate things.  As I mentioned
> earlier, I was not aiming for technical completeness, but rather for
> comprehension.  It's important to note that for many of these students
> Lisp is like a foreign or alien language that many of them struggle
> with.
> 

They are Seniors, dealing with a little complexity now might help them
keep there jobs later and not much later.

> > > So what is the difference between an array and a hash table?  Here
> > > are a few: a.  a hash table will allow any index (or key) to be
> > > used, not just a positive integer
> > 
> > I would chang this to read:
> > 
> > a: A hash table will allow the string representation of any data type
> > to be used as the index(I prefer key) not just an integer.
> > 
> >   *note* Many languages allow arbitrary integer ranges to be used for
> >   array indexes, pascal and friends(ada, modula 2, delphi) comes to
> >   mind.
> 
>    That's true.  (But since many of the students were only familiar
> with Java or C++, I didn't want to introduce rules that didn't apply
> to this lesson... After all, the students were trying to learn Lisp,
> not Pascal or Fortran.)  But you do make a good point.

That is not the point.  The simple fact is that it does not have to be
this way, its an implementation choice,no more or less.  Telling them
this is how it alway works is wrong.

> 
> > >    c.  hash tables have to worry about how to organize and retreive
> > >    the values internally; however, the programmer does not have to
> > >    worry about this (the code that manages this has already been
> > >    written!)
> > 
> > so do arrays a[4]= 3 and x = a[4] do exactly what you say in point c
> 
>    I think so.  The reason I added this point is that I wanted the
> students to realize that they could write their own hash-table
> abstraction if they really wanted to.  In other words, there's nothing
> magical or mysterious about Lisp hash tables, or any hash tables. 
> It's just that if they decided to write their own, they would have to
> devote a significant amount of time before their own hash-table works
> anywhere near as fast and efficiently as Lisp's.
> 

You can write your own array functions also and the same rules apply 
to arrays, especially in CL.

>    The reason Lisp hash-table lookups work so well is that someone
> already wrote the code for it.  Even if you wrote your own code for a
> hash-table, you would be in a sense "re-inventing the wheel," but it's
> definitely possible to do it.  Lisp will in no way prevent you from
> writing and implementing your own hash-table.
> 

ditto for arrays

> > > Two things to note: a.  "gethash", unlike "make-hash-table", does
> > > not contain any hyphens
> > 
> > I think a is silly to point out, because it is so self evident
> 
>    I disagree here.  One of the problems that instructors have in
> teaching new students is that it's not always obvious to the
> instructor what concept are difficult to the students (that's one
> reason that having a Graduate Teaching Assistant around is a good
> idea).  True, to the seasoned Lisp programmer the spellings of
> different function names is trivial, but not to beginners.  A beginner
> may spend an hour trying to find the bug in the program that contains
> the function call to "get-hash", not realizing that "gethash" has no
> hyphen in it.  A seasoned Lisp programmer would probably understand
> the error message right away, but you have to understand that
> beginners can't always understand the error messages of a language
> that they aren't familiar with.
> 

First they are Seniors in college, they have 3+ years of programming
experience, right?  And iff your students cannot figure out that it is
spelled one way on the handout and book and hyperspec, I spelled it
another way and it did not work perhaps I should spell it like on the
handout and see what happens?  Or perhaps they should change their
majors?  For G_d sake they should have the training wheels mostly off
by now.  And I am in no way shape or form a seasoned lisp programmer,
but with that said I can read a manual and understand this is how we
spell it, for computer languages any way.

 
>    In fact, I once wasted an hour trying to track down a simple error
> in a simple ten-line program in another language.  I can't remember
> the language or the exact error, but it had something to do with
> whitespaces in an assignment.  I had written something like:
> 
>    a = 4
> 
> Only to have the program choke (but I didn't know that that was the
> line at fault).  By removing each line one-by-one and retesting the
> program, I eventually brought the program down to that ONE LINE of
> code.  It was then that I discovered that I should have written the
> line without whitespaces, like this:
> 
>    a=4

That looks like a unix shell gotya.  And it is part of learning any
language.  But once you found the error you knew what to look for
next time and it went from a gotya to a typo easily caught and 
corrected.  It got me also once.

> 
> I wish my instructor had told me that I couldn't include whitespaces
> on the sides of the equal sign.  That hadn't even crossed my mind
> until my program was one line long.
> 

Well that is called learning, I have done it my self( that exact one).
It is one of the reason I like reading books instead of rediscovering
them.

>    Likewise, a Lisp beginner might not realize that, even though
> "make-hash-table" has hyphens, "gethash" does not.  And how long will
> it take them to catch that mistake?  Thirty minutes?  One hour?  A
> whole day?
> 

I have done exactly that, I once spent 2-3 days trying to figure out 
how to get something to print to a file in perl(v4) when I was starting
out I swore up and down that I was doing it right and I was wrong.  I 
finally found my mistake, when I opened the file i forgot to start the 
name string with a '>' to open it for output.  I have never had trouble
finding that particular mistake again.

And in programming paying attention to details is important, if you
want things to work anyway.  Let them learn.  CL is much easier to do
this in because it sticks them in your face, thing like with-open-file
come to mind.

>    (I don't want to incite a flame war here, but a beginner might
> think that this hyphen inconsistency means that Lisp lacks
> consistency.  And if they should ever trip on this fact and spend an
> hour debugging their program (by looking for bugs on lines that din't
> have any), that certainly isn't going to encourage them to use Lisp.)
> 

Compared to C++ lisp is inconsistent?  It can even be considered
inconsistent in comparison to C++

>    And in a time when students are learning languages like C++ and
> Java over Lisp, I don't want to discourage them from using Lisp.
> 
> > There were some other parts that I did not like but did not comment
> > on, I did not feel I would have done a very good job with them unless
> > I spent a good deal of time on them and I did not feel like making the
> > effort tonight.
> 
>    I'm sorry that my write-up didn't please you much, but like I said,
> I wasn't going for a technical manual, but rather a guide to help
> struggling students understand a new concept.

It is not your job to please me so do not worry about it.  And you 
do not have to agree with me either.  

My real problem with this is that students that are on there final
yeare of there undergraduate degree and have passed the first 3 do not
know and lack the abbility to figure out what a hash table is.  
That this situation exist makes me think that you are wise to not
post from your univ account, for their reputations sake anyway.

> 
>    In my experience I have found that, in general, reference manuals
> don't make good teaching guides, and teaching guides make lousy
> reference manuals.  So there's a balance:  How much technical material
> should I put in, and how simple and to the point should I keep the
> lesson?  And since I was a Graduate Teaching Assistant, my duty was
> towards my instructor-of-record and my students, so I felt obligated
> to make my write-up easy to understand even if that meant referring to
> C++, a language most of them already knew.  In fact, I didn't have to
> compose the write-up at all (much less post it here); I just thought
> it would help my students overcome an obstacle they were obviously
> having trouble with.

In many ways lisp is very different from c, don't know c++, so relating
things to the other language will probably do more harm then good. 

> 
>    My instructor liked my write-up enough that he had me explain it to
> the students in class before they had a chance to read it themselves. 
> And as some of you know, it's very easy to make a mistake on the
> white/blackboard at the front of the room.  So I'm pleased to say that
> when I did make mistakes on the whiteboard teaching them about Lisp
> hash-tables, most students were able to catch, point out, and correct
> my mistakes.  That's conforting to an instructor, because that means
> that the students are actually understanding the concept.
> 

That is always nice

marc 

>    ...and that's the best kind of feedback.
> 
>   -- Jean-Luc
From: Joe Marshall
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <smwfl6ui.fsf@ccs.neu.edu>
·······@hotmail.com (J. Romano) writes:

> Marc Spitzer <········@optonline.net> wrote in message news:<··············@bogomips.optonline.net>...
> >
> > ·······@hotmail.com (J. Romano) writes:
> > >    In the Fall of 2001, I was a Graduate Teaching Assistant for a
> > > class at my University.  Although Lisp was a requirement for the
> > > class, few people knew it as well as they knew C++ or Java.
> > 
> > What level was this course, freshman, sophomore, jr, sr, grad? 
> > Either way they should have seen a hash function and learned about 
> > hashing by the end of the freshman year.
> 
>    Believe it or not, it was a Senior-level course.

Given that a hash-table is an implementation of a `table', this is
astounding.  The description you gave doesn't even *touch* on the
details of `hashing' and `equivalence', it could just as easily apply
to an association list, property list, binary tree, etc.

How could a college, in good faith, give a degree to someone who
doesn't understand a data structure as fundamental as a `table'?


(As for the description itself, it's fine.  I'm just horrified that a
Senior level student would find it of interest.)
From: Pascal Bourguignon
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87r8c1jiek.fsf@thalassa.informatimago.com>
Marc Spitzer <········@optonline.net> writes:

> ·······@hotmail.com (J. Romano) writes:
> 
> > Or is it?  What if I wanted to write:
> >    A[2.71] = 6;
> > Or maybe something with an array index that is a string, like:
> >    A["hello"] = 2;
> 
> I think that adding a little syntax might make things clearer
> array:
> a[2]=6;
> hash table:
> a[["hello"]]=2;
> or something else that is not used by C/C++/JAVA to clarify
> visualy when you are talking about an array or a hash.

The point is that the map C++ class implements the [ operator, not the
[[ operator.   Its designer  wanted to  have a map  look just  like an
array.  That's the joys of operator overloading.

 
> > Is this even possible?
> > 
> > This IS possible, ever since the introduction of C++ Standard Template
> > Libraries (STL).  In these libraries, there is a class type called
> > "map" which allows a user to create a special kind of array that does
> > not have to use an integer as the array index.  The following line:
> > 
> >    map<string,float> A;
> > 
> > creates a variable A that behaves like an array, but whose array index
> > must be a string and whose values must all be floating point numbers. 
> > Now the following line is perfectly valid:
> >    A["pi"] = 3.14;
> > and the expression:
> >    float b = 2 * A["pi"];
> > will set b equal to 6.28 .
> > 
> > This explanation of the variable A is one way of looking at a hash
> > table (it is by no means the only way, however).  A hash table stores
> > a value based on a "key" which, in C++, looks like an array index. 
> > Unlike an array, however, the key can be a string, float, character,
> > integer, or even a user-defined type.
> 
> I think it would be more accurate to state that the print value of the
> data type in question not the data type itself is used as the key or
> index value. for example lets say I have a float, x = 2.0, and I use
> it for an index in to a hash.  The internal print representation was
> "+2.0E0" and later I use the string "2.0" to do a search for what is
> stored at index "+2.0E0" and I do not get it even though they are the
> same value as a float.
 
Never.  It's the data _value_, not  the data _type_ much less the data
representation as  a string that's used  as key in  hash tables, maps,
dictionaries,  whatever  the name  in  the  language you're  currently
using.

[17]> (setf map (make-hash-table :test (function equal)))
#S(HASH-TABLE EQUAL)
[18]> (setf (gethash "2.0" map) :deux)
:DEUX
[19]> (setf (gethash  2.0  map) :two)
:TWO
[20]> (gethash "2.0" map)
:DEUX ;
T
[21]> (gethash  2.0  map)
:TWO ;
T

> > So what is the difference between an array and a hash table?  Here
> > are a few: a.  a hash table will allow any index (or key) to be
> > used, not just a positive integer
> 
> I would chang this to read:
> 
> a: A hash table will allow the string representation of any data type
> to be used as the index(I prefer key) not just an integer.

No a hash tables accepts anything for a key:

[26]> (setf key '(really (any thing) #(1 2)))
(REALLY (ANY THING) #(1 2))
[27]> (setf (gethash  key  map) :complicated)
:COMPLICATED
[28]> (gethash  key  map)
:COMPLICATED ;
T

>   *note* Many languages allow arbitrary integer ranges to be used for
>   array indexes, pascal and friends(ada, modula 2, delphi) comes to
>   mind.
> 
> >    b.  since arrays have pre-allocated memory, their access time is
> >    much quicker than hash tables
> 
> You could make the argument that array access is faster because it
> does not have to hash the index string it will be faster.  But the one
s/index string/key/
> you use above makes no sense, if you were to replace 'pre-allocated'
> with 'contiguous' it would be more accurate.

Note however that the complexity of  a lookup in a hash table does not
depend on the key, it's a constant, like for an array.

> >    c.  hash tables have to worry about how to organize and retreive
> >    the values internally; however, the programmer does not have to
> >    worry about this (the code that manages this has already been
> >    written!)
> 
> so do arrays a[4]= 3 and x = a[4] do exactly what you say in point c

Yes.   The  difference  is that  hash  table  may  have to  do  memory
management (allocating memory for  a new couple) when inserting, while
the array has allocated all it's memory beforehand.



> > Two things to note: a.  "gethash", unlike "make-hash-table", does
> > not contain any hyphens
> 
> I think a is silly to point out, because it is so self evident

Not  always when you're  addressing innocent  pupils.  People  are not
used  to make the  difference between  the-president-of-the-states and
The President of The States and The President of the state.

 
> >    b.  if two values are written to the same key, the old value will
> >    be replaced (or "clobbered") by the new value.  For example, the
> >    lines: (setf (gethash "pi" a) 3.14) (setf (gethash "pi" a) 7)
> >    will set A["pi"] (that is, (gethash "pi" a) ) to 7 .
> 
> that is not always true, it can do that or it can keep the orignal
> value or it can keep a list of all values assigned to any given key.
> In CL the operation is distructive for the organic hash functions.

COMMON-LISP  replaces the  value, as  do all  the implementation  of a
similar   data   structure   I   know   of.    Could   you   point   a
language/implementation where that's not true?


> >    c.  if no value has been set for a particular key, its value will
> >    be NIL.
> 
> if no value has been set for a particular key it has no value set, you
> can still return nill with a value set to nil for a given key, read
> this: http://www.lispworks.com/reference/HyperSpec/Body/18_aa.htm
> 
> to start with.
> 
> > 
> > Well, that's about it for this file.  I hope you find it helpful!
> > 
> > ------------------------------------------------------
> 
> There were some other parts that I did not like but did not comment
> on, I did not feel I would have done a very good job with them unless
> I spent a good deal of time on them and I did not feel like making the
> effort tonight.
> 
> marc

-- 
__Pascal_Bourguignon__                   http://www.informatimago.com/
----------------------------------------------------------------------
There is a fault in reality. Do not adjust your minds. -- Salman Rushdie
From: Kalle Olavi Niemitalo
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87adipvy44.fsf@Saastamoduuli.y2000.kon.iki.fi>
Pascal Bourguignon <···@informatimago.com> writes:

> Never.  It's the data _value_, not  the data _type_ much less the data
> representation as  a string that's used  as key in  hash tables, maps,
> dictionaries,  whatever  the name  in  the  language you're  currently
> using.

Perl coerces all hash-keys to strings, though.

  $ perl -We '%x=([1],2); %y=([3],4); print keys(%x),"\n", keys(%y),"\n"'
  ARRAY(0x120013d00)
  ARRAY(0x120013d00)

The expression [1] constructs a fresh array containing the value
1 and returns a reference to it.  When Perl uses this reference
as a hash-key, it first converts it to the string
"ARRAY(0x120013d00)" which apparently shows the memory address of
the array.  It then frees the array itself, as there are no
actual references to it; the string doesn't count.  In this run,
Perl allocated the second array at the same address.
From: Pascal Bourguignon
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87n0mojttl.fsf@thalassa.informatimago.com>
Kalle Olavi Niemitalo <···@iki.fi> writes:

> Pascal Bourguignon <···@informatimago.com> writes:
> 
> > Never.  It's the data _value_, not  the data _type_ much less the data
> > representation as  a string that's used  as key in  hash tables, maps,
> > dictionaries,  whatever  the name  in  the  language you're  currently
> > using.
> 
> Perl coerces all hash-keys to strings, though.
> 
>   $ perl -We '%x=([1],2); %y=([3],4); print keys(%x),"\n", keys(%y),"\n"'
>   ARRAY(0x120013d00)
>   ARRAY(0x120013d00)
> 
> The expression [1] constructs a fresh array containing the value
> 1 and returns a reference to it.  When Perl uses this reference
> as a hash-key, it first converts it to the string
> "ARRAY(0x120013d00)" which apparently shows the memory address of
> the array.  It then frees the array itself, as there are no
> actual references to it; the string doesn't count.  In this run,
> Perl allocated the second array at the same address.

Perl is not a language, it's a program!
 ;-)

-- 
__Pascal_Bourguignon__                   http://www.informatimago.com/
----------------------------------------------------------------------
There is a fault in reality. Do not adjust your minds. -- Salman Rushdie
From: Lieven Marchand
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <878yy8275f.fsf@wyrd.be>
Pascal Bourguignon <···@informatimago.com> writes:

> COMMON-LISP  replaces the  value, as  do all  the implementation  of a
> similar   data   structure   I   know   of.    Could   you   point   a
> language/implementation where that's not true?

A lot of both strict and lazy functional languages. Look for Chris
Okasaki's book on them or his PhD thesis which is online.

-- 
People don't bore me. I like people. - Really? All of them? - All of them. -
Even the creepy ones? - Nobody's creepy from the inside, Hazel.  Some of them 
are sad, and some of them hurt, and some of them think they're the only real 
thing in the whole world. But they're not creepy.         --  Hazel and Death
From: Marc Spitzer
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <86ptrkzmq7.fsf@bogomips.optonline.net>
Pascal Bourguignon <···@informatimago.com> writes:

> Marc Spitzer <········@optonline.net> writes:
> 
> > ·······@hotmail.com (J. Romano) writes:
> > 
> > > Or is it?  What if I wanted to write:
> > >    A[2.71] = 6;
> > > Or maybe something with an array index that is a string, like:
> > >    A["hello"] = 2;
> > 
> > I think that adding a little syntax might make things clearer
> > array:
> > a[2]=6;
> > hash table:
> > a[["hello"]]=2;
> > or something else that is not used by C/C++/JAVA to clarify
> > visualy when you are talking about an array or a hash.
> 
> The point is that the map C++ class implements the [ operator, not the
> [[ operator.   Its designer  wanted to  have a map  look just  like an
> array.  That's the joys of operator overloading.

Well that is a stupid thing to do

> > I think it would be more accurate to state that the print value of the
> > data type in question not the data type itself is used as the key or
> > index value. for example lets say I have a float, x = 2.0, and I use
> > it for an index in to a hash.  The internal print representation was
> > "+2.0E0" and later I use the string "2.0" to do a search for what is
> > stored at index "+2.0E0" and I do not get it even though they are the
> > same value as a float.
>  
> Never.  It's the data _value_, not  the data _type_ much less the data
> representation as  a string that's used  as key in  hash tables, maps,
> dictionaries,  whatever  the name  in  the  language you're  currently
> using.
> 
> [17]> (setf map (make-hash-table :test (function equal)))
> #S(HASH-TABLE EQUAL)
> [18]> (setf (gethash "2.0" map) :deux)
> :DEUX
> [19]> (setf (gethash  2.0  map) :two)
> :TWO
> [20]> (gethash "2.0" map)
> :DEUX ;
> T
> [21]> (gethash  2.0  map)
> :TWO ;
> T

your example does not work with just (make-hash-table)

> 
> > > So what is the difference between an array and a hash table?  Here
> > > are a few: a.  a hash table will allow any index (or key) to be
> > > used, not just a positive integer
> > 
> > I would chang this to read:
> > 
> > a: A hash table will allow the string representation of any data type
> > to be used as the index(I prefer key) not just an integer.
> 
> No a hash tables accepts anything for a key:
> 
> [26]> (setf key '(really (any thing) #(1 2)))
> (REALLY (ANY THING) #(1 2))
> [27]> (setf (gethash  key  map) :complicated)
> :COMPLICATED
> [28]> (gethash  key  map)
> :COMPLICATED ;
> T
> 

most of the has tables I have used, wrote my own in pascal and used
perl, tcl, awk and C libraries always took the stringified value
of the data or as I said before the print value.  If your data
does not have a string representation you will damn well give it
one.  The default test for a hash eql operates closely to this
model:
http://www.lispworks.com/reference/HyperSpec/Body/f_eql.htm


> >   *note* Many languages allow arbitrary integer ranges to be used for
> >   array indexes, pascal and friends(ada, modula 2, delphi) comes to
> >   mind.
> > 
> > >    b.  since arrays have pre-allocated memory, their access time is
> > >    much quicker than hash tables
> > 
> > You could make the argument that array access is faster because it
> > does not have to hash the index string it will be faster.  But the one
> s/index string/key/
> > you use above makes no sense, if you were to replace 'pre-allocated'
> > with 'contiguous' it would be more accurate.
> 
> Note however that the complexity of  a lookup in a hash table does not
> depend on the key, it's a constant, like for an array.
> 

Well everything is constant for a large enough value of K and for the
hash you used above with a test function of equal it is false because
of the potentially large amount of work necessary to compare the keys
to the submitted lookup key to get a match

> > >    c.  hash tables have to worry about how to organize and retreive
> > >    the values internally; however, the programmer does not have to
> > >    worry about this (the code that manages this has already been
> > >    written!)
> > 
> > so do arrays a[4]= 3 and x = a[4] do exactly what you say in point c
> 
> Yes.   The  difference  is that  hash  table  may  have to  do  memory
> management (allocating memory for  a new couple) when inserting, while
> the array has allocated all it's memory beforehand.

That has nothing to do with what he said and is not true in CL.  I can
at run time declare an array by calling make-array with the needed
arguments.  And hash tables in CL do not need to use dynamic memory management 
to do what they need to do.  I can at hash creation preallocate all the needed 
items when I call make-hash.

> 
> 
> 
> > > Two things to note: a.  "gethash", unlike "make-hash-table", does
> > > not contain any hyphens
> > 
> > I think a is silly to point out, because it is so self evident
> 
> Not  always when you're  addressing innocent  pupils.  People  are not
> used  to make the  difference between  the-president-of-the-states and
> The President of The States and The President of the state.
> 

No, it is still stupid to point it out unless you are using it to illustrate
some rule.  No rule was pointed out so it serves no purpose.

>  
> > >    b.  if two values are written to the same key, the old value will
> > >    be replaced (or "clobbered") by the new value.  For example, the
> > >    lines: (setf (gethash "pi" a) 3.14) (setf (gethash "pi" a) 7)
> > >    will set A["pi"] (that is, (gethash "pi" a) ) to 7 .
> > 
> > that is not always true, it can do that or it can keep the orignal
> > value or it can keep a list of all values assigned to any given key.
> > In CL the operation is distructive for the organic hash functions.
> 
> COMMON-LISP  replaces the  value, as  do all  the implementation  of a
> similar   data   structure   I   know   of.    Could   you   point   a
> language/implementation where that's not true?
> 

berkelydb, works in tcl, perl, c, CL, java, c++.  It is a very nice
package for managing key value data.  And I have a problem with
someone saying this is the only way it works when it is just not so.
If it was phrased this is how CL does it fine no problem.  But to say
this is the way it works just because a given language(or set of them)
chose to implement it that way is sloppy an wrong.


marc
From: Scott McKay
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <vmKP9.157706$qF3.11644@sccrnsc04>
"Marc Spitzer" <········@optonline.net> wrote in message
···················@bogomips.optonline.net...
> Pascal Bourguignon <···@informatimago.com> writes:
>
> > Marc Spitzer <········@optonline.net> writes:
> >
> > > ·······@hotmail.com (J. Romano) writes:
> > >
> > > > Or is it?  What if I wanted to write:
> > > >    A[2.71] = 6;
> > > > Or maybe something with an array index that is a string, like:
> > > >    A["hello"] = 2;
> > >
> > > I think that adding a little syntax might make things clearer
> > > array:
> > > a[2]=6;
> > > hash table:
> > > a[["hello"]]=2;
> > > or something else that is not used by C/C++/JAVA to clarify
> > > visualy when you are talking about an array or a hash.
> >
> > The point is that the map C++ class implements the [ operator, not the
> > [[ operator.   Its designer  wanted to  have a map  look just  like an
> > array.  That's the joys of operator overloading.
>
> Well that is a stupid thing to do

Why is it stupid for element access -- [] -- to use the same
generic for tables and arrays or vectors?  If that is your claim,
I'd be interested in hearing you defend it.
From: Marc Spitzer
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <86d6nkz9pv.fsf@bogomips.optonline.net>
"Scott McKay" <···@attbi.com> writes:

> "Marc Spitzer" <········@optonline.net> wrote in message
> ···················@bogomips.optonline.net...
> > Pascal Bourguignon <···@informatimago.com> writes:
> >
> > > Marc Spitzer <········@optonline.net> writes:
> > >
> > > > ·······@hotmail.com (J. Romano) writes:
> > > >
> > > > > Or is it?  What if I wanted to write:
> > > > >    A[2.71] = 6;
> > > > > Or maybe something with an array index that is a string, like:
> > > > >    A["hello"] = 2;
> > > >
> > > > I think that adding a little syntax might make things clearer
> > > > array:
> > > > a[2]=6;
> > > > hash table:
> > > > a[["hello"]]=2;
> > > > or something else that is not used by C/C++/JAVA to clarify
> > > > visualy when you are talking about an array or a hash.
> > >
> > > The point is that the map C++ class implements the [ operator, not the
> > > [[ operator.   Its designer  wanted to  have a map  look just  like an
> > > array.  That's the joys of operator overloading.
> >
> > Well that is a stupid thing to do
> 
> Why is it stupid for element access -- [] -- to use the same
> generic for tables and arrays or vectors?  If that is your claim,
> I'd be interested in hearing you defend it.

I think in a syntax heavy language like Perl for example you should
have a special syntax for hash vs array because the access methods
are different for how you would walk them.

for example to print a ordered reading of an array:
start at low index goto high index and print each thing
and the index

you must do more work to get the same in a hash:
get all keys then sort them then for each  key get 
the value and print key and value

I guess my fundamental issues is arrays and vectors are sequence
type things and hashes are by definition not.  And since you have
all this syntax there to point out all the language features anyway
that should be one of them to be consistent with the rest of the crud.

marc
From: Scott McKay
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <Fi_P9.503190$P31.161733@rwcrnsc53>
"Marc Spitzer" <········@optonline.net> wrote in message
···················@bogomips.optonline.net...
> "Scott McKay" <···@attbi.com> writes:
>
> > "Marc Spitzer" <········@optonline.net> wrote in message
> > ···················@bogomips.optonline.net...
> > > Pascal Bourguignon <···@informatimago.com> writes:
> > >
> > > > Marc Spitzer <········@optonline.net> writes:
> > > >
> > > > > ·······@hotmail.com (J. Romano) writes:
> > > > >
> > > > > > Or is it?  What if I wanted to write:
> > > > > >    A[2.71] = 6;
> > > > > > Or maybe something with an array index that is a string, like:
> > > > > >    A["hello"] = 2;
> > > > >
> > > > > I think that adding a little syntax might make things clearer
> > > > > array:
> > > > > a[2]=6;
> > > > > hash table:
> > > > > a[["hello"]]=2;
> > > > > or something else that is not used by C/C++/JAVA to clarify
> > > > > visualy when you are talking about an array or a hash.
> > > >
> > > > The point is that the map C++ class implements the [ operator, not
the
> > > > [[ operator.   Its designer  wanted to  have a map  look just  like
an
> > > > array.  That's the joys of operator overloading.
> > >
> > > Well that is a stupid thing to do
> >
> > Why is it stupid for element access -- [] -- to use the same
> > generic for tables and arrays or vectors?  If that is your claim,
> > I'd be interested in hearing you defend it.
>
> I think in a syntax heavy language like Perl for example you should
> have a special syntax for hash vs array because the access methods
> are different for how you would walk them.
>
> for example to print a ordered reading of an array:
> start at low index goto high index and print each thing
> and the index
>
> you must do more work to get the same in a hash:
> get all keys then sort them then for each  key get
> the value and print key and value
>
> I guess my fundamental issues is arrays and vectors are sequence
> type things and hashes are by definition not.  And since you have
> all this syntax there to point out all the language features anyway
> that should be one of them to be consistent with the rest of the crud.
>

I can't speak to Perl.

In a language like Lisp, overloading the 'element' operator -- [] --
to function on all sorts of collections strikes me as eminently
reasonable.  It happens that tables are keyed by complex keys
and arrays/vectors are keyed by integers, but that seems less
relevant than the fact that all such things are collections.

It still causes me no end of frustration in Lisp that I need to use
different forms of iteration to walk over the elements of lists,
vectors, and tables, and if I define my own collection class, I
have to write yet another form of iteration (and am then forced
not to be able to use 'loop').
From: Christopher C. Stacy
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <ubs33e08c.fsf@dtpq.com>
>>>>> On Mon, 30 Dec 2002 16:15:33 GMT, Scott McKay ("Scott") writes:
 Scott> It still causes me no end of frustration in Lisp that I need to use
 Scott> different forms of iteration to walk over the elements of lists,
 Scott> vectors, and tables, and if I define my own collection class, I
 Scott> have to write yet another form of iteration (and am then forced
 Scott> not to be able to use 'loop').

The Lisp philosophy is schizophrenic in this way.
Lisp provides for "generic" functionality in places like arithmetic,
and the object system is oriented around generic function protocols.
But for primitive data types, you pick your data structures and are
then stuck with them, as in most programming languages.  You choose
lists or arrays or hashtables based on the algorithm you plan to use.
This is only somewhat ameliorated by generic "sequence" operators.
I am only slightly annoyed by the lack of a generic iteration
protocol (eg. where is DOSEQ?).  On the LOOP side of things: before
ANSI, the macro was extensible.  Anyway, mostly I don't change the kind
of sequences that I use, and am not bothered by explicitly coding for
a list versus an array.

Although I don't object to Hash[Key] and agree that is a collection,
I don't mind saying AREF versus ASSOC versus GETHASH in my code.
Maybe I like seeing that level of implementation detail when I am
reading a random fragment of code.
From: Bruce Hoult
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <bruce-DD7495.13102331122002@copper.ipg.tsnz.net>
In article <·············@dtpq.com>,
 ······@dtpq.com (Christopher C. Stacy) wrote:

> >>>>> On Mon, 30 Dec 2002 16:15:33 GMT, Scott McKay ("Scott") writes:
>  Scott> It still causes me no end of frustration in Lisp that I need to use
>  Scott> different forms of iteration to walk over the elements of lists,
>  Scott> vectors, and tables, and if I define my own collection class, I
>  Scott> have to write yet another form of iteration (and am then forced
>  Scott> not to be able to use 'loop').
> 
> The Lisp philosophy is schizophrenic in this way.
> Lisp provides for "generic" functionality in places like arithmetic,
> and the object system is oriented around generic function protocols.
> But for primitive data types, you pick your data structures and are
> then stuck with them, as in most programming languages.

What makes that part of the "Lisp philosophy", rather than just an 
accident of history?


> You choose lists or arrays or hashtables based on the algorithm you 
> plan to use. This is only somewhat ameliorated by generic "sequence" 
> operators. I am only slightly annoyed by the lack of a generic 
> iteration protocol (eg. where is DOSEQ?).  On the LOOP side of 
> things: before ANSI, the macro was extensible.  Anyway, mostly I 
> don't change the kind of sequences that I use, and am not bothered by 
> explicitly coding for a list versus an array.

Sounds like a Whorfian thing.  I very frequently change the type of 
sequence I'm using, because Dylan makes it transparent to do so.  I 
especially find myself switching from a list to a stretchy vector, or 
vice-versa, depending on the expected (or measured) average number of 
elements.


> Although I don't object to Hash[Key] and agree that is a collection,
> I don't mind saying AREF versus ASSOC versus GETHASH in my code.
> Maybe I like seeing that level of implementation detail when I am
> reading a random fragment of code.

Those things can provide a clue as to the intended usage of the 
collection, but I'd rather see the variable declared with a suitably 
mnemonic user-defined type.  The user can easily look up the declaration 
(as can the compiler, to see whether it should be doing AREF or GETHASH 
or NTH rather than a GF dispatch), and it's easy to change every use at 
a stroke.

-- Bruce
From: Christopher C. Stacy
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <uof724q52.fsf@dtpq.com>
>>>>> On Tue, 31 Dec 2002 13:10:23 +1300, Bruce Hoult ("Bruce") writes:

 Bruce> In article <·············@dtpq.com>,
 Bruce>  ······@dtpq.com (Christopher C. Stacy) wrote:

 >> >>>>> On Mon, 30 Dec 2002 16:15:33 GMT, Scott McKay ("Scott") writes:
 Scott> It still causes me no end of frustration in Lisp that I need to use
 Scott> different forms of iteration to walk over the elements of lists,
 Scott> vectors, and tables, and if I define my own collection class, I
 Scott> have to write yet another form of iteration (and am then forced
 Scott> not to be able to use 'loop').
 >> 
 >> The Lisp philosophy is schizophrenic in this way.
 >> Lisp provides for "generic" functionality in places like arithmetic,
 >> and the object system is oriented around generic function protocols.
 >> But for primitive data types, you pick your data structures and are
 >> then stuck with them, as in most programming languages.

 Bruce> What makes that part of the "Lisp philosophy", rather than just an 
 Bruce> accident of history?

I think that the philosophy is partly an accident of history, and was
even considering writing that in my paragraph; but some thought has
gone into it, and it has been fully embraced and codified into the
standard definition of the language.
From: Harald Hanche-Olsen
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <pco7kdrcl70.fsf@thoth.math.ntnu.no>
+ "Scott McKay" <···@attbi.com>:

| It still causes me no end of frustration in Lisp that I need to use
| different forms of iteration to walk over the elements of lists,
| vectors, and tables, and if I define my own collection class, I have
| to write yet another form of iteration (and am then forced not to be
| able to use 'loop').

I don't understand your parenthetical comment here.  I would have
guessed you could utilize the for-as-equals-then loop clause together
with a bit of destructuring to loop across just about any sort of
collection.  But if you provide a specific example, maybe I would get
your point.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Yes it works in practice - but does it work in theory?
From: Scott McKay
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <bw5Q9.564001$QZ.80555@sccrnsc02>
"Harald Hanche-Olsen" <······@math.ntnu.no> wrote in message
····················@thoth.math.ntnu.no...
> + "Scott McKay" <···@attbi.com>:
>
> | It still causes me no end of frustration in Lisp that I need to use
> | different forms of iteration to walk over the elements of lists,
> | vectors, and tables, and if I define my own collection class, I have
> | to write yet another form of iteration (and am then forced not to be
> | able to use 'loop').
>
> I don't understand your parenthetical comment here.  I would have
> guessed you could utilize the for-as-equals-then loop clause together
> with a bit of destructuring to loop across just about any sort of
> collection.  But if you provide a specific example, maybe I would get
> your point.

The following seems excessive, given that we now all understand
about generic functions and protocols:

  (loop for x in list
           for y across vector
           for z being the hash-elements of table
           ...)

Furthermore, if I implement a new sort of collection, I am screwed.
I can't use 'loop' at all, which is a real shame since 'loop' is the only
thing in Common Lisp that provides a "collecting" facility.
From: Louis Theran
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <a2503e25.0212310948.4953c08d@posting.google.com>
"Scott McKay" <···@attbi.com> wrote in message news:<·····················@sccrnsc02>...

> Furthermore, if I implement a new sort of collection, I am screwed.
> I can't use 'loop' at all, which is a real shame since 'loop' is the only
> thing in Common Lisp that provides a "collecting" facility.

You're probably aware of this, but with the version of loop that's in
at least Allegro and MCL, you can add an iteration path that makes it
similar to what Dylan has.  (Disclaimer: I've never written a Dylan
program, just looked at the manual a little.)

I wrote this a while back as part of a similar discussion, though I'm
not sure I really understand how to write my own loop iteration path
properly.  Maybe somebody will find it amusing, but it's not really
been tested.  (There's also the issue that without hash table
iterators that have indefinite extent, you can't use this trick on
hash tables...)


(defgeneric forward-iteration-protocol (thing)
  (:documentation
   "Returns a number of values that may be useful for general ~
purpose iteration: begin end this set-this next donep."))

;; example
(defmethod forward-iteration-protocol ((thing cons))
  (values thing nil
          (lambda (state col)
            (declare (ignore col))
            (car state))
          (lambda (val state  col)
            (declare (ignore col))
            (setf (car state) val))
          (lambda (state col)
            (declare (ignore col))
            (cdr state))
          (lambda (state limit col)
            (declare (ignore limit col))
            (null state))))

;; The idea is to allow
;;
;; (loop for blah as the protocol of thing)
;;

(defun iteration-protocol-path (var type prep-phrases &key which)
  (declare (ignore which))
  (with-unique-names (collection state end this set-this next donep
last-state)
    (push
     `(multiple-value-bind (,state ,end ,this ,set-this ,next ,donep)
          (forward-iteration-protocol ,collection))
     excl::*loop-wrappers*)
    (push `(flet ((,(form-symbol 'set- var '!) (x)
                 (funcall ,set-this x ,last-state ,collection))))
       excl::*loop-wrappers*)

    ;; now return the iteration stuff
    `(((,var nil ,type) (,collection ,(cadar prep-phrases))
(,last-state nil)) ;binds
      ()                                ; prologue
      (funcall ,donep ,state ,end ,collection)
      ()                                ; parallel
      ()
      (,var (funcall ,this ,state ,collection) ,last-state
            ,state ,state (funcall ,next ,state ,collection)))))

(excl::add-loop-path '(protocol)
'common-lisp-user::iteration-protocol-path excl::*loop-ansi-universe*
                                 :preposition-groups '((:of :in))
                                 :inclusive-permitted nil
                                 :user-data '(:which protocol))
From: Harald Hanche-Olsen
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <pco3coecpp2.fsf@thoth.math.ntnu.no>
+ "Scott McKay" <···@attbi.com>:

| The following seems excessive, given that we now all understand
| about generic functions and protocols:
| 
|   (loop for x in list
|            for y across vector
|            for z being the hash-elements of table
|            ...)

This I can understand and agree with, though I don't see it as a major
problem.

| Furthermore, if I implement a new sort of collection, I am screwed.
| I can't use 'loop' at all, which is a real shame since 'loop' is the
| only thing in Common Lisp that provides a "collecting" facility.

This is the part I don't get.  What you need to do is to define a
function, or method, that given a collection will return a closure
which when called repeatedly will return the members of the
collection, one by one.  Then do

  (loop for x in list
        for y across vector
        for z being the hash-elements of table
        with collection-next = (collection-iterator collection)
        for w = (funcall collection-next)
        ...)

To make the idea somewhat clearer, below is an implementation of a
tree iterator that might be used in the above scheme.  It is very
simple-minded, traverses the tree in a strange order, will not cope
with recursive structures or dotted sublists, and provides no
mechanism for detecting the end of the tree (it just returns nil
forever after the end is reached).  But these shortcomings are easily
remedied.  I offer the code only as a counterargument to your
statement that one cannot use loop to iterate over nonstandard forms
of collections.

(defun tree-iterator (tree)
  (lambda ()
    (when tree
      (let ((head (pop tree)))
	(loop
	 (cond ((consp head)
		(let ((new-head (first head)))
		  (dolist (item (rest head))
		    (push item tree))
		  (setf head new-head)))
	       (t (return head))))))))

Of course, there is always ITERATE: It has a built-in mechanism for
making your own iterators.  But iterate seems never to have caught on
for some reason.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Yes it works in practice - but does it work in theory?
From: Pascal Costanza
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <aus4pa$11p8$1@f1node01.rhrz.uni-bonn.de>
Harald Hanche-Olsen wrote:

> Of course, there is always ITERATE: It has a built-in mechanism for
> making your own iterators.  But iterate seems never to have caught on
> for some reason.

...and there are also Tim Bradshaw's excellent collecting macros at 
http://www.tfeb.org/lisp/hax.html#COLLECTING


Pascal

-- 
Pascal Costanza               University of Bonn
···············@web.de        Institute of Computer Science III
http://www.pascalcostanza.de  R�merstr. 164, D-53117 Bonn (Germany)
From: Lennart Staflin
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <m2isxa9qmt.fsf@saturn.local>
"Scott McKay" <···@attbi.com> writes:
> Furthermore, if I implement a new sort of collection, I am screwed.
> I can't use 'loop' at all, which is a real shame since 'loop' is the only
> thing in Common Lisp that provides a "collecting" facility.

I feel your pain and point taken :) 
But this is not entirely true. You could to something used in some
lesser languages, an iterator.

To iterate x over a collection cx you could write

    (loop for ix = (iterator cx) then (next ix) while ix
          for x  = (this ix)
          collect (something x))

assuming there is suitable functions interator, next and this. If the
collection is a list these functions would be:
        iterator = identity, next = cdr, this = car

This suggest, that if we can accept to use cons as iterator object
with the current element in car and whatever in cdr we could write:

(loop for x in (iterator cx) by #'next
      collect x)


//Lennart
From: Scott McKay
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <9WDQ9.190084$qF3.13190@sccrnsc04>
"Lennart Staflin" <·····@lysator.liu.se> wrote in message
···················@saturn.local...
> "Scott McKay" <···@attbi.com> writes:
> > Furthermore, if I implement a new sort of collection, I am screwed.
> > I can't use 'loop' at all, which is a real shame since 'loop' is the
only
> > thing in Common Lisp that provides a "collecting" facility.
>
> I feel your pain and point taken :)
> But this is not entirely true. You could to something used in some
> lesser languages, an iterator.

Yes, but you're missing the point that CL has failed to separate the
concerns
of iterating and collecting, and does not have a first-class protocol for
either,
even tho' such a protocol is trivial.

> To iterate x over a collection cx you could write
>
>     (loop for ix = (iterator cx) then (next ix) while ix
>           for x  = (this ix)
>           collect (something x))
>
> assuming there is suitable functions interator, next and this. If the
> collection is a list these functions would be:
>         iterator = identity, next = cdr, this = car

The allocation cost for this solution is prohibitive for any non-trivial
collection.  "Consing for effect" is a bete noir of mine.

> This suggest, that if we can accept to use cons as iterator object
> with the current element in car and whatever in cdr we could write:
>
> (loop for x in (iterator cx) by #'next
>       collect x)
>
>
> //Lennart
From: Carl Shapiro
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <ouyk7hqnnu2.fsf@panix3.panix.com>
"Scott McKay" <···@attbi.com> writes:

> It still causes me no end of frustration in Lisp that I need to use
> different forms of iteration to walk over the elements of lists,
> vectors, and tables, and if I define my own collection class, I
> have to write yet another form of iteration (and am then forced
> not to be able to use 'loop').

Have you considered writing a specialized loop path for the collection
types loop is unable to traverse?
From: Gabe Garza
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87u1gwwa8k.fsf@ix.netcom.com>
"Scott McKay" <···@attbi.com> writes:
> > > The point is that the map C++ class implements the [ operator, not the
> > > [[ operator.   Its designer  wanted to  have a map  look just  like an
> > > array.  That's the joys of operator overloading.
> >
> > Well that is a stupid thing to do
> 
> Why is it stupid for element access -- [] -- to use the same
> generic for tables and arrays or vectors?  If that is your claim,
> I'd be interested in hearing you defend it.

In my inexperienced opinion, overloading [] in C++ is bad because, in
addition to whatever high level, abstract reference-type tasks you
assign it, it still has the very low-level task from C: evaluate the
index, add it to the base, and "return" the "object" at that address.

It may be just me, but when I see foo[bar] = 3 (in C/C++) I think of
it has *(foo + bar) = 3. Overloading [] to manipulate hash tables
seems a bit like overloading DPB to manipulate class instances...

Gabe Garza
From: Pascal Bourguignon
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87vg1cezly.fsf@thalassa.informatimago.com>
Gabe Garza <·······@ix.netcom.com> writes:

> "Scott McKay" <···@attbi.com> writes:
> > > > The point is that the map C++ class implements the [ operator, not the
> > > > [[ operator.   Its designer  wanted to  have a map  look just  like an
> > > > array.  That's the joys of operator overloading.
> > >
> > > Well that is a stupid thing to do
> > 
> > Why is it stupid for element access -- [] -- to use the same
> > generic for tables and arrays or vectors?  If that is your claim,
> > I'd be interested in hearing you defend it.
> 
> In my inexperienced opinion, overloading [] in C++ is bad because, in
> addition to whatever high level, abstract reference-type tasks you
> assign it, it still has the very low-level task from C: evaluate the
> index, add it to the base, and "return" the "object" at that address.
> 
> It may be just me, but when I see foo[bar] = 3 (in C/C++) I think of
> it has *(foo + bar) = 3. Overloading [] to manipulate hash tables
> seems a bit like overloading DPB to manipulate class instances...
> 
> Gabe Garza

This  is a  problem  of  C/C++, not  of  overloading.  Actually,  it's
bar[foo]=4.

  int foo[10]; 0[foo]=1; 1[foo]=2; 2[foo]=0[foo]+1[foo]; foo[2[foo]]=3;
  int bar=1;  /* foo[bar]==bar[foo]==*(foo+bar)==*(bar+foo) */





-- 
__Pascal_Bourguignon__                   http://www.informatimago.com/
----------------------------------------------------------------------
There is a fault in reality. Do not adjust your minds. -- Salman Rushdie
From: Gareth McCaughan
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <slrnb1203p.11v7.Gareth.McCaughan@g.local>
Gabe Garza wrote:

>  In my inexperienced opinion, overloading [] in C++ is bad because, in
>  addition to whatever high level, abstract reference-type tasks you
>  assign it, it still has the very low-level task from C: evaluate the
>  index, add it to the base, and "return" the "object" at that address.
>  
>  It may be just me, but when I see foo[bar] = 3 (in C/C++) I think of
>  it has *(foo + bar) = 3. Overloading [] to manipulate hash tables
>  seems a bit like overloading DPB to manipulate class instances...

You're at odds with the people who wrote the C++ standard.
The STL associative containers let you use [] for element
reference.

It's not clear to me whether operator overloading is a good
thing, but *if* it is then it seems to me that this is exactly
the sort of thing it should be used for.

But, um, why are we having this long thread about C++ style
in comp.lang.lisp, again? Is Lisp dead or something? :-)

-- 
Gareth McCaughan  ················@pobox.com
.sig under construc
From: Joe Marshall
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <65tbl3zm.fsf@ccs.neu.edu>
"Scott McKay" <···@attbi.com> writes:

> "Marc Spitzer" <········@optonline.net> wrote in message
> ···················@bogomips.optonline.net...
> > Pascal Bourguignon <···@informatimago.com> writes:
> > >
> > > The point is that the map C++ class implements the [ operator, not the
> > > [[ operator.   Its designer  wanted to  have a map  look just  like an
> > > array.  That's the joys of operator overloading.
> >
> > Well that is a stupid thing to do
> 
> Why is it stupid for element access -- [] -- to use the same
> generic for tables and arrays or vectors?  

Because element access (be it a table, array, or vector) is a function
from the appropriate key (be it string, symbol, or integer) to a
value.

You *should* be overloading the () operator.
From: Jochen Schmidt
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <auo0g9$rst$07$1@news.t-online.com>
Marc Spitzer wrote:


>> The point is that the map C++ class implements the [ operator, not the
>> [[ operator.   Its designer  wanted to  have a map  look just  like an
>> array.  That's the joys of operator overloading.
> 
> Well that is a stupid thing to do

Really?
You can imagine arrays as a special form of tables optimized for packed 
integer keys. One could think of a type "table" which would be the union of 
for example lists, arrays, hash-tables and so on. If you think that "table" 
reminds to much on tables in a relational database - those can be thought 
to be of the same kind: A mapping of a tuple of keys to a tuple of values.

ciao,
Jochen

-- 
http://www.dataheaven.de
From: Pascal Bourguignon
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87isxcjs0y.fsf@thalassa.informatimago.com>
Marc Spitzer <········@optonline.net> writes:

> Pascal Bourguignon <···@informatimago.com> writes:
> > The point is that the map C++ class implements the [ operator, not the
> > [[ operator.   Its designer  wanted to  have a map  look just  like an
> > array.  That's the joys of operator overloading.
> 
> Well that is a stupid thing to do

May be.

But I  don't feel it's  more stupid than  to overload the  + operator,
defining it both for: N={x|x={} or (Ey  y in N and x={y})} and for the
equivalence class of infinite converging series of rationals (yes, the
one called R).  Note that there  is a bijection between N and a subset
R  which justify  the "extension"  of +  from N  to R  (or  merely the
projection of + of R onto N).  Similarly, there is a bijection between
the  C++ arrays  (which  are  maps of  natural  numbers onto  whatever
element type  declared for  a given  array) and a  subset of  the hash
table maps,  therefore the  [] operator  in the set  of C++  array can
easily  be extended  to  the C++  map  objects, and  []  in arrays  be
considered just  like the projection  of the []  in the maps  onto the
arrays.

Now,  overloading []  onto  the  set of  Apples,  taking Elephants  as
indicies  and  returning  frequencies,  yes,  that  would  be  stupid.
Perhaps.

 
> > > I think it would be more accurate to state that the print value of the
> > > data type in question not the data type itself is used as the key or
> > > index value. for example lets say I have a float, x = 2.0, and I use
> > > it for an index in to a hash.  The internal print representation was
> > > "+2.0E0" and later I use the string "2.0" to do a search for what is
> > > stored at index "+2.0E0" and I do not get it even though they are the
> > > same value as a float.
> >  
> > Never.  It's the data _value_, not  the data _type_ much less the data
> > representation as  a string that's used  as key in  hash tables, maps,
> > dictionaries,  whatever  the name  in  the  language you're  currently
> > using.
> > 
> > [17]> (setf map (make-hash-table :test (function equal)))
> > #S(HASH-TABLE EQUAL)
> > [18]> (setf (gethash "2.0" map) :deux)
> > :DEUX
> > [19]> (setf (gethash  2.0  map) :two)
> > :TWO
> > [20]> (gethash "2.0" map)
> > :DEUX ;
> > T
> > [21]> (gethash  2.0  map)
> > :TWO ;
> > T
> 
> your example does not work with just (make-hash-table)

For  a  different reason.  You're  nit-picking.   Here  is my  example
working with just (make-hash-table):

(setf k1 "2.0" k2 2.0 map (make-hash-table))
(setf (gethash k1 map) :deux)
(setf (gethash k2 map) :two)
(gethash k1 map)
(gethash k2 map)


[1]> (setf k1 "2.0" k2 2.0 map (make-hash-table))
#S(HASH-TABLE EQL)
[2]> (setf (gethash k1 map) :deux)
:DEUX
[3]> (setf (gethash k2 map) :two)
:TWO
[4]> (gethash k1 map)
:DEUX ;
T
[5]> (gethash k2 map)
:TWO ;
T

 

> > > > So what is the difference between an array and a hash table?  Here
> > > > are a few: a.  a hash table will allow any index (or key) to be
> > > > used, not just a positive integer
> > > 
> > > I would chang this to read:
> > > 
> > > a: A hash table will allow the string representation of any data type
> > > to be used as the index(I prefer key) not just an integer.
> > 
> > No a hash tables accepts anything for a key:
> > 
> > [26]> (setf key '(really (any thing) #(1 2)))
> > (REALLY (ANY THING) #(1 2))
> > [27]> (setf (gethash  key  map) :complicated)
> > :COMPLICATED
> > [28]> (gethash  key  map)
> > :COMPLICATED ;
> > T
> > 
> 
> most of the has tables I have used, wrote my own in pascal and used
> perl, tcl, awk and C libraries always took the stringified value
> of the data or as I said before the print value.  If your data
> does not have a string representation you will damn well give it
> one.  

Because of  a defect in these  languages: it's awkward  to implement a
hask  value function  for  all  types, so  it's  only implemented  for
strings.  In high level  languages, (I can't even say object-oriented,
could  we say  that lisp  is object-oriented?),  you have  or  you can
implement   easily  a   hash  function   over  all   values  therefore
implementing correct hash tables is trivial.


> The default test for a hash eql operates closely to this
> model:
> http://www.lispworks.com/reference/HyperSpec/Body/f_eql.htm

Not at all. See my example above.
 
 
> > Note however that the complexity of  a lookup in a hash table does not
> > depend on the key, it's a constant, like for an array.
> > 
> 
> Well everything is constant for a large enough value of K and for the
> hash you used above with a test function of equal it is false because
> of the potentially large amount of work necessary to compare the keys
> to the submitted lookup key to get a match

So you don't know what a hash table is and why it's called a "hash" table.


> > > >    c.  hash tables have to worry about how to organize and retreive
> > > >    the values internally; however, the programmer does not have to
> > > >    worry about this (the code that manages this has already been
> > > >    written!)
> > > 
> > > so do arrays a[4]= 3 and x = a[4] do exactly what you say in point c
> > 
> > Yes.   The  difference  is that  hash  table  may  have to  do  memory
> > management (allocating memory for  a new couple) when inserting, while
> > the array has allocated all it's memory beforehand.
> 
> That has nothing to do with what he said and is not true in CL.  I
> can at run time declare an array by calling make-array with the
> needed arguments.  And hash tables in CL do not need to use dynamic
> memory management  to do what they need to do.  I can at hash
> creation preallocate all the needed  items when I call make-hash.

Hash table implementations  may pre-allocate a few or  many cells, but
since  they have provisions  to accomodate  any number  of (key,value)
couples, sooner or later they will have to allocate some more memory.


> > > > Two things to note: a.  "gethash", unlike "make-hash-table", does
> > > > not contain any hyphens
> > > 
> > > I think a is silly to point out, because it is so self evident
> > 
> > Not  always when you're  addressing innocent  pupils.  People  are not
> > used  to make the  difference between  the-president-of-the-states and
> > The President of The States and The President of the state.
> 
> No, it is still stupid to point it out unless you are using it to illustrate
> some rule.  No rule was pointed out so it serves no purpose.

Clearly, this  "document" was elaborated  for a known public  in given
circumstances.   It was not  originally a  reference document  for the
general public.


> > > >    b.  if two values are written to the same key, the old value will
> > > >    be replaced (or "clobbered") by the new value.  For example, the
> > > >    lines: (setf (gethash "pi" a) 3.14) (setf (gethash "pi" a) 7)
> > > >    will set A["pi"] (that is, (gethash "pi" a) ) to 7 .
> > >
> > > that is not always true, it can do that or it can keep the orignal
> > > value or it can keep a list of all values assigned to any given key.
> > > In CL the operation is distructive for the organic hash functions.
> > 
> > COMMON-LISP  replaces the  value, as  do all  the implementation  of a
> > similar   data   structure   I   know   of.    Could   you   point   a
> > language/implementation where that's not true?
> > 
> 
> berkelydb, works in tcl, perl, c, CL, java, c++.  It is a very nice
> package for managing key value data.  And I have a problem with
> someone saying this is the only way it works when it is just not so.
> If it was phrased this is how CL does it fine no problem.  But to say
> this is the way it works just because a given language(or set of them)
> chose to implement it that way is sloppy an wrong.
> 
> 
> marc

Perhaps 'berkelydb' is more database oriented, which would explain its
reticence  to destroy  an existing  association?  Anyway  the document
explicitely  spoke of  the  hash tables  as  specified in  COMMON-LISP
comparing them to  the hash table as implemented in  the map C++ class
template.   There  is  no   absolute  rule,  no  absolute  universally
recognized hash table abstrat type,  but you can't reproach a specific
document to speak only about what it purports.




-- 
__Pascal_Bourguignon__                   http://www.informatimago.com/
----------------------------------------------------------------------
There is a fault in reality. Do not adjust your minds. -- Salman Rushdie
From: Marius Bernklev
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <3cfk7hs73wh.fsf@sex.ifi.uio.no>
Pascal Bourguignon <···@informatimago.com> writes:

> Now, overloading [] onto the set of Apples, taking Elephants as
> indicies and returning frequencies, yes, that would be stupid.
> Perhaps.

Before overloading, [] had the following semantics:

    Retrieve an element from a set in constant time (mod cache/OS)

That's not what a hash table does.  Even if you _want_ to get
frequencies from elephants, it's still not the same.



Marius
-- 
<URL: http://www.stud.ifi.uio.no/~mariube/ >

Life sucks.  Death swallows.
From: Tom Lord
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <v0v6p9lsilaf98@corp.supernews.com>
        Before overloading, [] had the following semantics:

            Retrieve an element from a set in constant time (mod cache/OS)

	That's not what a hash table does.  Even if you _want_ to get
	frequencies from elephants, it's still not the same.


[] had that semantic in C.  It never had that semantic in C++ (unless
maybe _very_ early on).

In C++, [] has a semantic something like:

	Invoke an lvalue-producing binary function or built-in.

and a "usage recommendation" that is something like:

	Use [] overloading for functions that access (creating
        if necessary) entries in an associative data structure.

Hash-table look-up/enter is right in keeping with that.

Operator overloading in C++ let's people select syntax that reflects
the algebraic structure of their functions.  (It also permits
abominations, but bad programming is possible in any language (and, to
respect Erik, more likely in some languages than others).)

If you like `setf' or generics, and adopt the premise that in-line,
punctuation-heavy syntaxes are desirable, then C++-style overloading
can't be _that_ bad.

That said, C++ overloading isn't quite right, even you are into
in-line, punctuation-heavy syntaxes.  For example, rather
than requiring [] to always return lvalues, it would be better to
separately overload lvalue and rvalue uses of [], so that the 
[] implementations would not always have to prepare for a possible
assignment to the given key.

(No, I don't much like C++.  Yes, I too am appalled that senior
 CS majors need to be taught what a hash table is.  On the other 
 hand, it's often interesting to read from-the-field reports 
 about the needs of effective, mass-audience pedagogy.)

-t
From: Marius Bernklev
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <3cfel80w94q.fsf@sex.ifi.uio.no>
····@emf.emf.net (Tom Lord) writes:

>         Before overloading, [] had the following semantics:
>
>             Retrieve an element from a set in constant time (mod
>             cache/OS)

> [] had that semantic in C.  It never had that semantic in C++
> (unless maybe _very_ early on).

my bad.  Let me correct 'overloading' to 'applied to anything but
arrays'.  Even for C++, it is, IMO, ugly and confusing.  (My vote
doesn't really count, since I left that world some time ago.)

> Operator overloading in C++ let's people select syntax that reflects
> the algebraic structure of their functions.

Yes it does.  Yet, in my experience, people seem to have distanced
themselves from C++ by the time they know what 'algebraic structure'
is.  Then again, most of the C++ coders I've met have never taken a
course in algebra since high school.


Marius
-- 
<URL: http://www.stud.ifi.uio.no/~mariube/ >

Life sucks.  Death swallows.
From: Marc Spitzer
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <868yy8yyke.fsf@bogomips.optonline.net>
····@emf.emf.net (Tom Lord) writes:

> 
> (No, I don't much like C++.  Yes, I too am appalled that senior
>  CS majors need to be taught what a hash table is.  On the other 
>  hand, it's often interesting to read from-the-field reports 
>  about the needs of effective, mass-audience pedagogy.)
> 
> -t

I know it appears that more and more colleges are turning into 4 or
more years of 13th grade.  I adjuncted for a school (one course only)
that did not make a stab at 12th grade.  The were appalled that I was
perfectly willing to fail people who did not meet standards, their
standards, for the class.

marc
From: Pascal Bourguignon
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <87znqoezsz.fsf@thalassa.informatimago.com>
Marius Bernklev <············@ifi.uio.no> writes:

> Pascal Bourguignon <···@informatimago.com> writes:
> 
> > Now, overloading [] onto the set of Apples, taking Elephants as
> > indicies and returning frequencies, yes, that would be stupid.
> > Perhaps.
> 
> Before overloading, [] had the following semantics:
> 
>     Retrieve an element from a set in constant time (mod cache/OS)

That's exactly what a hash tables does too. On average.

 
> That's not what a hash table does.  Even if you _want_ to get
> frequencies from elephants, it's still not the same.

-- 
__Pascal_Bourguignon__                   http://www.informatimago.com/
----------------------------------------------------------------------
There is a fault in reality. Do not adjust your minds. -- Salman Rushdie
From: Hannah Schroeter
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <auve9q$si8$2@c3po.schlund.de>
Hello!

Marc Spitzer  <········@optonline.net> wrote:
>[...]

>I think that adding a little syntax might make things clearer
>array:
>a[2]=6;
>hash table:
>a[["hello"]]=2;

>or something else that is not used by C/C++/JAVA to clarify
>visualy when you are talking about an array or a hash.

But then, it's not C++.

>[...]

>I think it would be more accurate to state that the print value of the
>data type in question not the data type itself is used as the key or
>index value. for example lets say I have a float, x = 2.0, and I use
>it for an index in to a hash.  The internal print representation was
>"+2.0E0" and later I use the string "2.0" to do a search for what is
>stored at index "+2.0E0" and I do not get it even though they are the
>same value as a float.

Not in C++'s map.

>[...]

>a: A hash table will allow the string representation of any data type
>to be used as the index(I prefer key) not just an integer.

Not in C++ or Lisp. In both, the value itself is used, according
to the less-then-predicate (C++'s map) or equivalence test used (Lisp).

>[...]

>simple printing in C++ is ugly, how bad does complex printing get?

You can of course also use printf(), which is *a bit* like Lisp's
format, but less powerful.

>[...]

>>    b.  if two values are written to the same key, the old value will
>>    be replaced (or "clobbered") by the new value.  For example, the
>>    lines: (setf (gethash "pi" a) 3.14) (setf (gethash "pi" a) 7)
>>    will set A["pi"] (that is, (gethash "pi" a) ) to 7 .

>that is not always true, it can do that or it can keep the orignal
>value or it can keep a list of all values assigned to any given key.
>In CL the operation is distructive for the organic hash functions.

If you (setf (gethash foo bar) baz), the value will be overwritten
and (eql (gethash foo bar) baz) will be true.

Of course, you can use different modifications *yourself*, but that's
no property of the Lisp hash table proper.

>[...]

Kind regards,

Hannah.
From: Marc Spitzer
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <86wuloty6e.fsf@bogomips.optonline.net>
······@schlund.de (Hannah Schroeter) writes:

> Hello!
> 
> Marc Spitzer  <········@optonline.net> wrote:
> >[...]
> 
> >I think that adding a little syntax might make things clearer
> >array:
> >a[2]=6;
> >hash table:
> >a[["hello"]]=2;
> 
> >or something else that is not used by C/C++/JAVA to clarify
> >visualy when you are talking about an array or a hash.
> 
> But then, it's not C++.

true

> 
> >[...]
> 
> >I think it would be more accurate to state that the print value of the
> >data type in question not the data type itself is used as the key or
> >index value. for example lets say I have a float, x = 2.0, and I use
> >it for an index in to a hash.  The internal print representation was
> >"+2.0E0" and later I use the string "2.0" to do a search for what is
> >stored at index "+2.0E0" and I do not get it even though they are the
> >same value as a float.
> 
> Not in C++'s map.

I was talking about hashes in general.  From my experience with Perl,
TLC, AWK and from what I remember from school.  I have no problem with
someone saying this is how it is done in lisp, C++ or whatever.  What
is the problem is when because it is done in A and B this way making a
general statement that this is how it is done period, it is wrong and
lazy or ignorant.

Lisp is a bit of a special case with hashes because of the way it
handles and accesses memory, ie the eq test before any other.
Most/all of the other languages I have some knowledge of do not have
the infrastructure to allow this to happen so transparently.  
  
> 
> >[...]
> 
> >a: A hash table will allow the string representation of any data type
> >to be used as the index(I prefer key) not just an integer.
> 
> Not in C++ or Lisp. In both, the value itself is used, according
> to the less-then-predicate (C++'s map) or equivalence test used (Lisp).
> 
> >[...]
> 
> >simple printing in C++ is ugly, how bad does complex printing get?
> 
> You can of course also use printf(), which is *a bit* like Lisp's
> format, but less powerful.

But then it is not C++.

> 
> >[...]
> 
> >>    b.  if two values are written to the same key, the old value will
> >>    be replaced (or "clobbered") by the new value.  For example, the
> >>    lines: (setf (gethash "pi" a) 3.14) (setf (gethash "pi" a) 7)
> >>    will set A["pi"] (that is, (gethash "pi" a) ) to 7 .
> 
> >that is not always true, it can do that or it can keep the orignal
> >value or it can keep a list of all values assigned to any given key.
> >In CL the operation is distructive for the organic hash functions.
> 
> If you (setf (gethash foo bar) baz), the value will be overwritten
> and (eql (gethash foo bar) baz) will be true.
> 

I said that that is how it works in CL. but that is not how it always
works everywhere.  

> Of course, you can use different modifications *yourself*, but that's
> no property of the Lisp hash table proper.
> 

But if I can do it it is a property of hash tables in general.  Look
at it this way, the CL hash table is a proper subset of hash tables
in general and saying that CL hash table is the set of options
available to any hash table is wrong and needs to be fixed.

marc
From: Hannah Schroeter
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <b1e3h2$om3$5@c3po.schlund.de>
Hello!

[old posting]

Marc Spitzer  <········@optonline.net> wrote:
>······@schlund.de (Hannah Schroeter) writes:

>[...]

>> >[...]

>> >I think it would be more accurate to state that the print value of the
>> >data type in question not the data type itself is used as the key or
>> >index value. for example lets say I have a float, x = 2.0, and I use
>> >it for an index in to a hash.  The internal print representation was
>> >"+2.0E0" and later I use the string "2.0" to do a search for what is
>> >stored at index "+2.0E0" and I do not get it even though they are the
>> >same value as a float.

>> Not in C++'s map.

>I was talking about hashes in general.  From my experience with Perl,
>TLC, AWK and from what I remember from school.

Okay. Hashes in general are defined thus that the key type is just
some abstract type with a hash function (key type --> some integer
subrange) and an equality comparison (key type * key type --> boolean).

Mapping all those keys to strings is definitely NOT a feature
of "hashes in general" but of a *few* particular scripting languages
and their use of hashes.

> I have no problem with
>someone saying this is how it is done in lisp, C++ or whatever.  What
>is the problem is when because it is done in A and B this way making a
>general statement that this is how it is done period, it is wrong and
>lazy or ignorant.

Okay. And that's what you have done: As awk and perl map the
hash keys to strings, you have just generalized that incorrectly.

>Lisp is a bit of a special case with hashes because of the way it
>handles and accesses memory, ie the eq test before any other.
>Most/all of the other languages I have some knowledge of do not have
>the infrastructure to allow this to happen so transparently.  

Hey? In about all programming languages with a real type system
(regardless whether run-time or compile-time or both), the hash
keys *keep* their type, regardless whether that's some integer type,
some pointer type, some structured type or, just accidentally,
string, or even some union type.

>[...]

>> >simple printing in C++ is ugly, how bad does complex printing get?

>> You can of course also use printf(), which is *a bit* like Lisp's
>> format, but less powerful.

>But then it is not C++.

Yes, it is. Cf. page 666 (section 27.8.2) of the ISO C++ standard.
Table 94, there, says:

Header <cstdio> synopsis:

[...]

Functions: [...] printf [...]

>[...]

Kind regards,

Hannah.
From: Hannah Schroeter
Subject: Re: Understanding Lisp Hash-Tables (a little better)
Date: 
Message-ID: <auve0m$si8$1@c3po.schlund.de>
Hello!

J. Romano <·······@hotmail.com> wrote:
>[...]

>   map<string,float> A;

"std::"!

>[...]

>This explanation of the variable A is one way of looking at a hash
>table (it is by no means the only way, however).  A hash table stores
>a value based on a "key" which, in C++, looks like an array index. 
>Unlike an array, however, the key can be a string, float, character,
>integer, or even a user-defined type.

In fact, you've rather defined the term "associative array", or even
better "finite map".

In fact, C++'s std::map template is *not* a hash table, but a search
tree (this is about the only reasonable implementation technique,
given the requirements of the standard: it requires a less-than
comparison operator for the key type, and O(log N) performance for
insert/delete/lookup).

>So what is the difference between an array and a hash table?  Here are
>a few:
>   a.  a hash table will allow any index (or key) to
>       be used, not just a positive integer

Not any, but key types must have a hash function plus an equivalence
relation.

>[...]

>   (setf a  (make-hash-table :test #'equal) )

Or (let ((a (make-hash-table)))
     ...body...)

>[...]

>   c.  if no value has been set for a particular key,
>       its value will be NIL.

In fact, the primary value of (gethash foo bar) is nil in that
case, but only if you haven't given the optional argument to
gethash!

And there's the second return value of gethash which allows you
to distinguish a present entry whose value happens to be eql the
default value from a key which is not present.

Kind regards,

Hannah.