Why can't a hash table be read/written as #S(HASH-TABLE ...)?

From: Joerg-Cyril Hoehle
Subject: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Thu, 08 Apr 1999 00:00:00 +0000
Message-ID: <qkpzp4jns6n.fsf@tzd.dont.telekom.spam.de.me>

Hi,

sorry to bring this old topic again, but I'd like to understand why
there can't be a #S(HASH-TABLE ...) representation of hash tables.

To me, a hash table is an association from keys to values, that is, it
could even be implemented internally as a-list or p-list and searched
linearly, using the appropriate test function (wasn't Genera recently
reported here as dynamically switching internal representations?).

A-lists and p-lists have an external representation (as far as the
keys and values have one), so why can't hash tables?

Is that a matter of *print-readably* which is well-defined for conses?
Or object identity?

Kent M. Pitman wrote on the topic of "Re: Constants and DEFCONSTANT"
>It's better
>for the language to make it clear that you can't save a complex object
>at all and expect to get it back unless you write the code to do it because
>it would be lying to you if it declared otherwise.
A hash table may be a complex object, but I'm not sure this thought
from a completely different thread applies here equally.

Thanks for any enlightment,
	Jorg Hohle
Telekom Research Center -- SW-Reliability

Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Tim Bradshaw
- Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Erik Naggum
  - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Barry Margolin
    - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Tim Bradshaw
  - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Kent M Pitman
    - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Erik Naggum
- Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Dobes Vandermeer
  - Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Tim Bradshaw
  - Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Vassil Nikolov
    - Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Barry Margolin
      - Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Vassil Nikolov
        Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Barry Margolin
        Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Vassil Nikolov
    - Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...)) Kent M Pitman
- Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Kent M Pitman
  - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Vassil Nikolov
    - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Vassil Nikolov
      - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Kent M Pitman
        Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Vassil Nikolov
    - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Barry Margolin
    - What's a Hollerith? (was Re: Why can't a hash table be...) His Holiness the Reverend Doktor Xenophon Fenderson, the Carbon(d)ated
      - Re: What's a Hollerith? (was Re: Why can't a hash table be...) Frank A. Adrian
        Re: What's a Hollerith? (was Re: Why can't a hash table be...) Vassil Nikolov
        Re: What's a Hollerith? (was Re: Why can't a hash table be...) Frank A. Adrian
        Re: What's a Hollerith? (was Re: Why can't a hash table be...) Vassil Nikolov
        Re: What's a Hollerith? (was Re: Why can't a hash table be...) Frank A. Adrian
      - Re: What's a Hollerith? (was Re: Why can't a hash table be...) Erik Naggum
  - Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)? Tim Bradshaw

From: Tim Bradshaw
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <ey3k8vm3b1p.fsf@lostwithiel.tfeb.org>

* Joerg-Cyril Hoehle wrote:
> Hi,
> sorry to bring this old topic again, but I'd like to understand why
> there can't be a #S(HASH-TABLE ...) representation of hash tables.

I don't think it would be reasonable for it to be #S(HASH-TABLE ...)
because that commits you to either representing hashtables as
structures, or special-casing #S, neither of which is very nice.  So
you'd need another read macro.

I think the other argument against it is that you might not want to
commit implementations to saying what information there is about a
hashtable.  As well as the keys & values you might want other kinds of
information -- what is the test function (and what if an
implementation allows test functions other than the standard set --
now you need some way of printing arbitrary functions perhaps)?  How
big is it, how fast does it grow?.

Finally, if you really want a read syntax for whatever purpose it's
not *that* hard to invent one.

--tim

From: Erik Naggum
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <3132642870691300@naggum.no>

* Tim Bradshaw <···@tfeb.org>
| I think the other argument against it is that you might not want to
| commit implementations to saying what information there is about a
| hashtable.  As well as the keys & values you might want other kinds of
| information -- what is the test function (and what if an implementation
| allows test functions other than the standard set -- now you need some
| way of printing arbitrary functions perhaps)?  How big is it, how fast
| does it grow?.

  I don't see a problem with #.(make-hash-table ...) except it would have
  been nice if one could "prime" a hash table in the MAKE-HASH-TABLE form,
  as in keyword arguments named INITIAL-CONTENTS (like MAKE-ARRAY) and/or
  INITIAL-KEYS and INITIAL-VALUES.  the lack of such an argument is a
  liability, not just when loading saved hash tables, but when building
  them the first time.  if we had a standard list notation for hash table
  initial contents (as we have for arrays), it would be trivial to create a
  print syntax for hash-tables, too.  so let's attack the basic problem.

#:Erik

From: Barry Margolin
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <NuqP2.445$kM2.48705@burlma1-snr2>

In article <················@naggum.no>, Erik Naggum  <····@naggum.no> wrote:
>  I don't see a problem with #.(make-hash-table ...) except it would have
>  been nice if one could "prime" a hash table in the MAKE-HASH-TABLE form,
>  as in keyword arguments named INITIAL-CONTENTS (like MAKE-ARRAY) and/or
>  INITIAL-KEYS and INITIAL-VALUES.  the lack of such an argument is a
>  liability, not just when loading saved hash tables, but when building
>  them the first time.  if we had a standard list notation for hash table
>  initial contents (as we have for arrays), it would be trivial to create a
>  print syntax for hash-tables, too.  so let's attack the basic problem.

How about:

#.(let ((#1=#:h (make-hash-table ...)))
    (setf (gethash #1# '<key1>) '<val1>
          (gethash #1# '<key2>) '<val2>
          ...)
    #1#)

:-)

A better syntax would probably be something like:

#H((<creation options>) <key1> <val1> <key2> <val2> ...)

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Tim Bradshaw
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <ey3r9ptabni.fsf@lostwithiel.tfeb.org>

* Barry Margolin wrote:

> A better syntax would probably be something like:

> #H((<creation options>) <key1> <val1> <key2> <val2> ...)

What worries me about all of these things is that it makes it
(possibly) hard for an implementation to provide extensions.  For
instance perhaps an implementation might allow user-defined functions
as TEST arguments, or possibly a user defined hash function.  Printing
those would be hard.

Of course it's very easy to provide a read/print syntax in an
application -- for instance it wouldn't take very long to implement
the above syntax.

So I think, what I'm trying to say is that a lot of things like this
are very easy to do in an application, but perhaps very hard to do in
a sufficiently general and useful way that it would really be good to
standardise them.  Things like copy constructors (C++), general
equality predicates, serialisation (of which this is an example I
guess) fall into this area (for me) too.  Unfortunately this doesn't
stop lesser languages from leaping in and standardising the stuff --
if you have it, it's a feature, even if it is no use.

--tim

From: Kent M Pitman
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <sfwd81d6jwo.fsf@world.std.com>

Erik Naggum <····@naggum.no> writes:

>   I don't see a problem with #.(make-hash-table ...) except [...]

It's problem is that people quite reasonably want to disable *read-eval*
to avoid trojan horses.  I'd make an isomorphism between this concern and
the "security" java provides.  We all know that Java security doesn't go
very deep, but there is a small set of applications for which you can
prove that Java isn't very dangerous, and it would be similarly nice
if there were a small set of lisp files you could load that were similiarly
not opening your implementation to attack.  Asking that hash tables
be among the set of "safe" things to load is not unreasonable.
It may be a small point in the grand scheme of things, but that doesn't
make it not worth caring about.

> if we had a standard list notation for hash table
> initial contents (as we have for arrays), it would be trivial to create a
> print syntax for hash-tables, too.  so let's attack the basic problem.

i'm all for that.

From: Erik Naggum
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <3132732178997230@naggum.no>

* Erik Naggum <····@naggum.no>
| I don't see a problem with #.(make-hash-table ...) except [...]

* Kent M Pitman <······@world.std.com>
| It's problem is that people quite reasonably want to disable *read-eval*
| to avoid trojan horses.

  yes, *READ-EVAL* is sometimes too powerful, but as a first approximation
  (of which Barry's tongue-in-cheek form would be a first approximation :),
  until we get to the point where we know how to represent these things, it
  would at provide _some_ mechanism to do it, instead of being impossible.
  #H appears to be free, so if only calls to MAKE-HASH-TABLE were to be
  desired, it could be used for that purpose, once we know what it takes.

#:Erik

From: Dobes Vandermeer
Subject: Serialization (WAS Why can't a hash table be read/written as  #S(HASH-TABLE ...))
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <370D9F7E.DFE81D37@mindless.com>

Tim Bradshaw wrote:
> 
> * Joerg-Cyril Hoehle wrote:
> > Hi,
> > sorry to bring this old topic again, but I'd like to understand why
> > there can't be a #S(HASH-TABLE ...) representation of hash tables.
> 
> I don't think it would be reasonable for it to be #S(HASH-TABLE ...)
> because that commits you to either representing hashtables as
> structures, or special-casing #S, neither of which is very nice.  So
> you'd need another read macro.
> 
> I think the other argument against it is that you might not want to
> commit implementations to saying what information there is about a
> hashtable.  As well as the keys & values you might want other kinds of
> information -- what is the test function (and what if an
> implementation allows test functions other than the standard set --
> now you need some way of printing arbitrary functions perhaps)?  How
> big is it, how fast does it grow?.

It does raise an interesting issue in LISP, though, which is that while
it would be very easy to add some kind of standard for serializing
various objects like hash tables and so forth, there is none.  Perhaps
the next version/revision of CL should take that into consideration.

For example, the serialization of a hash table might be reasonably
simple:

(make-hash-table :size X :initial-contents (...) ...)

Basically, the ability to generate from an object a LISP expression to
create that object (which would, of course work recursively expanding
objects until it reached the atom level) would be a very useful thing to
have built into a language, and is typically something that is not
useful when coded as something outside of the standard library.

CU
Dobes

From: Tim Bradshaw
Subject: Re: Serialization (WAS Why can't a hash table be read/written as   #S(HASH-TABLE ...))
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <nkj3e2a5b94.fsf@tfeb.org>

Dobes Vandermeer <·····@mindless.com> writes:


> It does raise an interesting issue in LISP, though, which is that while
> it would be very easy to add some kind of standard for serializing
> various objects like hash tables and so forth, there is none.  Perhaps
> the next version/revision of CL should take that into consideration.
> 
Do you really think it's simple?

--tim

From: Vassil Nikolov
Subject: Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...))
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <7elm1a$312$1@nnrp1.dejanews.com>

In article <·················@mindless.com>,
  Dobes Vandermeer <·····@mindless.com> wrote:
> > * Joerg-Cyril Hoehle wrote:
> > > Hi,
> > > sorry to bring this old topic again, but I'd like to understand why
> > > there can't be a #S(HASH-TABLE ...) representation of hash tables.
(...)
> It does raise an interesting issue in LISP, though, which is that while
> it would be very easy to add some kind of standard for serializing
> various objects like hash tables and so forth, there is none.

With all this talk about object identity and various headaches
related to it, one might think it would be clear why it is *not*
`very easy.'

Someone wrote that since there is a printed representation
for a-lists and p-lists, there should be no problem to have
a printed representation for hash tables.  I disagree.  There
is no printed representation for a-lists or p-lists.  There is
a printed representation for _conses_, and since a-lists and
p-lists are represented by conses, it appears that there is
a printed representation for them as well.  (Perhaps this
illustration is useful: implement a Lisp in Lisp; implement
environments by a-lists; can you print and then read
environments?)

Hash tables are not mere collections of objects.  They represent
_relationships_.  Externalising relationships so that they can
be then faithfully internalised back---in a way that works in
_general_, not just some of the time, which is an obvious
requirement for a language mechanism---is everything but easy.
(Externalising objects is not easy by itself, and externalising
relationships would encounter all difficulties in externalising
objects, plus some of its own.)

I agree that #.(MAKE-HASH-TABLE ...) can be very useful and
proper sometimes---but not at other times.  Therefore, it is
left for the (hopefully competent) programmers to decide when
they want that, and it is not incorporated into what the PRINT
function produces.

Note that it might be useful to be able to print hash tables
or to have a readable representation of hash tables---but
different ones^1 so that if one tries to read back in a hash
table that has been printed, one knows that there may be trouble
ahead.
______
^1 i.e. the printed representation is unreadable, and the
   readable representation is unprintable

As to Erik Naggum's suggestion that MAKE-HASH-TABLE is equipped
with :INITIAL-CONTENTS or some such, I take it that this would
be entirely equivalent to making the hash table and then
calling SETF of GETHASH, modulo some efficiency related
e.g. to memory management,---or am I missing something?

> Perhaps
> the next version/revision of CL should take that into consideration.
(...)

I don't think it is hard to see that such a thing would happen if
there already _is_ a _usable_ facility implemented by someone
(this being a necessary not a sufficient condition).  I am not
holding my breath.

Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Barry Margolin
Subject: Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...))
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <rutP2.453$kM2.49197@burlma1-snr2>

In article <············@nnrp1.dejanews.com>,
Vassil Nikolov  <········@poboxes.com> wrote:
>Hash tables are not mere collections of objects.  They represent
>_relationships_.  Externalising relationships so that they can
>be then faithfully internalised back---in a way that works in
>_general_, not just some of the time, which is an obvious
>requirement for a language mechanism---is everything but easy.

Everything you said above is just as true for conses, arrays, and
structures.  The post that asked for all the different representations that
((A B) (A B)) could be the printed representation of is an illustration of
this.  By your reasoning, we shouldn't provide a standard printed
representation of any of these things collections, since reading them back
in won't necessarily reproduce the original relationships.

But instead, we provide a printer and reader, and document the limitations.
Lisp's I/O architecture is nice and modular: if we provided a printed
representation for hash tables, the relationships between the objects that
it "contains" before and after reading would be identical to the
relationships those objects would have if we printed and read them
individually.  If you intend to print and read a hash table, you should
take this into account in designing the objects that will be contained.

In some cases, the problems of serialization can be dealt with by using
*PRINT-CIRCLE* to ensure that relationships within a particular I/O
operation are maintained.  If programmers are forced to create their own
serialization functions, there's a good chance they won't do this right.
It's nice to be able to take advantage of built-in functionality for
complex processes like this.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Vassil Nikolov
Subject: Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...))
Date: Fri, 09 Apr 1999 00:00:00 +0000
Message-ID: <7em2ev$d4s$1@nnrp1.dejanews.com>

In article <···················@burlma1-snr2>,
  Barry Margolin <······@bbnplanet.com> wrote:
> In article <············@nnrp1.dejanews.com>,
> Vassil Nikolov  <········@poboxes.com> wrote:
> >Hash tables are not mere collections of objects.  They represent
> >_relationships_.  Externalising relationships so that they can
> >be then faithfully internalised back---in a way that works in
> >_general_, not just some of the time, which is an obvious
> >requirement for a language mechanism---is everything but easy.
>
> Everything you said above is just as true for conses, arrays, and
> structures.

Not quite; perhaps I didn't express myself well enough.

A cons by itself is a structure (not in the DEFSTRUCT sense, of
course).  It makes reasonable sense to take it as standalone and
print it, using the print-circle machinery for shared substructure
if necessary.

A hash table is more than a structure; its very existence implies
much more strongly that some relationships are involved---not just
the relationship between the keys and values within the hash table
but also relationships between things in the hash table and other
things outside the hash table.

For example, how would an EQ hash table be printed so that when it
is read back in the identical keys are produced so that the hash
table is still usable?  Would the printer signal an error if the
hash table's test function is EQ and not all keys are interned
symbols?  Or maybe nobody uses EQ hash tables for non-interned
keys?

> The post that asked for all the different representations that
> ((A B) (A B)) could be the printed representation of is an illustration of
> this.  By your reasoning, we shouldn't provide a standard printed
> representation of any of these things collections, since reading them back
> in won't necessarily reproduce the original relationships.

When it comes to the `conventional' collections, the original
irreproducible relationships are above the threshold that defines
what the programmer would naturally expect from the Lisp system
to take care of.  With hash tables, these are below the threshold.
(Well, I hope I wouldn't have to formalise this notion of
threshold.  What is below the threshold can be naturally expected
to work in `the right way.')  In other words, what is the implicit
promise behind the commitment to provide a readable printed
representation?  Isn't it `there might be cases when reading it
back won't really work, but it will be clear when such a case is
at hand'?  Can such a promise be kept for hash tables?

(...)
> In some cases, the problems of serialization can be dealt with by using
> *PRINT-CIRCLE* to ensure that relationships within a particular I/O
> operation are maintained.  If programmers are forced to create their own
> serialization functions, there's a good chance they won't do this right.
> It's nice to be able to take advantage of built-in functionality for
> complex processes like this.

Provided that we understand correctly and fully all relevant issues.
After all, wasn't *PRINT-READABLY*---something very obvious in
retrospect---added late to the language, after people saw the need?

Perhaps I should just shut up and wait until people start reporting
dissatisfaction with the printed representation of hash tables...

Anyway, let's experiment.  Maybe someone already has some experience
with printing and reading hash tables which they would be willing
to take the time to share?

Besides, maybe I have the wrong mindset.  Perhaps I should not be
thinking about printing a _hash table_ which I consider a data
structure tightly coupled with other structures in memory, but
printing a _mapping_ from keys to values which, when internalised,
has an efficient internal representation.  From this viewpoint,
printing a hash table is like printing #H or whatever, then
printing the a-list that can be obtained by a composition of
CONS and MAPHASH (modulo hash-table specifics like test function).

Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Barry Margolin
Subject: Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...))
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <qpBP2.469$kM2.49719@burlma1-snr2>

In article <············@nnrp1.dejanews.com>,
Vassil Nikolov  <········@poboxes.com> wrote:
>For example, how would an EQ hash table be printed so that when it
>is read back in the identical keys are produced so that the hash
>table is still usable?  Would the printer signal an error if the
>hash table's test function is EQ and not all keys are interned
>symbols?  Or maybe nobody uses EQ hash tables for non-interned
>keys?

Perhaps the better question is: *why* would someone try to read in an EQ
hash table whose keys don't retain EQ-ness across PRINT/READ.  Are you
concerned that by providing a standard printed representation, we're going
to encourage people to make inappropriate use of it?

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Vassil Nikolov
Subject: Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...))
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <7eoa14$2no$1@nnrp1.dejanews.com>

In article <···················@burlma1-snr2>,
  Barry Margolin <······@bbnplanet.com> wrote:
> In article <············@nnrp1.dejanews.com>,
> Vassil Nikolov  <········@poboxes.com> wrote:
> >For example, how would an EQ hash table be printed so that when it
> >is read back in the identical keys are produced so that the hash
> >table is still usable?  Would the printer signal an error if the
> >hash table's test function is EQ and not all keys are interned
> >symbols?  Or maybe nobody uses EQ hash tables for non-interned
> >keys?
>
> Perhaps the better question is: *why* would someone try to read in an EQ
> hash table whose keys don't retain EQ-ness across PRINT/READ.

Yes, maybe this question is better; I understand it in the following
sense: which is this particular problem that (i) calls for EQ hash
tables and (ii) calls for printing/reading them.

But I still wouldn't mind getting an answer to my questions too
(I feel I have begun to use the phrase about (not) holding one's
breath too often).

> Are you
> concerned that by providing a standard printed representation, we're going
> to encourage people to make inappropriate use of it?

Yes, though I wouldn't use the word `encourage'---I'd say that
it would appear to people, or that people might be induced to
think, that they can just go ahead and print and read hash tables
and things will be fine each and every time.  This is of course
about the balance between (A) ensuring the length of the rope is
not too great and (B) trusting that the programmer knows how to
handle the rope; I admit that perhaps I have been leaning too much
to the `shortest rope' approach.  It would be nice to see particular
problems and solutions where hash tables are printed and read, and
if there were any complications---anybody has any case studies
handy?

Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Kent M Pitman
Subject: Re: Serialization (WAS Why can't a hash table be read/written as #S(HASH-TABLE ...))
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <sfwbtgx6jjp.fsf@world.std.com>

Vassil Nikolov <········@poboxes.com> writes:

> Someone wrote that since there is a printed representation
> for a-lists and p-lists, there should be no problem to have
> a printed representation for hash tables.  I disagree.  There
> is no printed representation for a-lists or p-lists.  There is
> a printed representation for _conses_, and since a-lists and
> p-lists are represented by conses, it appears that there is
> a printed representation for them as well.  (Perhaps this
> illustration is useful: implement a Lisp in Lisp; implement
> environments by a-lists; can you print and then read
> environments?)

I both agree and disagree with this.

If the role of the alist or plist is not to communicate identity but
only structure, then in a sense the fact that there is only a printed
representation for conses is accomodated by the fact that sometimes
identity is not what the alist or plist contains.  sometimes objects
are not used for their identity, and so sometimes alist and plist
representations are "good enough".  note, too, that even in the cons
notation (foo . bar) you lose the "identity" of the cons and if it
shares with other objects not printed in that pass, you lose the
sharing.

So the issue of "sharing" and "identity" are in there with the other
object structure issues, but the issue of printing is messier since
the real question in printing is "what was to be communicated".

Note that the fact is that the fasloader can coalesce even conses,
not just plists and alists, and that there is some isomorphism between
its ability to do this and the visual confusions that happen when
sharing is lots by print.  PRINT/READ can't coalesce but can split identity,
but even so the fact that identity can be lost is really the relevant 
factor--whether it is lost to widening or narrowing is sort of
secondary.

From: Kent M Pitman
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <sfwemlt6k2l.fsf@world.std.com>

Tim Bradshaw <···@tfeb.org> writes:

> * Joerg-Cyril Hoehle wrote:
> > Hi,
> > sorry to bring this old topic again, but I'd like to understand why
> > there can't be a #S(HASH-TABLE ...) representation of hash tables.
> 
> I don't think it would be reasonable for it to be #S(HASH-TABLE ...)
> because that commits you to either representing hashtables as
> structures,

Nothing about this choice of notation commits to a representation
other than a bogus insinuation by the implementation that #S implies
structure class.  I have long advocated that #S should be used for all
kinds of things like *print-readably*=t versions of arrays so that
you could write #S(ARRAY :CONTENTS (1 2 3) :ADJUSTABLE T :FILL-POINTER 1
:ELEMENT-TYPE FIXNUM) and stuff like that.  I think I also tried to
get a special #H... for hash tables just as a shorthand.

> or special-casing #S, neither of which is very nice.  So
> you'd need another read macro.

Doesn't require special casing.  Just make a structure-creator
facility and say that (defstruct foo...) among its other things just
sets up (structure-creator 'foo) but that the facility for doing
#S(foo ...) is more general.

- - - - -

I mention this because I want to distance my rejection of the whole
equality issue from this issue of dumping specific details of
representation.  I think it's good that there are ways of "serializing"
datastructures within the limits of the definitions of those structures.
I think this should not be confused with preserving "identity", though.
I do think you should be able to get an object to a file and back out
in a structurally isomorphic fashion; I don't think that is the same
as saying you should be able to reliably get identity-matching, which
is what the big debate was over with defconstant.

So count me as a friend to Joerg-Cyril's wish here for hash tables
to have a printed representation.  I can dig around for my notes on
the hash table printed reprensetation proposal to find out why it failed
if anyone is really curious.  My guess would be that it was just
"too late in the cycle; after feature freeze" though it's possible it
was something else.  I haven't powered my macivory back up since returning
from my vacation this week, though, and the data is all on there and
I'm too lazy to look it up now.

From: Vassil Nikolov
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Sat, 10 Apr 1999 00:00:00 +0000
Message-ID: <7eoco0$4me$1@nnrp1.dejanews.com>

In article <···············@world.std.com>,
  Kent M Pitman <······@world.std.com> wrote:
(...)
> I have long advocated that #S should be used for all
> kinds of things like *print-readably*=t versions of arrays so that
> you could write #S(ARRAY :CONTENTS (1 2 3) :ADJUSTABLE T :FILL-POINTER 1
> :ELEMENT-TYPE FIXNUM) and stuff like that.  I think I also tried to
> get a special #H... for hash tables just as a shorthand.
(...)

Regarding syntax, why not go the /fn/ way of FORMAT, and do something
like #/MAKE-ARRAY/(:INITIAL-CONTENTS (1 2 3) ...) or, with a UNIX
flavour that is too strong perhaps, #!MAKE-ARRAY(:INITIAL-CONTENTS ...).
For hash tables, one would need also to add an :INITIAL-CONTENTS
parameter to MAKE-HASH-TABLE.  Defining #H as shorthand should perhaps
be left to the programmer since someone might want to use #H for
something else.  (Hollerith strings spring to mind...)

While I am it, let me enumerate (some of) the cases when one might
use a printed representation of a hash table or whatever, lest
something is overlooked:

(1) literal hash table object in source code;
(2) hash table printed to a data file to be read back into the same image;
(3) hash table printed to a data file to be read by another Lisp image,
    another program, or humans;
(4) hash table in a data file produced by another image, another program,
    or a human which file is to be read into a Lisp image.

All of these should work equally well.

Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Vassil Nikolov
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Sun, 11 Apr 1999 00:00:00 +0000
Message-ID: <7er5ip$856$1@nnrp1.dejanews.com>

Sorry for following-up on self, but the addition occurred to me later.

In article <············@nnrp1.dejanews.com>,
  Vassil Nikolov <········@poboxes.com> wrote:
(...)
> While I am it, let me enumerate (some of) the cases when one might
> use a printed representation of a hash table or whatever, lest
> something is overlooked:
>
> (1) literal hash table object in source code;
> (2) hash table printed to a data file to be read back into the same image;
> (3) hash table printed to a data file to be read by another Lisp image,
>     another program, or humans;
> (4) hash table in a data file produced by another image, another program,
>     or a human which file is to be read into a Lisp image.

In addition to the above:

IWBN if every time a hash table was printed, the order of key-value pairs
was the same.  What I mean is essentially this: if one key-value pair
preceded another at the first printing, then in spite of any intervening
rehashings the former should precede the latter at subsequent printings
(though later invocations of MAPHASH could see a different order).

Perhaps this requirement might appear too stringent,^1 but I believe it
would be very useful if, for example, the printed representations are
to be diffed in some way (by eye, by diff(1), by whatever).  Also
note that every time a list, an array, or a structure is printed, the
order of elements is the same (naturally), so I believe this is what
many people would intuitively expect with hash-tables too (_expecially_
if some uniform syntax like #S is used).  Besides, without such a
requirement, how would *PRINT-LENGTH* make sense for hash tables?
_________________________________________________________________
^1 it could be relaxed if, as a first approximation, it is required
   only with the pretty-printer

Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Kent M Pitman
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Sun, 11 Apr 1999 00:00:00 +0000
Message-ID: <sfw4smmyf6j.fsf@world.std.com>

Vassil Nikolov <········@poboxes.com> writes:

> IWBN if every time a hash table was printed, the order of key-value
> pairs was the same. [...] it would be very useful if, for example,
> the printed representations are to be diffed in some way (by eye, by
> diff(1), by whatever) ...

The printer prints all on one line.  Any change and diff will be mush
since the one line of output will differ.  You're better off to write
a custom pretty-printer that suits your needs.

I'm skeptical a visual diff will be worth accomodating since hash tables
are often quite large and eyes can only see so much.

The real problem with this proposal is that it suggests that you CAN
visually compare.  Consider an EQ hash table initialized as follows:
 (dotimes (i 5) (setf (gethash (list 1 2) table) t))
In what order should the keys be printed in order to make sure nothing
had changed?  What happens if someone remhash's one of the elements and
inserts another using the same setf?  This case is extreme, but the general
problem will occur and what you're asking for is probably beyond the
limits of what is worth perturbing the language design for.
Printing elements in a certain order would potentially slow things a lot.

From: Vassil Nikolov
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Mon, 12 Apr 1999 00:00:00 +0000
Message-ID: <7erd2q$eag$1@nnrp1.dejanews.com>

In article <···············@world.std.com>,
  Kent M Pitman <······@world.std.com> wrote:
> Vassil Nikolov <········@poboxes.com> writes:
>
> > IWBN if every time a hash table was printed, the order of key-value
> > pairs was the same. [...] it would be very useful if, for example,
> > the printed representations are to be diffed in some way (by eye, by
> > diff(1), by whatever) ...
>
> The printer prints all on one line.  Any change and diff will be mush
> since the one line of output will differ.  You're better off to write
> a custom pretty-printer that suits your needs.

Yes, of course.  My mistake; should have just written that this is
relevant only to the pretty-printer instead of putting a footnote
that it might be left only for the pretty-printer to support.
(I have been pampered by having *PRETTY-PRINT* set to T all the time.)

> I'm skeptical a visual diff will be worth accomodating since hash tables
> are often quite large and eyes can only see so much.

I agree that one would be forced to do visual diffs very rarely, but if it
does happen, one would be very grateful for any convenience.

> The real problem with this proposal is that it suggests that you CAN
> visually compare.  Consider an EQ hash table initialized as follows:
>  (dotimes (i 5) (setf (gethash (list 1 2) table) t))
> In what order should the keys be printed in order to make sure nothing
> had changed?  What happens if someone remhash's one of the elements and
> inserts another using the same setf?  This case is extreme, but the general
> problem will occur and what you're asking for is probably beyond the
> limits of what is worth perturbing the language design for.

In my opinion (though I don't have proof at hand), having a printed
representation by itself suggests one can visually compare (modulo a
potentially large size of the data), and as I already wrote, printing EQ
hash tables is rather problematic anyway.  (Maybe it isn't, or there is a
solution to that, but I can't see it.)

Taking the above example, what is the hash table reader to do if it
encounters two occurences of the same key (same under the predicate of
having the same printed representation): create two fresh keys that are
EQUAL but not EQ if it is an EQ hash table, but signal an error if it is
an EQUAL hash table?  (I don't know.  Analogies to a-lists or p-lists
don't seem to be very helpful.)

> Printing elements in a certain order would potentially slow things a lot.

Yes.  I don't know if this really is a problem.  If it is, then I'd
only ask that this is an option to the programmer (like there is SORT
and there is STABLE-SORT).

Or, in the worst case, it should be made very explicit in the spec
that the order of key-value pairs produced by the hash-table printer is
unspecified and that the hash-table reader is insensitive to the order
it sees.

Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Barry Margolin
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Mon, 12 Apr 1999 00:00:00 +0000
Message-ID: <nFpQ2.494$kM2.50743@burlma1-snr2>

In article <············@nnrp1.dejanews.com>,
Vassil Nikolov  <········@poboxes.com> wrote:
>Regarding syntax, why not go the /fn/ way of FORMAT, and do something
>like #/MAKE-ARRAY/(:INITIAL-CONTENTS (1 2 3) ...) or, with a UNIX
>flavour that is too strong perhaps, #!MAKE-ARRAY(:INITIAL-CONTENTS ...).

If this is allowed to call any function, then it has almost the same
security implications as #., and would presumably be controlled by
*READ-EVAL*.

Kent's suggestion of allowing #S(ARRAY ...) and #S(HASH-TABLE ...) has the
feature that these are upward compatible with the existing spec, without
wasting another sharpsign prefix character.  Currently, those syntaxes
can't occur, since the user can't define a structure named ARRAY or
HASH-TABLE.  So these could be recognized as special cases.

-- 
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: His Holiness the Reverend Doktor Xenophon Fenderson, the Carbon(d)ated
Subject: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Sat, 01 May 1999 00:00:00 +0000
Message-ID: <w4ou2twgvgo.fsf_-_@nemesis.irtnog.org>

>>>>> "VN" == Vassil Nikolov <········@poboxes.com> writes:

    VN> Defining #H as shorthand should perhaps be left to the
    VN> programmer since someone might want to use #H for something
    VN> else.  (Hollerith strings spring to mind...)

Just out of curiosity, what's a Hollerith?  The only references I have
found (so far) on the Net are to one Herman Hollerith, who founded the
company that later became IBM.  An on-line ANSI FORTRAN spec mentions
it only to say that such things are not allowed in ANSI FORTRAN.

Thanks,
Mystified in Indianapolis

P.S. Somebody should keep an archive of all these weird programming
do-hickies, like trampolines and such.  There are "numerical recipes"
books out there, yes, but I'd like something along the lines of
"algorithm recipes".

-- 
Rev. Dr. Xenophon Fenderson, the Carbon(d)ated, KSC, DEATH, SubGenius, mhm21x16
Pope, Patron Saint of All Things Plastic fnord, and Salted Litter of r.g.s.b
In a lecture, Werner von Braun once said "Ve haf alvays been aiming for zer
stars" and a little voice at the back replied "But ve keep hittink London".

From: Frank A. Adrian
Subject: Re: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Sat, 01 May 1999 00:00:00 +0000
Message-ID: <tRQW2.281$_x.56056@news.uswest.net>

In the old FORTRANs (Pre F70), one could define a string in a FORMAT
specification as nHxxxx... (where n is the number of x's), e.g., 4HABCD,
15HThis is a test+ACE-, etc. These strings were known as Hollerith constants
and were named after the famed Herman (although I do not know the reason for
naming this particular structure for him). By the time of Fortran IV, most
commercial implementations had delimited string constants. F70 specified a
standard form for delimited strings, obviating the need for Hollerith
constants and they were done away with in F90 (though some compilers may
still surprise you)...

faa

His Holiness the Reverend Doktor Xenophon Fenderson, the Carbon(d)ated wrote
in message ...
+AD4APgA+AD4APgA+- +ACI-VN+ACI- +AD0APQ- Vassil Nikolov +ADw-vnikolov+AEA-poboxes.com+AD4- writes:
+AD4-
+AD4- VN+AD4- Defining +ACM-H as shorthand should perhaps be left to the
+AD4- VN+AD4- programmer since someone might want to use +ACM-H for something
+AD4- VN+AD4- else. (Hollerith strings spring to mind...)
+AD4-
+AD4-Just out of curiosity, what's a Hollerith? The only references I have
+AD4-found (so far) on the Net are to one Herman Hollerith, who founded the
+AD4-company that later became IBM. An on-line ANSI FORTRAN spec mentions
+AD4-it only to say that such things are not allowed in ANSI FORTRAN.
+AD4-
+AD4-Thanks,
+AD4-Mystified in Indianapolis
+AD4-
+AD4-P.S. Somebody should keep an archive of all these weird programming
+AD4-do-hickies, like trampolines and such. There are +ACI-numerical recipes+ACI-
+AD4-books out there, yes, but I'd like something along the lines of
+AD4AIg-algorithm recipes+ACI-.
+AD4-
+AD4---
+AD4-Rev. Dr. Xenophon Fenderson, the Carbon(d)ated, KSC, DEATH, SubGenius,
mhm21x16
+AD4-Pope, Patron Saint of All Things Plastic fnord, and Salted Litter of
r.g.s.b
+AD4-In a lecture, Werner von Braun once said +ACI-Ve haf alvays been aiming for zer
+AD4-stars+ACI- and a little voice at the back replied +ACI-But ve keep hittink London+ACI-.

From: Vassil Nikolov
Subject: Re: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Mon, 03 May 1999 00:00:00 +0000
Message-ID: <7gku4b$h5b$1@nnrp1.dejanews.com>

In article <··················@news.uswest.net>,
  "Frank A. Adrian" <·······@uswest.net> wrote:
> In the old FORTRANs (Pre F70), one could define a string in a FORMAT
> specification as nHxxxx... (where n is the number of x's), e.g., 4HABCD,
> 15HThis is a test+ACE-, etc.  These strings were known as Hollerith constants

And Real Lisp Programmers use nothing but Just Common Lisp and write

  #16HTHIS IS A STRING

> and were named after the famed Herman (although I do not know the reason for
> naming this particular structure for him).  By the time of Fortran IV, most
> commercial implementations had delimited string constants.  F70 specified a
> standard form for delimited strings, obviating the need for Hollerith
> constants and they were done away with in F90 (though some compilers may
> still surprise you)...

It has been said that a Fortran compiler is any good only if it
has a switch that turns FORTRAN IV mode on.

By the way, it is an instructive story how tabulating machines
running on punched cards `saved the day' for the US census.

--
Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Frank A. Adrian
Subject: Re: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Mon, 03 May 1999 00:00:00 +0000
Message-ID: <6TtX2.248$Dl4.27474@news.uswest.net>

Vassil Nikolov wrote in message +ADw-7gku4b+ACQ-h5b+ACQ-1+AEA-nnrp1.dejanews.com+AD4-...
+AD4-By the way, it is an instructive story how tabulating machines
+AD4-running on punched cards +AGA-saved the day' for the US census.


As well as how the Hollerith card got its size...

faa

From: Vassil Nikolov
Subject: Re: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Thu, 06 May 1999 00:00:00 +0000
Message-ID: <7gridc$amq$1@nnrp1.deja.com>

In article <···················@news.uswest.net>,
  "Frank A. Adrian" <·······@uswest.net> wrote:
(...)
> As well as how the Hollerith card got its size...

Indeed.

I have a few boxes of new (unpunched) punched cards which I find
very useful for taking notes (saved them from what had been
stocked for the mainframe, now retired).  I would be very rich
if I could substitute them for those same-sized pieces of paper...

--
Vassil Nikolov <········@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Frank A. Adrian
Subject: Re: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Sat, 08 May 1999 00:00:00 +0000
Message-ID: <zW7Z2.1271$2w4.57758@news.uswest.net>

Perhaps... If you could print the proper pictures on them (though the
government might frown upon that)...

faa


Vassil Nikolov wrote in message +ADw-7gridc+ACQ-amq+ACQ-1+AEA-nnrp1.deja.com+AD4-...
+AD4-In article +ADw-6TtX2.248+ACQ-Dl4.27474+AEA-news.uswest.net+AD4-,
+AD4-  +ACI-Frank A. Adrian+ACI- +ADw-fadrian+AEA-uswest.net+AD4- wrote:
+AD4-(...)
+AD4APg- As well as how the Hollerith card got its size...
+AD4-
+AD4-Indeed.
+AD4-
+AD4-I have a few boxes of new (unpunched) punched cards which I find
+AD4-very useful for taking notes (saved them from what had been
+AD4-stocked for the mainframe, now retired).  I would be very rich
+AD4-if I could substitute them for those same-sized pieces of paper...
+AD4-
+AD4---
+AD4-Vassil Nikolov +ADw-vnikolov+AEA-poboxes.com+AD4- www.poboxes.com/vnikolov
+AD4-(You may want to cc your posting to me if I +AF8-have+AF8- to see it.)
+AD4-   LEGEMANVALEMFVTVTVM  (Ancient Roman programmers' adage.)
+AD4-
+AD4------------+AD0APQ- Posted via Deja News, The Discussion Network +AD0APQ-----------
+AD4-http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

From: Erik Naggum
Subject: Re: What's a Hollerith? (was Re: Why can't a hash table be...)
Date: Sun, 02 May 1999 00:00:00 +0000
Message-ID: <3134619269885350@naggum.no>

* "His Holiness the Reverend Doktor Xenophon Fenderson, the Carbon(d)ated" <········@irtnog.org>
| Just out of curiosity, what's a Hollerith?

  a Hollerith _string_ is a string constant in early FORTRAN.  a Hollerith
  _card_ is the familiar punched card with 12 rows and 80 columns and one
  corner cut to keep orientation straight.  there might have been more old
  stuff named after its inventor, Herman Hollerith.

#:Erik

From: Tim Bradshaw
Subject: Re: Why can't a hash table be read/written as #S(HASH-TABLE ...)?
Date: Mon, 12 Apr 1999 00:00:00 +0000
Message-ID: <nkj4smmrstq.fsf@tfeb.org>

Kent M Pitman <······@world.std.com> writes:

> 
> Doesn't require special casing.  Just make a structure-creator
> facility and say that (defstruct foo...) among its other things just
> sets up (structure-creator 'foo) but that the facility for doing
> #S(foo ...) is more general.

I think I agree with this, although I'd like not to use #S (since #S
is wired into my brain as `thing made with DEFSRRUCT').  On the other
hand, #S is probably better than any other choice because anything
else is either used already or decreases the number of #x available to
the user.

I actually wrote the beginnings of and implementation of this on
Friday, so it's not very hard (but hard to get really right I'm sure).

--tim