ANSI spec brain damaged wrt case in dispatch macro char

From: Richard Fateman
Subject: ANSI spec brain damaged wrt case in dispatch macro char
Date: Thu, 30 Nov 2000 00:00:00 +0000
Message-ID: <3A268F3E.5F4CA36A@cs.berkeley.edu>

At the risk of tossing gasoline on a fire that has
almost burned out...


I occasionally use Lisp to parse other languages, and had occasion
to write a program to read an ascii file in "xdoc" format. This
is output from an optical character recognition program, TextBridge.

In any case, there are phrases that look like 
[F;10;15;1;b;7]  or [s;234;324;4;0]

interspersed with text like Hello World

I thought what a great application for read table hacking. 
Clearly I should make [ into a dispatch character, and I should
use set-dispatch-macro-character to do the processing of 
the next modifier character, e.g. F  or s  in the examples
above, and collect the arguments.
This clear win turns into a sour taste, when debugging since lisp forbids
you from separately doing

(set-dispatch-macro-character #\[ #\s .... )
(set-dispatch-macro-character #\[ #\S .... )
 
from the ANSI spec.
 set-dispatch-macro-character causes new-function to be called 
        when disp-char followed by sub-char is read. If sub-char is a
        lowercase letter, it is converted to its uppercase equivalent. 
        It is an error if sub-char is one of the ten decimal digits. 


Thus the ANSI spec insists on changing #\s  into #\S.

This seems totally brain-damaged to me.  It requires that I dispatch
to #\S and then look again at the sub-char to see if it was #\s or #\S,
in effect dispatching AGAIN.

I wonder if there are other places that the ANSI standard specifies
that --even though you wrote the backslashed character in lower case--
that it will change it to upper case.

While I suppose there is some theological argument defending this
decision, (and, by the way, Franz seems to abide by ANSI in this
instance), I think it would be nice to have a
set-dispatch-macro-character-retaining-case function, which could
trivially be used to implement the brain-damaged version by
upper-casing its 2nd argument. The reverse transformation is
not trivial.

Regards,
  Richard Fateman

Re: ANSI spec brain damaged wrt case in dispatch macro char Kent M Pitman
Re: ANSI spec brain damaged wrt case in dispatch macro char Erik Naggum
- Re: ANSI spec brain damaged wrt case in dispatch macro char Richard Fateman
  - Re: ANSI spec brain damaged wrt case in dispatch macro char Erik Naggum

From: Kent M Pitman
Subject: Re: ANSI spec brain damaged wrt case in dispatch macro char
Date: Thu, 30 Nov 2000 00:00:00 +0000
Message-ID: <sfw66l5p9wt.fsf@world.std.com>

Richard Fateman <·······@cs.berkeley.edu> writes:

> Thus the ANSI spec insists on changing #\s  into #\S.

It's not trying to force case on you, it's just neutralizing the case
to make sure that you don't write a sharpsign readmacro that distinguishes
upper and lower case, since all the predefined characters work for #A and #a
equally and it would be confusing for you to pick some other letter that
was unused and make it case-sensitive.

I agree that for other dispatch characters not defined by CL, it might be
reasonable to allow a mode saying whether the character after the dispatch
character was to be canonicalized in case.  This is a limitation of the
language, but hardly a fatal one, since you yourself know how to work around
it and could even implement a shadowed version of set-dispatch-macro-character
that did the right thing if you really cared.  Had you observed this limitation
earlier, I would definitely have voted to allow you the capability, by adding 
a :canonicalize-case argument, probably defaulting to T for compatibility
with existing code that expected this.  I don't think it's a bad thing to
have asked for at language design time; I just think we're no longer at
that point unless someone discovers a really fatal flaw, and I think it's
fortunate that no one has.

> I wonder if there are other places that the ANSI standard specifies
> that --even though you wrote the backslashed character in lower case--
> that it will change it to upper case.

Sure.  How about EQUALP? (equalp "foo" "FOO").  Or maybe its downcasing it.
But at least it's ignoring it.  But not because it's got a deathwish for 
the case you typed--just because the operator in question is defined to
put "f" and "F" in an equivalence class, which is no more than what
the dispatch macro character stuff does.

> While I suppose there is some theological argument defending this
> decision, (and, by the way, Franz seems to abide by ANSI in this
> instance), I think it would be nice to have a
> set-dispatch-macro-character-retaining-case function, which could
> trivially be used to implement the brain-damaged version by
> upper-casing its 2nd argument. The reverse transformation is
> not trivial.

Why not making a FATEMAN-CL which offers just SET-DISPATCH-MACRO-CHARACTER
to your liking.  I'm not being facetious.  Each of us wants the language to
be different in some way.  I have ways I'd like it to be different.  But
every time I think of changing something these days, I think about just
making my own package and publishing it for others to use, since that requires
no change to the language.  I don't see that you can't do the same.

From: Erik Naggum
Subject: Re: ANSI spec brain damaged wrt case in dispatch macro char
Date: Sun, 03 Dec 2000 00:00:00 +0000
Message-ID: <3184871207489902@naggum.net>

* Richard Fateman <·······@cs.berkeley.edu>
| At the risk of tossing gasoline on a fire that has
| almost burned out...

  I think using "braindamaged" about a technical decision like this is
  indicative of a lack of interest in solving a problem and a desire to
  toss gasoline on the embers to start a new fire aflaming.

| I thought what a great application for read table hacking.

  Really?  That doesn't look awfully smart to me, but OK, let's assume
  it's a possible application of read tables, not contrived gasoline.

| Clearly I should make [ into a dispatch character, and I should use
| set-dispatch-macro-character to do the processing of  the next
| modifier character, e.g. F  or s  in the examples above, and collect
| the arguments.

  I'd think it would be a lot smarter to read the "list" in as a whole
  since you would have to read up to the enclosing ] in every "macro",
  associate functions with the characters, strings, or symbols that make
  up the first object, and call the function associated with the first
  object when you hit that final ].

| This clear win turns into a sour taste, when debugging since lisp
| forbids you from separately doing
| 
| (set-dispatch-macro-character #\[ #\s .... )
| (set-dispatch-macro-character #\[ #\S .... )
|  
| from the ANSI spec.
|  set-dispatch-macro-character causes new-function to be called 
|         when disp-char followed by sub-char is read. If sub-char is a
|         lowercase letter, it is converted to its uppercase equivalent. 
|         It is an error if sub-char is one of the ten decimal digits. 

  This is a valid _complaint_, of course, even though the application
  still seems contrived to cause an unnecessary problem and conflict.

  A proposal that should be both backward compatible and easy to
  implement: The _standard_ dispatch macro characters are defined for
  the standard-case letters ("standard-case" being defined in a proposal
  yet to be published, pretend it reads "uppercase" if you really have
  to) and the case of the character _read_ should be adjusted according
  to the readtable-case of the readtable prior to dispatching.  That is,
  it was a design flaw to introduce case knowledge in the function that
  sets up the dispatch tables.

  The problem of case gets bigger according as the character sets get
  bigger.  Since modern languages should support Unicode natively and
  efficiently and this is hard to accomplish if you want to maintain a
  case-insensitivity throughout the entire character set and system, we
  need to rethink our character sensitivity needs a little bit.  It is
  _not_ hard to implement case-insensitivity in the Lisp reader as we
  read character by character and already have to do a number of costly
  conversions, anyway.  E.g., a character object could easily contain
  information about the case and the distance in the code space to the
  corresponding character of the other case.  (This holds for everything
  but three _stupidly_ transliterated characters from Cyrillic to Latin
  or whatever, but people who actually _use_ those should just be shot,
  anyway, so I'll pull a Foderaro and say that "in practice" it is not a
  problem, with nigh a thought for the doofuses who wanted those wacky
  "characters" in the first place.)

| Thus the ANSI spec insists on changing #\s  into #\S.

  That's an amazingly inaccurate statement.

| This seems totally brain-damaged to me.  It requires that I dispatch
| to #\S and then look again at the sub-char to see if it was #\s or
| #\S, in effect dispatching AGAIN.

(defun macro-case-dispatch (uppercase lowercase)
  (lambda (stream char infix)
    (let ((call (if (upper-case-p char) uppercase lowercase)))
      (if call
	  (funcall call stream char arg)
	#+allegro (excl::dispatch-char-error stream char arg)))))

  Now you can do straightforward stuff like this:

(set-dispatch-macro-character #\# #\Z
  (macro-case-dispatch #'FOOBAR #'foobar))

  I don't think this seems brain-damaged at all, but it probably still
  seems "totally brain-damaged" to you, since you aren't going to accept
  any other solution than "the standard is broken because upper-case is
  insane", isn't that right?

| I wonder if there are other places that the ANSI standard specifies
| that --even though you wrote the backslashed character in lower case--
| that it will change it to upper case.

  Well, char-upcase is a rather obvious candidate.  :)

| While I suppose there is some theological argument defending this
| decision, (and, by the way, Franz seems to abide by ANSI in this
| instance), I think it would be nice to have a
| set-dispatch-macro-character-retaining-case function, which could
| trivially be used to implement the brain-damaged version by
| upper-casing its 2nd argument. The reverse transformation is not
| trivial.

  However dubious your "braindamage" label about the case problems in
  general, _that_ is a truly braindamaged solution to your "problem".

  After having watched not one, but two, astonishingly stupid ways _not_
  to deal with the upper-case issue in ANSI Common Lisp, I must wonder
  whether the desire to be _annoyed_ by upper-case issues has caused a
  few old Lispers to go slightly bananas, making them miss some obvious
  technical solutions to their problems in preference to broken, stupid
  non-solutions that do nothing but show people how irrationally annoyed
  _they_ are with the upper-case decision.  This is more than bitterness
  at having to deal with a standard and people who voted them down in a
  committee, which probably would never have happened if they had been
  only _somewhat_ less antagonistic to upper-case to begin with,  How
  can you _fail_ to get the majority of a committee to agree with you
  that lower-case is better than upper-case, when that clearly is the
  simplest, best, most obvious solution?  My take: Accuse your opponents
  of suffering braindamaged _before_ you have even presented your case,
  and prefer to continue to call them braindamaged while you utterly
  fail to present a better technical solution to _their_ concerns.

  _I'm_ annoyed with the proponents of a lower-case Common Lisp that had
  a chance to affect the standard, because if _they_ had not been so
  braindamaged, maybe we would've had a lower-case symbol names today.

#:Erik
-- 
  "When you are having a bad day and it seems like everybody is trying
   to piss you off, remember that it takes 42 muscles to produce a
   frown, but only 4 muscles to work the trigger of a good sniper rifle."
								-- Unknown

From: Richard Fateman
Subject: Re: ANSI spec brain damaged wrt case in dispatch macro char
Date: Mon, 04 Dec 2000 00:00:00 +0000
Message-ID: <3A2C1A23.99961B49@cs.berkeley.edu>

Erik Naggum wrote:
> 
> * Richard Fateman <·······@cs.berkeley.edu>
> | At the risk of tossing gasoline on a fire that has
> | almost burned out...
> 
>   I think using "braindamaged" about a technical decision like this is
>   indicative of a lack of interest in solving a problem and a desire to
>   toss gasoline on the embers to start a new fire aflaming.
> 
> | I thought what a great application for read table hacking.
> 
>   Really?  That doesn't look awfully smart to me, but OK, let's assume
>   it's a possible application of read tables, not contrived gasoline.

Well, thanks for allowing me to state for the record what occurred
to my own mind!
> 
> | Clearly I should make [ into a dispatch character, and I should use
> | set-dispatch-macro-character to do the processing of  the next
> | modifier character, e.g. F  or s  in the examples above, and collect
> | the arguments.
> 
>   I'd think it would be a lot smarter to read the "list" in as a whole
>   since you would have to read up to the enclosing ] in every "macro",

Sad to say, I learned that not every [ has a matching ].  In fact some
of the "macros" have no arguments and so they look like  [B   ... no
terminator.



>   associate functions with the characters, strings, or symbols that make
>   up the first object, and call the function associated with the first
>   object when you hit that final ].
> 
> | This clear win turns into a sour taste, when debugging since lisp
> | forbids you from separately doing
> |
> | (set-dispatch-macro-character #\[ #\s .... )
> | (set-dispatch-macro-character #\[ #\S .... )
> |
> | from the ANSI spec.
> |  set-dispatch-macro-character causes new-function to be called
> |         when disp-char followed by sub-char is read. If sub-char is a
> |         lowercase letter, it is converted to its uppercase equivalent.
> |         It is an error if sub-char is one of the ten decimal digits.
> 
>   This is a valid _complaint_, of course, even though the application
>   still seems contrived to cause an unnecessary problem and conflict.

If you want to look at the XDOC specification, I can direct you to a copy.


> 
>  <snip>  ....


>   "characters" in the first place.)
> 
> | Thus the ANSI spec insists on changing #\s  into #\S.
> 
>   That's an amazingly inaccurate statement.
> 
> | This seems totally brain-damaged to me.  It requires that I dispatch
> | to #\S and then look again at the sub-char to see if it was #\s or
> | #\S, in effect dispatching AGAIN.
> 
> (defun macro-case-dispatch (uppercase lowercase)
>   (lambda (stream char infix)
>     (let ((call (if (upper-case-p char) uppercase lowercase)))
>       (if call
>           (funcall call stream char arg)
>         #+allegro (excl::dispatch-char-error stream char arg)))))
> 
>   Now you can do straightforward stuff like this:
> 
> (set-dispatch-macro-character #\# #\Z
>   (macro-case-dispatch #'FOOBAR #'foobar))

This is true, and a nice solution for this particular application
when I can gather up all the pieces for #\Z and #\z. Thanks.

 However, it
does not allow me to separately set the dispatch character separately
for each character:  it forces me to do this in pairs.  If I wanted
to agregate characters I could write a single dispatch function for
all the characters of interest.

And if I wanted to change the meaning of #\z without also knowing
the meanings of #\Z it would be worrisome.  I could extract the old
meaning via get-dispatch-macro-character and embed it in the
new one, I suppose.
<snip>


> (Erik...)

>   _I'm_ annoyed with the proponents of a lower-case Common Lisp that had
>   a chance to affect the standard, because if _they_ had not been so
>   braindamaged, maybe we would've had a lower-case symbol names today.

Interesting thought, but I guess you just had to be there at the time.

RJF

From: Erik Naggum
Subject: Re: ANSI spec brain damaged wrt case in dispatch macro char
Date: Tue, 05 Dec 2000 01:31:36 +0000
Message-ID: <3184968696769937@naggum.net>

* Richard Fateman <·······@cs.berkeley.edu>
| If you want to look at the XDOC specification, I can direct you to a copy.

  Not really.

| This is true, and a nice solution for this particular application when
| I can gather up all the pieces for #\Z and #\z.  Thanks.

  You're welcome.

| However, it does not allow me to separately set the dispatch character
| separately for each character:  it forces me to do this in pairs.  If
| I wanted to agregate characters I could write a single dispatch
| function for all the characters of interest.

  I was looking into a solution to this.  I found it hard to access the
  shared code of the closure, which I would have been able to compare
  with the newly created closure, and it seemed very wrong to try to
  call a function I did not know was a function that was prepared to
  accept different types or numbers of arguments.  I could have relied
  on the arglist or the function-lambda-expression, but these are not
  reliable in a production system.  I could register the functions in a
  hash table of some sort and compare with the stored value and believe
  I could call the function to find out.  The latter _could_ work.  The
  function would then be able to return the upper- or lowercase function
  it would call with an additional argument it would be known to accept,
  such as a keyword argument, like :retrieve, and char being :upper or
  :lower or the actual character.

| And if I wanted to change the meaning of #\z without also knowing the
| meanings of #\Z it would be worrisome.  I could extract the old
| meaning via get-dispatch-macro-character and embed it in the new one,
| I suppose.

  You're welcome to extend the functionality to do that, but I found it
  too much work to finalize for a freebie.

#:Erik
-- 
  "When you are having a bad day and it seems like everybody is trying
   to piss you off, remember that it takes 42 muscles to produce a
   frown, but only 4 muscles to work the trigger of a good sniper rifle."
								-- Unknown