READ-PRESERVING-WHITESPACE in recursive calls

From: Edi Weitz
Subject: READ-PRESERVING-WHITESPACE in recursive calls
Date: Sun, 19 Oct 2003 23:04:58 +0000
Message-ID: <87n0bw25id.fsf@bird.agharta.de>

The CLHS entry for READ-PRESERVING-WHITESPACE says:

  "READ-PRESERVING-WHITESPACE is exactly like READ when the
   RECURSIVE-P argument to READ-PRESERVING-WHITESPACE is true."

In other words, I can't use this function for its intended effect
(preserving whitespace characters after parsing an object) in my own
reader macros. Does someone out there know what the rationale for this
behaviour is?

Thank you,
Edi.

Re: READ-PRESERVING-WHITESPACE in recursive calls Thomas A. Russ
- Re: READ-PRESERVING-WHITESPACE in recursive calls Edi Weitz
  - Re: READ-PRESERVING-WHITESPACE in recursive calls james anderson
    - Re: READ-PRESERVING-WHITESPACE in recursive calls Edi Weitz
      - Re: READ-PRESERVING-WHITESPACE in recursive calls james anderson
        Re: READ-PRESERVING-WHITESPACE in recursive calls Edi Weitz
        Re: READ-PRESERVING-WHITESPACE in recursive calls james anderson
        Re: READ-PRESERVING-WHITESPACE in recursive calls Edi Weitz
Re: READ-PRESERVING-WHITESPACE in recursive calls David Lichteblau
- Re: READ-PRESERVING-WHITESPACE in recursive calls Edi Weitz

From: Thomas A. Russ
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Mon, 20 Oct 2003 17:39:08 +0000
Message-ID: <ymir817g66b.fsf@sevak.isi.edu>

Edi Weitz <···@agharta.de> writes:

> The CLHS entry for READ-PRESERVING-WHITESPACE says:
> 
>   "READ-PRESERVING-WHITESPACE is exactly like READ when the
>    RECURSIVE-P argument to READ-PRESERVING-WHITESPACE is true."
> 
> In other words, I can't use this function for its intended effect
> (preserving whitespace characters after parsing an object) in my own
> reader macros. Does someone out there know what the rationale for this
> behaviour is?

Why not?  Can't you just set the RECURSIVE-P argument to NIL when you
call it in your own reader macro?  Is there some subtlety I don't
understand?

-- 
Thomas A. Russ,  USC/Information Sciences Institute

From: Edi Weitz
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Mon, 20 Oct 2003 19:49:20 +0000
Message-ID: <874qy3hepr.fsf@bird.agharta.de>

On 20 Oct 2003 10:39:08 -0700, ···@sevak.isi.edu (Thomas A. Russ) wrote:

> Edi Weitz <···@agharta.de> writes:
> 
> > The CLHS entry for READ-PRESERVING-WHITESPACE says:
> > 
> >   "READ-PRESERVING-WHITESPACE is exactly like READ when the
> >    RECURSIVE-P argument to READ-PRESERVING-WHITESPACE is true."
> > 
> > In other words, I can't use this function for its intended effect
> > (preserving whitespace characters after parsing an object) in my
> > own reader macros. Does someone out there know what the rationale
> > for this behaviour is?
> 
> Why not?  Can't you just set the RECURSIVE-P argument to NIL when
> you call it in your own reader macro?  Is there some subtlety I
> don't understand?

Er, cough, cough - I don't /know/, actually. I'm new to reader macros
and I thought that I /must/ set it to T.

CLtL2 in 22.2.1 says

  "Many input functions also take an argument called recursive-p. If
   specified and not nil, this argument specifies that this call is
   not a ``top-level'' call to read but an imbedded call, typically
   from the function for a macro character. It is important to
   distinguish such recursive calls [...]"

On the other hand, Steele himself provides the following example, also
in 22.2.1,

  (defun slash-reader (stream char) 
    (declare (ignore char)) 
    (do ((path (list (read-preserving-whitespace stream)) 
               (cons (progn (read-char stream nil nil t) 
                            (read-preserving-whitespace 
                               stream)) 
                     path))) 
        ((not (char= (peek-char nil stream nil nil t) #\/)) 
         (cons 'path (nreverse path))))) 
  (set-macro-character #\/ #'slash-reader)

which sets RECURSIVE-P to T for the calls to READ-CHAR and PEEK-CHAR
but to NIL for READ-PRESERVING-WHITESPACE.

I'm confused...

Edi.

From: james anderson
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Mon, 20 Oct 2003 20:25:21 +0000
Message-ID: <3F9443DD.D7936CA2@setf.de>

Edi Weitz wrote:
> 
>...
> ?
> 
> I'm confused...
> 

with or without the benefit of the discussion in section 23 of the argument
conventions for reader functions?



> Edi.

From: Edi Weitz
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Mon, 20 Oct 2003 23:07:58 +0000
Message-ID: <873cdnjynl.fsf@bird.agharta.de>

On Mon, 20 Oct 2003 22:25:21 +0200, james anderson <··············@setf.de> wrote:

> Edi Weitz wrote:
> > 
> > I'm confused...
> 
> with or without the benefit of the discussion in section 23 of the
> argument conventions for reader functions?

That's the same text as the one from CLtL2 I was citing. I have to
admit that I hadn't read this thoroughly when I posted my initial
question about the rationale yesterday. My question remains, though,
but maybe I can frame it better now:

I understand the first (context for #n= and #n# syntax) and third
(possible EOF within objects) reason why the RECURSIVE-P argument is
important but I have problems with the third one:

The rationale there is that the top-level call to READ determines
whether whitespace should be preserved or not. But I think this is
only reasonable at the end of the object which is read by the
top-level call, I'm talking about preserving whitespace while you are
"in the middle" of the object.

Steele provides an example that I'll quote again:

  (defun slash-reader (stream char) 
    (declare (ignore char)) 
    (do ((path (list (read-preserving-whitespace stream)) 
               (cons (progn (read-char stream nil nil t) 
                            (read-preserving-whitespace 
                               stream)) 
                     path))) 
        ((not (char= (peek-char nil stream nil nil t) #\/)) 
         (cons 'path (nreverse path))))) 
  (set-macro-character #\/ #'slash-reader)

It is intended to convert something like

  /usr/games/zork

into 

  (path usr games zork)

and he continues that the call to READ-PRESERVING-WHITESPACE is
essential here because

  (zyedh /usr/games/zork /usr/games/boggle)

should be

  (zyedh (path usr games zork) (path usr games boggle))

and not

  (zyedh (path usr games zork usr games boggle)).

But here the top-level call reads the whole list starting with ZYEDH
and it can (should?) only determine whether it's interested in
preserving the whitespace at the end of this list.[1]

On the other hand (that's the whole point of Steele's example) the
writer of SLASH-READER _must_ preserve whitespace. In fact, it must
preserve whitespace the top-level caller isn't interested in. And the
only way Steele can achieve this is by breaking the RECURSIVE-P rule
he has just established a couple of paragraphs earlier:

Either he calls READ-PRESERVING-WHITESPACE without a true RECURSIVE-P
argument and is thus subject to wrong #n=/#n# context or EOF in the
middle of an object or he sets RECURSIVE-P to true and can't control
whitespace anymore.

I think there's a dilemma. (And please don't tell me the answer is "So
don't do that...")

Thanks,
Edi.

[1] which happens to be a list and thus does not need a delimiting
    whitespace character at the end but the caller of the top-level
    READ might not know which kind of object he is reading

From: james anderson
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Tue, 21 Oct 2003 08:45:10 +0000
Message-ID: <3F94F123.7FA22240@setf.de>

Edi Weitz wrote:
> 
> On Mon, 20 Oct 2003 22:25:21 +0200, james anderson <··············@setf.de> wrote:
> 
> > Edi Weitz wrote:
> > >
> > > I'm confused...
> >
> > with or without the benefit of the discussion in section 23 of the
> > argument conventions for reader functions?
> 
> That's the same text as the one from CLtL2 I was citing. I have to
> admit that I hadn't read this thoroughly when I posted my initial
> question about the rationale yesterday. My question remains, though,
> but maybe I can frame it better now:
> 
> I understand the first (context for #n= and #n# syntax) and third
> (possible EOF within objects) reason why the RECURSIVE-P argument is
> important but I have problems with the third one:
> 
> The rationale there is that the top-level call to READ determines
> whether whitespace should be preserved or not. But I think this is
> only reasonable at the end of the object which is read by the
> top-level call, I'm talking about preserving whitespace while you are
> "in the middle" of the object.
> 
> Steele provides an example that I'll quote again:
> 
>   (defun slash-reader (stream char)
>     (declare (ignore char))
>     (do ((path (list (read-preserving-whitespace stream))
>                (cons (progn (read-char stream nil nil t)
>                             (read-preserving-whitespace
>                                stream))
>                      path)))
>         ((not (char= (peek-char nil stream nil nil t) #\/))
>          (cons 'path (nreverse path)))))
>   (set-macro-character #\/ #'slash-reader)
> 
> It is intended to convert something like
> 
>   /usr/games/zork
> 
> into
> 
>   (path usr games zork)
> 
> and he continues that the call to READ-PRESERVING-WHITESPACE is
> essential here because
> 
>   (zyedh /usr/games/zork /usr/games/boggle)
> 
> should be
> 
>   (zyedh (path usr games zork) (path usr games boggle))
> 
> and not
> 
>   (zyedh (path usr games zork usr games boggle)).
> 
> But here the top-level call reads the whole list starting with ZYEDH
> and it can (should?) only determine whether it's interested in
> preserving the whitespace at the end of this list.[1]

this parses variously. while it may be just reiterating the above to do so,
perhaps it helps to restate the paragraph in the two most significant variations:

a [only] [the top-level call] [can, and therefore only it should] determine whether
  it's interested in preserving the whitespace at the end of this list.

and

b [top-level call] [should] determine whether it's interested in
  preserving the whitespace at the end of this list [only].

are two possibilities, which describe different aspects of the situation. in
any case, while both are true, the distinction suggests a motivation for the
various parameters. 

in agreement with the discussion cited above from section 23, one allows that
the "outer" reader have the prerogative to specify whether whitespace is to be
retained at the end of the form which it intends to read. thus the facility
supported by recursive-p to say, "what he said". that's aspect [a]. while i
cannot give an example, one should keep in mind that a given call to
read-preserving-whitespace might well pass a non-constant argument for recursive-p.

on the other hand, when an "internal" reader is aware of context-sensitive
syntax rules, it should well be able to override the whitespace handling. as
in the example above, where it knows that the outer form is unabiguously
delimited and will not consume whitespace at its close.

> ...
> 
> Either he calls READ-PRESERVING-WHITESPACE without a true RECURSIVE-P
> argument and is thus subject to wrong #n=/#n# context or EOF in the
> middle of an object or he sets RECURSIVE-P to true and can't control
> whitespace anymore.

the consequence of which is that, the syntax rules a read macro imposes when
reading a special syntax must be observed when generating the expression.
which does impose some restrictions on internal expressions in the string
reader which you posted earlier.

if this does in fact have to do with that string syntax, i don't understand
what problem it causes: it appeared that a recurseive read would be indicated
within the "{  }" subform only - in which case one need not preserve
whitespace, while within the text one would not want to use the reader anyway.

i'm confused...

...

From: Edi Weitz
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Wed, 22 Oct 2003 18:30:35 +0000
Message-ID: <87znftcegk.fsf@bird.agharta.de>

On Tue, 21 Oct 2003 10:45:10 +0200, james anderson <··············@setf.de> wrote:

> on the other hand, when an "internal" reader is aware of
> context-sensitive syntax rules, it should well be able to override
> the whitespace handling. as in the example above, where it knows
> that the outer form is unabiguously delimited and will not consume
> whitespace at its close.

So, do I understand you correctly that you're saying Steele did the
right thing in using READ-PRESERVING-WHITESPACE without a RECURSIVE-P
argument?

> the consequence of which is that, the syntax rules a read macro
> imposes when reading a special syntax must be observed when
> generating the expression.  which does impose some restrictions on
> internal expressions in the string reader which you posted earlier.

For example? Sorry for pestering, but I still don't quite get
it. Unless you're saying "some things aren't possible - that's it."

> if this does in fact have to do with that string syntax, i don't
> understand what problem it causes: it appeared that a recurseive
> read would be indicated within the "{ }" subform only - in which
> case one need not preserve whitespace, while within the text one
> would not want to use the reader anyway.

If someone else reads this: I think James is referring to something I
sent to cmucl-help and the LW mailing list. FWIW this was about
CL-INTERPOL[1], see my annonuncement from a couple of hours earlier.

Yes, you are right that I don't need to preserve whitespace for the {}
case where

  (let ((a 42)) #?"${a}")

is expected to yield "42". What I had in mind (I have since abandonded
this idea because I don't like it anymore) was support for
interpolation without the curly brackets - like "$a " or "$a" in Perl.

Of course, it's another story in Lisp 'cause virtually every character
can be part of a symbol, so "$a.$b" wouldn't evaluate to what a Perl
programmer might expect but the basic idea was:

1. Once the reader has seen a dollar sign it checks if the next
   character is a curly bracket.

2. If it's _not_ a curly bracket it temporarily sets the syntax of #\"
   (or whatever is the outer delimiter of the string) to that of
   #\Space and then uses READ-PRESERVING-WHITESPACE directly after the
   dollar sign. Thus, things like #?"$a " as well as #?"$a" should
   work as expected - or so I thought.

It turned out that it worked in CMUCL, AllegroCL, CLISP, and in
LispWork's REPL, but _not_ if LispWorks was loading a file. That's
when I began to read the spec more carefully... :)

As I said, I don't want this anymore. But _if_ I wanted it, I still
wouldn't know how to do it correctly.

Thanks,
Edi.

[1] <http://weitz.de/cl-interpol/>

From: james anderson
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Wed, 22 Oct 2003 19:54:52 +0000
Message-ID: <3F96E079.F1578679@setf.de>

Edi Weitz wrote:
> 
> On Tue, 21 Oct 2003 10:45:10 +0200, james anderson <··············@setf.de> wrote:
> 
> > on the other hand, when an "internal" reader is aware of
> > context-sensitive syntax rules, it should well be able to override
> > the whitespace handling. as in the example above, where it knows
> > that the outer form is unabiguously delimited and will not consume
> > whitespace at its close.
> 
> So, do I understand you correctly that you're saying Steele did the
> right thing in using READ-PRESERVING-WHITESPACE without a RECURSIVE-P
> argument?

i don't recall ever being there, but that's the logical conclusion.
> 
> > the consequence of which is that, the syntax rules a read macro
> > imposes when reading a special syntax must be observed when
> > generating the expression.  which does impose some restrictions on
> > internal expressions in the string reader which you posted earlier.
> 
> For example? Sorry for pestering, but I still don't quite get
> it. Unless you're saying "some things aren't possible - that's it."

here i have been and, yup, it would mean that one could not have both the
global circle id's and different whitespace handling.

> 
> > if this does in fact have to do with that string syntax, i don't
> > understand what problem it causes: it appeared that a recurseive
> > read would be indicated within the "{ }" subform only - in which
> > case one need not preserve whitespace, while within the text one
> > would not want to use the reader anyway.
> 
> If someone else reads this: I think James is referring to something I
> sent to cmucl-help and the LW mailing list. FWIW this was about
> CL-INTERPOL[1], see my annonuncement from a couple of hours earlier.
> 
> Yes, you are right that I don't need to preserve whitespace for the {}
> case where
> 
>   (let ((a 42)) #?"${a}")
> 
> is expected to yield "42". What I had in mind (I have since abandonded
> this idea because I don't like it anymore) was support for
> interpolation without the curly brackets - like "$a " or "$a" in Perl.
> 
> Of course, it's another story in Lisp 'cause virtually every character
> can be part of a symbol, so "$a.$b" wouldn't evaluate to what a Perl
> programmer might expect but the basic idea was:
> 
> 1. Once the reader has seen a dollar sign it checks if the next
>    character is a curly bracket.
> 
> 2. If it's _not_ a curly bracket it temporarily sets the syntax of #\"
>    (or whatever is the outer delimiter of the string) to that of
>    #\Space and then uses READ-PRESERVING-WHITESPACE directly after the
>    dollar sign. Thus, things like #?"$a " as well as #?"$a" should
>    work as expected - or so I thought.
> 
it's a moot point if you're not thinking this way anymore, but you could also
make #\" a breaking macro character. sort of like #\)

> It turned out that it worked in CMUCL, AllegroCL, CLISP, and in
> LispWork's REPL, but _not_ if LispWorks was loading a file. That's
> when I began to read the spec more carefully... :)
> 
> As I said, I don't want this anymore. But _if_ I wanted it, I still
> wouldn't know how to do it correctly.
> 
> Thanks,
> Edi.
> 
> [1] <http://weitz.de/cl-interpol/>

From: Edi Weitz
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Wed, 22 Oct 2003 20:02:19 +0000
Message-ID: <87u161avn8.fsf@bird.agharta.de>

On Wed, 22 Oct 2003 21:54:52 +0200, james anderson <··············@setf.de> wrote:

> here i have been and, yup, it would mean that one could not have
> both the global circle id's and different whitespace handling.

OK, I see. (Although I don't understand what "global circle id's"
are. Is this something like having the cake and eat it too?)

> > Yes, you are right that I don't need to preserve whitespace for
> > the {} case where
> > 
> >   (let ((a 42)) #?"${a}")
> > 
> > is expected to yield "42". What I had in mind (I have since
> > abandonded this idea because I don't like it anymore) was support
> > for interpolation without the curly brackets - like "$a " or "$a"
> > in Perl.
> > 
> > Of course, it's another story in Lisp 'cause virtually every
> > character can be part of a symbol, so "$a.$b" wouldn't evaluate to
> > what a Perl programmer might expect but the basic idea was:
> > 
> > 1. Once the reader has seen a dollar sign it checks if the next
> >    character is a curly bracket.
> > 
> > 2. If it's _not_ a curly bracket it temporarily sets the syntax of #\"
> >    (or whatever is the outer delimiter of the string) to that of
> >    #\Space and then uses READ-PRESERVING-WHITESPACE directly after the
> >    dollar sign. Thus, things like #?"$a " as well as #?"$a" should
> >    work as expected - or so I thought.
> > 
> it's a moot point if you're not thinking this way anymore, but you
> could also make #\" a breaking macro character. sort of like #\)

Yes, I briefly thought about that. That would fix "$a" but it would
make "$a " unpredictable - some Lisps would evaluate it to "42 ", some
others to "42" - same old problem again.

Edi.

From: David Lichteblau
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Wed, 22 Oct 2003 18:04:15 +0000
Message-ID: <oelwuaxjgio.fsf@vanilla.rz.fhtw-berlin.de>

Edi Weitz <···@agharta.de> writes:
>   "READ-PRESERVING-WHITESPACE is exactly like READ when the
>    RECURSIVE-P argument to READ-PRESERVING-WHITESPACE is true."

Yes, because both READ and (of course) READ-P-W preserve whitespace when
RECURSIVE-P is true.  Not sure where the standard says that, but
23.1.3.2 (2.) has some explanation.

> In other words, I can't use this function for its intended effect
> (preserving whitespace characters after parsing an object) in my own
> reader macros. Does someone out there know what the rationale for this
> behaviour is?

Inner calls never consume whitespace.

The outmost call decides, since if READ is the outmost call (RECURSIVE-P
is false), it consumes whitespace for you.

(Unless I'm very confused, which may well be the case today.)

From: Edi Weitz
Subject: Re: READ-PRESERVING-WHITESPACE in recursive calls
Date: Wed, 22 Oct 2003 18:53:23 +0000
Message-ID: <87vfqhcdek.fsf@bird.agharta.de>

On Wed, 22 Oct 2003 20:04:15 +0200, David Lichteblau <············@lichteblau.com> wrote:

> Inner calls never consume whitespace.

Well, I think this can happen. See the rest of the thread and the
Steele example I posted.

> (Unless I'm very confused, which may well be the case today.)

I'm still confused a bit, FWIW... :)

Cheers,
Edi.