From: Daniel Pittman
Subject: Read table modification question.
Date: 
Message-ID: <87vglsfw9x.fsf@inanna.rimspace.net>
I am using CMU Common Lisp (2.5.2) to write (well, prototype) a little
language parser/compiler.

As part of this, I would like to use the Lisp reader to read in the
contents of the input files. This input is similar to:

,----
| # '#' is a comment character, to the end of the line.
| command arg1 arg2 arg3;
| command arg1 { opt1 opt2 } arg2;
`----

So, to get the right `read' behavior, I need to create a modified
readtable with the syntax of #\# and #\; (at least) changed.[1]

Is there any way to do this other than to use `set-syntax-from-char'?
Reading the HyperSpec, it seems that there are many functions for
modifying the reader macro dispatch function associated with a
character, but no way to actually set the syntax type of a character.

So, can anyone offer guidance here? Am I taking this the wrong way --
should I be avoiding the Lisp reader in deference to something else?

I would like to keep the syntax of the input, of course, as that's half
the point of the prototype. :)

        Daniel

Footnotes: 
[1]  Correct me, of course, if I have started out the wrong way with
     this.

-- 
Life is not all lovely thorns and singing vultures, you know...
        -- Morticia Addams, _The Addams Family_

From: Kent M Pitman
Subject: Re: Read table modification question.
Date: 
Message-ID: <sfwn174lbit.fsf@world.std.com>
Daniel Pittman <······@rimspace.net> writes:

> 
> I am using CMU Common Lisp (2.5.2) to write (well, prototype) a little
> language parser/compiler.
> 
> As part of this, I would like to use the Lisp reader to read in the
> contents of the input files. This input is similar to:
> 
> ,----
> | # '#' is a comment character, to the end of the line.
> | command arg1 arg2 arg3;
> | command arg1 { opt1 opt2 } arg2;
> `----
> 
> So, to get the right `read' behavior, I need to create a modified
> readtable with the syntax of #\# and #\; (at least) changed.[1]
> 
> Is there any way to do this other than to use `set-syntax-from-char'?

I don't understand.  Are you trying to work around a bug or did you not
read the two entries from CLHS on set-syntax-from-char and set-macro-character,
both of which offer you exampls of making how to make a comment work?
How many ways do you need to do this?

As to modifying #\;, you don't say what you want it changed to.
If you need to see it as a token, I recommend just
 (defvar *end-of-command* (list '*end-of-command*))
 (set-macro-character #\; #'(lambda (stream char)
                              (declare (ignore char))
                              *end-of-command*)
                      *my-readtable*)

> Reading the HyperSpec, it seems that there are many functions for
> modifying the reader macro dispatch function associated with a
> character, but no way to actually set the syntax type of a character.

This is right.  It's implementation-dependent what those bits even are.
 
> So, can anyone offer guidance here? Am I taking this the wrong way --

If you re-read your message, you'll see you didn't show any examples of
what way you were doing it.

> should I be avoiding the Lisp reader in deference to something else?

That depends on how effectively you are making use of the readtable.  
Personally, I wouldn't use the readtable for anything but Lisp forms, for
which it is designed, but I'm sure Erik Naggum will say I'm losing out 
for not using it for more.  It's just a personal preference thing.
 
> I would like to keep the syntax of the input, of course, as that's half
> the point of the prototype. :)

If it were me, I'd just write the command parser from scratch myself, using
read-char and peek-char.  It's not that hard.  But that's just a personal
call.  But it lets you completely control what aspects of the parsed form
you retain and can display later.

>         Daniel
> 
> Footnotes: 
> [1]  Correct me, of course, if I have started out the wrong way with
>      this.
From: Daniel Pittman
Subject: Re: Read table modification question.
Date: 
Message-ID: <87wv67fr8x.fsf@inanna.rimspace.net>
On Tue, 19 Jun 2001, Kent M. Pitman wrote:
> Daniel Pittman <······@rimspace.net> writes:

[...]

>> As part of this, I would like to use the Lisp reader to read in the
>> contents of the input files. This input is similar to:

[...]

>> So, to get the right `read' behavior, I need to create a modified
>> readtable with the syntax of #\# and #\; (at least) changed.[1]
>> 
>> Is there any way to do this other than to use `set-syntax-from-char'?
> 
> I don't understand. Are you trying to work around a bug or did you not
> read the two entries from CLHS on set-syntax-from-char and
> set-macro-character, both of which offer you exampls of making how to
> make a comment work? 

Er, neither? 

> How many ways do you need to do this?

I wasn't sure that `set-syntax-from-char' (which was the one that I was
after) was the *right* way to do what I wanted.

[...]

>> So, can anyone offer guidance here? Am I taking this the wrong way --
> 
> If you re-read your message, you'll see you didn't show any examples
> of what way you were doing it.

Sorry, I was after more general guidance, rather than specific "is this
code correct" guidance. An "is this the Lisp way to do it" question.

>> should I be avoiding the Lisp reader in deference to something else?
> 
> That depends on how effectively you are making use of the readtable. 
> Personally, I wouldn't use the readtable for anything but Lisp forms,
> for which it is designed, but I'm sure Erik Naggum will say I'm losing
> out for not using it for more. It's just a personal preference thing.

Erik's comments were very helpful to me, specifically that I was
probably wasting time trying to shoehorn a non-Lisp language through the
Lisp reader.

>> I would like to keep the syntax of the input, of course, as that's
>> half the point of the prototype. :)
> 
> If it were me, I'd just write the command parser from scratch myself,
> using read-char and peek-char. It's not that hard. But that's just a
> personal call. But it lets you completely control what aspects of the
> parsed form you retain and can display later.

Cool. That's pretty much what I intend now. Sorry the question was so
vague -- and thanks for helping anyway. :)

        Daniel

-- 
Democracy is ever eager for rapid progress, and the only
progress which can be rapid is progress made down hill.
        -- Sir James Jeans
From: Steven D. Majewski
Subject: Re: Read table modification question.
Date: 
Message-ID: <9go406$ghm$1@murdoch.acc.Virginia.EDU>
In article <··············@inanna.rimspace.net>,
Daniel Pittman  <······@rimspace.net> wrote:
>
>As part of this, I would like to use the Lisp reader to read in the
>contents of the input files. This input is similar to:
>
>,----
>| # '#' is a comment character, to the end of the line.
>| command arg1 arg2 arg3;
>| command arg1 { opt1 opt2 } arg2;
>`----
>
>So, to get the right `read' behavior, I need to create a modified
>readtable with the syntax of #\# and #\; (at least) changed.[1]
>
>Is there any way to do this other than to use `set-syntax-from-char'?
>Reading the HyperSpec, it seems that there are many functions for
>modifying the reader macro dispatch function associated with a
>character, but no way to actually set the syntax type of a character.

Is there some reason you don't want to use 'set-syntax-from-char' ? 
( There are lower level routines you can use, but that would seem 
  the appropriate method here. )

I use (*):
	( set-syntax-from-char #\, #\ )
	( set-syntax-from-char #\# #\; )

in a function to read in numerical data files -- 
  the first makes commas read as whitespace so it will accept 
     comma delimited lists,
  the second makes '#' read as a comment char -- so I can read
  in files from unix programs that use that convention. 

As long as they are data files and not generic lisp files that 
 might use character literals, it works. ( In my compiled code,
the order of those two lines didn't seem to matter, but if you
type it in the terminal, it's likely that the second line has
to come last! ;-) 

Just be sure to wrap it all in an UNWIND-PROTECT block that resets
the original readtable (with COPY-READTABLE) whether it exits normally
or not. 

>So, can anyone offer guidance here? Am I taking this the wrong way --
>should I be avoiding the Lisp reader in deference to something else?

Why do you thing the reader would be so programmable if you weren't
supposed to use it? 

-- Steve Majewski <·····@Virginia.EDU> 



(* "I use..." :
   Actually, this was in XlispStat, not Common Lisp, and until I 
   added an implementation of #'set-syntax-from-char, I had to 
   hack a lower level method, but I plan to rewrite it using
   #'set-syntax-from-char. In XlispStat, the readtables happen 
    to be implemented by vectors, so the actual code used was
    more like:
	(setf (elt *readtable* ( char-code #\# )) ... 
*)
From: ···············@solibri.com
Subject: Re: Read table modification question.
Date: 
Message-ID: <un174a1cd.fsf@solibri.com>
Daniel Pittman <······@rimspace.net> writes:

> I am using CMU Common Lisp (2.5.2) to write (well, prototype) a little
> language parser/compiler.
> 
> As part of this, I would like to use the Lisp reader to read in the
> contents of the input files. This input is similar to:
> 
> ,----
> | # '#' is a comment character, to the end of the line.
> | command arg1 arg2 arg3;
> | command arg1 { opt1 opt2 } arg2;
> `----
> 
> So, to get the right `read' behavior, I need to create a modified
> readtable with the syntax of #\# and #\; (at least) changed.[1]


A possibly simpler brute-force way of doing this would be to run the
data through a filter which would exchange the troublesome characters to
some other character (if your language syntax has left something free to
use). Either externally, for example using the Unix tr -command, or
internally in lisp.

-- 
From: Erik Naggum
Subject: Re: Read table modification question.
Date: 
Message-ID: <3201974391024564@naggum.net>
* Daniel Pittman <······@rimspace.net>
> I am using CMU Common Lisp (2.5.2) to write (well, prototype) a little
> language parser/compiler.

  If that language has Lisp nature, it is a good idea to use the Lisp
  reader.  If it does not have the Lisp nature, it is a very, very bad
  novice mistake to use the Lisp reader.  In general, few syntaxes have the
  Lisp nature.  The primary criterion is that the first character (possibly
  the first two) should determine the type of the object and the method of
  converting an character stream (text) representation into an in-memory
  representation (object).  The exceptions are symbols, which are whatever
  is left after a sequence of characters not otherwise startings an object
  is determined not to be a number.  This rule is part of the Lisp nature,
  and it is _not_ part of most other syntaxes.

> I would like to keep the syntax of the input, of course, as that's half
> the point of the prototype. :)

  If you care to know my opinion, I think semicolon-and-braces-oriented
  syntaxes suck and that it is a very, very bad idea to use them at all.
  It is far easier to write a parser for a syntax with the Lisp nature in
  any language than it is to write a parser for thet stupid semiconcoction.
  Whoever decided to use the semicolon to _end_ something should just be
  taken out and have his colon semified.  (At least COBOL and SQL managed
  to use a period.)
  
#:Erik
-- 
  Travel is a meat thing.
From: Daniel Pittman
Subject: Re: Read table modification question.
Date: 
Message-ID: <8766drh62v.fsf@inanna.rimspace.net>
On Tue, 19 Jun 2001, Erik Naggum wrote:
> * Daniel Pittman <······@rimspace.net>
>> I am using CMU Common Lisp (2.5.2) to write (well, prototype) a
>> little language parser/compiler.
> 
>   If that language has Lisp nature, it is a good idea to use the Lisp
>   reader. If it does not have the Lisp nature, it is a very, very bad
>   novice mistake to use the Lisp reader. 

Right. That makes sense. It goes a lot of the way to explaining why I
felt that it was a fight against the reader, not working with it.

I was mislead by the `use the reader as a tokenizer' comments elsewhere
in the HyperSpec, I suspect. Er, that and being too lazy to write my
own.

Thanks,
        Daniel

-- 
Forsan et haec olim meminisse juvabit.
Some day, perhaps, even this will be pleasant to remember.
        -- Vergil, Aeneid