From: a314658
Subject: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <497cf18e$0$3340$6e1ede2f@read.cnntp.org>
One of the sexp's in a file that I'm trying to read has the following:

(foo (bar baz) (, ,) )

and when I try to read it in I get error: comma is illegal outside of 
backquote. what is a way around that?

thanks,
-Victor Piousbox, a student

From: D Herring
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <497cf248$0$3339$6e1ede2f@read.cnntp.org>
a314658 wrote:
> One of the sexp's in a file that I'm trying to read has the following:
> 
> (foo (bar baz) (, ,) )
> 
> and when I try to read it in I get error: comma is illegal outside of 
> backquote. what is a way around that?

What does foo expect the commas to be?
(foo (bar baz) (", ,"))
(foo (bar baz) (#\, #\,))
(foo (bar baz) (|, ,|))
...

- Daniel
From: a314658
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <497cf425$0$3339$6e1ede2f@read.cnntp.org>
Er... I can't change the dataset. But it's just a tree of constants, no 
functions. It goes something like this:

( (S
     (VP (VBD brought)
       (PP
         (NP (PRP him) ))
       (NP
         (NP (DT a) (NN mixture) )
         (, ,)
         (PP (IN of))))
     (. .) ))

And so the (. .) and (, ,) part is giving me trouble.

D Herring wrote:
> a314658 wrote:
>> One of the sexp's in a file that I'm trying to read has the following:
>>
>> (foo (bar baz) (, ,) )
>>
>> and when I try to read it in I get error: comma is illegal outside of 
>> backquote. what is a way around that?
> 
> What does foo expect the commas to be?
> (foo (bar baz) (", ,"))
> (foo (bar baz) (#\, #\,))
> (foo (bar baz) (|, ,|))
> ...
> 
> - Daniel
From: Pascal J. Bourguignon
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <87r62r2ame.fsf@galatea.local>
a314658 <······@gmail.com> writes:

> Er... I can't change the dataset. But it's just a tree of constants,
> no functions. It goes something like this:
>
> ( (S
>     (VP (VBD brought)
>       (PP
>         (NP (PRP him) ))
>       (NP
>         (NP (DT a) (NN mixture) )
>         (, ,)
>         (PP (IN of))))
>     (. .) ))
>
> And so the (. .) and (, ,) part is giving me trouble.

If you cannot change the data, then you will have to implement your
own reader (or use a library reader), because unfortunately, the
Common Lisp standard doesn't allow changing the constituent traits of
characters, therefore nothing can be done to read the dot not as an
invalid token by CL:READ.  For the comma, you can just remove the
reader macro normally attached to it.  But for the dot, you need to
change the whole reader.

(It would probably be much easier to copy your data file substituting
these characters:

      sed -e 's/\([.,]\)/\\\1/g' < daat  > data-for-cl

and use data-for-cl with the normal readtable, but if you cannot
change the dataset, perhaps you are forbidden to do that too...)


Have a look at:
http://darcs.informatimago.com/lisp/common-lisp/reader.lisp
http://www.informatimago.com/develop/lisp/index.html

-- 
__Pascal Bourguignon__
From: ······@corporate-world.lisp.de
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <6ba420ee-8508-411a-b57e-73c5761cedd6@s9g2000prg.googlegroups.com>
On 26 Jan., 01:07, ····@informatimago.com (Pascal J. Bourguignon)
wrote:
> a314658 <······@gmail.com> writes:
> > Er... I can't change the dataset. But it's just a tree of constants,
> > no functions. It goes something like this:
>
> > ( (S
> >     (VP (VBD brought)
> >       (PP
> >         (NP (PRP him) ))
> >       (NP
> >         (NP (DT a) (NN mixture) )
> >         (, ,)
> >         (PP (IN of))))
> >     (. .) ))
>
> > And so the (. .) and (, ,) part is giving me trouble.
>
> If you cannot change the data, then you will have to implement your
> own reader (or use a library reader), because unfortunately, the
> Common Lisp standard doesn't allow changing the constituent traits of
> characters, therefore nothing can be done to read the dot not as an
> invalid token by CL:READ.  For the comma, you can just remove the
> reader macro normally attached to it.  But for the dot, you need to
> change the whole reader.
>
> (It would probably be much easier to copy your data file substituting
> these characters:
>
>       sed -e 's/\([.,]\)/\\\1/g' < daat  > data-for-cl
>
> and use data-for-cl with the normal readtable, but if you cannot
> change the dataset, perhaps you are forbidden to do that too...)
>
> Have a look at:http://darcs.informatimago.com/lisp/common-lisp/reader.lisphttp://www.informatimago.com/develop/lisp/index.html
>
> --
> __Pascal Bourguignon__

(defparameter *data* "(foo (bar baz) (, ,) (. .))")
(defparameter *rt* (copy-readtable))

(set-macro-character #\. (lambda (stream char) '|.|) nil *rt*)
(set-macro-character #\, (lambda (stream char) '|,|) nil *rt*)
(set-macro-character #\( (lambda (stream char) (read-delimited-list #
\) stream)) t *rt*)

(let ((*readtable* *rt*))
  (with-input-from-string (s *data*)
    (read s)))

->  (FOO (BAR BAZ) (\, \,) (\. \.))



Works for me...
From: Pascal J. Bourguignon
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <87k58i3kos.fsf@galatea.local>
·······@corporate-world.lisp.de" <······@corporate-world.lisp.de> writes:

> On 26 Jan., 01:07, ····@informatimago.com (Pascal J. Bourguignon)
> wrote:
>> a314658 <······@gmail.com> writes:
>> > Er... I can't change the dataset. But it's just a tree of constants,
>> > no functions. It goes something like this:
>>
>> > ( (S
>> > � � (VP (VBD brought)
>> > � � � (PP
>> > � � � � (NP (PRP him) ))
>> > � � � (NP
>> > � � � � (NP (DT a) (NN mixture) )
>> > � � � � (, ,)
>> > � � � � (PP (IN of))))
>> > � � (. .) ))
>>
>> > And so the (. .) and (, ,) part is giving me trouble.
>>
>> If you cannot change the data, then you will have to implement your
>> own reader (or use a library reader), because unfortunately, the
>> Common Lisp standard doesn't allow changing the constituent traits of
>> characters, therefore nothing can be done to read the dot not as an
>> invalid token by CL:READ. �For the comma, you can just remove the
>> reader macro normally attached to it. �But for the dot, you need to
>> change the whole reader.
>>
>> (It would probably be much easier to copy your data file substituting
>> these characters:
>>
>> � � � sed -e 's/\([.,]\)/\\\1/g' < daat �> data-for-cl
>>
>> and use data-for-cl with the normal readtable, but if you cannot
>> change the dataset, perhaps you are forbidden to do that too...)
>>
>> Have a look at:http://darcs.informatimago.com/lisp/common-lisp/reader.lisphttp://www.informatimago.com/develop/lisp/index.html
>>
>> --
>> __Pascal Bourguignon__
>
> (defparameter *data* "(foo (bar baz) (, ,) (. .))")
> (defparameter *rt* (copy-readtable))
>
> (set-macro-character #\. (lambda (stream char) '|.|) nil *rt*)
> (set-macro-character #\, (lambda (stream char) '|,|) nil *rt*)
> (set-macro-character #\( (lambda (stream char) (read-delimited-list #
> \) stream)) t *rt*)
>
> (let ((*readtable* *rt*))
>   (with-input-from-string (s *data*)
>     (read s)))
>
> ->  (FOO (BAR BAZ) (\, \,) (\. \.))
>
>
>
> Works for me...

Yes, it works here too.  I must have done something wrong in my own tests...

-- 
__Pascal Bourguignon__
From: a314658
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <497e4ba2$0$3339$6e1ede2f@read.cnntp.org>
······@corporate-world.lisp.de wrote:
> On 26 Jan., 01:07, ····@informatimago.com (Pascal J. Bourguignon)
> wrote:
>> a314658 <······@gmail.com> writes:
>>> Er... I can't change the dataset. But it's just a tree of constants,
>>> no functions. It goes something like this:
>>> ( (S
>>>     (VP (VBD brought)
>>>       (PP
>>>         (NP (PRP him) ))
>>>       (NP
>>>         (NP (DT a) (NN mixture) )
>>>         (, ,)
>>>         (PP (IN of))))
>>>     (. .) ))
>>> And so the (. .) and (, ,) part is giving me trouble.
>> If you cannot change the data, then you will have to implement your
>> own reader (or use a library reader), because unfortunately, the
>> Common Lisp standard doesn't allow changing the constituent traits of
>> characters, therefore nothing can be done to read the dot not as an
>> invalid token by CL:READ.  For the comma, you can just remove the
>> reader macro normally attached to it.  But for the dot, you need to
>> change the whole reader.
>>
>> (It would probably be much easier to copy your data file substituting
>> these characters:
>>
>>       sed -e 's/\([.,]\)/\\\1/g' < daat  > data-for-cl
>>
>> and use data-for-cl with the normal readtable, but if you cannot
>> change the dataset, perhaps you are forbidden to do that too...)
>>
>> Have a look at:http://darcs.informatimago.com/lisp/common-lisp/reader.lisphttp://www.informatimago.com/develop/lisp/index.html
>>
>> --
>> __Pascal Bourguignon__
> 
> (defparameter *data* "(foo (bar baz) (, ,) (. .))")
> (defparameter *rt* (copy-readtable))
> 
> (set-macro-character #\. (lambda (stream char) '|.|) nil *rt*)
> (set-macro-character #\, (lambda (stream char) '|,|) nil *rt*)
> (set-macro-character #\( (lambda (stream char) (read-delimited-list #
> \) stream)) t *rt*)
> 
> (let ((*readtable* *rt*))
>   (with-input-from-string (s *data*)
>     (read s)))
> 
> ->  (FOO (BAR BAZ) (\, \,) (\. \.))
> 
> 
> 
> Works for me...

  It works! I feared that I'd have to read the files a char at a time, 
or do pre-parsing, which would not be graceful. Your solution is 
graceful. Thanks a lot ;-)

-Victor
From: Kaz Kylheku
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <20090131184642.756@gmail.com>
On 2009-01-26, a314658 <······@gmail.com> wrote:
> Er... I can't change the dataset. But it's just a tree of constants, no 
> functions. It goes something like this:
>
> ( (S
>      (VP (VBD brought)
>        (PP
>          (NP (PRP him) ))
>        (NP
>          (NP (DT a) (NN mixture) )
>          (, ,)
>          (PP (IN of))))
>      (. .) ))
>
> And so the (. .) and (, ,) part is giving me trouble.

What D Herring is asking is what these commas and periods represent.

You can't solve this problem if you don't know what these are.

Is it okay to just discard these forms, so they turn into nothing?

If they mean nothing, why are they in the data?


Let's put it this way: can you fill in this blank?

   Syntax:            Reads as this Lisp object:

   a                  A
   (a . nil)          (A)
   #xFF               255
   (, ,)              ______ ???


If /you/ can't fill in this blank, how can you write a computer
program which fills in the blank?
From: a314658
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <497e4d07$0$3339$6e1ede2f@read.cnntp.org>
Kaz Kylheku wrote:
> On 2009-01-26, a314658 <······@gmail.com> wrote:

> What D Herring is asking is what these commas and periods represent.
> 
> You can't solve this problem if you don't know what these are.
> 
> Is it okay to just discard these forms, so they turn into nothing?
> 
> If they mean nothing, why are they in the data?
> 
> 
> Let's put it this way: can you fill in this blank?
> 
>    Syntax:            Reads as this Lisp object:
> 
>    a                  A
>    (a . nil)          (A)
>    #xFF               255
>    (, ,)              ______ ???
> 
> 
> If /you/ can't fill in this blank, how can you write a computer
> program which fills in the blank?

Discarding data is a very subtle business ;-) I'd be content with 
translating (, ,) as (\, \,) and (. .) as (\. \.).
From: Kaz Kylheku
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <20090201181528.616@gmail.com>
On 2009-01-27, a314658 <······@gmail.com> wrote:
> Discarding data is a very subtle business ;-) I'd be content with 
> translating (, ,) as (\, \,) and (. .) as (\. \.).

I.e. commas and periods are token constituent characters. Easily arranged in
the readtable.

Any tricky cases? Could the data contain instances of the dot which are in fact
consing dot notation and must be treated as such?

Is the lack of whitespace, in your examples, between the left or right
parenthesis, and the following or leading dot, significant? Does ( . . ) mean
something different from (. .) ?

Also, what is the lexical analysis to make of this?

 (. .)
  \ /
  / \
 ( v )

Hmm ....
From: Rainer Joswig
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <joswig-033545.00570426012009@news-europe.giganews.com>
In article <························@read.cnntp.org>,
 a314658 <······@gmail.com> wrote:

> One of the sexp's in a file that I'm trying to read has the following:
> 
> (foo (bar baz) (, ,) )
> 
> and when I try to read it in I get error: comma is illegal outside of 
> backquote. what is a way around that?
> 
> thanks,
> -Victor Piousbox, a student

See readtables in Common Lisp.

To make , a whitespace use:

CL-USER 60 > (defparameter *data* "(foo (bar baz) (, ,) )")
*DATA*

CL-USER 61 > (defparameter *rt* (copy-readtable))
*RT*

CL-USER 62 > (set-syntax-from-char #\, #\space *rt* *rt*)
T

CL-USER 63 > (let ((*readtable* *rt*)) (read-from-string *data*))
(FOO (BAR BAZ) NIL)
22



You can also have them read as something.
Here I read the character #\, as the symbol |,|.
The symbol comma can also be printed as \, .



CL-USER 64 > (defparameter *rt* (copy-readtable))
*RT*

CL-USER 65 > (set-macro-character #\, (lambda (stream char) '|,|) nil *rt*)
T

CL-USER 66 > (let ((*readtable* *rt*)) (read-from-string *data*))
(FOO (BAR BAZ) (\, \,))
22


You should not change the existing readtable, but create your
own and change that - as I did in the example.

Then binding *readtable* to your readtable
around calls to the read functions (read, read-from-string, ...)
will make them use the new readtable - thanks to dynamic
binding.

-- 
http://lispm.dyndns.org/
From: Robert Maas
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <REM-2009jan28-001@Yahoo.Com>
> From: a314658 <······@gmail.com>
> One of the sexp's in a file that I'm trying to read has the following:
> (foo (bar baz) (, ,) )

I don't know what you mean by "sexp", but that's not a valid
s-expression in any version of lisp I've ever seen/used. You need
to contact whoever made that file to find out what his/her
intention was. Only when you know the intention of those commas
will you be able to convert them to something usable in CL.

If this is a class assignment, ask your instructor.
If this is a work assignment, ask your boss/supervisor.
If your instructor/boss/supervisor refuses to tell you the
intention of those commas, but still expects you to deal with them,
file a grievance.

If this is a random file you found on the net, you're wasting your
time, and our time.
If this is a file you generated yourself, please tell us how you
generated it.
From: Pascal J. Bourguignon
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <7chc3jihu6.fsf@pbourguignon.anevia.com>
········································@MaasInfo.Org (Robert Maas) writes:

>> From: a314658 <······@gmail.com>
>> One of the sexp's in a file that I'm trying to read has the following:
>> (foo (bar baz) (, ,) )
>
> I don't know what you mean by "sexp", but that's not a valid
> s-expression in any version of lisp I've ever seen/used. 

Old lisps had not yet backquotes, so you could name a symbol by a
single comma.  For the dot, I don't remember if in LISP 1.5 there was
already the pair syntax or not, I'll have to check the sources.


> You need
> to contact whoever made that file to find out what his/her
> intention was. Only when you know the intention of those commas
> will you be able to convert them to something usable in CL.
>
> If this is a class assignment, ask your instructor.
> If this is a work assignment, ask your boss/supervisor.
> If your instructor/boss/supervisor refuses to tell you the
> intention of those commas, but still expects you to deal with them,
> file a grievance.


Customer is king.  You don't file grievances when a customer comes
with his requirements, however strange they may seem to you.


> If this is a random file you found on the net, you're wasting your
> time, and our time.

Not really, CL is perfectly able to read this file, as other answers
demonstrated.


> If this is a file you generated yourself, please tell us how you
> generated it.

-- 
__Pascal Bourguignon__
From: Andrew Philpot
Subject: Re: Newbie: input file with commas like this: (, ,)
Date: 
Message-ID: <slrngo1051.a3s.philpot@ubirr.isi.edu>
> ········································@MaasInfo.Org (Robert Maas) writes:
>
>>> From: a314658 <······@gmail.com>
>>> One of the sexp's in a file that I'm trying to read has the following:
>>> (foo (bar baz) (, ,) )
>>
>> You need
>> to contact whoever made that file to find out what his/her
>> intention was. Only when you know the intention of those commas
>> will you be able to convert them to something usable in CL.

From inspection, it's Penn treebank (parsed natural language).  CARs
are tags, CADRs are subtrees or leaves (lexemes), etc.  The first ","
thus is the class (tag) of all comma-like punctuation sequences, the
second is the actual comma encountered.  These files also often have
bare : and ; and . characters, and may have the lexemes "'s" and "'nt"
and don't forget the embedded comma as digits separator "1,000".  I've
also seen extended versions where the lexeme tokens may themselves
include colons, weirdly escaped whitespace, potnums, etc.

I've read simple versions of these successfully with the stock CL
reader by modifying the syntax of a few characters, but now I use
Pascal Bourguingon's portable CL reader with a custom token
recognizer.

-- 
Andrew Philpot
USC Information Sciences Institute
·······@isi.edu