From: Robin Boswell
Subject: How to modify the behaviour of (read)?
Date: 
Message-ID: <6dj9kt$qfj$1@roadkill.scms.rgu.ac.uk>
   I'm trying to parse successive lines from a Clips 
trace file, and I find that the function (with-input-from-string) 
enables me to do this quite efficiently, when each line is 
presented as a string.

(defun string_to_symbols (string)

  (let ((word_list nil))
    (with-input-from-string (stream string) 
      (do ((word (read stream nil) (read stream nil)))
	  ((null word) (reverse word_list))
	(setq word_list (cons word word_list))))))

For example, 

  (string_to_symbols "==> f-7     (persues sharon john)")

returns

   (==> F-7 (PERSUES SHARON JOHN))

However, this goes wrong when the name of a rule is followed
by a ":", as in

      FIRE    1 john_rule: f-3

because Lisp then complains that john_rule isn't a package.
At the moment, I'm having to avoid using (read), and instead
use low-level string manipulation functions to parse each line, 
which strikes me as a bit messy and inefficient.  Is there some way
of modifying the behaviour of (read), possibly by temporarily
re-defining macro-characters?  (I HAVE read the sections in 
Steele that deal with macro-characters, but am not much the 
wiser).

  I'm using Allegro CL 4.3, in case that's relevant.  I'd
prefer to write portable Common Lisp code if possible.

  Thanks for any suggestions,

             Robin Boswell.

From: Christopher J. Vogt
Subject: Re: How to modify the behaviour of (read)?
Date: 
Message-ID: <34FDE1EC.A5CAB7D9@computer.org>
Robin Boswell wrote:
> 
>    I'm trying to parse successive lines from a Clips
> trace file, and I find that the function (with-input-from-string)
> enables me to do this quite efficiently, when each line is
> presented as a string.
> 
> (defun string_to_symbols (string)
> 
>   (let ((word_list nil))
>     (with-input-from-string (stream string)
>       (do ((word (read stream nil) (read stream nil)))
>           ((null word) (reverse word_list))
>         (setq word_list (cons word word_list))))))
> 
> For example,
> 
>   (string_to_symbols "==> f-7     (persues sharon john)")
> 
> returns
> 
>    (==> F-7 (PERSUES SHARON JOHN))
> 
> However, this goes wrong when the name of a rule is followed
> by a ":", as in
> 
>       FIRE    1 john_rule: f-3
> 
> because Lisp then complains that john_rule isn't a package.
> At the moment, I'm having to avoid using (read), and instead
> use low-level string manipulation functions to parse each line,
> which strikes me as a bit messy and inefficient.  Is there some way
> of modifying the behaviour of (read), possibly by temporarily
> re-defining macro-characters?  (I HAVE read the sections in
> Steele that deal with macro-characters, but am not much the
> wiser).
> 
>   I'm using Allegro CL 4.3, in case that's relevant.  I'd
> prefer to write portable Common Lisp code if possible.
> 
>   Thanks for any suggestions,
> 
>              Robin Boswell.

You might try something like this:

(defmacro with-modified-readtable (&rest rest)
  `(let ((*readtable* (copy-readtable nil)))
     (set-syntax-from-char #\: #\space)
     ,@rest))

(defun string_to_symbols (string)
   (let ((word_list nil))
     (with-modified-readtable
        (with-input-from-string (stream string)
          (do ((word (read stream nil) (read stream nil)))
              ((null word) (reverse word_list))
           (setq word_list (cons word word_list))))))

be warned that the character ":" is not the only one that can give you
trouble when reading text in from a file like this (e.g. ".").
From: Martti Halminen
Subject: Re: How to modify the behaviour of (read)?
Date: 
Message-ID: <34FE6900.400A@dpe.fi>
Robin Boswell wrote:
> 
>    I'm trying to parse successive lines from a Clips
> trace file, and I find that the function (with-input-from-string)
> enables me to do this quite efficiently, when each line is
> presented as a string.
<snip>
> However, this goes wrong when the name of a rule is followed
> by a ":", as in
> 
>       FIRE    1 john_rule: f-3
> 
> because Lisp then complains that john_rule isn't a package.
> At the moment, I'm having to avoid using (read), and instead
> use low-level string manipulation functions to parse each line,
> which strikes me as a bit messy and inefficient.  Is there some way
> of modifying the behaviour of (read), possibly by temporarily
> re-defining macro-characters?  (I HAVE read the sections in
> Steele that deal with macro-characters, but am not much the
> wiser).
> 
>   I'm using Allegro CL 4.3, in case that's relevant.  I'd
> prefer to write portable Common Lisp code if possible.


Of course, in case you just want the job done regardless of style,
there  is the fast-and-dirty trick (probably considered heretical in
this newsgroup) of using some external systems to massage the file to
easily readable form. 

The easiest would be to use convenient Unix tools for this, tr and sed
are usually adequate for this kind of jobs.

If you want to stay strictly in CL, other readers of this group probably
will help.

-- 
________________________________________________________________
    ^.          Martti Halminen
   / \`.        Design Power Europe Oy
  /   \ `.      Tekniikantie 12, FIN-02150 Espoo, Finland
 /\`.  \ |      Tel:+358 9 4354 2306, Fax:+358 9 455 8575
/__\|___\|      ······················@dpe.fi   http://www.dpe.fi
From: Larry Hunter
Subject: Re: How to modify the behaviour of (read)?
Date: 
Message-ID: <rbyaynhfmf.fsf@work.nlm.nih.gov>
Robin Boswell writes:

   I'm trying to parse successive lines from a Clips... [using READ]
   However, this goes wrong when the name of a rule is followed by a ":" ...
   because Lisp then complains that john_rule isn't a package.  At the
   moment, I'm having to avoid using (read), and instead use low-level
   string manipulation functions to parse each line, which strikes me as a
   bit messy and inefficient.

Actually, it's much MORE efficient to use string functions than to use READ,
which has to do everything necessary to parse any common-lisp expressions
(e.g. parse numbers).

Here's one way to do it:

  (defun lex-string (string &optional
                            (whitespace-chars '(#\space #\newline)))
    "Separates a string at whitespace and returns a list of strings"
    (flet ((whitespace-char? (char)
             (member char whitespace-chars :test #'char=)))
      (let ((tokens nil))
        (do* ((token-start (position-if-not #'whitespace-char? string) 
                           (when token-end
                             (position-if-not #'whitespace-char? string 
                                              :start (1+ token-end))))
              (token-end (position-if #'whitespace-char? string
                                      :start token-start)
                         (when token-start
                           (position-if #'whitespace-char? string
                                        :start token-start))))
             ((null token-start) (nreverse tokens))
          (push (subseq string token-start token-end) tokens)))))

Or, if you want symbols rather than strings (note that this will NOT parse
numbers):

  (mapcar #'intern (lex-string *my-string*))

Or if you want uppercase symbols:

  (mapcar #'intern (mapcar #'string-upcase (lex-string *my-string*)))

Although it is tempting to use read for your own parsing needs, it is often
much more efficient to do it yourself.  If you really want all of the power
(and cost) of READ, then you can set the readtable entries for special
characters like #\: #\, etc. so that you get the results you desire.

Good luck,

Larry