From: Toomas Altosaar
Subject: Regular expressions for objects
Date: 
Message-ID: <3ef90b62.0411251137.5d5f8cb3@posting.google.com>
Regular expressions are most frequently applied to detect patterns in
strings. However, I am in need of operating on object sequences.

Has anyone implemented a Lisp regexp parser that accepts a sequence of
objects instead of characters? If not, what difficulties, if any,
might lie ahead?

From: Pascal Bourguignon
Subject: Re: Regular expressions for objects
Date: 
Message-ID: <87653tk5yw.fsf@thalassa.informatimago.com>
···············@hut.fi (Toomas Altosaar) writes:

> Regular expressions are most frequently applied to detect patterns in
> strings. However, I am in need of operating on object sequences.
> 
> Has anyone implemented a Lisp regexp parser that accepts a sequence of
> objects instead of characters? If not, what difficulties, if any,
> might lie ahead?

None really.

If the number of objects is less than the number of characters, you
can establish an encoding and still use strings.

Otherwise, you'll have to fetch the sources of a regex package, and
modify it to handle more "characters".



(Note that in any case, one important optimization step is to replace
characters with character classes, so the DFA generated from the
regexp don't work on characters either).

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The world will now reboot; don't bother saving your artefacts.
From: Thomas Schilling
Subject: Re: Regular expressions for objects
Date: 
Message-ID: <opsh3bssmq1gy3cn@news.cis.dfn.de>
Am 25 Nov 2004 11:37:48 -0800 schrieb Toomas Altosaar  
<···············@hut.fi>:

FWIW, I recently hacked up a little regexp matcher for lists. I put it up  
on

  <http://www.inf.tu-dresden.de/~s3815210/lisp/matcher.lisp>

Feel free to modify it to fit your needs. It currently uses EQUALP to  
compare any kind of atoms so it should be easy to compare any kind of  
objects. But note that it currently only supports lists as input  
structures. This is very inefficient when you want to do greedy matching  
(cause you then have to cons up a lot of sublists beginning from the  
longest--all the rest of your input). Note that I didn't add any  
optimizations except using linked closures for matching. (This is the same  
that Edi's CL-PPCRE does which is faster that Perls regexps, so you might  
take a look at it for ideas for optimizations.)

It should be quite well-documented. Also it spits out quite a bunch of  
debugging ouput (since closures are hard to trace). So if you have any  
questions, drop me a note.

-ts

-- 
      ,,
     \../   /  <<< The LISP Effect
    |_\\ _==__
__ | |bb|   | _________________________________________________