From: Thaddeus L Olczyk
Subject: regular expression library for lispworks.
Date: 
Message-ID: <3c800145.2127625@nntp.interaccess.com>
I'm looking for a regular expression library for lispworks that works
somewhat like CLisp regexp works.
IE there is a function called match which returns a
multiple-value-list that contains the begining and end positions (
including backtraces ) of the matched part of the string.

From: Dr. Edmund Weitz
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <m31yf66gb9.fsf@bird.agharta.de>
······@interaccess.com (Thaddeus L Olczyk) writes:

> I'm looking for a regular expression library for lispworks that
> works somewhat like CLisp regexp works.  IE there is a function
> called match which returns a multiple-value-list that contains the
> begining and end positions ( including backtraces ) of the matched
> part of the string.

<http://ww.telent.net/cliki/Text>

-- 

Dr. Edmund Weitz
Hamburg
Germany

The Common Lisp Cookbook
<http://cl-cookbook.sourceforge.net/>
From: Thaddeus L Olczyk
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <3c7d0a10.4378843@nntp.interaccess.com>
On 27 Feb 2002 17:27:22 +0100, ···@agharta.de (Dr. Edmund Weitz)
wrote:

>······@interaccess.com (Thaddeus L Olczyk) writes:
>
>> I'm looking for a regular expression library for lispworks that
>> works somewhat like CLisp regexp works.  IE there is a function
>> called match which returns a multiple-value-list that contains the
>> begining and end positions ( including backtraces ) of the matched
>> part of the string.
>
><http://ww.telent.net/cliki/Text>
Yes, but do any of them work like CLisps regexp package?
From: Dr. Edmund Weitz
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <m3sn7m50gx.fsf@bird.agharta.de>
······@interaccess.com (Thaddeus L Olczyk) writes:

> On 27 Feb 2002 17:27:22 +0100, ···@agharta.de (Dr. Edmund Weitz)
> wrote:
> 
> >······@interaccess.com (Thaddeus L Olczyk) writes:
> >
> >> I'm looking for a regular expression library for lispworks that
> >> works somewhat like CLisp regexp works.  IE there is a function
> >> called match which returns a multiple-value-list that contains the
> >> begining and end positions ( including backtraces ) of the matched
> >> part of the string.
> >
> ><http://ww.telent.net/cliki/Text>
> Yes, but do any of them work like CLisps regexp package?

Why don't you try it out yourself or at least read the documentation?
If you want other people to evaluate software for you, you should pay
them.

IIRC Dorai Sitaram's package does more or less exactly what you
want. (Which doesn't mean that the others don't.) It's no more than
two mouse-clicks away from the link above:

<http://www.ccs.neu.edu/home/dorai/pregexp/pregexp-Z-H-2.html#%_sec_2.2>

Edi.

-- 

Dr. Edmund Weitz
Hamburg
Germany

The Common Lisp Cookbook
<http://cl-cookbook.sourceforge.net/>
From: Michael Parker
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <1B01E591D9A091E3.58B4B6CEBA447014.326F15A69FA825D7@lp.airnews.net>
Thaddeus L Olczyk wrote:
> 
> On 27 Feb 2002 17:27:22 +0100, ···@agharta.de (Dr. Edmund Weitz)
> wrote:
> 
> >······@interaccess.com (Thaddeus L Olczyk) writes:
> >
> >> I'm looking for a regular expression library for lispworks that
> >> works somewhat like CLisp regexp works.  IE there is a function
> >> called match which returns a multiple-value-list that contains the
> >> begining and end positions ( including backtraces ) of the matched
> >> part of the string.
> >
> ><http://ww.telent.net/cliki/Text>
> Yes, but do any of them work like CLisps regexp package?

Beats me, I've never used clisp's regexp.  But the extant regex
libraries are all pretty similar, and none of them are that difficult
to use.  Shouldn't take more than a few minutes to whip up a wrapper
that matches clisp's regex pretty closely.

The regex matcher in CL-AWK, for example, will probably do what you
want with the following simple definition (disclaimer: CL-AWK is one
of my babies).

(defun match (str pat &optional (start 0))
  (when (clawk:$match str pat start)
     (values t clawk:*RSTART* clawk:*REND* clawk:*REGS*)))

OTOH, there's a lot more than just regexes in CLAWK that really come
in handy if you're doing lots of text munging.
From: Thaddeus L Olczyk
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <3c7fc13c.51270390@nntp.interaccess.com>
On Wed, 27 Feb 2002 18:27:43 -0600, Michael Parker <·······@pdq.net>
wrote:

>Thaddeus L Olczyk wrote:
>> 
>> On 27 Feb 2002 17:27:22 +0100, ···@agharta.de (Dr. Edmund Weitz)
>> wrote:
>> 
>> >······@interaccess.com (Thaddeus L Olczyk) writes:
>> >
>> >> I'm looking for a regular expression library for lispworks that
>> >> works somewhat like CLisp regexp works.  IE there is a function
>> >> called match which returns a multiple-value-list that contains the
>> >> begining and end positions ( including backtraces ) of the matched
>> >> part of the string.
>> >
>> ><http://ww.telent.net/cliki/Text>
>> Yes, but do any of them work like CLisps regexp package?
>
>Beats me, I've never used clisp's regexp.  But the extant regex
>libraries are all pretty similar, and none of them are that difficult
>to use.  Shouldn't take more than a few minutes to whip up a wrapper
>that matches clisp's regex pretty closely.
>
>The regex matcher in CL-AWK, for example, will probably do what you
>want with the following simple definition (disclaimer: CL-AWK is one
>of my babies).
>
>(defun match (str pat &optional (start 0))
>  (when (clawk:$match str pat start)
>     (values t clawk:*RSTART* clawk:*REND* clawk:*REGS*)))
>
>OTOH, there's a lot more than just regexes in CLAWK that really come
>in handy if you're doing lots of text munging.
I've been meaning to look at it. The problem is that at CLiki the
citation does not have a link to the code.

The thing about CLisp's regexp package is that it returns locations,
rather than matches, so I could write a function

(defun my_match(regex str)
    (let ((match (multiple-value-list (regexp:match regex str)))
          (loop for x in match 
                collect ( (list (regexp:match-start x)  
                                     (regexp:match-end   x))))))

then something like

(my-match "gh\(ij\(k\)\)lmno" "abcdefghijklmnopqrstuv")
produces
( (6 14) (8 10) (10 10))

Many regexp packages ( like pregex ) would produce
( "ghijklmno" "ijk" "k")

( obviously not using my function ).

Unfortunately I have too much code that is relying on this to change
it now. If something comes close I can write qwrappers, but that's
about it.
From: Michael Parker
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <FF334665FA1E9D30.CE85AB44369D59B1.7B153D070CC32B4E@lp.airnews.net>
> (defun my_match(regex str)
>     (let ((match (multiple-value-list (regexp:match regex str)))
>           (loop for x in match
>                 collect ( (list (regexp:match-start x)
>                                      (regexp:match-end   x))))))
> 
> then something like
> 
> (my-match "gh\(ij\(k\)\)lmno" "abcdefghijklmnopqrstuv")
> produces
> ( (6 14) (8 10) (10 10))

This is pretty much what the CL-AWK regex does.  It was written as a
replacement for GNU regex to be used in a commercial product, so it has
some of the same behavior as clisp's (which sounds like it's a FFI
interface to GNU regex?). It returns (values t start length regs) on a
successful match, and nil on failure.  The regs is an array of conses,
where the CAR is the start and the CDR is the end of the submatch.
Register 0 is the entire match, like GNU regex.  One minor note:
REGEX:MATCH is an anchored match, REGEX:SCAN is really more like the
GNU REGEX match.  In the CLAWK package, match is an unanchored match
like in AWK.

The CL-AWK package itself is pretty comprehensive, but if all you need
is a basic replacement for GNU regex, the REGEX package in CL-AWK is
probably all you need.  The CL-AWK package has a bunch of wrappers
around this regex function, so you can do stuff like split strings
based on a regex expression, substitute one or more occurrences of a
regex with something else, do a sort of CASE using regex patterns,
as well as the higher-level file operations that were directly modeled
on AWK.

> Many regexp packages ( like pregex ) would produce
> ( "ghijklmno" "ijk" "k")
> 
> ( obviously not using my function ).

You can do this with CL-AWK as well.  After a successful match, the
variables (or at least the defsyms) %0 - %20 will return the strings
corresponding to the submatches.  I haven't put in a setf form for
these yet -- that's still on my list.

> Unfortunately I have too much code that is relying on this to change
> it now. If something comes close I can write qwrappers, but that's
> about it.

I just noticed -- you're example escaped the special chars to get their
magic meaning, while the CL-AWK regex uses the unescaped form for magic
and escaped form for normal.  Don't know if this is a problem, but he
doesn't currently have the compiler syntax flags that GNU regex does.
I'll fix it here in a bit so the lexer can flip between the escaped and
unescaped syntaces.
From: Michael Parker
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <9f023346.0202280955.33890649@posting.google.com>
Michael Parker <·······@pdq.net> wrote in message news:<··················································@lp.airnews.net>...
> I just noticed -- you're example escaped the special chars to get their
> magic meaning, while the CL-AWK regex uses the unescaped form for magic
> and escaped form for normal.  Don't know if this is a problem, but he
> doesn't currently have the compiler syntax flags that GNU regex does.
> I'll fix it here in a bit so the lexer can flip between the escaped and
> unescaped syntaces.

Done.  There's now a variable REGEX:*escape-special-chars* that is nil
by default, but if T will cause the lexer to use escaped special chars.
From: Dr. Edmund Weitz
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <m3sn7mf37h.fsf@bird.agharta.de>
······@interaccess.com (Thaddeus L Olczyk) writes:

> On Wed, 27 Feb 2002 18:27:43 -0600, Michael Parker <·······@pdq.net>
> wrote:
> >The regex matcher in CL-AWK, for example, will probably do what you
> >want with the following simple definition (disclaimer: CL-AWK is one
> >of my babies).
> >
> >(defun match (str pat &optional (start 0))
> >  (when (clawk:$match str pat start)
> >     (values t clawk:*RSTART* clawk:*REND* clawk:*REGS*)))
> >
> >OTOH, there's a lot more than just regexes in CLAWK that really come
> >in handy if you're doing lots of text munging.
> I've been meaning to look at it. The problem is that at CLiki the
> citation does not have a link to the code.

Huh? The CLiki citation links to the CLAWK site and from there you'll
find a link to the code (at the bottom). Maybe you should grow the
habit of reading more than the first ten words of a web page... :)

Edi.

-- 

Dr. Edmund Weitz
Hamburg
Germany

The Common Lisp Cookbook
<http://cl-cookbook.sourceforge.net/>
From: Thaddeus L Olczyk
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <3c823a22.82220515@nntp.interaccess.com>
On 28 Feb 2002 08:57:53 +0100, ···@agharta.de (Dr. Edmund Weitz)
wrote:

>······@interaccess.com (Thaddeus L Olczyk) writes:
>
>> On Wed, 27 Feb 2002 18:27:43 -0600, Michael Parker <·······@pdq.net>
>> wrote:
>> >The regex matcher in CL-AWK, for example, will probably do what you
>> >want with the following simple definition (disclaimer: CL-AWK is one
>> >of my babies).
>> >
>> >(defun match (str pat &optional (start 0))
>> >  (when (clawk:$match str pat start)
>> >     (values t clawk:*RSTART* clawk:*REND* clawk:*REGS*)))
>> >
>> >OTOH, there's a lot more than just regexes in CLAWK that really come
>> >in handy if you're doing lots of text munging.
>> I've been meaning to look at it. The problem is that at CLiki the
>> citation does not have a link to the code.
>
>Huh? The CLiki citation links to the CLAWK site and from there you'll
>find a link to the code (at the bottom). Maybe you should grow the
>habit of reading more than the first ten words of a web page... :)
>
>Edi.
Correction: the problem is that  the link to the source is buried deep
within the description of CLAWK ( forwarded to from CLiki, I didn't
realise I had left).
I must have visited that page five or six times before I found it.
You ( Michael) should do something like move the link to the top 
and put it in 100pt font.
From: Michael Parker
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <B5D019D013FA4510.A50E73391E65FDF1.70A9E6FB6167421E@lp.airnews.net>
Thaddeus L Olczyk wrote:
> 
> On 28 Feb 2002 08:57:53 +0100, ···@agharta.de (Dr. Edmund Weitz)
> wrote:
> 
> >······@interaccess.com (Thaddeus L Olczyk) writes:
> >
> >> On Wed, 27 Feb 2002 18:27:43 -0600, Michael Parker <·······@pdq.net>
> >> wrote:
> >> >The regex matcher in CL-AWK, for example, will probably do what you
> >> >want with the following simple definition (disclaimer: CL-AWK is one
> >> >of my babies).
> >> >
> >> >(defun match (str pat &optional (start 0))
> >> >  (when (clawk:$match str pat start)
> >> >     (values t clawk:*RSTART* clawk:*REND* clawk:*REGS*)))
> >> >
> >> >OTOH, there's a lot more than just regexes in CLAWK that really come
> >> >in handy if you're doing lots of text munging.
> >> I've been meaning to look at it. The problem is that at CLiki the
> >> citation does not have a link to the code.
> >
> >Huh? The CLiki citation links to the CLAWK site and from there you'll
> >find a link to the code (at the bottom). Maybe you should grow the
> >habit of reading more than the first ten words of a web page... :)
> >
> >Edi.
> Correction: the problem is that  the link to the source is buried deep
> within the description of CLAWK ( forwarded to from CLiki, I didn't
> realise I had left).
> I must have visited that page five or six times before I found it.
> You ( Michael) should do something like move the link to the top
> and put it in 100pt font.

But that would require web-design skills, which I obviously don't
possess :-)
From: Thomas A. Russ
Subject: Re: regular expression library for lispworks.
Date: 
Message-ID: <ymi3czlfhxz.fsf@sevak.isi.edu>
······@interaccess.com (Thaddeus L Olczyk) writes:

> The thing about CLisp's regexp package is that it returns locations,
> rather than matches, so I could write a function
> 
> (defun my_match(regex str)
>     (let ((match (multiple-value-list (regexp:match regex str)))
>           (loop for x in match 
>                 collect ( (list (regexp:match-start x)  
>                                      (regexp:match-end   x))))))
> 
> then something like
> 
> (my-match "gh\(ij\(k\)\)lmno" "abcdefghijklmnopqrstuv")
> produces
> ( (6 14) (8 10) (10 10))
> 
> Many regexp packages ( like pregex ) would produce
> ( "ghijklmno" "ijk" "k")

Well, if you use the regex package (called nregex ?) written by Lawrence
Freihl, then you can get the result you don't like simply by

  (regex:regex "gh\(ij\(k\)\)lmno" "abcdefghijklmnopqrstuv")
    =>
  ("ghijklmno" "ijk" "k")

and the result you want with the following function:

(defun my-match (expression string)
	(let ((regex::*regex-groups* (make-array 10))
	      (regex::*regex-groupings* 0))
	  
    (if (not (funcall (compile nil (regex:regex-compile expression)) string))
      nil
      (loop for i from 0 below regex::*regex-groupings*
	  collect (aref regex::*regex-groups* i)))))

(my-match "gh\(ij\(k\)\)lmno" "abcdefghijklmnopqrstuv")
  => 
((6 15) (8 11) (10 11))

I note that the offsets are a bit different.  The locations are those
suitable for passing to the subsequence function to get the strings in
question.  (In fact, that is what the built-in regex:regex function
does).

I've extended this particular code a little.  I submitted it a while
back to the CMU Archives, but it seems to have been lost in the aether.
I suppose I should provide the updated version to Clikki or somesuch
place.

-Tom.

-- 
Thomas A. Russ,  USC/Information Sciences Institute          ···@isi.edu