Hi,
I'm trying to use a regular-expression package with lisp.
I'm working with clisp on a pc. I found pregexp from dorai
sitaram (http://www.ccs.neu.edu/home/dorai/pregexp/pregexp.html).
It has the re-feaurures i'm looking for - especially lookaheads.
But i have two problems. With very long strings (> 1000) i often
get a stack overflow (compiled version of the file, lispworks
has the same stack overflow). And the performace is bad. I have
a python-script, which i want to recode and enlarge in lisp.
The python script takes half a second to tokenize a file, the
lisp implementation 30 seconds. May be my code is wrong, but the
main work will be done by the pregexp-module, it's only a small
wrapper.
Now my questions:
1. Does anybody else has this problems with pregexp?
2. Is there a better re-package for lisp with all this
re-feautures. (on windows, with open source cl's)
best regards
jens himmelreich
"Jens Himmelreich" <ยทยทยทยท@uni-bremen.de> writes:
> I'm trying to use a regular-expression package with lisp.
> I'm working with clisp on a pc. I found pregexp from dorai
> sitaram (http://www.ccs.neu.edu/home/dorai/pregexp/pregexp.html).
> It has the re-feaurures i'm looking for - especially lookaheads.
> But i have two problems. With very long strings (> 1000) i often
> get a stack overflow (compiled version of the file, lispworks
> has the same stack overflow). And the performace is bad. I have
> a python-script, which i want to recode and enlarge in lisp.
> The python script takes half a second to tokenize a file, the
> lisp implementation 30 seconds. May be my code is wrong, but the
> main work will be done by the pregexp-module, it's only a small
> wrapper.
>
> Now my questions:
> 1. Does anybody else has this problems with pregexp?
> 2. Is there a better re-package for lisp with all this
> re-feautures. (on windows, with open source cl's)
You could look at CLISP's regexp package:
http://clisp.sourceforge.net/impnotes.html#regexp
[warning, this is an anchor in CLISP's 1 MB horror of an
implementation notes file]
It has the advantage of POSIX (ie, standards-conformant) semantics,
and its guts are in C (which, on CLISP, means speed). Because it uses
the host C system's regexp facilities, it's "Unix"-only, but I put the
scare-quotes in there because I'd be surprised if "Unix" didn't
include Cygwin.
--
/|_ .-----------------------.
,' .\ / | No to Imperialist war |
,--' _,' | Wage class war! |
/ / `-----------------------'
( -. |
| ) |
(`-. '--.)
`. )----'
Thomas F. Burdick wrote:
>
> You could look at CLISP's regexp package:
> http://clisp.sourceforge.net/impnotes.html#regexp
> [warning, this is an anchor in CLISP's 1 MB horror of an
> implementation notes file]
>
There is also a one-chapter-per-page version of the CLISP implementation
notes at http://clisp.sourceforge.net/impnotes/ -- the regexp docs are
at http://clisp.sourceforge.net/impnotes/modules.html#regexp.
Joe
Jens Himmelreich schrieb:
> Hi,
>
> I'm trying to use a regular-expression package with lisp.
> I'm working with clisp on a pc. I found pregexp from dorai
> sitaram (http://www.ccs.neu.edu/home/dorai/pregexp/pregexp.html).
> It has the re-feaurures i'm looking for - especially lookaheads.
> But i have two problems. With very long strings (> 1000) i often
> get a stack overflow (compiled version of the file, lispworks
> has the same stack overflow). And the performace is bad. I have
> a python-script, which i want to recode and enlarge in lisp.
> The python script takes half a second to tokenize a file, the
> lisp implementation 30 seconds. May be my code is wrong, but the
> main work will be done by the pregexp-module, it's only a small
> wrapper.
>
AFAIK pregexp is written in Scheme / CL, it's not a wrapper.
There is a regexp package which hasn't all the features pregexp has but
according to my experience it is faster.
(http://ww.telent.net/cliki/Regular%20Expression)
>
> Now my questions:
> 1. Does anybody else has this problems with pregexp?
> 2. Is there a better re-package for lisp with all this
> re-feautures. (on windows, with open source cl's)
>
I guess the fastest regexp-package is the PCRE wrapper (CMUCL only).
Rolf Wester