From: Dave Bakhash
Subject: PortableAserve should use regular expressions
Date: 
Message-ID: <c291y1ch7hs.fsf@nerd-xing.mit.edu>
hey,

it's good to see a syntactic improvement of the META syntax.  However,
most people are fairly comfortable with widely accepted regular
expressions.

I emailed Jochem twice about using one of the Lisp-based regexp packages
in place of the META stuff, e.g.:

 http://www.geocities.com/mparker762/clawk.html#regex

I got no response.  However, I think that this would drastically
simplify the codebase, as well as making it more readable and usable by
the majority people who would like to use it.

dave

From: Jochen Schmidt
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <b4o4jk$g8t$05$1@news.t-online.com>
Dave Bakhash wrote:

> hey,
> 
> it's good to see a syntactic improvement of the META syntax.  However,
> most people are fairly comfortable with widely accepted regular
> expressions.
> 
> I emailed Jochem twice about using one of the Lisp-based regexp packages
> in place of the META stuff, e.g.:
> 
>  http://www.geocities.com/mparker762/clawk.html#regex
> 
> I got no response.  However, I think that this would drastically
> simplify the codebase, as well as making it more readable and usable by
> the majority people who would like to use it.

Hm... I remember having answered that we try to have as few dependencies as
possible with paserve. Using an external regex package for the few places
where META is used would be a bit overkill (IMHO).

I don't see in what way adding another external package dependency would
make the codebase simpler.

I'm willing to discuss the merits of changing the codebase in such ways but
if there are only aesthetic arguments then I don't know if it is worth the
effort.

ciao,
Jochen
From: Dave Bakhash
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <c29wuj4ff1k.fsf@nerd-xing.mit.edu>
Jochen Schmidt <···@dataheaven.de> writes:

> Hm... I remember having answered that we try to have as few
> dependencies as possible with paserve. Using an external regex package
> for the few places where META is used would be a bit overkill (IMHO).

Because the CL community, with its fragmented vendor support, never had
a decent system of "standard libraries", lots of programs end up
reproducing what should be either part of the language, or part of a
standard set of libraries.

So what ends up happening is that you start out with some non-ideal
piece of code that does just enough of what a more mature and powerful
library can do, and then feel that it's the better way, because it makes
the overall code smaller.

In this case, the webserver makes sufficient use of what anyone else
would see as an obvious application of regular expressions, but you have
chosen to implement a non-standard, virtually unusable-by-the-masses,
complicated way to do something rather easy.

If you wanted, you could have used the simple nregex.lisp package -- a
single-file, relatively small, program, and it still would have been
better.

The purpose of writing something like "Portable Allegroserve" is not
just so that it runs on many CL implementations, but that it be
comprehensible by most people.  But it's open-source, and of course,
anyone can dive in, and rip out the vile stuff, replace it with the more
standard, readable, usable stuff, etc.

This is probably one case of where a Lisp programmer has an obsession
with an academic article and wants to use it, at the expense of the
value of the program.

The most value you could add to a program these days is by making it
more readable, extensible, powerful, understandable, and flexible.  If
your META implementation is 1/10 the size, code-wise, as what I
suggested using, that is a negligible cost, compared to making your code
depend on something that few understand (nor need to understand), and
even fewer can improve upon (and not because there's no room for
improvement, but because most people can't grok it in the first place).

If this is an ego thing, then I suggest that it take the back seat.

> if there are only aesthetic arguments then I don't know if it is worth
> the effort.

If you were sensitive to your audience, you'd find that using META
vs. regular expressions is far from an "aesthetic" deviation.

dave
From: Jochen Schmidt
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <b4ogr7$iij$07$1@news.t-online.com>
Dave Bakhash wrote:

> Jochen Schmidt <···@dataheaven.de> writes:
> 
>> Hm... I remember having answered that we try to have as few
>> dependencies as possible with paserve. Using an external regex package
>> for the few places where META is used would be a bit overkill (IMHO).
> 
> Because the CL community, with its fragmented vendor support, never had
> a decent system of "standard libraries", lots of programs end up
> reproducing what should be either part of the language, or part of a
> standard set of libraries.

I think Daniel Barlows ASDF Defsystem facility and the effort around
vn-cclan looks promising to solve this "third party library problem".

But even if it is easy I see no need to include nontrivial libraries like a
regex package if for the very few places were I need it META is good
enough.

 
> So what ends up happening is that you start out with some non-ideal
> piece of code that does just enough of what a more mature and powerful
> library can do, and then feel that it's the better way, because it makes
> the overall code smaller.

Be assured that there are alot of more cases were META like parses can be
applied - not only the two ones I used in Portable AServe.
 
> In this case, the webserver makes sufficient use of what anyone else
> would see as an obvious application of regular expressions, but you have
> chosen to implement a non-standard, virtually unusable-by-the-masses,
> complicated way to do something rather easy.

Is this really so obvious?
IMHO META is not really complicated compared to regex packages.

> If you wanted, you could have used the simple nregex.lisp package -- a
> single-file, relatively small, program, and it still would have been
> better.

no there are license problems with this library.

> The purpose of writing something like "Portable Allegroserve" is not
> just so that it runs on many CL implementations, but that it be
> comprehensible by most people.  But it's open-source, and of course,
> anyone can dive in, and rip out the vile stuff, replace it with the more
> standard, readable, usable stuff, etc.

The purpose of writing something like "Portable Allegroserve" was and is on
the first priority to fit the needs of those who joined the effort to
actively work on it.
 
> This is probably one case of where a Lisp programmer has an obsession
> with an academic article and wants to use it, at the expense of the
> value of the program.

Bzzt! Wrong - this is one case were a Lisp programmer pragmatically decided
not to use another complicated system just to solve a problem which can be
solved with a few lines of code.

> The most value you could add to a program these days is by making it
> more readable, extensible, powerful, understandable, and flexible.  If
> your META implementation is 1/10 the size, code-wise, as what I
> suggested using, that is a negligible cost, compared to making your code
> depend on something that few understand (nor need to understand), and
> even fewer can improve upon (and not because there's no room for
> improvement, but because most people can't grok it in the first place).

I think META is widely understood in the lisp community - the fact that
every second lisp programmer seems to have an own implementation lying
around might imply this.

> If this is an ego thing, then I suggest that it take the back seat.

Huh? In what way should this be an ego thing? Everyone is free to either
join our effort or create his own project if our goals absolutely do not
fit together.

Please try to understand that Open Source (and particularily Free Software)
means to get your own hands dirty if you really want to get something done.

There are somewhere around 34 Lines of META code in Portable Allegroserve.
I really do not think that this is such a critical point. and since there
are no technical problems (I know of) with this code I see no good reason
to replace it with something untested.

Allegroserve makes heavy use of IF* and LOOPs with explicit calls to RETURN
which _I_ and AFAIK most other developers of Portable AllegroServe do not
really like. But as long as the code using that stuff works there is no
reason to change that. The only thing we informally agreed on is that we do
not necessarily follow the advice to use those facilities for new code we
write.

>> if there are only aesthetic arguments then I don't know if it is worth
>> the effort.
> 
> If you were sensitive to your audience, you'd find that using META
> vs. regular expressions is far from an "aesthetic" deviation.

We do not work particularily for this audience. If the goals of the audience
meet those of the guys who actually make their hands dirty then both profit
from it. If they differ the audience is welcome to join or if they do not
want that - to compensate the developers for doing work they did not really
plan to do for their own good.

To bring the question to the point:
I personally do not yet have any problems with the META code in Portable
AllegroServe. I definitely would have a problem if Portable Allegroserve
suddenly would depend on another non-trivial library (with no actual gain
in terms of new features). This does _not_ mean in any way that I would
"forbid" such a move - it's just a fact that I'm not convinced enough to
get *my* hands dirty.

ciao,
Jochen
From: Chris Double
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <u7kb48c74.fsf@double.co.nz>
Dave Bakhash <·····@alum.mit.edu> writes:

> In this case, the webserver makes sufficient use of what anyone else
> would see as an obvious application of regular expressions, but you
> have chosen to implement a non-standard, virtually
> unusable-by-the-masses, complicated way to do something rather easy.

From what I've seen usage of Meta is fairly common in Common Lisp. I
see mention of it here and of people using it fairly often. What do
you see as the advantages of regular expressions over the Meta
approach?

Meta has some advanteges over regular expressions. From the Henry
Baker paper:

  "If all META did was recognize regular expressions, it would not be
  very useful. It is a programming language, however, and the
  operations [], {} and $ correspond to the Common Lisp control
  structures AND, OR, and DO. Therefore, we can utilize META to not
  only parse, but also to transform. In this way, META is analogous to
  "attributed grammars" [Aho86], but it is an order of magnitude
  simpler and more efficient. Thus, with the addition of the "escape"
  operation "!", which allows us to incorporate arbitrary Lisp
  expressions into META, we can not only parse integers, but produce
  their integral value as a result."

What are your thoughts on this?

Chris.
From: Dave Bakhash
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <c29bs0ftbyk.fsf@no-knife.mit.edu>
Chris Double <·····@double.co.nz> writes:

> Meta has some advanteges over regular expressions. From the Henry
> Baker paper..

The original AllegroServe program, by Franz, does not use Meta.  It uses
regular expressions.  There are many reasons (standard, simple,
universal).

As an example, suppose you wanted to write a regular expression to match
a phone number.  A simple yet powerful regular expression might be as
simple as:

 [^\d]*(\d{3})[^\d]*(\d{3})[^\d]*(\d{4})

This will parse most phone numbers, as well as store the contents in
registers (so you have the area code, exchange, etc.)

To do this using a META-style parser is more work, more lines of code,
and of course not language-neutral.  Parsing is an area of programming
that is common enough that (IMO) programmers should leverage standard
parsing tools and methodologies where possible.  In the example of
parsing a URI, phone number, etc, it's easy to look up a tested regexp
and then use it.

dave
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86adfzumpx.fsf@bogomips.optonline.net>
Dave Bakhash <·····@alum.mit.edu> writes:

> Chris Double <·····@double.co.nz> writes:
> 
> > Meta has some advanteges over regular expressions. From the Henry
> > Baker paper..
> 
> The original AllegroServe program, by Franz, does not use Meta.  It uses
> regular expressions.  There are many reasons (standard, simple,
> universal).

Regular expressions are not: 

Standard: perl, perl with extended re syntax, tcl, awk, python use
different formating conventions to describe the same pattern.  I will
admit for a small subset of easy problems( is this an integer) they
are close.

Simple: Write a re to correctly parse a 822 defined email address,
all the legal possibilities

Universal: no, it is not a standard part of the following languages:
C, C++, fortran, cobol, ada, Common Lisp, delphi...  You can get a lib 
that will do re's but it is not even close to universal and use different
syntax to describe the re in question

Also parsers solve a larger set of problems then re's do.  


> 
> As an example, suppose you wanted to write a regular expression to match
> a phone number.  A simple yet powerful regular expression might be as
> simple as:
> 
>  [^\d]*(\d{3})[^\d]*(\d{3})[^\d]*(\d{4})

and it is wrong for many inputs for example the leading 1:

1 800 555 1212

works, but

18005551212

gives:

180 055 5121

this is not good data validation

> 
> This will parse most phone numbers, as well as store the contents in
> registers (so you have the area code, exchange, etc.)
> 
> To do this using a META-style parser is more work, more lines of code,
> and of course not language-neutral.  Parsing is an area of programming
> that is common enough that (IMO) programmers should leverage standard
> parsing tools and methodologies where possible.  In the example of
> parsing a URI, phone number, etc, it's easy to look up a tested regexp
> and then use it.

to write a parser for anything is work, so do it once and stick it in
a lib.  And using Common Lisp in inherently non portable to other 
programming languages, GCC will barf on cl code.

regular expressions are a strict subset of parsers, they cannot do
everything that a parser can.

marc

> 
> dave
From: Thomas F. Burdick
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <xcvbs0ekb69.fsf@apocalypse.OCF.Berkeley.EDU>
Marc Spitzer <········@optonline.net> writes:

> Dave Bakhash <·····@alum.mit.edu> writes:
> 
> > Chris Double <·····@double.co.nz> writes:
> > 
> > > Meta has some advanteges over regular expressions. From the Henry
> > > Baker paper..
> > 
> > The original AllegroServe program, by Franz, does not use Meta.  It uses
> > regular expressions.  There are many reasons (standard, simple,
> > universal).
> 
> Regular expressions are not: 

I'm with you on everything except:

> Standard:

POSIX 1003.2 specifies a regular expression syntax.  Unfortunately,
none of:

> perl, perl with extended re syntax, tcl, awk, python

use it.  At least Emacs suppports it, though.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <8665qmvh97.fsf@bogomips.optonline.net>
···@apocalypse.OCF.Berkeley.EDU (Thomas F. Burdick) writes:

> Marc Spitzer <········@optonline.net> writes:
> 
> > Dave Bakhash <·····@alum.mit.edu> writes:
> > 
> > > Chris Double <·····@double.co.nz> writes:
> > > 
> > > > Meta has some advanteges over regular expressions. From the Henry
> > > > Baker paper..
> > > 
> > > The original AllegroServe program, by Franz, does not use Meta.  It uses
> > > regular expressions.  There are many reasons (standard, simple,
> > > universal).
> > 
> > Regular expressions are not: 
> 
> I'm with you on everything except:
> 
> > Standard:
> 
> POSIX 1003.2 specifies a regular expression syntax.  Unfortunately,
> none of:

Drat I knew I forgot something.  Although if I was being silly I 
could claim that although they have a standard they are not standard
because no 2(or more) major players use the same syntax, but I would
never do that.  

> 
> > perl, perl with extended re syntax, tcl, awk, python
> 
> use it.  At least Emacs suppports it, though.

Well its a start.

marc

> 
> -- 
>            /|_     .-----------------------.                        
>          ,'  .\  / | No to Imperialist war |                        
>      ,--'    _,'   | Wage class war!       |                        
>     /       /      `-----------------------'                        
>    (   -.  |                               
>    |     ) |                               
>   (`-.  '--.)                              
>    `. )----'                               
From: Thomas F. Burdick
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <xcv3clp3i6g.fsf@apocalypse.OCF.Berkeley.EDU>
Marc Spitzer <········@optonline.net> writes:

> ···@apocalypse.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
>
> > POSIX 1003.2 specifies a regular expression syntax.  Unfortunately,
> > none of:
> 
> Drat I knew I forgot something.  Although if I was being silly I 
> could claim that although they have a standard they are not standard
> because no 2(or more) major players use the same syntax, but I would
> never do that.  

They don't?  How about C/POSIX and awk (on a POSIX-conforming system)?
grep and egrep, too.  There are quite a few tools that use POSIX
regexps.  Unfortunately, there are more that seem to think that having
a non-conforming regexp syntax is a competative advantage.  And
apparently none of these implementors thought to support POSIX regexps
as a subset of their own.

> > > perl, perl with extended re syntax, tcl, awk, python

Er, I missed "awk" in there the first time.  Oops.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Dave Bakhash
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <8a3667a0.0303150633.b7fd3b6@posting.google.com>
> I'm with you on everything except:
> 
> > Standard:

well, let's put it this way.  There's POSIX, and I'm sure that Perl
has support for POSIX regexps somewhere (though maybe in a module). 
Also, many languages (CL, C/C++, Java) support Perl regular
expressions.  So if you wanted to go with Perl regexps as a de-facto
standard, you could do that with relative ease.

Given a regexp in one language, it's pretty easy to get it to conform
to another.  Can you do that with a META-style parser?

dave
From: Raymond Wiker
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <867kb03ag8.fsf@raw.grenland.fast.no>
·····@alum.mit.edu (Dave Bakhash) writes:

> > I'm with you on everything except:
> > 
> > > Standard:
> 
> well, let's put it this way.  There's POSIX, and I'm sure that Perl
> has support for POSIX regexps somewhere (though maybe in a module). 
> Also, many languages (CL, C/C++, Java) support Perl regular
> expressions.  So if you wanted to go with Perl regexps as a de-facto
> standard, you could do that with relative ease.
> 
> Given a regexp in one language, it's pretty easy to get it to conform
> to another.  Can you do that with a META-style parser?

        Well, given that META is part of the PortableAserve
distribution, there's *no reason* to make it conform to anything else.

        For PortableAserve, I guess the options are:

        1) Pack it up with META.
        2) Pack it up with a regexp package with acceptable licensing.
        3) Make it use whatever regexp package the platform provides,
           with conditional code to gloss over the differences.

        I cannot see that 2) or 3) provides any advantages over 1).
The META package is smaller than a regexp package would be, and is
probably also smaller than a set of conditionals would be.

-- 
Raymond Wiker                        Mail:  ·············@fast.no
Senior Software Engineer             Web:   http://www.fast.no/
Fast Search & Transfer ASA           Phone: +47 23 01 11 60
P.O. Box 1677 Vika                   Fax:   +47 35 54 87 99
NO-0120 Oslo, NORWAY                 Mob:   +47 48 01 11 60

Try FAST Search: http://alltheweb.com/
From: Jochen Schmidt
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <b4vosq$2um$01$1@news.t-online.com>
Raymond Wiker wrote:

> ·····@alum.mit.edu (Dave Bakhash) writes:
> 
>> > I'm with you on everything except:
>> > 
>> > > Standard:
>> 
>> well, let's put it this way.  There's POSIX, and I'm sure that Perl
>> has support for POSIX regexps somewhere (though maybe in a module).
>> Also, many languages (CL, C/C++, Java) support Perl regular
>> expressions.  So if you wanted to go with Perl regexps as a de-facto
>> standard, you could do that with relative ease.
>> 
>> Given a regexp in one language, it's pretty easy to get it to conform
>> to another.  Can you do that with a META-style parser?
> 
>         Well, given that META is part of the PortableAserve
> distribution, there's *no reason* to make it conform to anything else.
> 
>         For PortableAserve, I guess the options are:
> 
>         1) Pack it up with META.
>         2) Pack it up with a regexp package with acceptable licensing.
>         3) Make it use whatever regexp package the platform provides,
>            with conditional code to gloss over the differences.
> 
>         I cannot see that 2) or 3) provides any advantages over 1).
> The META package is smaller than a regexp package would be, and is
> probably also smaller than a set of conditionals would be.

Yes definitely.

Actually the META package came into ACL-COMPAT because my URI package used
it for parsing. The dateparsing functions in AllegroServe use ACLs inbuilt
regular expressions. NREGEX was not compatible with those regular
expressions and then issues about licensing made clear that using it might
not be such a good idea. Since META was already there and we wanted to get
it working I reimplemented the date parsing function using META (we talk
about ~17 lines of META code).

If I would redo my META package today I would choose a simple s-expr Syntax
like others did too because the reader syntax of META is IMHO the only
thing that sucks a bit with it.

META is so simple to understand (and implement) completely that one is able
to write META parsers out of the head without even needing any "META
package".

The _only_ case in which I from a technical standpoint would want to change
the dateparsing function again would be to reintroduce the _original_ ones
again. This will only happen when ACL-COMPAT supports a bulletproof
implementation of regular expressions which are 100% compatible to those of
ACL. Even then I would first open a discussion about this on the Portable
AllegroServe mailinglist if the other developers would agree.

The thing I disliked a bit about the way this issue here came up is that it
got articulated in a way as if I or the other Portable AllegroServe
developers owe anyone something. I'm absolutely open for suggestions but I
should be free to disagree. I do not stand in the way of anyone who thinks
its worth his time to implement this stuff and integrate it in a way that
_nobody_ suffers from it. As long as I have no problem using it the way it
is done now I do not see a need (or obligation) to spend my working time on
this.

ciao,
Jochen
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86ptosih7h.fsf@bogomips.optonline.net>
Jochen Schmidt <···@dataheaven.de> writes:

> 
> The thing I disliked a bit about the way this issue here came up is that it
> got articulated in a way as if I or the other Portable AllegroServe
> developers owe anyone something. I'm absolutely open for suggestions but I
> should be free to disagree. I do not stand in the way of anyone who thinks
> its worth his time to implement this stuff and integrate it in a way that
> _nobody_ suffers from it. As long as I have no problem using it the way it
> is done now I do not see a need (or obligation) to spend my working time on
> this.

I agree completely.  With that said lisp-nyc is going to be using 
portable alegroserve for its web server.

Thanks to the portable alegroserve team and Franz for this
project.

marc
From: Chris Double
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <uk7f0je8q.fsf@double.co.nz>
Jochen Schmidt <···@dataheaven.de> writes:

> NREGEX was not compatible with those regular expressions and then
> issues about licensing made clear that using it might not be such a
> good idea. Since META was already there and we wanted to get it
> working I reimplemented the date parsing function using META (we
> talk about ~17 lines of META code).

What were the licensing issues with NREGEX? Note that I fully support
the use of META, I'm just asking out of curiosity.

Chris.
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86smtoa5aw.fsf@bogomips.optonline.net>
·····@alum.mit.edu (Dave Bakhash) writes:

> > I'm with you on everything except:
> > 
> > > Standard:
> 
> well, let's put it this way.  There's POSIX, and I'm sure that Perl
> has support for POSIX regexps somewhere (though maybe in a module). 
> Also, many languages (CL, C/C++, Java) support Perl regular
> expressions.  So if you wanted to go with Perl regexps as a de-facto
> standard, you could do that with relative ease.

Are you stating that perl has support for posix REs or are you just
making something up in the hopes of "winning" a flawed and already lost 
argument?  If perl does have posix RE support please post the link, as
part of the standard perl, we are talking about standard thing after all.

Now on to the perl REs are standard, which one?  As has been stated
previously by me and can be verified by using perls documentation,
perl has 2 syntacticly different RE forms, here is the docs:

http://www.perldoc.com/perl5.6/pod/perlre.html

And why would you want to put forward the proposition that standards
are important and useful so lets disregard the one we have for this
other non standard package, that has the added benefit of changing
from release to release at the implementors whim.     

Also keep in mind much of the utility of perl is bound up in the
native RE engine/syntax.  And if you junk that you junk much of the
desire to use perl, perl without REs sucks more then perl with it.

> 
> Given a regexp in one language, it's pretty easy to get it to conform
> to another.  Can you do that with a META-style parser?

Why do I care if I can do that with a meta style parser, what I would
care about is that it is harder to fuck up and not catch it with a
parser( meta or otherwise) then it is with REs.  By not catch it I
mean the RE matches the data but just does not give you good results,
yea old false positive.

Another benefit of parsers over REs is that parsers of reasonably
complex things are much much easier to maintain then REs of the same.
The reason is that parsers are much more discreet in what they do,
match this pattern of tokens do this to it, where a RE is a big honken
string that you hope works, correctly, after the change.  

marc
From: Thomas F. Burdick
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <xcvhea3yu7p.fsf@conquest.OCF.Berkeley.EDU>
Marc Spitzer <········@optonline.net> writes:

> ·····@alum.mit.edu (Dave Bakhash) writes:
> 
> > > I'm with you on everything except:
> > > 
> > > > Standard:
> > 
> > well, let's put it this way.  There's POSIX, and I'm sure that Perl
> > has support for POSIX regexps somewhere (though maybe in a module). 
> > Also, many languages (CL, C/C++, Java) support Perl regular
> > expressions.  So if you wanted to go with Perl regexps as a de-facto
> > standard, you could do that with relative ease.
> 
> Are you stating that perl has support for posix REs or are you just
> making something up in the hopes of "winning" a flawed and already lost 
> argument?  If perl does have posix RE support please post the link, as
> part of the standard perl, we are talking about standard thing after all.

I would think that if there was POSIX::Regexp module on CPAN, and you
could do:

  use POSIX::Regexp;

and from then on, your re's were standard, that would count.  Of
course, I don't see how you could even begin to do that, with the way
that regexps are so intimately integrated into Perl.  The best you
could hope for would be something like:

  my $re = new POSIX::Regexp ("...");
  my $ismatch = $re->match($some_string);
  if ($ismatch) {
    my ($foo, $bar, $baz) = ($1, $2, $3);
    ...

but given how easy Perl regexps are in perl, that's kind of like
loading a bignum library into C, and saying, "See, I can do bignums in
C!".  Well, yes, painfully, and so long as you don't have to touch
anyone else's code with them.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Kalle Olavi Niemitalo
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <873clnoxmd.fsf@Astalo.kon.iki.fi>
···@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:

> Of course, I don't see how you could even begin to do that,
> with the way that regexps are so intimately integrated into Perl.

I suppose one would overload regexp constants.  (perldoc overload)
From: Thomas F. Burdick
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <xcvel56jtpa.fsf@famine.OCF.Berkeley.EDU>
Kalle Olavi Niemitalo <···@iki.fi> writes:

> ···@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
> 
> > Of course, I don't see how you could even begin to do that,
> > with the way that regexps are so intimately integrated into Perl.
> 
> I suppose one would overload regexp constants.  (perldoc overload)

Huh, possibly.  Of course, Perl isn't strongly typed, so there's no
way you could tell your POSIX regexps from other people's Perl
regexps, so I can't imagine this working well in practice.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Florian Weimer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <87hea4zkef.fsf@deneb.enyo.de>
Marc Spitzer <········@optonline.net> writes:

> Simple: Write a re to correctly parse a 822 defined email address,
> all the legal possibilities

Email addresses do not form a regular language. 8-(
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86wuj0a7ox.fsf@bogomips.optonline.net>
Florian Weimer <··@deneb.enyo.de> writes:

> Marc Spitzer <········@optonline.net> writes:
> 
> > Simple: Write a re to correctly parse a 822 defined email address,
> > all the legal possibilities
> 
> Email addresses do not form a regular language. 8-(

yes, but you can write a *parser* for them.

marc
From: Tim Bradshaw
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <ey3k7f0pds6.fsf@cley.com>
* Florian Weimer wrote:

> Email addresses do not form a regular language. 8-(

Tell me about it.  A long time ago I used to have a situation where
emacs would mysteriously explode while doing some mail thing.  It
turned out to be because some cretin had thought that they could parse
RFC822 with regexps, and it fell over with nested parens in something
like ···@bar.com (Dr. (Med) Crunworthy F. Foo).

Somewhere there's a quote, isn't there: you have a problem, so you try
to solve it with regexps: now you have two problems.

--tim
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86of4cuq0g.fsf@bogomips.optonline.net>
Tim Bradshaw <···@cley.com> writes:


> 
> Somewhere there's a quote, isn't there: you have a problem, so you try
> to solve it with regexps: now you have two problems.
> 
> --tim

in message <··············@kappa.unlambda.com>:

I recall Jamie Zawinski's quote:

"Sometimes a hacker has a problem, and he thinks to himself 'I know,
I'll solve it with a regular expression!'.  Now he has two problems."

marc
From: Rob Warnock
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <lYWcnZ9mfdOP5emjXTWc-w@speakeasy.net>
Marc Spitzer  <········@optonline.net> wrote:
+---------------
| Tim Bradshaw <···@cley.com> writes:
| > Somewhere there's a quote, isn't there: you have a problem, so you try
| > to solve it with regexps: now you have two problems.
| 
| I recall Jamie Zawinski's quote:
| 
| "Sometimes a hacker has a problem, and he thinks to himself 'I know,
| I'll solve it with a regular expression!'.  Now he has two problems."
+---------------

Yup. Halfway down this page <URL:http://www.jwz.org/hacks/marginal.html>
one sees:

	(Some people, when confronted with a problem, think ``I know,
	I'll use regular expressions.'' Now they have two problems.)

Some say it was originally posted by him in comp.lang.emacs, which
probably explains its position on this page.  ;-}


-Rob

-----
Rob Warnock, PP-ASEL-IA		<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Yarden Katz
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86fzpm95yh.fsf@underlevel.net>
····@rpw3.org (Rob Warnock) writes:

> Marc Spitzer  <········@optonline.net> wrote:
> +---------------
> | Tim Bradshaw <···@cley.com> writes:
> | > Somewhere there's a quote, isn't there: you have a problem, so you try
> | > to solve it with regexps: now you have two problems.
> | 
> | I recall Jamie Zawinski's quote:
> | 
> | "Sometimes a hacker has a problem, and he thinks to himself 'I know,
> | I'll solve it with a regular expression!'.  Now he has two problems."
> +---------------
>
> Yup. Halfway down this page <URL:http://www.jwz.org/hacks/marginal.html>
> one sees:
>
> 	(Some people, when confronted with a problem, think ``I know,
> 	I'll use regular expressions.'' Now they have two problems.)
>
> Some say it was originally posted by him in comp.lang.emacs, which
> probably explains its position on this page.  ;-}

What are some obvious alternatives to regexps other than a META
parser, like the ones introduced in the Baker papers?
-- 
Yarden Katz <····@underlevel.net>  |  Mind the gap
From: Marc Spitzer
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <86wuiyd8g8.fsf@bogomips.optonline.net>
Yarden Katz <····@underlevel.net> writes:

> ····@rpw3.org (Rob Warnock) writes:
> 
> > Marc Spitzer  <········@optonline.net> wrote:
> > +---------------
> > | Tim Bradshaw <···@cley.com> writes:
> > | > Somewhere there's a quote, isn't there: you have a problem, so you try
> > | > to solve it with regexps: now you have two problems.
> > | 
> > | I recall Jamie Zawinski's quote:
> > | 
> > | "Sometimes a hacker has a problem, and he thinks to himself 'I know,
> > | I'll solve it with a regular expression!'.  Now he has two problems."
> > +---------------
> >
> > Yup. Halfway down this page <URL:http://www.jwz.org/hacks/marginal.html>
> > one sees:
> >
> > 	(Some people, when confronted with a problem, think ``I know,
> > 	I'll use regular expressions.'' Now they have two problems.)
> >
> > Some say it was originally posted by him in comp.lang.emacs, which
> > probably explains its position on this page.  ;-}
> 
> What are some obvious alternatives to regexps other than a META
> parser, like the ones introduced in the Baker papers?

I think the problem is not the use of regular expressions but the 
impropper use of them.  Or I should not use lex to do yacc's work.
And I have paid for doing just that in the past.

marc
From: Rob Warnock
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <Roidneocmqekl-qjXTWc-w@speakeasy.net>
Yarden Katz  <····@underlevel.net> wrote:
+---------------
| ····@rpw3.org (Rob Warnock) writes:
| > Yup. Halfway down this page <URL:http://www.jwz.org/hacks/marginal.html>
| > one sees:
| >
| > 	(Some people, when confronted with a problem, think ``I know,
| > 	I'll use regular expressions.'' Now they have two problems.)
| 
| What are some obvious alternatives to regexps other than a META
| parser, like the ones introduced in the Baker papers?
+---------------

As mentioned elsewhere in the group recently, quite often
simple ad-hoc recursive descent parsers are best, if the
language being parsed is anywhere near being LL(1). [Most
things with some kind of keywords in front are, or can be
faked into being with some fixups behind the scenes...]


-Rob

-----
Rob Warnock, PP-ASEL-IA		<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Daniel Barlow
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <87el5ae77h.fsf@noetbook.telent.net>
Dave Bakhash <·····@alum.mit.edu> writes:

> As an example, suppose you wanted to write a regular expression to match
> a phone number.  A simple yet powerful regular expression might be as
> simple as:
>
>  [^\d]*(\d{3})[^\d]*(\d{3})[^\d]*(\d{4})
>
> This will parse most phone numbers, as well as store the contents in
> registers (so you have the area code, exchange, etc.)

YM "most US phone numbers" HTH

It leaves the rest of the world out in the cold.


-dan

-- 

   http://www.cliki.net/ - Link farm for free CL-on-Unix resources 
From: Ingvar Mattsson
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <87llziz10i.fsf@gruk.tech.ensign.ftech.net>
Daniel Barlow <···@telent.net> writes:

> Dave Bakhash <·····@alum.mit.edu> writes:
> 
> > As an example, suppose you wanted to write a regular expression to match
> > a phone number.  A simple yet powerful regular expression might be as
> > simple as:
> >
> >  [^\d]*(\d{3})[^\d]*(\d{3})[^\d]*(\d{4})
> >
> > This will parse most phone numbers, as well as store the contents in
> > registers (so you have the area code, exchange, etc.)
> 
> YM "most US phone numbers" HTH
> 
> It leaves the rest of the world out in the cold.

Amusingly enough, I do actually use regular expressions in one new
thing I'm doodling on (transforming ISBN from canoncial form do
delimited form, for easier looking-at).

Saying that, I might just as well have used a triplet of numbers, one
for how many digits in the group, one for how many digits in the
publisher ID and one for how many digits in the publication ID.

However, as-is, I can (throretically) do data validation in the regexp
and I find that neat.

//Ingvar
-- 
(defun m (a b) (cond ((or a b) (cons (car a) (m b (cdr a)))) (t ())))
From: Tim Bradshaw
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <ey3fzpqjiws.fsf@cley.com>
* Daniel Barlow wrote:

> YM "most US phone numbers" HTH

And more excitingly, most phone numbers formatted the conventional
way.  I don't know about the US, but in the UK it's not quite clear
how phone numbers are formatted.  I write mobile numbers as:

    07xy abc defg

But most people write them as


    07xya bcd efg

or

    07xya bcdefg

I'm not sure which is `right' though I suspect I'm wrong.  Certainly
most phone numbers in the UK were traditionally exchange and three
digits (Markyate 601, say).  Later many of them became Main exchange,
sub-exchange (three digits) and three digits (so the previous number
changed to Luton 840 601 sometime in the 70s). Although I think London
numbers may well have been London, london exchange (three digits),
four digits (01 123 4567 say) for a very long time (and that's where
my grouping comes from).

--tim
From: Thomas F. Burdick
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <xcvznnx234f.fsf@apocalypse.OCF.Berkeley.EDU>
Tim Bradshaw <···@cley.com> writes:

> * Daniel Barlow wrote:
> 
> > YM "most US phone numbers" HTH
> 
> And more excitingly, most phone numbers formatted the conventional
> way.  I don't know about the US, but in the UK it's not quite clear
> how phone numbers are formatted.

Hee hee, y'all might have the metric system, but at least we can agree
on how to write phone numbers (small victories).  In the US, there's
one cannonical representation:

  (aaa) ppp-nnnn

Where aaa is the area-code, ppp is the prefix, and nnnn is four
digits.  Sometimes numbers are also written in dialing-instructions
format:

  1-aaa-ppp-nnnn

because that's how you actually dial them, if it's a non-local call.
Any format other than those two is gratuitously weird.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Raymond Toy
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <4nptotvf43.fsf@edgedsp4.rtp.ericsson.se>
>>>>> "Thomas" == Thomas F Burdick <···@apocalypse.OCF.Berkeley.EDU> writes:

    Thomas> Tim Bradshaw <···@cley.com> writes:
    >> * Daniel Barlow wrote:
    >> 
    >> > YM "most US phone numbers" HTH
    >> 
    >> And more excitingly, most phone numbers formatted the conventional
    >> way.  I don't know about the US, but in the UK it's not quite clear
    >> how phone numbers are formatted.

    Thomas> Hee hee, y'all might have the metric system, but at least we can agree
    Thomas> on how to write phone numbers (small victories).  In the US, there's
    Thomas> one cannonical representation:

    Thomas>   (aaa) ppp-nnnn

Yes, but when I was a little kid, I usually only had to remember and
use 5 digits to make a local call:  p-nnnn.

These all went away, I think, when digital switches came.

Ray
From: Ingvar Mattsson
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <878yviyvo9.fsf@gruk.tech.ensign.ftech.net>
Tim Bradshaw <···@cley.com> writes:

> * Daniel Barlow wrote:
> 
> > YM "most US phone numbers" HTH
> 
> And more excitingly, most phone numbers formatted the conventional
> way.  I don't know about the US, but in the UK it's not quite clear
> how phone numbers are formatted.  I write mobile numbers as:
> 
>     07xy abc defg

If I could be bothered, I'd head off and check the prefixes assigned
to the operators off Oftel's web pages. Just to be sure, you see.

If the operators have varying 4-5 digit prefixes, I'd go with
whatever's right for the prefix, if there seems to be an easy
discriminator.

In Sweden, the general recommended rule was <areacode> and then group
phone numbers as:
xxx yy
xx yy zz
xxx yy zz
xxx yyy zz

//Ingvar
-- 
(defun m (f)
  (let ((db (make-hash-table :key #'equal)))
    #'(lambda (&rest a)
        (or (gethash a db) (setf (gethash a db) (apply f a))))))
From: Tim Bradshaw
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <ey3u1e6i033.fsf@cley.com>
* Ingvar Mattsson wrote:

> If I could be bothered, I'd head off and check the prefixes assigned
> to the operators off Oftel's web pages. Just to be sure, you see.

So would I.  But what you need to parse is the data you have, not the
data you *should* have.

--tim
From: Nils Goesche
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <lybs0e82j6.fsf@cartan.de>
Ingvar Mattsson <······@cathouse.bofh.se> writes:

> Tim Bradshaw <···@cley.com> writes:
> 
> > I write mobile numbers as:
> > 
> >     07xy abc defg
> 
> If I could be bothered, I'd head off and check the prefixes assigned
> to the operators off Oftel's web pages. Just to be sure, you see.
> 
> If the operators have varying 4-5 digit prefixes, I'd go with
> whatever's right for the prefix, if there seems to be an easy
> discriminator.
> 
> In Sweden, the general recommended rule was <areacode> and then group
> phone numbers as:
> xxx yy
> xx yy zz
> xxx yy zz
> xxx yyy zz

There is a standard:

``Notation for national and international telephone numbers, e-mail
  addresses and Web addresses��

(ITU-T E.123)

It doesn't say much about grouping, though.  Mainly:

2.9 Grouping the digits of a telephone number is advisable for reasons
    of memorizing, oral presentation, and printing.

and

9.1 Grouping of digits in a telephone number should be accomplished by
    means or spaces unless an agreed upon explicit symbol
    (e.g. hyphen) is necessary for procedural purposes. Only spaces
    should be used in an international number.

Regards,
-- 
Nils G�sche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x0655CFA0
From: Jochen Schmidt
Subject: Re: PortableAserve should use regular expressions
Date: 
Message-ID: <b4ohi3$nnl$03$1@news.t-online.com>
Dave Bakhash wrote:

> hey,
> 
> it's good to see a syntactic improvement of the META syntax.  However,
> most people are fairly comfortable with widely accepted regular
> expressions.
> 
> I emailed Jochem twice about using one of the Lisp-based regexp packages
> in place of the META stuff, e.g.:

For the sake of correctness and to test if there are any problems with one
of our accounts:

Please search in your Mailbox for a Message with 

  Message-Id: <······················@dataheaven.de>

Written on Wed, 25 Dec 2002 13:48:00 +0100

With the following content:
-----------------------------------------------------------------------
Am Dienstag, 24. Dezember 2002 19:58 schrieb Dave Bakhash:
> Jochen,
>
> I don't know if you've updated PortableAserve, but now that there are
> several good alternatives for regular expressions, it would nice to
> replace the meta-based stuff in portableaserve with one of the two
> regular expression packages that have recently been created.  I havn't
> played with either of them, but they are apparently pretty fast, and
> have a more standard interface.
>
> One was posted recently by Edi Weitz (message ID:
> <··············@bird.agharta.de>), and the other is cl-regex.

Yes I've seen that.

If it would be only rewriting the parts (or even only recovering to the 
original AServe sources) which use Meta (I think there are only some date 
parsing routines using it). Then I would be all for it.

The good thing with meta is that it is only some lines of code and therefore 
easy to get integrated in acl-compat. Using a package like cl-regex would 
have to use it as a add-on package. I don't wan't to claim responsibility
for 
maintaining a seperate branch of such a non-trivial package. Making it an 
add-on would mean having a dependency which was not there before. New 
dependencies like this make maintaining and even the user installation not 
really easier.

This reasons would not be of any weight if there is a substantial benefit in 
changing those parts to using regexes even if this means introducing new 
dependencies. 

Until now the date parsing routines never made any problems so I would not
see 
an immediate reason to replace them. Did you face any problems with them?

I'm open to discuss this change and I'm not really against it. I would make
it 
depend on what other PAserve developers would think of it.

Maybe a compromise solution would be to let the meta based code in but have 
conditionals which get set if a regex package like cl-regex is there?

ciao,
Jochen
------------------------------------------------------------------------------

I did not get any answers from you since then.

ciao,
Jochen