From: Xah Lee
Subject: how to use parsing expressing grammar
Date: 
Message-ID: <6b8a1070-1a89-48b0-9287-343b673b5758@a29g2000pra.googlegroups.com>
There are 2 parsing expression grammars in elisp.

    * http://www.emacswiki.org/cgi-bin/wiki/ParserCompiler, (2008) by
Mike Mattie.
    * http://www.emacswiki.org/emacs/ParsingExpressionGrammars (2008)
by Helmut Eller.

The second one seems simpler, and i'm trying to learn it as a regex
replacement, but don't know how to use it.

Could anyone give concrete example in the following scenario?

For example, on my website i have things like:

 <hr>
 <p>Related essays:</p>
 <ul>
 <li><a href="someFilePath1">SomeTitleString2</a> someSring1</li>
 <li><a href="someFilePath2">SomeTitleString2</a> someSring2</li>
 ...
 </ul>

Suppose i want to change them to:

 <hr>
 <p>See also:</p>
 <p>
 <a href="someFilePath1">SomeTitleString1</a> someSring1<br>
 <a href="someFilePath2">SomeTitleString2</a> someSring2<br>
 ...
 </p>

How do i do it?

Thanks.

  Xah
∑ http://xahlee.org/

☄

From: budden
Subject: Re: how to use parsing expressing grammar
Date: 
Message-ID: <87fc41cd-e672-4ca0-9421-2b75173e8d89@w1g2000prk.googlegroups.com>
There are some HTML/XML parsing libraries around, you don't need to
write the parser.
E.g., http://www.cl-user.net/asp/libs/cl-html-parse (not surely this
is the best one).
From: Xah Lee
Subject: Re: how to use parsing expressing grammar
Date: 
Message-ID: <72396362-7322-4b8e-bb15-9f629602d021@a26g2000prf.googlegroups.com>
On Dec 17, 4:46 am, budden <···········@mail.ru> wrote:
> There are some HTML/XML parsing libraries around, you don't need to
> write the parser.
> E.g.,http://www.cl-user.net/asp/libs/cl-html-parse(not surely this
> is the best one).

On Dec 17, 3:53 am, Xah Lee <······@gmail.com> wrote:
> There are 2 parsing expression grammars in elisp.
>
>     * http://www.emacswiki.org/cgi-bin/wiki/ParserCompiler, (2008) by
> Mike Mattie.
>     * http://www.emacswiki.org/emacs/ParsingExpressionGrammars(2008)
> by Helmut Eller.
>
> The second one seems simpler, and i'm trying to learn it as a regex
> replacement, but don't know how to use it.
>
> Could anyone give concrete example in the following scenario?
>
> For example, on my website i have things like:
>
>  <hr>
>  <p>Related essays:</p>
>  <ul>
>  <li><a href="someFilePath1">SomeTitleString2</a> someSring1</li>
>  <li><a href="someFilePath2">SomeTitleString2</a> someSring2</li>
>  ...
>  </ul>
>
> Suppose i want to change them to:
>
>  <hr>
>  <p>See also:</p>
>  <p>
>  <a href="someFilePath1">SomeTitleString1</a> someSring1<br>
>  <a href="someFilePath2">SomeTitleString2</a> someSring2<br>
>  ...
>  </p>
>
> How do i do it?

For those interested in parsing expression grammars,
Helmut Eller has given detailed answer to this problem here:

http://groups.google.com/group/comp.emacs/browse_frm/thread/acb19cb47f2e632f

------------------

For Non sequitur “budden” and “Torsten Zühlsdorff”, perhaps the
following articles of mine will be pertinent:

• http://groups.google.com/group/gnu.emacs.help/msg/34b3ad89e9d089e9

relevant excerpt:

I'm mainly interested in learning about parsing expression grammar,
because its power is a order of magnitude higher than regex. In
practice, this just means it's the next generation of regex.

the sed code won't work in general... because as soon as you have some
other chars or slight variation, it stops working. You'll need to code
up a lot variations and conditionals. All the ~4000 html pages on my
website are valid html4. Even at the level of strictness of valid xml,
regex simply can't work.

if i wanted to, could've used Perl, which i'm a master, which is far
more powerful than sed. Even though emacs regex is less powerful than
perl, but in my opinion, the fact that you can move cursor about
freely due to elisp's buffer datatype, its power as a text processing
lang beats Perl despite Perl's more powerful regex. But still, they
all inferior to parsing expression grammar.

i think parsing expression grammar is so important that it should be
core of emacs soon, perhaps coded in C.

See also:

• Text Processing: Elisp vs Perl
  http://xahlee.org/emacs/elisp_text_processing_lang.html

• Pattern Matching vs Lexical Grammar Specification
  http://xahlee.org/cmaci/notation/pattern_matching_vs_pattern_spec.html

  Xah
∑ http://xahlee.org/

☄
From: =?UTF-8?B?VG9yc3RlbiBaw7xobHNkb3JmZg==?=
Subject: Re: how to use parsing expressing grammar
Date: 
Message-ID: <giauq6$67e$2@news.motzarella.org>
Xah Lee schrieb:
> There are 2 parsing expression grammars in elisp.
> 
>     * http://www.emacswiki.org/cgi-bin/wiki/ParserCompiler, (2008) by
> Mike Mattie.
>     * http://www.emacswiki.org/emacs/ParsingExpressionGrammars (2008)
> by Helmut Eller.
> 
> The second one seems simpler, and i'm trying to learn it as a regex
> replacement, but don't know how to use it.
> 
> Could anyone give concrete example in the following scenario?
> 
> For example, on my website i have things like:
> 
>  <hr>
>  <p>Related essays:</p>
>  <ul>
>  <li><a href="someFilePath1">SomeTitleString2</a> someSring1</li>
>  <li><a href="someFilePath2">SomeTitleString2</a> someSring2</li>
>  ...
>  </ul>
> 
> Suppose i want to change them to:
> 
>  <hr>
>  <p>See also:</p>
>  <p>
>  <a href="someFilePath1">SomeTitleString1</a> someSring1<br>
>  <a href="someFilePath2">SomeTitleString2</a> someSring2<br>
>  ...
>  </p>
> 
> How do i do it?

By using DOMDocument. cl-dom is a good starting point.

Greetings from Germany,
Torsten
-- 
http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8 
verschiedenen Datenbanksystemen abstrahiert,
Queries von Applikationen trennt und automatisch die Query-Ergebnisse 
auswerten kann.
From: Xah Lee
Subject: Re: how to use parsing expressing grammar
Date: 
Message-ID: <067151a4-b892-4e9a-8600-e2dcd92cb0ad@v18g2000pro.googlegroups.com>
On Mar 3, 9:59 am, Mike Mattie <···········@gmail.com> wrote:
> On Tue, 03 Mar 2009 17:34:41 +0000
>
> Leo <·······@gmail.com> wrote:
> > On 2008-12-21 09:49 +0000, Helmut Eller wrote:
> > >> I'm pretty sure if you create it, more and more people will join
> > >> it. I'm very interested in PEG and think it is of critical
> > >> importance.
>
> > > I'll try to set up project at savannah.
>
> > I have seen the project on
> >http://savannah.nongnu.org/projects/emacs-peg. I wonder it might be a
> > good idea to make a newsgroup on gmane to link to the mailing list. It
> > will make more Emacs users subscribe to it.
>
> > Best wishes,
>
> I was working on a PEG/CFG parser compiler:http://www.emacswiki.org/cgi-bin/wiki/ParserCompiler
>
> I will be resuming the development soon. Please keep me in the loop on such efforts.

Folks,

when you create a PEG parser, please please make it user oriented one,
so that any user of emacs familiar with regex find-replace will be
able to use PEG for find-replace. In particular, when doing find-
replace on nested text such as XML.

regex is powerful, but it doesn't do nested text. PEG comes to the
rescue. However, it needs to be regex-like, in the sense that the
program interface will be a simple source text and replacement text.
e.g. a function peg-replace that takes 2 args, pattern text, and text
source. The pattern text can be the region, buffer, or a filename, the
text source to work on can be similar. (thus, maybe peg-replace-
region, peg-replace-buffer, peg-replace-file etc.)

last time i was looking at PEG, i opted to try Helmut Eller's version
because it seems simpler. (mike's version is far more compiler geeking
incomprehensible) But still problematic to use. I got busy in other
things so i didn't continue on studying it, so i dropped out of this
thread (havn't read Helmut's last message in detail). Rather, i simply
want just to use it. Last in this thread, he mentioned about stacks
and i went huh... and just didn't have time to go further.

Regex is practically extremely useful, a tool every programer uses
today. However, regex cannot work with nested text such as XML/HTML,
which is used extensively, probably more so than any programing lang
or text. So, brigining regex power to html/xml will be a major impact
on not just emacs, but the whole programing industry. PEG, practically
speaking, is basically just the next generation of so-called regex.
Emacs can be the first to have such a feature. (existing PEG
implementations in various lang,  at this point, as far as i know, are
all tech geeking toys, done by geekers interested in language parsing
and so on.)

Personally, i have huge need for regex that can work on html.
PEG is of course not just a regex replacement, but a BNF replacement
in the sense it is actually for machines to read. For these reasons
that's how i got heavily interested in PEG. (see:
• Pattern Matching vs Lexical Grammar Specification
  http://xahlee.org/cmaci/notation/pattern_matching_vs_pattern_spec.html
)

Please make your PEG in emacs with a regex-like API. Something any
emacs user familiar with regex will be able to use brainlessly. This
will be huge...

i had plans to open a mailing list and stuff... but got busy with
other things. I'll come back to this. But i hope you are convienced
about making PEG usable as a text-editing tool, as opposed to a tool
for computer scientist or compiler/parser writers.

Also, Mike & Helmut, please consider putting your code in goode code.
Google Code is very popular, probably today the most popular code
building service, and extremely easy to use, and from my studies
Google's products and services are all extremely high quality. It
would help a lot in your software at least in the marketing aspect if
you use Google Code. Also, open a google group is very useful and
popular. (yasnippet is a successful example for a emacs project on
google code. There are several others, including e.g. js2, ejacs, the
erlang one, etc.) Going into Savana or anything on FSF services tend
to be a dead end. (yeah, controversy, but whatever.)

  Xah
∑ http://xahlee.org/

☄
From: W Dan Meyer
Subject: Re: how to use parsing expressing grammar
Date: 
Message-ID: <87fxhukvuu.fsf@gmail.com>
Xah Lee <······@gmail.com> writes:

> On Mar 3, 9:59 am, Mike Mattie <···········@gmail.com> wrote:
>> On Tue, 03 Mar 2009 17:34:41 +0000
>>
>> Leo <·······@gmail.com> wrote:
>> > On 2008-12-21 09:49 +0000, Helmut Eller wrote:
>> > >> I'm pretty sure if you create it, more and more people will join
>> > >> it. I'm very interested in PEG and think it is of critical
>> > >> importance.
>>
>> > > I'll try to set up project at savannah.
>>
>> > I have seen the project on
>> >http://savannah.nongnu.org/projects/emacs-peg. I wonder it might be a
>> > good idea to make a newsgroup on gmane to link to the mailing list. It
>> > will make more Emacs users subscribe to it.
>>
>> > Best wishes,
>>
>> I was working on a PEG/CFG parser compiler:http://www.emacswiki.org/cgi-bin/wiki/ParserCompiler
>>
>> I will be resuming the development soon. Please keep me in the loop on such efforts.

Are we talking about memoising PEG (e.g Packrat) in elisp?  There might
be more people interested.  I do realise that Emacs doesn't have decent
parsing facility and it makes it's regular expression based engine in
more complex areas useless (mentioned nested tags). Since it's automata
based you cannot go beyond simple patterns.  That means also that
everybody is reinventing a wheel with implanting all those recursive
decent parsers when it comes to analyse, pretty print, syntax highlight
of such modes like Haskell one.  Packrat parser is the best option out
there currently because it recognises much complex grammar then LALR(1),
and it has no ambiguities with expressing these grammars. Works well
also as a standalone combinator based solution rather then a parser
generator, plus it is lex-less e.g allow to recognise several different
languages in _one_go_ and one context without marking them explicitly!
cheers Dan W
From: Miles Bader
Subject: Re: how to use parsing expressing grammar
Date: 
Message-ID: <87d4cyqg5n.fsf@catnip.gol.com>
W Dan Meyer <······@gmail.com> writes:
> Are we talking about memoising PEG (e.g Packrat) in elisp? ...
> Packrat parser is the best option out there currently because it
> recognises much complex grammar then LALR(1), and it has no
> ambiguities with expressing these grammars.

You also might be interested in Roberto Ierusalimschy's paper on the
implemenation of LPEG, which is a PEG implementation for Lua:

   http://www.inf.puc-rio.br/~roberto/docs/peg.pdf


Note that LPEG does _not_ use the packrat algorithm, as apparently it
presents some serious practical problems for common uses of parsing
tools:

      In 2002, Ford proposed Packrat [5], an adaptation of the original
   algorithm that uses lazy evaluation to avoid that inefficiency.

      Even with this improvement, however, the space complexity of the
   algorithm is still linear on the subject size (with a somewhat big
   constant), even in the best case. As its author himself recognizes,
   this makes the algorithm not befitting for parsing “large amounts of
   relatively flat” data ([5], p. 57). However, unlike parsing tools,
   regular-expression tools aim exactly at large amounts of relatively
   flat data.

      To avoid these difficulties, we did not use the Packrat algorithm
   for LPEG. To implement LPEG we created a virtual parsing machine, not
   unlike Knuth’s parsing machine [15], where each pattern is
   represented as a program for the machine. The program is somewhat
   similar to a recursive-descendent parser (with limited backtracking)
   for the pattern, but it uses an explicit stack instead of recursion.


The general LPEG page is here:

   http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html


-Miles

-- 
Justice, n. A commodity which in a more or less adulterated condition the
State sells to the citizen as a reward for his allegiance, taxes and personal
service.