From: Dario Lah
Subject: parsing C like syntax
Date: 
Message-ID: <d1ojk7$9l1$1@sunce.iskon.hr>
Hi everyone,
Can anyone give me some pointers, ideas, etc. about how to parse C like
syntax?

What are advantages/disadvatages of some particular way of doing it?

I suppose lots of people here have some experience with parsing some C like
files (configs, etc.).

Care to share you experiences/ideas with the world? :)

Tnx,
Dario

From: Harald Hanche-Olsen
Subject: Re: parsing C like syntax
Date: 
Message-ID: <pcou0n458xa.fsf@shuttle.math.ntnu.no>
+ Dario Lah <····@tis.hr>:

| I suppose lots of people here have some experience with parsing some
| C like files (configs, etc.).

You mean, like <http://www.cliki.net/trivial-configuration-parser>?

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Pascal Bourguignon
Subject: Re: parsing C like syntax
Date: 
Message-ID: <87d5tryfnp.fsf@thalassa.informatimago.com>
Dario Lah <····@tis.hr> writes:

> Hi everyone,
> Can anyone give me some pointers, ideas, etc. about how to parse C like
> syntax?

Use tools to generate automatically a lexer and a parser.


> What are advantages/disadvatages of some particular way of doing it?
> 
> I suppose lots of people here have some experience with parsing some C like
> files (configs, etc.).
> 
> Care to share you experiences/ideas with the world? :)

If it was exactly C you wanted to parse, I'd advise one of the free
project to parse C or C++ such as:

    1. "OpenC++ is a C++ frontend library (lexer+parser+DOM/MOP) and
        source-to-source translator ...."
       http://opencxx.sourceforge.net/

    2.  "Synopsis is a general source code documentation tool ...."
       http://synopsis.fresco.org/

    3.  Keystone: A parser and front-end for ISO C++
       http://keystone.sourceforge.net/research.shtml

    4. "Meta-Compilation for C++ transformation and filtering of a
        superset to the target language" by Edward D. Willink
       http://www.computing.surrey.ac.uk/research/dsrg/fog/

    5. "Parsing C++" by Andrew Birkett
       http://www.nobugs.org/developer/parsingcpp/

    6. compiler tool "Simplified Wrapper and Interface Generator"
       http://www.swig.org/

If you want to do it in Common-Lisp, have a look at:

    http://www.cliki.net/Zebu
    http://common-lisp.net/project/cparse/     
    ftp://thalassa.informatimago.com/pub/ZETA-C-PD.tgz


Otherwise, you may want to use still flex and bison to generate the parser.
Depends on the complexity of the C-like language.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Small brave carnivores
Kill pine cones and mosquitoes
Fear vacuum cleaner
From: Dario Lah
Subject: Re: parsing C like syntax
Date: 
Message-ID: <d1pa38$7ff$1@sunce.iskon.hr>
> If you want to do it in Common-Lisp, have a look at:
> 
>     http://www.cliki.net/Zebu
>     http://common-lisp.net/project/cparse/
>     ftp://thalassa.informatimago.com/pub/ZETA-C-PD.tgz

I'm interested in Common-Lisp. Yacc is ok if it exists for common lisp.

Tnx for links.

Dario
From: Paul F. Dietz
Subject: Re: parsing C like syntax
Date: 
Message-ID: <yIadncePH8rlsd3fRVn-gg@dls.net>
Dario Lah wrote:

> I'm interested in Common-Lisp. Yacc is ok if it exists for common lisp.

Zebu

	Paul
From: Pascal Bourguignon
Subject: Re: parsing C like syntax
Date: 
Message-ID: <877jjzxu7z.fsf@thalassa.informatimago.com>
Dario Lah <····@tis.hr> writes:

> > If you want to do it in Common-Lisp, have a look at:
> > 
> >     http://www.cliki.net/Zebu
> >     http://common-lisp.net/project/cparse/
> >     ftp://thalassa.informatimago.com/pub/ZETA-C-PD.tgz
> 
> I'm interested in Common-Lisp. Yacc is ok if it exists for common lisp.
> 
> Tnx for links.

Well, concerning flex and bison (lex and yacc), it's envisageable to
port the driver to Common-Lisp and use the tables generated by the
tools.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Wanna go outside.
Oh, no! Help! I got outside!
Let me back inside!
From: ············@gmail.com
Subject: Re: parsing C like syntax
Date: 
Message-ID: <1111820461.093003.236710@f14g2000cwb.googlegroups.com>
> > >     http://www.cliki.net/Zebu
> > >     http://common-lisp.net/project/cparse/
> > >     ftp://thalassa.informatimago.com/pub/ZETA-C-PD.tgz
> >
> > I'm interested in Common-Lisp. Yacc is ok if it exists for common
lisp.
> >
> > Tnx for links.
>
> Well, concerning flex and bison (lex and yacc), it's envisageable to
> port the driver to Common-Lisp and use the tables generated by the
> tools.


What's wrong with combinator parsers? Easy to implement and less mess
to carry along.

David
From: Juliusz Chroboczek
Subject: Re: parsing C like syntax
Date: 
Message-ID: <7ifyyehhay.fsf@lanthane.pps.jussieu.fr>
·············@gmail.com" <············@gmail.com>:

> What's wrong with combinator parsers? Easy to implement and less mess
> to carry along.

Do you mean Haskell-style combinator parsers or something else?

If the former, then as far as I know they either refactoring your
grammars on the left (which is not always easy), or unbounded amounts
of backtracking, which is going to interact horribly with actions that
are not purely functional.  (Imperative effect in actions are
necessary if you're trying to parse something that's not context-free...
C for example, which is what I'm working on right now.)

                                        Juliusz
From: Tomasz Zielonka
Subject: Re: parsing C like syntax
Date: 
Message-ID: <slrnd4ioah.9f1.tomasz.zielonka@localhost.localdomain>
Juliusz Chroboczek wrote:
> Do you mean Haskell-style combinator parsers or something else?
>
> If the former, then as far as I know they either refactoring your
> grammars on the left (which is not always easy), or unbounded amounts
> of backtracking, which is going to interact horribly with actions that
> are not purely functional.  (Imperative effect in actions are
> necessary if you're trying to parse something that's not context-free...
> C for example, which is what I'm working on right now.)

Do you mean that it's tedious/error-prone or inefficient? If you mean
the former - you can combine parsing monads with other effects, for
example, Parsec's Parser monad is also a state monad. The state changes
are "rolled back" when the parser backtracks.

Best regards
Tomasz
From: Juliusz Chroboczek
Subject: Re: parsing C like syntax
Date: 
Message-ID: <7iekdxe3gw.fsf@lanthane.pps.jussieu.fr>
Oh, hi Tomek!

>> [...] Haskell-style combinator parsers [...]

>> as far as I know they either refactoring your grammars on the left
>> [...] or unbounded amounts of backtracking

> Do you mean that it's tedious/error-prone or inefficient?

The former is the former and the latter the latter ;-)

Bottom-up parsing is magic.  LALR(1) takes a context-free grammar in
very general form (it's far from easy to find a context-free grammar
that's not LALR(1)), munges it incomprehensibly, then produces an
automaton that runs in linear time and space.  Plus there's no grammar
rewriting involved, which means that you've got no restriction on the
actions you can use.

The main flaw is that the LALR(1) algorithm performs a global
transformation, which means that debugging a parser requires a fairly
deep grokking of the algorithm -- just reading the error messages
requires knowing what a conflict is.

To speak frankly, I do not understand why the functional languages
community have been searching for a purely functional approach to the
issue of parsing rather than recognising the sheer brilliance of
LALR(1) and working on how to integrate it into a functional language.

In case you're looking for a research subject, I think it would be
interesting to work out how the LALR(1) algorithm can be tweaked so
that it produces human-readable debugging information.  There is some
work on the subject (for example the end of DeRemer and Pennello, 1982),
but I don't know whether it has been tried on real users.

> you can combine parsing monads with other effects, for example,
> Parsec's Parser monad is also a state monad. The state changes are
> "rolled back" when the parser backtracks.

Interesting -- composing monads ``the wrong way''.  How do they
rollback I/O?

                                        Juliusz
From: Juliusz Chroboczek
Subject: Re: LALR(1) parser generators [was: parsing C like syntax]
Date: 
Message-ID: <7ir7i35g51.fsf_-_@lanthane.pps.jussieu.fr>
> I'm interested in Common-Lisp.  Yacc is ok if it exists for common lisp.

There are a few LALR(1) parser generators freely available for CL.
You'll find all of them through Google.

Lalr.cl is Free Software, lovely code, clean, small and elegant.  It
is okay for small grammars (up to a hundred rules or so), but too slow
to be used with grammars the size of C's.  It doesn't support error
recovery, and has no support for ambiguous grammars.

Zebu is not Free Software (some of the distribution files carry the
notice ``experimental; do not distribute''), but you're unlikely to
run into trouble if you use it.  It's a large package, but is
reasonably fast.  I don't know whether it supports error recovery, but
its support for ambiguous grammars is IMHO somewhat clumsy[1]: it uses
implicit production precedence rather than explicit operator
precedence, as Yacc does.

Lispworks' Defparser is not Free Software.  It is included with
Xanalys Lispworks (nee Harlequin Lispworks, nee Harlequin Common
Lisp).  Defparser appears to have support for both error recovery
(Yacc-style or automatically generated), and ambiguous grammars.  As I
haven't tried it out myself, I cannot speak about its performance.

The above situation is far from satisfactory, but will hopefully
improve soon.  Watch this space over the next few weeks.

                                        Juliusz
From: Christophe Rhodes
Subject: Re: LALR(1) parser generators
Date: 
Message-ID: <sq4qey3hxk.fsf@cam.ac.uk>
Juliusz Chroboczek <···@pps.jussieu.fr> writes:

>> I'm interested in Common-Lisp.  Yacc is ok if it exists for common lisp.
> [...]
> The above situation is far from satisfactory, but will hopefully
> improve soon.  Watch this space over the next few weeks.

I've just been playing with the syntax support (essentially, a
modified Earley parser framework) in the Climacs editor project.  It
seems to work well, but I only have limited experience with parsers in
general.

Screenshot:
  <http://www-jcsu.jesus.cam.ac.uk/~csr21/pretty-prolog.png>
(the colorification is driven from a complete parse of the prolog
text).

I don't know about the needs of the Original Poster; I simply thought
it might be worth adding another option into the mix.

Cheers,

Christophe