From: Lowell
Subject: looking for a portable html parser
Date: 
Message-ID: <bh7n1l$q34$1@mughi.cs.ubc.ca>
Does anyone know where I can find a simple, portable html parser? It 
would be great if it could parse html into an S-expression like:

(:html (:head (:title "Example HTML input"))
   (:body (:p ...

There is a parser available at http://opensource.franz.com/xmlutils/ 
which looks great but seems to only work with Allegro (I use Clisp).

Lowell

From: Marco Antoniotti
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <3F37A6F8.7040803@cs.nyu.edu>
You should check CL-XML

Cheers

Marco

Lowell wrote:
> Does anyone know where I can find a simple, portable html parser? It 
> would be great if it could parse html into an S-expression like:
> 
> (:html (:head (:title "Example HTML input"))
>   (:body (:p ...
> 
> There is a parser available at http://opensource.franz.com/xmlutils/ 
> which looks great but seems to only work with Allegro (I use Clisp).
> 
> Lowell
> 
From: Lowell
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <bh8o0r$31$1@mughi.cs.ubc.ca>
CL-XML is not portable - it does not work with Clisp :-(

Lowell

Marco Antoniotti wrote:

> You should check CL-XML
 >
> Lowell wrote:
> 
>> Does anyone know where I can find a simple, portable html parser? 
From: Marco Antoniotti
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <3F380F2C.4050909@cs.nyu.edu>
Lowell wrote:
> CL-XML is not portable - it does not work with Clisp :-(

I'd say it has not been "ported".  I do not know why it shouldn't be 
"portable".

Cheers

--
Marco
From: Chris Riesbeck
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <riesbeck-C9D781.13092212082003@news.it.northwestern.edu>
In article <················@cs.nyu.edu>,
 Marco Antoniotti <·······@cs.nyu.edu> wrote:

>Lowell wrote:
>> CL-XML is not portable - it does not work with Clisp :-(
>
>I'd say it has not been "ported".  I do not know why it shouldn't be 
>"portable".

Hmm - I usually interpret "portable" to be "runs on compliant
platforms" not "can be ported" -- that would be
"port-able"
From: james anderson
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <3F3972BF.28499343@setf.de>
Chris Riesbeck wrote:
> 
> In article <················@cs.nyu.edu>,
>  Marco Antoniotti <·······@cs.nyu.edu> wrote:
> 
> >Lowell wrote:
> >> CL-XML is not portable - it does not work with Clisp :-(
> >
> >I'd say it has not been "ported".  I do not know why it shouldn't be
> >"portable".
> 
> Hmm - I usually interpret "portable" to be "runs on compliant
> platforms" not "can be ported" -- that would be
> "port-able"


a program can not only purport to be portable, but also to be a conforming
program, yet, in practice, "to depend for its correctness only upon documented
aspects of common lisp" will not suffice to guarantee its correct functioning
in a new implementation given that there are any number of things which common
lisp does not specify, and that there is always a disjunction between a
purportedly conforming common lisp implementation and the standard.

...
From: james anderson
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <3F37E023.7C492B44@setf.de>
Marco Antoniotti wrote:
> 
> You should check CL-XML
> 
> Cheers
> 
> Marco
> 
> Lowell wrote:
> > Does anyone know where I can find a simple, portable html parser?

it depends on what one means by "html". cl-xml, as released, is intended for
encodings like xhtml. that is, for html which is "well-formed". if you need to
parse earlier versions of html, with abbreviated attributes and unterminated
elements, you'll have to change the bnf to permit undelimted or absent
attribute values and include a heuristic to recognize unterminated elements.
the bnf directory contains a start at an html grammar, but html exhibits a
wide range of variations and you may need to change what's there to suit your source.

> >    It
> > would be great if it could parse html into an S-expression like:
> >
> > (:html (:head (:title "Example HTML input"))
> >   (:body (:p ...
> >

as released, cl-xml is intended to parse to a clos document model. if you need
to parse to s-expressions, look in the directory #p"xml:demos;xcl;". it shows
one way to augment the parser's internal construction functions to generate
s-expressions rather than instances. again, as opinions vary on the
appropriate s-expression representation for markup, you may need to modify
what's there.

> > There is a parser available at http://opensource.franz.com/xmlutils/
> > which looks great but seems to only work with Allegro (I use Clisp).

i've never tried to port it to clisp. if one needs to use it with an open
source lisp, openmcl and cmucl are know to work. if someone has experience
with clisp, please let me know.

...
From: Simon András
Subject: Re: looking for a portable html parser
Date: 
Message-ID: <vcdlltz7xms.fsf@proof.math.bme.hu>
I `de-Allegroized' (`ported' would be an overstatement) phtml.cl some
time ago, so I could use it with CMUCL. Send me a mail if you want me
to dig it up for you. But be warned that is is based on v 1.18, dated
2000/08/16. Not to mention that it is untested, may only work with
CMUCL, or not at all.

Andras 

Lowell <······@cs.ubc.ca> writes:

> Does anyone know where I can find a simple, portable html parser? It
> would be great if it could parse html into an S-expression like:
> 
> (:html (:head (:title "Example HTML input"))
>    (:body (:p ...
> 
> There is a parser available at http://opensource.franz.com/xmlutils/
> which looks great but seems to only work with Allegro (I use Clisp).
> 
> Lowell