I have to use XML (i.e., to parse XML strings) within Common Lisp and
I would like to know which XML parsers exist for Lisp.
I'm aware of a couple of efforts:
- the code written by James Anderson and distributed in the contrib
directory of CL-HTTP (it seems not able to deal with
parameter-entity references, however)
- the ACL5 wrappers for the expat API by Kaelin Colclasure
Is there anything else? Do you have comments and/or first-hand
experiences with the tools I mentioned?
Thanks in advance
alberto
In article <············@fe2.cs.interbusiness.it>,
·······@irst.it (Alberto Lavelli) wrote:
>
> I have to use XML (i.e., to parse XML strings) within Common Lisp and
> I would like to know which XML parsers exist for Lisp.
http://clocc.sourceforge.net
clocc/src/cllib/xml.lisp
this will parse but will not validate against your dtd
Sent via Deja.com http://www.deja.com/
Before you buy.
·······@irst.it (Alberto Lavelli) writes:
> I have to use XML (i.e., to parse XML strings) within Common Lisp and
> I would like to know which XML parsers exist for Lisp.
>
> I'm aware of a couple of efforts:
>
> - the code written by James Anderson and distributed in the contrib
> directory of CL-HTTP (it seems not able to deal with
> parameter-entity references, however)
>
> - the ACL5 wrappers for the expat API by Kaelin Colclasure
>
> Is there anything else? Do you have comments and/or first-hand
> experiences with the tools I mentioned?
Depending on how sophisticated your use of XML is, and your
performance requirements, a simple approach to XML or even SGML
parsing is to use one of the ready-made parsers, put a simple wrapper
around it that produces Lisp-friendly output, and use this wrapper as
a subprocess for parsing.
Using expat, and restricting myself to the ISO Latin1 subset of
Unicode, it took me couple of hours and 150 lines of C code to produce
the wrapper.
Of course this form of double parsing is prohibitive if your
performance requirements are medium to high (currently parsing a 3.5MB
file of moderately structured XML takes around 6 seconds on an AMD
K6-2 550), in which case you'll need to either provide for a tighter
coupling, or write an XML parser in CL.
Depending on your requirements in XML parsing, the XML parser itself
is easily written (modulo charset handling), unless you need/want a
validating parser. OTOH, IMHO validation should be left to
specialized validators, and application parsers should be checking
for well-formedness only (and even that only with minimum effort).
OTOH, the double parsing gives you quite a bit of lattitude in the
way you do your lisp-side parsing, allowing you to have multiple,
dynamically varying internal representations for XML elements and
their sub-elements, in a top-down manner. While this could be
achieved with a more integrated parser, the API design would be quite
a bit more effort than I expended... ;)
Regs, Pierre.
--
Pierre Mai <····@acm.org> PGP and GPG keys at your nearest Keyserver
"One smaller motivation which, in part, stems from altruism is Microsoft-
bashing." [Microsoft memo, see http://www.opensource.org/halloween1.html]