I am looking for a better XML parser to use with ACL. The parser in ACL
doesn't work for me.
What I am trying to do is process an xml file containing the instance
variables from smalltalk objects stored in xml. I would like to create a
list of structures or objects in lisp.
The xml file looks like:
<target version="1">
<customer>Frank</customer>
<product>LW5</customer>
</target>
<target version="1">
<customer>.......
The problem with the ACL parser is that it creates an s-expression like
((target version "1") (customer "frank")(product "lw5"))
However I expect:
(target (customer "frank") (product "lw5"))
It turns out that the attribute "version" messes things up. Manually
removing it results in the expected result. I believe this is an
inappropriate way to work for the parser because it does not recognize the
relation between the 'parent' and 'child' tags. so I am looking for an
alternative parser, or maybe a better approach to import the informaton.
Best Regards,
Frank
Frank Sonnemans wrote:
> I am looking for a better XML parser to use with ACL. The parser in ACL
> doesn't work for me.
>
The ACL XML parser produces structure similar to the stuff in ACL's htmlgen, where
<foo bar="baz"><mumble>grumble</mumble></foo>
produces
((foo bar "baz") (mumble "grumble"))
The parser has to do *something* with the attributes, or you couldn't
reproduce the original XML from the Lisp structure. The htmlgen-ish
syntax is OK for writing HTML, but I found it painful for processing XML
because it makes it harder to find the tags. I wrote two little
converter functions that change the above structure to and from:
(foo (bar baz) (mumble nil "grumble"))
i.e., XML is always represented as (tag attribute-list content...).
Enjoy:
;; For structural processing purposes, I prefer XML rendered as
;;
;; (foo (att val...) stuff)
;; (foo nil stuff)
;;
;; rather than as
;;
;; ((foo att val...) stuff)
;; (foo stuff)
;;
;; These functions convert between these two representations,
;; consing minimally by smashing structure. Nasty, but fast.
(defun flatten-lxml (lxml)
(typecase lxml
(cons
(typecase (car lxml)
(symbol
(mapc #'flatten-lxml (cdr lxml))
(setf (cdr lxml) (cons nil (cdr lxml))))
(cons
(if (symbolp (caar lxml))
(let ((subnode (car lxml))
(stuff (cdr lxml)))
(mapc #'flatten-lxml stuff)
(setf (car lxml) (car subnode))
(setf (cdr lxml) subnode)
(setf (car subnode) (cdr subnode))
(setf (cdr subnode) stuff))
(error "Malformed lxml: ~s" lxml)))
(t (error "malformed lxml: ~s" lxml)))))
lxml)
(defun unflatten-lxml (uxml)
(typecase uxml
(cons
(unless (and (symbolp (car uxml))
(consp (cdr uxml)))
(error "malformed flat uxml: ~s" uxml))
(if (cadr uxml)
(let* ((subnode (cdr uxml))
(stuff (cdr subnode)))
(setf (cdr subnode) (car subnode))
(setf (car subnode) (car uxml))
(setf (car uxml) subnode)
(setf (cdr uxml) stuff))
(progn
(mapc #'unflatten-lxml (cddr uxml))
(setf (cdr uxml) (cddr uxml))))))
From: Kenny Tilton
Subject: Re: Alternative (better) xml parser available for ACL??
Date:
Message-ID: <3E70B406.9080902@nyc.rr.com>
Frank Sonnemans wrote:
> I am looking for a better XML parser to use with ACL. The parser in ACL
> doesn't work for me.
>
> What I am trying to do is process an xml file containing the instance
> variables from smalltalk objects stored in xml. I would like to create a
> list of structures or objects in lisp.
>
> The xml file looks like:
>
> <target version="1">
> <customer>Frank</customer>
> <product>LW5</customer>
> </target>
> <target version="1">
> <customer>.......
>
> The problem with the ACL parser is that it creates an s-expression like
>
> ((target version "1") (customer "frank")(product "lw5"))
>
> However I expect:
>
> (target (customer "frank") (product "lw5"))
>
> It turns out that the attribute "version" messes things up. Manually
> removing it results in the expected result. I believe this is an
> inappropriate way to work for the parser because it does not recognize the
> relation between the 'parent' and 'child' tags.
I don't see any difference between ((target <mo stuff>) ...) and your
preferred (target ....) in re showing the parent relationship. The
parent relationship does not come from target being un-nested, it comes
from it being first. 'target' is still first in ((target version "1")
...), you just have to excavate it from a long-form (if you will)
description.
--
kenny tilton
clinisys, inc
http://www.tilton-technology.com/
---------------------------------------------------------------
"Cells let us walk, talk, think, make love and realize
the bath water is cold." -- Lorraine Lee Cudmore
From: Tim Bradshaw
Subject: Re: Alternative (better) xml parser available for ACL??
Date:
Message-ID: <ey3isun59e1.fsf@cley.com>
* Frank Sonnemans wrote:
> It turns out that the attribute "version" messes things up. Manually
> removing it results in the expected result. I believe this is an
> inappropriate way to work for the parser because it does not
> recognize the relation between the 'parent' and 'child' tags. so I
> am looking for an alternative parser, or maybe a better approach to
> import the informaton.
Erm? Surely:
(defun de-attributize (form)
(typecase form
(cons
(destructuring-bind (tag/attr . body) form
(cons (typecase tag/attr
(cons (car tag/attr))
(t tag/attr))
(mapcar #'de-attributize body))))
(t form)))
--tim