From: Torkel Holm
Subject: Processing lxml (xml sexprs)
Date: 
Message-ID: <871xsyeb9k.fsf@uib.no>
Hi 

I'm  using the xml parser from Franz
(http://www.franz.com/support/documentation/6.2/doc/pxml.htm).
Some parsing samples is given below:

Ex1.
net.xml.parser(60): (parse-xml "<foo><bar/><bar/></foo>")
((foo (bar) (bar))) ; lxml

Ex2.
net.xml.parser(61): (parse-xml "<foo att='val'><bar/><bar/></foo>")
(((foo att "val") (bar) (bar))) ; lxml
 
Ex3. 
net.xml.parser(62): (parse-xml "<foo att='val'><bar><zot/></bar></foo>")
(((foo att "val") (bar (zot)))) ; lxml

Ex4. 
net.xml.parser(113): (parse-xml "<foo att='val'>content<bar>bar content</bar></foo>")
(((foo att "val") "content" (bar "bar content")))

Note the treatment of attributes in ex2 and content in ex4.

I want to do some processing with the functions start-elem and end-elem. 
My first take on this is given below:


(defparameter *stack* nil)
(defun process-lxml (lxml)
  (flet ((start-elem (element attributes)
           (push element *stack*)
           ;; do something with element...
           (format t "~%element:~S attributes:~S" element attributes))
         (end-elem ()
           (format t "~%end of element:~S" (pop *stack*))))
    (loop for x in lxml do
          (cond 
           ((stringp x)  nil) ; do nothing with element content
           ((atom x) ; element without attributes
            (start-elem x nil))
           ((listp x)
            (let ((next (rest x))
                  (attributes nil))
              (when (symbolp (first next)) ; any attributes?
                (setf attributes next))
              (when (stringp (first next)) ; any content
                (format t "~%Content:~S" (first next)))
              (cond ((symbolp (first x)) 
                     (start-elem (first x) attributes)
                     (when (listp (first next)) 
                       (process-lxml next)
                       (end-elem)))
                    (t (process-lxml x)
                       (end-elem)))))))))


Other solutions, bugs, style and effiency comments are welcome.

Thank you.


Torkel Holm, ((norway (bergen)))
From: james anderson
Subject: Re: Processing lxml (xml sexprs)
Date: 
Message-ID: <3F9D6B92.3CAA7143@setf.de>
i had thought that franz made a destructuring macro available in connection
with their parser. from the examples i recall, it would make code for this
kind of task much easier to read than does the approach used in the example
below. as an alternative, you could look at
http://www.cl-xml.org/documentation/howto/destructure.lisp, which is intended
accomplish much the same thing. it's meant to be used with an instance model
rather than lists, but would need only the appropriate accessors in order to
work with lists. given the plist form of lxml attributes, something based on
destructuring-bind is another alternative.

the approach illustrated below tends to leave my head spinning. around two
axes at once.
first, it relies on first/rest/car/cdr operators to incrementally destructure
a tree which has a consistent, known structure. the least one might consider
is to either define mnemonic accessors or use destructuring-bind directly.
second, it is reiterating the parse which parse-xml has already done once.
albeit at a different level. one wonders whether there were not a better way
to express the intended transformation directly.


Torkel Holm wrote:
> 
> Hi
> 
> [lxml structure examples]
> 
> Note the treatment of attributes in ex2 and content in ex4.
> 
> I want to do some processing with the functions start-elem and end-elem.
> My first take on this is given below:
> 
> (defparameter *stack* nil)
> (defun process-lxml (lxml)
>   (flet ((start-elem (element attributes)
>            (push element *stack*)
>            ;; do something with element...
>            (format t "~%element:~S attributes:~S" element attributes))
>          (end-elem ()
>            (format t "~%end of element:~S" (pop *stack*))))
>     (loop for x in lxml do
>           (cond
>            ((stringp x)  nil) ; do nothing with [parsed character data]
>            ((atom x) ; element without attributes
>             (start-elem x nil))
>            ((listp x)
>             (let ((next (rest x))
>                   (attributes nil))
>               (when (symbolp (first next)) ; any attributes?
>                 (setf attributes next))
>               (when (stringp (first next)) ; any content
>                 (format t "~%Content:~S" (first next)))
>               (cond ((symbolp (first x))
>                      (start-elem (first x) attributes)
>                      (when (listp (first next))
>                        (process-lxml next)
>                        (end-elem)))
>                     (t (process-lxml x)
>                        (end-elem)))))))))
> 
...