Using Lisp to process "XML-like" documents...

From: Christopher Browne
Subject: Using Lisp to process "XML-like" documents...
Date: Tue, 28 Sep 1999 00:00:00 +0000
Message-ID: <17UH3.17003$%4.517525@news6.giganews.com>

[Note: followup redirected to comp.lang.lisp as it's neither a Linux
matter nor directly an XML matter...]
On 24 Sep 1999 07:26:49 -0400, William F. Hammond
<·······@hilbert.math.albany.edu> wrote: 
>Arthur Lemmens <·······@simplex.nl> writes:
>
>> By using these conventions I don't have to use a parser at all (not
>> even Lisp's READ function). I just start defining the macros that
>> do whatever needs to be done with the document. Lisp's compiler takes
>> care of parsing the document for me.
>
>Could you elaborate on this with a small, simple example?

Indeed.

The reference to the use of DESTRUCTURING-BIND as a significant part
of the process was interesting; I'd be *real* interested in seeing how
this is made to work as DESTRUCTURING-BIND is one of those operators
for which I have Not Yet Seen Relevance...  

A coherent example of its usefulness would be worth seeing.

-- 
Programming is one of the most difficult branches of applied mathematics; the
poorer mathematicians had better remain pure mathematicians.
-- Edsger W. Dijkstra
········@hex.net- <http://www.hex.net/~cbbrowne/lsf.html>

Re: Using Lisp to process "XML-like" documents... Arthur Lemmens
- Re: Using Lisp to process "XML-like" documents... ·······@neodesic.com
  - Re: Using Lisp to process "XML-like" documents... Rainer Joswig
    - Re: Using Lisp to process "XML-like" documents... Kent M Pitman
  - Re: Using Lisp to process "XML-like" documents... Martin Rodgers
  - Re: Using Lisp to process "XML-like" documents... james anderson
Re: Using Lisp to process "XML-like" documents... Kenny Tilton
- Re: Using Lisp to process "XML-like" documents... Martin Rodgers
  - Re: Using Lisp to process "XML-like" documents... Kenny Tilton

From: Arthur Lemmens
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Tue, 28 Sep 1999 00:00:00 +0000
Message-ID: <37F121F4.95EB9EE3@simplex.nl>

I wrote (in comp.text.xml):
>
> By using these conventions I don't have to use a parser at all (not
> even Lisp's READ function). I just start defining the macros that
> do whatever needs to be done with the document. Lisp's compiler takes
> care of parsing the document for me.

William F. Hammond asked:
>
> Could you elaborate on this with a small, simple example?

Christopher Browne switched the thread to comp.lang.lisp and wrote:
>
> The reference to the use of DESTRUCTURING-BIND as a significant part
> of the process was interesting;

I don't think I ever mentioned DESTRUCTURING-BIND. If someone else did,
I'm afraid I missed that.

It's all much simpler than that. The trick is that, in the right context, 
a document like:

(book (:title "My book"
       :author "Arthur Lemmens")
  (chapter (:title "Introduction")
    (paragraph () "This is a very short introduction")
    (paragraph (:justify :right) "It won't get much longer."))
  (chapter (:title "Chapter One")))

can be a legal Lisp program.
Here is such a context:

(defmacro book ((&key title author) &body chapters)
  (format nil "This is a book by ~A about ~A.~% It contains ~D chapters." 
	  author title (length chapters)))

(defmacro chapter ((&key title) &body children)
  (declare (ignore children))
  title)

(defmacro paragraph ((&key (justify :left)) &body text)
  (first text))

After evaluating these macros, I can let my Lisp compiler evaluate
the document/program above. In fact, I just did this. Running
my document/program resulted in the string:
"This is a book by Arthur Lemmens about My book.
 It contains 2 chapters."

The Lisp compiler takes care of parsing the document for me. In fact, 
it will complain if I misspell a keyword parameter (eh, sorry, 
attribute) and it will also fill in default values if I ask it to. 
You could say I get a poor man's DTD validation for free as well.

--
Arthur Lemmens

From: ·······@neodesic.com
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Wed, 29 Sep 1999 00:00:00 +0000
Message-ID: <7stf5p$ab6$1@nnrp1.deja.com>

n article <·················@simplex.nl>,
  Arthur Lemmens <·······@simplex.nl> wrote:
>
> I wrote (in comp.text.xml):
> >
> > By using these conventions I don't have to use a parser at all
> > (not even Lisp's READ function). I just start defining the macros
> > that do whatever needs to be done with the document. Lisp's
> > compiler takes care of parsing the document for me.

I've written a bit of XML in Lisp s-expression syntax.  There is an
ugly problem with trying to get READ to understand these forms,
however: XML namespaces.

My novice understanding of XML is that in the tag <HTML:P>, HTML is
the namespace.  But READ tries to parse it as a symbol P that is
external in an HTML package.  Unless you've set up packages ahead of
time to mirror what namespaces & tags can appear in your XML document
you're kind of screwed.

I can think of kludges to get around this.  With a little readtable
manipulation you can convince READ to convert "NAMESPACE:TAG" strings
to symbols whose symbol-name is "NAMESPACETAG" (see
<http://www.deja.com/getdoc.xp?AN=403294371>, by Matthias Holzl).
This might be enough in some contexts but is not generally acceptable.

Another idea is to define a readmacro that would just gobble up
characters until it hits whitespace and return a symbol with the
corresponding name.  If the read macro were named #$, you'd use it
like this: (#$HTML:P ...).  This is just too ugly.

If you wanted to get really fancy you could examine the document to
determine what namespaces and tags it uses, then setup packages to
mirror those namespaces before calling READ.  At this point you almost
might as well just write a real parser.

XML namespaces show up everywhere, they're not just some obscure
feature you can ignore.  The fact that XML uses ":" to separate
namespaces from tags seems Lisp-like, so it's ironic that it makes
things so much harder to parse from Lisp.

John Wiseman

Sent via Deja.com http://www.deja.com/
Before you buy.

From: Rainer Joswig
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Wed, 29 Sep 1999 00:00:00 +0000
Message-ID: <joswig-2909991916180001@194.163.195.67>

In article <············@nnrp1.deja.com>, ·······@neodesic.com wrote:

> If you wanted to get really fancy you could examine the document to
> determine what namespaces and tags it uses, then setup packages to
> mirror those namespaces before calling READ.  At this point you almost
> might as well just write a real parser.

Hmm, would a condition handler to the trick?

From: Kent M Pitman
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Wed, 29 Sep 1999 00:00:00 +0000
Message-ID: <sfwogelxqrq.fsf@world.std.com>

······@lavielle.com (Rainer Joswig) writes:

> In article <············@nnrp1.deja.com>, ·······@neodesic.com wrote:
> 
> > If you wanted to get really fancy you could examine the document to
> > determine what namespaces and tags it uses, then setup packages to
> > mirror those namespaces before calling READ.  At this point you almost
> > might as well just write a real parser.
> 
> Hmm, would a condition handler to the trick?

He should not be using READ.

From: Martin Rodgers
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Wed, 29 Sep 1999 00:00:00 +0000
Message-ID: <MPG.125c7c451483d50a98a039@news.demon.co.uk>

In article <············@nnrp1.deja.com>, ·······@neodesic.com says...

> At this point you almost might as well just write a real parser.
 
That's why I've used expat. I wrote a simple C program to map XML to 
s-exprs. Tag names can be handled any way you like. Map namespace:name 
to whatever kind of Lisp object you like, say a structure. There's no 
reason why you can't make the mapping data defined. Use an s-expr!
-- 
Remove insect from address | You can never browse enough
will write code that writes code that writes code for food

From: james anderson
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Sat, 02 Oct 1999 00:00:00 +0000
Message-ID: <37F646DE.25FCE6E5@ibm.net>

there is a rudamentary xml parser packaged with the cl-http release.
take a look at it's namespace handling to get an idea of what is necessary.
the mappings suggested below are not correct.


·······@neodesic.com wrote:
> 
> n article <·················@simplex.nl>,
>   Arthur Lemmens <·······@simplex.nl> wrote:
> >
> > I wrote (in comp.text.xml):
> > >
> > > By using these conventions I don't have to use a parser at all
> > > (not even Lisp's READ function). I just start defining the macros
> > > that do whatever needs to be done with the document. Lisp's
> > > compiler takes care of parsing the document for me.
> 
> I've written a bit of XML in Lisp s-expression syntax.  There is an
> ugly problem with trying to get READ to understand these forms,
> however: XML namespaces.
> 
> My novice understanding of XML is that in the tag <HTML:P>, HTML is
> the namespace.  But READ tries to parse it as a symbol P that is
> external in an HTML package.  Unless you've set up packages ahead of
> time to mirror what namespaces & tags can appear in your XML document
> you're kind of screwed.
> 
> I can think of kludges to get around this.  With a little readtable
> manipulation you can convince READ to convert "NAMESPACE:TAG" strings
> to symbols whose symbol-name is "NAMESPACETAG" (see
> <http://www.deja.com/getdoc.xp?AN=403294371>, by Matthias Holzl).
> This might be enough in some contexts but is not generally acceptable.

From: Kenny Tilton
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Tue, 28 Sep 1999 00:00:00 +0000
Message-ID: <37F05C2E.D5D5F3D4@liii.com>

> I'd be *real* interested in seeing how
> this is made to work as DESTRUCTURING-BIND is one of those operators
> for which I have Not Yet Seen Relevance...
> 
> A coherent example of its usefulness would be worth seeing.

Ah, I love this one, because 'destructuring-bind is my latest crush in
CL, and it took me a loonggggg time to appreciate. In the meantime, I
was doing stuff like:

(defun myFunc (p1 p1Info)
  (let ((p1Size (car p1Info))
        (p1Color (cdr p1Info))
   ...yada yada...

and:

   (cadr (member :allocation slotSpec))


The first because the 'p1Info was just a quicky cons to shoot two pieces
of info around and I did not want to go thru a whole defstruct for that.
One can argue I was lazy, and so this does not justify d/bind. :)

But d/bind also lets me do neat stuff in a macro I wrote which
preprocesses if you will a 'defclass. The defclass code is just data to
my macro, and it parses it using d/bind.

It was great fun and lots of laughing at myself as I ripped out all the
code (such as the second example) that tediously looked through a list
looking for a keyword and then plucked out the next thing in the list.

Hope that helps.

kt

From: Martin Rodgers
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Tue, 28 Sep 1999 00:00:00 +0000
Message-ID: <MPG.125ad4e8a20927b498a034@news.demon.co.uk>

In article <·················@liii.com>, ····@liii.com says...

> Ah, I love this one, because 'destructuring-bind is my latest crush in
> CL, and it took me a loonggggg time to appreciate. In the meantime, I
> was doing stuff like:

Curiously, this was something that took me no time at all to 
appreciate. Possibly because I'd been writing loads of code like:
 
> (defun myFunc (p1 p1Info)
>   (let ((p1Size (car p1Info))
>         (p1Color (cdr p1Info))
>    ...yada yada...

The example below uses a WITH-PAIR macro in Scheme. As you can see, 
this also checks for a pair and branches.

(define (treat-or e*)
    (with-pair test next e* #f
        (with-pair a b next test
            (with-gensym (v "or")
                `((lambda (,v)
                    (if ,v
                        ,v
                        ,(treat-or next)))
                    ,test)))))

BTW, converting XML docments to Lisp data is neat. I use expat and a 
little C code to do the conversion, then Lisp does the processing.
Would an XML parser written in Lisp be faster? This was fast to write.
-- 
Please note: my email address is munged; You can never browse enough
         "There are no limits." -- tagline for Hellraiser

From: Kenny Tilton
Subject: Re: Using Lisp to process "XML-like" documents...
Date: Tue, 28 Sep 1999 00:00:00 +0000
Message-ID: <37F171C4.6564AB85@liii.com>

Martin Rodgers wrote:
> 
> In article <·················@liii.com>, ····@liii.com says...
> 
> > Ah, I love this one, because 'destructuring-bind is my latest crush in
> > CL, and it took me a loonggggg time to appreciate. In the meantime, I
> > was doing stuff like:
> 
> Curiously, this was something that took me no time at all to
> appreciate. 

OK, I confess: I took one look at the size of the name
'destructuring-bind and did not even look to see what it did! :)

Then I was developing a CLOS metaclass and saw some examples somewhere
that were leveraging that bad boy and I put two and two together.

I really learned a lot during that exercise, at a point when i thought I
new CL. Not! :)

kt