From: Ron Parker
Subject: Processing XML-formatted documents in CL
Date: 
Message-ID: <f68e3853-9ae2-41d5-8882-1153fe62bd68@v4g2000yqa.googlegroups.com>
I need to analyze and manipulate several multi-megabyte documents that
are stored in various XML formats. There appears to be a fairly long
list of XML parsers and tools on the CLiki but I have no frame of
reference to judge them.

Some of the documents I need to access will probably contain Unicode,
although I am not positive of this.

Half of me wants to hack it from scratch, but this is my first CL
project so I don't know if this would be wise even though I've hacked
Elisp for a couple decades off and on.

Any recommendations in this area would be appreciated.

From: Xah Lee
Subject: Re: Processing XML-formatted documents in CL
Date: 
Message-ID: <e5a867a1-32bf-45b1-a102-58777c74371c@a26g2000prf.googlegroups.com>
On Jan 13, 7:57 pm, Ron Parker <········@gmail.com> wrote:
> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.
>
> Some of the documents I need to access will probably contain Unicode,
> although I am not positive of this.
>
> Half of me wants to hack it from scratch, but this is my first CL
> project so I don't know if this would be wise even though I've hacked
> Elisp for a couple decades off and on.
>
> Any recommendations in this area would be appreciated.

thought i'd mention that there's nxml mode written by the xml expert
James Clark. It features a xml parser and xml validation as you type.

since it contains a complet parser, i think it can be used for your
project, but am not sure how the code is structured to be used like
that. (the code is over 10k lines)

(James is also the one who wrote the widely used xml parser expat in
c. (which happened to be used as part of our app server software
written in perl back in 1999, before realizing him in about 2007.))

  Xah
∑ http://xahlee.org/

☄
From: Volkan YAZICI
Subject: Re: Processing XML-formatted documents in CL
Date: 
Message-ID: <b708605d-5299-49f4-b847-aedffa2ca55e@s9g2000prg.googlegroups.com>
On Jan 14, 5:57 am, Ron Parker <········@gmail.com> wrote:
> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.
>
> Some of the documents I need to access will probably contain Unicode,
> although I am not positive of this.

I think Closure XML[1] is pretty good; have a good support and
community.

You didn't specify much about what you mean with "multi-megabyte", but
this issue could be a problem. Anyway, just let's try and see. If it
fails for some memory related reasons, you can first convert your XML
files into s-expression forms using some sort of XSLT and then easily
parse these s-expressions from lisp.


Regards.

[1] http://common-lisp.net/project/cxml/
From: Andy Chambers
Subject: Re: Processing XML-formatted documents in CL
Date: 
Message-ID: <1f3e4d39-393f-4850-b17c-5809c5b54dce@b38g2000prf.googlegroups.com>
On Jan 14, 8:26 am, Volkan YAZICI <·············@gmail.com> wrote:
> On Jan 14, 5:57 am, Ron Parker <········@gmail.com> wrote:
>
> > I need to analyze and manipulate several multi-megabyte documents that
> > are stored in various XML formats. There appears to be a fairly long
> > list of XML parsers and tools on the CLiki but I have no frame of
> > reference to judge them.
>
> > Some of the documents I need to access will probably contain Unicode,
> > although I am not positive of this.
>
> I think Closure XML[1] is pretty good; have a good support and
> community.
>
> You didn't specify much about what you mean with "multi-megabyte", but
> this issue could be a problem.

Not really.  Closure can process documents as streams or trees and
it's
stream processor has a really nice interface (klacks).

--
Andy
From: GP lisper
Subject: Re: Processing XML-formatted documents in CL
Date: 
Message-ID: <slrngmrgvm.5du.spambait@phoenix.clouddancer.com>
On Tue, 13 Jan 2009 19:57:47 -0800 (PST), <········@gmail.com> wrote:
> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.

If it is standards compliant XML, any of the fancy code will work.

When I faced this problem about 3 years ago, "s-xml" solved my
non-standard XML problems nicely.  I still use it for everything,
since I didn't need to learn any buzzwords and specs to apply it.

You'll probably try a few parsers anyway, sounds like speed will be an
issue.

-- 
"Most programmers use this on-line documentation nearly all of the
time, and thereby avoid the need to handle bulky manuals and perform
the translation from barbarous tongues."  CMU CL User Manual
From: Zach Beane
Subject: Re: Processing XML-formatted documents in CL
Date: 
Message-ID: <m38wpevx1d.fsf@unnamed.xach.com>
Ron Parker <········@gmail.com> writes:

> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.
>
> Some of the documents I need to access will probably contain Unicode,
> although I am not positive of this.
>
> Half of me wants to hack it from scratch, but this is my first CL
> project so I don't know if this would be wise even though I've hacked
> Elisp for a couple decades off and on.
>
> Any recommendations in this area would be appreciated.

I've been a happy user of Closure XML for some time now. It's very
capable and the documentation is good.

Zach
From: game_designer
Subject: Re: Processing XML-formatted documents in CL
Date: 
Message-ID: <e8daa490-165e-45c5-a9da-eda1bc61805c@z27g2000prd.googlegroups.com>
On Jan 13, 8:57 pm, Ron Parker <········@gmail.com> wrote:
.
>
> Any recommendations in this area would be appreciated.

If the goal is to read the file and to map XML elements into similar
structured CLOS objects you may want to explore XMLisp. A new version
released recently:

http://www.agentsheets.com/lisp/XMLisp/

If it works, great! If not, let me know why not.

Alex