From: Tayssir John Gabbour
Subject: Re: XML
Date: 
Message-ID: <cgvc62$h9d@odbk17.prod.google.com>
Joe Marshall wrote:
> "Jeff" <···@nospam.insightbb.com> writes:
> > 5. Error checking is a real pain (ie close tags in the wrong
order). I
> > can't count how many malformed XML files broke game code.
>
> Not surprising.  A certain amount of redundancy is necessary for
error
> correction, but it is not sufficient.  Just adding redundancy
strictly
> *increases* the liklihood of error.  You need an additional mechanism
> to recover from errors.

Indeed. I have never seen code which takes advantage of this
"redundancy." If one needs a checksum, they just add an explicit
checksum, or more likely rely on the checksumming of a lower transport
layer.

Also, sexps can co-opt any advantages of end-tags:
(foo 42) ;</foo>

We see that comments can trivially add any sort of end-tag
requirements. XML however can't trivially add sexp's terseness
advantages, since there's always that minimum bulk.

Anyway, XML is only a communication medium. Using it as the internal
representation of one's important data is probably a recipe for
disaster.

MfG,
Tayssir

--
Video, audio and other lispish odds & ends.
http://alu.cliki.net/Education

From: Stefan Scholl
Subject: Re: XML
Date: 
Message-ID: <ymojtbd3f57j.dlg@parsec.no-spoon.de>
On 2004-08-30 16:04:50, Tayssir John Gabbour wrote:

> Anyway, XML is only a communication medium. Using it as the internal
> representation of one's important data is probably a recipe for
> disaster.

Maybe this is the reason I don't really grok XML databases.
From: Christopher C. Stacy
Subject: Re: XML
Date: 
Message-ID: <ubrgssd58.fsf@news.dtpq.com>
>>>>> On Mon, 30 Aug 2004 17:26:44 +0200, Stefan Scholl ("Stefan") writes:
 Stefan> On 2004-08-30 16:04:50, Tayssir John Gabbour wrote:
 >> Anyway, XML is only a communication medium. Using it as the
 >> internal representation of one's important data is probably a
 >> recipe for disaster.
 Stefan> Maybe this is the reason I don't really grok XML databases.

I think what's going on there in some cases may be an attempt
at doing object-oriented database access on an SQL engine.
From: Russell Wallace
Subject: Re: XML
Date: 
Message-ID: <4133a2c0.257796439@news.eircom.net>
On 30 Aug 2004 07:04:50 -0700, "Tayssir John Gabbour"
<···········@yahoo.com> wrote:

>Anyway, XML is only a communication medium. Using it as the internal
>representation of one's important data is probably a recipe for
>disaster.

If by "internal" you mean "working copy in memory", I'd certainly
agree. But for anything you want to store on disk for an indefinite
period, using XML is a Good Idea. For some of the reasons why, see:

http://www.longnow.org/10klibrary/darkage.htm

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: Alain Picard
Subject: Re: XML
Date: 
Message-ID: <87acwcw9o9.fsf@memetrics.com>
················@eircom.net (Russell Wallace) writes:

> But for anything you want to store on disk for an indefinite
> period, using XML is a Good Idea. For some of the reasons why, see:
>
> http://www.longnow.org/10klibrary/darkage.htm

Either there is sufficient meta-data to make sense of the
bits on the disk, or there isn't.  XML tags don't magically
tell an application (or a person) how some field is to
be handled.  For a digital library, for example, it'd probably
be shorter and faster to include a full specification of the
data format (in ASCII) on the CDROM and to highly compress
the contents of the library.  

Storing megabytes and megabytes of XML seems insane.
From: Rob Warnock
Subject: Re: XML
Date: 
Message-ID: <w8Kdnap2l6JTd67cRVn-ug@speakeasy.net>
Alain Picard  <············@memetrics.com> wrote:
+---------------
| ················@eircom.net (Russell Wallace) writes:
| > But for anything you want to store on disk for an indefinite
| > period, using XML is a Good Idea. For some of the reasons why, see:
| > http://www.longnow.org/10klibrary/darkage.htm
| 
| Either there is sufficient meta-data to make sense of the
| bits on the disk, or there isn't.  XML tags don't magically
| tell an application (or a person) how some field is to
| be handled.  For a digital library, for example, it'd probably
| be shorter and faster to include a full specification of the
| data format (in ASCII) on the CDROM and to highly compress
| the contents of the library.  
+---------------

Or if political considerations *require* XML somewhere, then write
that "full specification of the data format (in ASCII)" in XML and
include it on the CDROM, and leave itself the data in binary (possibly
even compressed, as you say).

+---------------
| Storing megabytes and megabytes of XML seems insane.
+---------------

Indeed.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Russell Wallace
Subject: Re: XML
Date: 
Message-ID: <4133ffbb.281603081@news.eircom.net>
On Tue, 31 Aug 2004 08:19:50 +1000, Alain Picard
<············@memetrics.com> wrote:

>Either there is sufficient meta-data to make sense of the
>bits on the disk, or there isn't.  XML tags don't magically
>tell an application (or a person) how some field is to
>be handled.

They provide a far better starting point than other formats, enough so
that in many cases it's likely to make the difference between the data
being retrievable at affordable cost versus being abandoned.

>For a digital library, for example, it'd probably
>be shorter and faster to include a full specification of the
>data format (in ASCII) on the CDROM and to highly compress
>the contents of the library.  

You could certainly store the data in that way - good luck to your
successor trying to get it back again.

>Storing megabytes and megabytes of XML seems insane.

<only partly tongue in cheek> Sometimes I wonder if further progress
in computer technology is waiting for those of us whose minds were
permanently warped by having to fit things in 3.5K of memory to die
off :P </only partly tongue in cheek>

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: John Thingstad
Subject: Re: XML
Date: 
Message-ID: <opsdk3i4xlpqzri1@mjolner.upc.no>
I think it is more a question of being old enough to see
that doing the same old thing (storing data) with more sluggis data  
retrival
and taking 4 times the room is not progress.

>
> <only partly tongue in cheek> Sometimes I wonder if further progress
> in computer technology is waiting for those of us whose minds were
> permanently warped by having to fit things in 3.5K of memory to die
> off :P </only partly tongue in cheek>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: Christophe Rhodes
Subject: Re: XML
Date: 
Message-ID: <sqwtzfajhw.fsf@cam.ac.uk>
"John Thingstad" <··············@chello.no> writes:

> I think it is more a question of being old enough to see that doing
> the same old thing (storing data) with more sluggis[h] data retrival
> and taking 4 times the room is not progress.

How do you think error-correcting codes work?  How would you
characterize the speed of retrieval and space requirements of such
codes?

Christophe
-- 
http://www-jcsu.jesus.cam.ac.uk/~csr21/       +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%")    (pprint #36rJesusCollegeCambridge)
From: Rob Warnock
Subject: Re: XML
Date: 
Message-ID: <0fGdnQKX3epD7qncRVn-qg@speakeasy.net>
Christophe Rhodes  <·····@cam.ac.uk> wrote:
+---------------
| "John Thingstad" <··············@chello.no> writes:
| > I think it is more a question of being old enough to see that doing
| > the same old thing (storing data) with more sluggis[h] data retrival
| > and taking 4 times the room is not progress.
| 
| How do you think error-correcting codes work?  How would you
| characterize the speed of retrieval and space requirements of such
| codes?
+---------------

1. A heck of a lot more efficient than XML!!  You'd have to be dealing
   with a really *high* error rate -- multiple percent per byte! -- to
   need an error correction code use equivalent overhead to XML's bloat!
   [E.g., a rate-3/4 or rate-1/2 convolutional code.] And even at that,
   you'd be *getting* something for your overhead, namely, significant
   error *correction*. XML's trailing brackets provide at most some minor
   error *detection*, not much better than an extra parity bit per element.

2. Practical error-correcting codes have *less* overhead than XML!
   For example, a Reed-Solomon code capable of *correcting* up to
   three bytes in error anywhere within each XML element would only
   cost 7 bytes per element (for elements <248 bytes long). Compare
   that to the sum of the lengths of typical XML opening & closing tags!

3. But all that is moot, since XML doesn't provide any error correction!


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Christophe Rhodes
Subject: Re: XML
Date: 
Message-ID: <sq4qmjw6wv.fsf@cam.ac.uk>
····@rpw3.org (Rob Warnock) writes:

> Christophe Rhodes  <·····@cam.ac.uk> wrote:
> +---------------
> | "John Thingstad" <··············@chello.no> writes:
> | > I think it is more a question of being old enough to see that doing
> | > the same old thing (storing data) with more sluggis[h] data retrival
> | > and taking 4 times the room is not progress.
> | 
> | How do you think error-correcting codes work?  How would you
> | characterize the speed of retrieval and space requirements of such
> | codes?
> +---------------
>
> 1. A heck of a lot more efficient than XML!! [...]
> 2. Practical error-correcting codes have *less* overhead than XML! [...]
> 3. But all that is moot, since XML doesn't provide any error correction!

Well, duh.

Christophe
-- 
http://www-jcsu.jesus.cam.ac.uk/~csr21/       +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%")    (pprint #36rJesusCollegeCambridge)
From: William Bland
Subject: Re: XML
Date: 
Message-ID: <pan.2004.08.31.16.40.11.499468@abstractnonsense.com>
On Tue, 31 Aug 2004 04:37:53 +0000, Russell Wallace wrote:
 
> <only partly tongue in cheek> Sometimes I wonder if further progress
> in computer technology is waiting for those of us whose minds were
> permanently warped by having to fit things in 3.5K of memory to die
> off :P </only partly tongue in cheek>

No, progress in computer technology is waiting for people to stop
re-inventing the wheel (often badly).

Cheers,
	Bill.
-- 
Dr. William Bland.
It would not be too unfair to any language to refer to Java as a
stripped down Lisp or Smalltalk with a C syntax.   (Ken Anderson).
From: Jeff
Subject: Re: XML
Date: 
Message-ID: <WL6Zc.93402$Fg5.14608@attbi_s53>
William Bland wrote:

> On Tue, 31 Aug 2004 04:37:53 +0000, Russell Wallace wrote:
>  
> > <only partly tongue in cheek> Sometimes I wonder if further progress
> > in computer technology is waiting for those of us whose minds were
> > permanently warped by having to fit things in 3.5K of memory to die
> > off :P </only partly tongue in cheek>
> 
> No, progress in computer technology is waiting for people to stop
> re-inventing the wheel (often badly).
> 

This, I think, will be my new .sig =) With credit, of course.

Jeff
From: Coby Beck
Subject: Re: XML
Date: 
Message-ID: <qJ9Zc.83303$X12.11219@edtnps84>
"Russell Wallace" <················@eircom.net> wrote in message
·······················@news.eircom.net...
> On Tue, 31 Aug 2004 08:19:50 +1000, Alain Picard
> <············@memetrics.com> wrote:
> >For a digital library, for example, it'd probably
> >be shorter and faster to include a full specification of the
> >data format (in ASCII) on the CDROM and to highly compress
> >the contents of the library.
>
> You could certainly store the data in that way - good luck to your
> successor trying to get it back again.

What if his successor is able to read a data specification...?

> >Storing megabytes and megabytes of XML seems insane.
>
> <only partly tongue in cheek> Sometimes I wonder if further progress
> in computer technology is waiting for those of us whose minds were
> permanently warped by having to fit things in 3.5K of memory to die
> off :P </only partly tongue in cheek>

I think that is pretty non seqitur.  My first development computer was a
Windows NT machine and I can clearly see that XML is a no-brainer that has
been widely touted as the deliverer of all your data sharing dreams.  XML
may provide good hints for a human reader who has nothing else to go on but
it buys a computer program exactly *nothing*.  (/Except/ that now everyone
has bought into it, the free tools are everywhere to get you quickly past
the part where you actually deal with it.  And that part of your data
consuming or producing program, ie parsing or formatting it, is the trivial
part anyway.)

People, and by people I really mean a large proportion of the world's code
monkeys, have unfortunately come to equate parsing something with knowing
what to do with it.  This is XML's Big Lie, this why XML does not deliver
what people expect of it and expectations are a big factor in product
satisfaction.  If I by a nice 5 litre box of wine, I might have a glass and
say to myself, "Hmm, that's not bad (for wine out of a box)" but when
someone pours it into an empty $35/750ml bottle and then offers me a glass,
I will probably say to myself, "yech!"

XML is wine from a box in a $35 bottle.

-- 
Coby Beck
(remove #\Space "coby 101 @ big pond . com")
From: Russell Wallace
Subject: Re: XML
Date: 
Message-ID: <4135f11e.64768057@news.eircom.net>
On Wed, 01 Sep 2004 01:37:26 GMT, "Coby Beck" <·····@mercury.bc.ca>
wrote:

>What if his successor is able to read a data specification...?

Being able to read a data specification is one thing - being able to
get assigned a budget for writing decoding software based on it (which
includes the time to figure out by trial and error all the mistakes,
ambiguities and omissions in the specification - trust me, I've done
that sort of work, it is _not_ anywhere near as trivial as writing a
parser for data of which you yourself are the producer) is another.

>I think that is pretty non seqitur.  My first development computer was a
>Windows NT machine and I can clearly see that XML is a no-brainer that has
>been widely touted as the deliverer of all your data sharing dreams.

Well, I agree. "Use XML" is a no-brainer; it's something that (in most
cases) should clearly and obviously be done. At the same time, of
course it won't solve all the problems of interpreting data as
meaningful information, and yes, any hype that says it will should be
disregarded.

>XML is wine from a box in a $35 bottle.

Heh, not a bad analogy. But the solution isn't to pour the wine down
the drain, it's to drink it while understanding you're not getting the
elixir of life.

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: Steven E. Harris
Subject: Re: XML
Date: 
Message-ID: <jk48ybut056.fsf@W003275.na.alarismed.com>
················@eircom.net (Russell Wallace) writes:

> being able to get assigned a budget for writing decoding software
> based on it

[...]

> it won't solve all the problems of interpreting data as meaningful
> information

What is the value in getting "decoding" for free when "interpreting"
is still necessary and usually much more difficult? I'm trying to
imagine some case where one opens a file in a foreign XML vocabulary
just to gaze at the structure, or to pluck out a few text strings.

For an XML application, the "parser" is really more like a traditional
lexer, and the application still does all the parsing. Hence, it's
wrong to sell XML on the grounds that one need not write a parser. For
line-oriented record files, does read-line count as a parser?

I'll concede that I admire XML for tackling Unicode and character
encoding decisions head-on, but I'd rather not trade all its baggage
just to get an agreed-upon means of serializing directed graphs.

-- 
Steven E. Harris
From: Russell Wallace
Subject: Re: XML
Date: 
Message-ID: <413681e2.25450957@news.eircom.net>
On Wed, 01 Sep 2004 09:38:29 -0700, "Steven E. Harris" <···@panix.com>
wrote:

>What is the value in getting "decoding" for free when "interpreting"
>is still necessary and usually much more difficult?

1) Getting decoding for free. XML doesn't magically do the
interpreting for you, but eschewing XML certainly doesn't either! You
might as well take the decoding solution.

2) Having the fields labelled in English rather than some
binary-number crap is a big help with interpreting.

>I'm trying to
>imagine some case where one opens a file in a foreign XML vocabulary
>just to gaze at the structure

I've spent plenty of time gazing at the structure of this or that file
in some proprietary format in order to figure out how to extract the
data so the customer could ditch some obsolete DOS program whose
programmer had long since moved on, but which was the only thing that
knew how to read said files. If they'd been in XML format, my job
would have been far easier.

>I'll concede that I admire XML for tackling Unicode and character
>encoding decisions head-on, but I'd rather not trade all its baggage
>just to get an agreed-upon means of serializing directed graphs.

Try doing the above job a few times and see if you still feel that
way!

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: John Thingstad
Subject: Re: XML
Date: 
Message-ID: <opsdpcfsvjpqzri1@mjolner.upc.no>
On Thu, 02 Sep 2004 02:19:34 GMT, Russell Wallace  
<················@eircom.net> wrote:

> On Wed, 01 Sep 2004 09:38:29 -0700, "Steven E. Harris" <···@panix.com>
> wrote:
>
>> What is the value in getting "decoding" for free when "interpreting"
>> is still necessary and usually much more difficult?
>
> 1) Getting decoding for free. XML doesn't magically do the
> interpreting for you, but eschewing XML certainly doesn't either! You
> might as well take the decoding solution.
>
> 2) Having the fields labelled in English rather than some
> binary-number crap is a big help with interpreting.
>
>> I'm trying to
>> imagine some case where one opens a file in a foreign XML vocabulary
>> just to gaze at the structure
>
> I've spent plenty of time gazing at the structure of this or that file
> in some proprietary format in order to figure out how to extract the
> data so the customer could ditch some obsolete DOS program whose
> programmer had long since moved on, but which was the only thing that
> knew how to read said files. If they'd been in XML format, my job
> would have been far easier.
>
>> I'll concede that I admire XML for tackling Unicode and character
>> encoding decisions head-on, but I'd rather not trade all its baggage
>> just to get an agreed-upon means of serializing directed graphs.
>
> Try doing the above job a few times and see if you still feel that
> way!
>

There are two basic ways to read a XML file.

The first uses SAX.
Each tag generates a event and you tie a handler to it.
Then you build the structure in memory yourself.

The second uses DOM.
Her the library builds a tree structure in memory for you.
Most peaple know this style from JavaScript.

Both these appoches do save time compared to building your own parser.
In lisp it is probaly better to use read and let that build the  
datastructure.
The dataformat is then sexpr.
(I am recounting this mostly so you see that i know.)
In a language like Java there is a more obvious saving of time using XML.

XML has a advantage over sexpr when the amount of data is large and the
number of tags small. XHTML is a good example of where XML is actually
and advantage (over HTML). I find sexp encoded HTML more difficult to read  
than plain
HTML. The allignmet of sexp's seems to make it more difficult to see how  
blocks of
text allign.
For records in a database, however, I find the excessive tags to be a huge  
waste of
space and time. Lisp sexpr beats XML hands down here. Altso there can be  
no semantic
information encoded in XML.

XML 1.0, DTD and XRPC I can appiciate.

SOAP, XSL, XSLT I find to be a abbomination.
I am still appaled that more peaple don't see that their abstaction is
breaking down and are just keeping on adding to it. I seems a huge waste
of space and time. Equally bad for both the computer (to large, to slow)
and human (to difficult to learn, to verbose)
I never thought I would say this but I perfer using perl to transfor one  
text
format to another to using XSLT. It is faster, easier to use and MUCH more  
compact.



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: Marco Antoniotti
Subject: Re: XML
Date: 
Message-ID: <ohMZc.69$D5.79809@typhoon.nyu.edu>
John Thingstad wrote:
> 
> There are two basic ways to read a XML file.
> 
> The first uses SAX.
> Each tag generates a event and you tie a handler to it.
> Then you build the structure in memory yourself.
> 
> The second uses DOM.
> Her the library builds a tree structure in memory for you.
> Most peaple know this style from JavaScript.
> 
> Both these appoches do save time compared to building your own parser.
...

Well, technically both ways use a "XML parser".  The distinction is kind 
of bogus IMHO.

Cheers
--
marco
From: John Thingstad
Subject: Re: XML
Date: 
Message-ID: <opsdp1ktcgpqzri1@mjolner.upc.no>
On Thu, 02 Sep 2004 17:29:56 -0400, Marco Antoniotti <·······@cs.nyu.edu>  
wrote:

> Well, technically both ways use a "XML parser".  The distinction is kind  
> of bogus IMHO.
>
> Cheers
> --
> marco
>

Yes. I mean access the data in a xml file from the reader.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: Alain Picard
Subject: Re: XML
Date: 
Message-ID: <87y8js96s3.fsf@memetrics.com>
················@eircom.net (Russell Wallace) writes:

>
> Try doing the above job a few times and see if you still feel that
> way!

Well, what can I say.  I have, and I do.
From: Coby Beck
Subject: Re: XML
Date: 
Message-ID: <MCnZc.43617$A8.13390@edtnps89>
"Russell Wallace" <················@eircom.net> wrote in message
······················@news.eircom.net...
> On Wed, 01 Sep 2004 01:37:26 GMT, "Coby Beck" <·····@mercury.bc.ca>
> wrote:
> >I think that is pretty non seqitur.  My first development computer was a
> >Windows NT machine and I can clearly see that XML is a no-brainer that
has
> >been widely touted as the deliverer of all your data sharing dreams.
>
> Well, I agree. "Use XML" is a no-brainer; it's something that (in most
> cases) should clearly and obviously be done. At the same time, of

That wasn't what I meant.  I meant the concept behind the creation of XML is
a no-brainer.  But worse, XML is one of those all too common "reinvented
wheels"  that is in fact eliptical.  It makes fundamental errors in already
solved problem spaces.

> >XML is wine from a box in a $35 bottle.
>
> Heh, not a bad analogy. But the solution isn't to pour the wine down
> the drain, it's to drink it while understanding you're not getting the
> elixir of life.

Yes, but why is it always being served at my formal banquets? ;-)

-- 
Coby Beck
(remove #\Space "coby 101 @ big pond . com")
From: Russell Wallace
Subject: Re: XML
Date: 
Message-ID: <413680c0.25160236@news.eircom.net>
On Wed, 01 Sep 2004 17:26:04 GMT, "Coby Beck" <·····@mercury.bc.ca>
wrote:

>That wasn't what I meant.  I meant the concept behind the creation of XML is
>a no-brainer.  But worse, XML is one of those all too common "reinvented
>wheels"  that is in fact eliptical.  It makes fundamental errors in already
>solved problem spaces.

*shrug* I've yet to see a format that didn't have its disadvantages...
but the thing is, you're thinking of the problem as "serializing data
in a transparent format". But that problem is trivial, of course it's
been solved before. The real problem is *getting people to agree* to
serialize data in a transparent format. And *that* is a problem that
was *not* solved before XML.

>> Heh, not a bad analogy. But the solution isn't to pour the wine down
>> the drain, it's to drink it while understanding you're not getting the
>> elixir of life.
>
>Yes, but why is it always being served at my formal banquets? ;-)

Because it's the only drink in the world that doesn't have molecules
of flipped chirality relative to the biochemistry of 99% of the
attendants? (Okay, the analogy is getting very very thin at this stage
:))

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.