XML, Lisp and Meaning

From: Boris Schaefer
Subject: XML, Lisp and Meaning
Date: Thu, 14 Sep 2000 00:00:00 +0000
Message-ID: <87bsxq4p2d.fsf@qiwi.uncommon-sense.net>

A few days ago, I was talking with a friend about XML and about some
things I don't like about it.  Well, this made me remember a thread of
last year titled `data structure for markup text' and a more recent
thread titled `Lisp XML parser'.

What I didn't understand was Erik's comment about SGML documents
having no meaning besides the one the renderer assigns to it, whereas
a Lisp program has a defined meaning.

Here's what Erik wrote in the parser-thread:

# To be blunt: *ML documents derive meaning from sources external to
# the documents.  Even if you use XSL to obtain meaning as far as
# _presentation_ is concerned, you still don't have a clue what you're
# dealing with unless you're actually the _same_ application as the
# writer of the XML document.  *ML is no better than random chunks of
# binary data, but it also is no worse -- it could easily have been.

The discussion then drifted to the difference of the meaning of a Lisp
form as opposed to a XML form, specifically it was about the
difference that `(car whatever)' has a meaning whereas
`<title>Introduction</title>' has none.

Something with which I agree.

I guess this is mainly addressed at Erik.

What meaning should an XML document have?  As far as I can see
(possibly I'm mentally short sighted) an expression has meaning only
relative to more primitive expressions.  What should these primitive
expressions be in a markup language.  I could imagine something akin
to HTML as a basic rendering language, that could give more abstract
XML forms a meaning.

Or, maybe you could declare the primitive expressions yourself and
these would have no defined meaning.  Then you could define higher
level abstractions in the terms of these lower level primitives.  This
IMO _is_ useful, but I'm not sure whether this is what Erik meant.

Well, maybe somebody can help me out here.
Boris

PS: Erik mentioned `modern hypertext theory' in one of these threads.
    Anybody, have any references where I can read something about
    this?  Thanks.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Put no trust in cryptic comments.

Re: XML, Lisp and Meaning Rahul Jain
- Re: XML, Lisp and Meaning Tim Bradshaw
  - Re: XML, Lisp and Meaning Rahul Jain
    - Re: XML, Lisp and Meaning Arvid Grøtting
      - Re: XML, Lisp and Meaning Boris Schaefer
      - Re: XML, Lisp and Meaning Rahul Jain
        Re: XML, Lisp and Meaning Craig Brozefsky
        Re: XML, Lisp and Meaning Rahul Jain
    - Re: XML, Lisp and Meaning Michael Schuerig
      - Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Christopher Browne
        Re: XML, Lisp and Meaning Boris Schaefer
Re: XML, Lisp and Meaning Sunil Mishra
- Re: XML, Lisp and Meaning Boris Schaefer
  - Re: XML, Lisp and Meaning Marco Antoniotti
    - Re: XML, Lisp and Meaning Sunil Mishra
      - Re: XML, Lisp and Meaning Marco Antoniotti
  - Re: XML, Lisp and Meaning Sunil Mishra
    - Re: XML, Lisp and Meaning Barry Margolin
      - Re: XML, Lisp and Meaning Boris Schaefer
    - Re: XML, Lisp and Meaning Boris Schaefer
      - Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Marco Antoniotti
        Re: XML, Lisp and Meaning Karl Eichwalder
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Tim Bradshaw
        Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Barry Margolin
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Tim Bradshaw
        Re: XML, Lisp and Meaning Chris Page
        Re: XML, Lisp and Meaning Espen Vestre
        Re: XML, Lisp and Meaning Barry Margolin
        Re: XML, Lisp and Meaning David Combs
        Re: XML, Lisp and Meaning Tim Bradshaw
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Karl Eichwalder
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Barry Margolin
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Sunil Mishra
        Re: XML, Lisp and Meaning Espen Vestre
        Re: XML, Lisp and Meaning Tim Bradshaw
        Re: XML, Lisp and Meaning Philip Lijnzaad
        Re: XML, Lisp and Meaning Espen Vestre
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Barry Margolin
        Re: XML, Lisp and Meaning Lieven Marchand
        Re: XML, Lisp and Meaning Marco Antoniotti
        Re: XML, Lisp and Meaning Lieven Marchand
        Re: XML, Lisp and Meaning Marco Antoniotti
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Rob Warnock
        Re: XML, Lisp and Meaning Weiqi Gao
        Re: XML, Lisp and Meaning Marco Antoniotti
        Re: XML, Lisp and Meaning Boris Schaefer
        Re: XML, Lisp and Meaning Barry Margolin
        Re: XML, Lisp and Meaning Marco Antoniotti
        Re: XML, Lisp and Meaning Barry Margolin
        Re: XML, Lisp and Meaning Marco Antoniotti
        Re: XML, Lisp and Meaning Rainer Joswig

From: Rahul Jain
Subject: Re: XML, Lisp and Meaning
Date: Thu, 14 Sep 2000 00:00:00 +0000
Message-ID: <8prfnm$670$1@joe.rice.edu>

In article <··············@qiwi.uncommon-sense.net>, Boris Schaefer <·····@uncommon-sense.net> wrote:
> What meaning should an XML document have?  As far as I can see
> (possibly I'm mentally short sighted) an expression has meaning only
> relative to more primitive expressions.  What should these primitive
> expressions be in a markup language.  I could imagine something akin
> to HTML as a basic rendering language, that could give more abstract
> XML forms a meaning.

XML is just like sexprs, as I see it. They have no meaning without an
interpreter, just like (car foo) means nothing without saying that it's
an sexpr to be interpreted by a lisp interpreter. You can consider scheme
and CL to be different "DTD"s for sexpr languages. The main difference
between SGML and sexprs are that SGML is designed for representing
documents, while sexprs are typically designed for representing code.
In SGML, the tags (code) are secondary citizens in that you need special
syntax to show that something is a tag. In lisp, words (strings) are
likewise secondary citizens.

> PS: Erik mentioned `modern hypertext theory' in one of these threads.
>     Anybody, have any references where I can read something about
>     this?  Thanks.
> 

I think he's talking about hytime: www.hytime.org

From: Tim Bradshaw
Subject: Re: XML, Lisp and Meaning
Date: Thu, 14 Sep 2000 23:32:20 +0000
Message-ID: <ey3og1qo8hn.fsf@cley.com>

* Rahul Jain wrote:
> XML is just like sexprs, as I see it. They have no meaning without an
> interpreter, just like (car foo) means nothing without saying that it's
> an sexpr to be interpreted by a lisp interpreter. You can consider scheme
> and CL to be different "DTD"s for sexpr languages. The main difference
> between SGML and sexprs are that SGML is designed for representing
> documents, while sexprs are typically designed for representing code.
> In SGML, the tags (code) are secondary citizens in that you need special
> syntax to show that something is a tag. In lisp, words (strings) are
> likewise secondary citizens.

No, you can't treat them as different DTDs.  A DTD just assigns a
structure to a string (or fails to), lisp does a lot more than that:
after assigning a structure it evaluates the resulting structure.

--tim

From: Rahul Jain
Subject: Re: XML, Lisp and Meaning
Date: Thu, 14 Sep 2000 00:00:00 +0000
Message-ID: <8prpjs$fsh$1@joe.rice.edu>

In article <···············@cley.com>, Tim Bradshaw <···@cley.com> wrote:
> No, you can't treat them as different DTDs.  A DTD just assigns a
> structure to a string (or fails to), lisp does a lot more than that:
> after assigning a structure it evaluates the resulting structure.

Correct, that's a difference I missed. DTDs define syntax, but not semantics.
However, in both cases, the interpreter assigns the semantics according to
whatever standard it's following (or creating).

-- Rahul

From: Arvid Grøtting
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <l8hf7h7t07.fsf@gorgon.netfonds.no>

Rahul Jain <·····@rice.edu> writes:

> DTDs define syntax, but not semantics.

That's not entirely true.

Traditionally, an SGML DTD consists of a machine-readable part and a
human-readable part.  The human-readable part should (and often does)
define semantics.

-- 
(let (hh mm ss)
  (do () () (multiple-value-setq (ss mm hh) (get-decoded-time)) 
    (format t ····@R ·@R ·@R                            "
	    #\return hh mm ss) (sleep 1)))

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <87snr13dbc.fsf@qiwi.uncommon-sense.net>

······@regina.uio.no (Arvid Gr�tting) writes:

| That's not entirely true.
| 
| Traditionally, an SGML DTD consists of a machine-readable part and a
| human-readable part.  The human-readable part should (and often does)
| define semantics.

If you mean that a human read understands that <title>foo</title> is
something like a title for something, well, ok, most probably would.
But, I don't see what that buys you, if your documents are to be
processed by a program that was written by someone who didn't know the
human-readable semantics of the DTD in the first place.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Seize the day, put no trust in the morrow!
		-- Quintus Horatius Flaccus (Horace)

From: Rahul Jain
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <8ptj37$dll$1@joe.rice.edu>

In article <··············@gorgon.netfonds.no>, ······@regina.uio.no (Arvid =?iso-8859-1?q?Gr=F8tting?=) wrote:
> Rahul Jain <·····@rice.edu> writes:
> 
>> DTDs define syntax, but not semantics.
> 
> That's not entirely true.
> 
> Traditionally, an SGML DTD consists of a machine-readable part and a
> human-readable part.  The human-readable part should (and often does)
> define semantics.
> 

That's true, but someone needs to write an interpreter (or style sheet)
in order to actually have the document interpreted according to those
semantics. The style sheet is what really assigns the semantics to the
document. Think of the style sheet as a set of "macros" that "compiles"
the document into a lower-level representation, just like the LOOP
macro does.

-- Rahul

From: Craig Brozefsky
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <87k8cdo8cf.fsf@piracy.red-bean.com>

Rahul Jain <·····@rice.edu> writes:

> That's true, but someone needs to write an interpreter (or style sheet)
> in order to actually have the document interpreted according to those
> semantics. The style sheet is what really assigns the semantics to the
> document. Think of the style sheet as a set of "macros" that "compiles"
> the document into a lower-level representation, just like the LOOP
> macro does.

Stylesheets are not really for adding semantics, rather they are for
performing transformations.  Granted that with DSSSL you could do some
intepretation of the document because you had a full scheme
interpretor, but all of the XML stylesheet languages are really only
good for defining transformations, or describing page-based layout of
elements and their contents.  This is hardly assigning semantics.

What is brewing in the XML world now is this horrible mishmash of an
object model called DOM, which has all the worse parts of the lame OO
systems you can imagine, rolled into a least common denominator
interface to an object-tree representation of documents in memory.
Supposedly thru this you can define your semantics in various OO
languages, not by manipulating the markup, but by manipulating it's
object representation.  The stylesheets are somewhat orthagonal to
this, but are retrofitted to conform to the rhetoric of the DOM
model.

All I can think of when confronted with this monstrosity is the scene
from The Life of Brian, where they start worshipping Brian's sandal.
It handily squelches any notion one had of the computer industry
having revolutionary potential.

-- 
Craig Brozefsky               <·····@red-bean.com>
it's alright 'cos the historical pattern has shown / how the
economical cycle tends to revolve / in a round of decades
three stages stand out in a loop / a slump and war then peel
back to square one and back for more -- Stereolab "Ping Pong"

From: Rahul Jain
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <8ptvv1$ptv$1@joe.rice.edu>

In article <··············@piracy.red-bean.com>, Craig Brozefsky <·····@red-bean.com> wrote:
> Stylesheets are not really for adding semantics, rather they are for
> performing transformations.  Granted that with DSSSL you could do some
> intepretation of the document because you had a full scheme
> interpretor, but all of the XML stylesheet languages are really only
> good for defining transformations, or describing page-based layout of
> elements and their contents.  This is hardly assigning semantics.

as far as documents are concerned, I'd consider those transformations to
be the semantics.

> What is brewing in the XML world now is this horrible mishmash of an
> object model called DOM, which has all the worse parts of the lame OO
> systems you can imagine, rolled into a least common denominator
> interface to an object-tree representation of documents in memory.
> Supposedly thru this you can define your semantics in various OO
> languages, not by manipulating the markup, but by manipulating it's
> object representation.  The stylesheets are somewhat orthagonal to
> this, but are retrofitted to conform to the rhetoric of the DOM
> model.

Ok, now I'm just confused. I don't think I want to learn anything more
abuot this... :P

> All I can think of when confronted with this monstrosity is the scene
> from The Life of Brian, where they start worshipping Brian's sandal.
> It handily squelches any notion one had of the computer industry
> having revolutionary potential.

I guess that's just human nature. Oh well, life goes on...

-- Rahul

From: Michael Schuerig
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <1egzy5o.13pwm3016z2jggN%schuerig@acm.org>

Rahul Jain <·····@rice.edu> wrote:

> In article <···············@cley.com>, Tim Bradshaw <···@cley.com> wrote:
> > No, you can't treat them as different DTDs.  A DTD just assigns a
> > structure to a string (or fails to), lisp does a lot more than that:
> > after assigning a structure it evaluates the resulting structure.
> 
> Correct, that's a difference I missed. DTDs define syntax, but not semantics.
> However, in both cases, the interpreter assigns the semantics according to
> whatever standard it's following (or creating).

Guess: In the case of XML there is no standard interpreter. In the case
of Lisp there is.

Anyway, here's a nice rant about XML that I found recently:
<http://www.interlog.com/~gray/markup-abuse.html>


Michael

-- 
Michael Schuerig
···············@acm.org
http://www.schuerig.de/michael/

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 15 Sep 2000 00:00:00 +0000
Message-ID: <87wvgd3e6o.fsf@qiwi.uncommon-sense.net>

········@acm.org (Michael Schuerig) writes:

| Guess: In the case of XML there is no standard interpreter. In the
| case of Lisp there is.

Just to nitpick: no there isn't.  There is a standard which defines
the language down to level, so that interpreters following that
standard produce the same results (to the extent where the standard
doesn't say the result is implementation-dependent) Well, I understand
that XML doesn't give you something close at all.

Erik seems to say that XML _should_ have an interpreter and my
question then really is:

What kind of results should such an interpreter produce?

| Anyway, here's a nice rant about XML that I found recently:
| <http://www.interlog.com/~gray/markup-abuse.html>

Which seems to me to be about the same that Erik is saying, except for
the fact that Erik's also saying that XML _should_ be interpretable.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

The kind of danger people most enjoy is the kind they can watch from
a safe place.

From: Christopher Browne
Subject: Re: XML, Lisp and Meaning
Date: Sat, 16 Sep 2000 04:35:45 +0000
Message-ID: <BMCw5.11451$SN6.311450@news4.giganews.com>

In our last episode (15 Sep 2000 16:47:11 +0200),
the artist formerly known as Boris Schaefer said:
>········@acm.org (Michael Schuerig) writes:
>
>| Guess: In the case of XML there is no standard interpreter. In the
>| case of Lisp there is.
>
>Just to nitpick: no there isn't.  There is a standard which defines
>the language down to level, so that interpreters following that
>standard produce the same results (to the extent where the standard
>doesn't say the result is implementation-dependent) Well, I understand
>that XML doesn't give you something close at all.

That seems to me to be picking more at nits than anyone might _ever_
accuse Erik of...  _Obviously_ there's not a "standard interpreter"
for Lisp _as such_; what you get is a standard that describes the
_generic behaviour_ of a Lisp environment that is good enough that you
could loosely pretend that there is a "standard interpreter."

With XML, there's an API (DOM) that provides a scheme for predicting
how one might create an interpretation scheme, but you need a whole
lot more "convention" to have anything useful.

>Erik seems to say that XML _should_ have an interpreter and my
>question then really is:
>
>What kind of results should such an interpreter produce?

It probably needs to have some form of "native" output model; if that
needs to interface to another language, that would imply generating
something "not entirely unlike CORBA;" happily, there do exist
interface schemes like CORBA to deal with that.

>| Anyway, here's a nice rant about XML that I found recently:
>| <http://www.interlog.com/~gray/markup-abuse.html>
>
>Which seems to me to be about the same that Erik is saying, except
>for the fact that Erik's also saying that XML _should_ be
>interpretable.

Not quite.

Graydon Hoare is basically saying that XML needs some sort of
"semantic evaluation" on top of the DTD.  

I'd parallel this with relational databases somewhat like the
following:
- DBMSes all allow you define tables containing fields, and to
  indicate type information about those fields.

  That parallels the way a DTD allows you to define what elements and
  attributes are permissible.

- DBMSes provide some limited ways of describing how tables relate to
  one another; generally not terribly sophisticated.

  DTDs provide some of the same, indicating what elements may be
  contained in other elements.

The amount of validation that is provided by DTDs and by SQL typing is
not unuseful; the problem is that it is _INCOMPLETE_.

Darwen&Date's "Third Manifesto" book discusses the incompleteness
issue from the perspective of SQL; the critical point, which I'm
_certain_ Graydon would agree with, is that these "definition schemes"
provide only a very shallow set of validation information.

If you give me a DTD, or some SQL DDL, that provides me only a fairly
shallow understanding of what you truly intend as the data
representation.

The "Third Manifesto" suggests providing a "more truly relational"
database definition scheme that allows describing a more comprehensive
set of rules as to what the DBMS permits.

Graydon doesn't go that far; he basically just says, "Here's the
dilemma; the DTD just isn't enough."

In contrast, Erik's usual thesis on the XML/SGML issue has little to
do with the above (he might well _agree_ with all this, but he doesn't
_talk_ about it...).  

Rather, he basically says something somewhat simpler:

SGML, as it is usually used, and XML, as it is defined, don't provide
anything that is _fundamentally_ more sophisticated than
S-expressions.  (There's a parallel here to the "Graydon" issue
insofar as this implies that SGML/XML are not vastly sophisticated...)

One of the more interesting things you might want to do with SGML/XML
information is to transform it into a different set of SGML/XML, much
as one might take DocBook input and turn it into HTML.  

Crucial point: Such transformations are entirely outside the scope of
either XML or SGML, as neither SGML or XML provide any mechanism for
_implementing_ transformations.

If we used Lisp, instead, we'd have a language that is quite capable
of:
a) Representing data, as well as
b) Transforming that data.

The situation with XML/SGML as they are is that you wind up having to
work with one or more additional languages in conjunction with parsers
and such, when the "Lispy" approach would require none of that.
-- 
(concatenate 'string "cbbrowne" ·@" "acm.org")
<http://www.ntlug.org/~cbbrowne/xml.html>
There are few personal problems which can't be solved by the suitable
application of high explosives.

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Sat, 16 Sep 2000 00:00:00 +0000
Message-ID: <87aed8fysn.fsf@qiwi.uncommon-sense.net>

········@news.hex.net (Christopher Browne) writes:

| That seems to me to be picking more at nits than anyone might _ever_
| accuse Erik of...  _Obviously_ there's not a "standard interpreter"
| for Lisp _as such_; what you get is a standard that describes the
| _generic behaviour_ of a Lisp environment that is good enough that you
| could loosely pretend that there is a "standard interpreter."

I made that comment because there exist languages which have a
reference implementation that you might call the "standard
interpreter".  Anyway, as I said, I was picking nits.

| Graydon Hoare is basically saying that XML needs some sort of
| "semantic evaluation" on top of the DTD.  
| 
| [...]
| 
| In contrast, Erik's usual thesis on the XML/SGML issue has little to
| do with the above (he might well _agree_ with all this, but he doesn't
| _talk_ about it...).  

Well, here's the quote of Erik's again that I put in the post that
started this thread:

# To be blunt: *ML documents derive meaning from sources external to
# the documents.  Even if you use XSL to obtain meaning as far as
# _presentation_ is concerned, you still don't have a clue what you're
# dealing with unless you're actually the _same_ application as the
# writer of the XML document.  *ML is no better than random chunks of
# binary data, but it also is no worse -- it could easily have been.

Which to me seems to be the same thing that Graydon's talking about.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

That's odd.  That's very odd.  Wouldn't you say that's very odd?

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Mon, 18 Sep 2000 00:00:00 +0000
Message-ID: <39C66861.90004@everest.com>

I wish I had found this thread before the conversation appeared to peter 
out... I've been (at work) using XML for a variety of things, which can 
be broadly described in two categories: presentation and data transfer. 
My experience so far: XML has been severely abused.

Boris Schaefer wrote:

> A few days ago, I was talking with a friend about XML and about some
> things I don't like about it.  Well, this made me remember a thread of
> last year titled `data structure for markup text' and a more recent
> thread titled `Lisp XML parser'.
> 
> What I didn't understand was Erik's comment about SGML documents
> having no meaning besides the one the renderer assigns to it, whereas
> a Lisp program has a defined meaning.
> 
> Here's what Erik wrote in the parser-thread:
> 
> # To be blunt: *ML documents derive meaning from sources external to
> # the documents.  Even if you use XSL to obtain meaning as far as
> # _presentation_ is concerned, you still don't have a clue what you're
> # dealing with unless you're actually the _same_ application as the
> # writer of the XML document.  *ML is no better than random chunks of
> # binary data, but it also is no worse -- it could easily have been.

The only *ML language I have had any experience with before XML has been 
HTML, so my perspective on them is somewhat more limited than Erik's. 
For marking up documents for presentation, HTML did OK. XML took a step 
back, picked out more facilities from SGML, and tried to create a better 
document markup language. Note that I say *document*. XML, so far as I 
can tell, was never meant for annotating program data and program processes.

The difference I see between documents and program data is very subtle, 
and may appear to be arbitrary at times. Consider that XML (and SGML) do 
not have any notion of a string. They do not have any notion of a 
number. Or any such. Any such restrictions have to be imposed from above.

Consider on the other hand "true" program data. S-expressions have been 
used for a long, long time to express programmatic data. Lists and 
strings and numbers can be written out (and read in) very naturally 
using a lisp or scheme interpreter. In fact, s-expressions encode 
semantics to the extent that data can be freely exchanged between lisp 
and scheme. (There are additional facilities in lisp, such as #1# forms, 
but I am ignoring those for now.) One can use a well defined reader, 
encoded in the function READ, to read in this data and then work with it 
as program data. I consider the ability to be able to painlessly input a 
piece of data through a function *without* relying on any additional 
machinery to be an important aspect of program data.

Consider again XML. There are currently multiple data encoding formats 
around. (XSchema and the SOAP encoding subset are examples.) A 
non-validating parser can read in a document just fine. But to be able 
to figure out what to do with it, the parser must consult a number of 
other documents, and then *apply their semantics*. That is the key issue 
here. XML does not have any semantics of its own, a consequence of 
having descended from a document representation system. A document is 
something that is opaque to the underlying machine. Program data is not 
opaque.

Just as we can talk about the s-expression READ function, we may some 
day be able to talk about the SOAP READ function or the XSchema READ 
function. Once these functions are available, implemented and tested, I 
might admit XML as a reasonable data exchange mechanism. Though one that 
I will still consider to be inferior in its abilities to s-expressions 
for data exchange.

The reason I had initially stated that XML has been abused is that there 
are no constructs in XML for handling data types natively. They have to 
be imposed through an external entity. The fact that they *can* be 
imposed through an external entity is in fact a problem. XML is *too* 
general to be a good program data representation language. It is not 
adequately constrained to economically express interesting program data 
constructs such as numbers and strings. Sure, you can use s-expressions 
to hold document data as well. But I don't think that is an appropriate 
use of s-expressions. But then that's just my opinion...

I have probably been unable to fully express why I think XML should not 
be used for everything that it has been used for. I would love to have 
this discussion continue though, it will probably force me to clarify my 
thoughts on various issues.

Sunil

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Tue, 19 Sep 2000 23:59:36 +0000
Message-ID: <87r96gvspj.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| The difference I see between documents and program data is very
| subtle, and may appear to be arbitrary at times.  Consider that XML
| (and SGML) do not have any notion of a string.  They do not have any
| notion of a number.  Or any such.  Any such restrictions have to be
| imposed from above.

Well, somehow I also have mentally different conceptions of program
data and documents, it has not so much to do with data types
though[1].  I can hardly explain it though.  I think I see documents
as being long paragraphs of text, whereas I think I see program data
as being more deeply nested, usually, and probably also having more
structuring information in relation to content.

This seems to me a completely arbitrary differentiation and I don't
think it is very useful when talking about markup.  The main reason I
consider this harmful is that I cannot tell you at all where to draw
the line.

I think HTML actually "turns" most documents into program data,
because of its lack of abstraction capabilities so that people have to
litter their documents with markup that makes them essentially
non-readable for humans, while they are still machine readable.

Boris

[1] Actually, I think that the notion that documents contain only text
    is misguided.  Having everything of type "string" makes it very
    hard to build abstractions that have to compute something and as I
    said, I think the lack of abstractional capabilities turns
    documents into an unreadable mess.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

What's a cult?  It just means not enough people to make a minority.
		-- Robert Altman

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Wed, 20 Sep 2000 00:00:00 +0000
Message-ID: <y6cem2f14as.fsf@octagon.mrl.nyu.edu>

I have been following this discussion about XML etc etc and just came
to the following very prosaic realization.

XML is "bad" for the Lisp community because it gives "the rest of the
world" what it had lacked for ages: a common parsing substrate for
"external data".  In other words, finally the C/C++ programmer will be
able to just link in an XML parsing library without having to write
her/his own 'operator>>' stuff.

Of course they had to reimplement Lisp to achieve this effect, but
that is another story.

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Wed, 20 Sep 2000 00:00:00 +0000
Message-ID: <39C90A21.3020207@everest.com>

I don't think they are quite there yet. I have been looking at 
exchanging data through XML, and it is still very inconvenient. It will 
get there after a fashion, but there will still be one thing they will 
lack: the code-data equivalence. (I suppose they *will* have to 
reimplement lisp to get that.)

Sunil

Marco Antoniotti wrote:

> I have been following this discussion about XML etc etc and just came
> to the following very prosaic realization.
> 
> XML is "bad" for the Lisp community because it gives "the rest of the
> world" what it had lacked for ages: a common parsing substrate for
> "external data".  In other words, finally the C/C++ programmer will be
> able to just link in an XML parsing library without having to write
> her/his own 'operator>>' stuff.
> 
> Of course they had to reimplement Lisp to achieve this effect, but
> that is another story.
> 
> Cheers

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Thu, 21 Sep 2000 00:00:00 +0000
Message-ID: <y6caed1vnra.fsf@octagon.mrl.nyu.edu>

Sunil Mishra <············@everest.com> writes:

> I don't think they are quite there yet. I have been looking at 
> exchanging data through XML, and it is still very inconvenient. It will 
> get there after a fashion, but there will still be one thing they will 
> lack: the code-data equivalence. (I suppose they *will* have to 
> reimplement lisp to get that.)

That is a good point, but I believe that the "unified parsing engine"
(READ in CL and Lisp's, XML in CL, Lisp's and the rest of the world :) )
*is* what really will get XML accepted.  After all, that is enough to
save a humongous amount of work in writing specialized parsers for a
number of "external formats".

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Wed, 20 Sep 2000 00:00:00 +0000
Message-ID: <39C90C7F.2050802@everest.com>

I've thought and thought and thought about this, and have come to the 
realization that documents and data are not after all that different. 
One program's document is another program's data. What appears to turn 
an ordinary document into data is whether the program is able to make 
semantic sense of the document. In other words, for a document to be 
data, there must be an interpretive process associated with the document.

That said, the problem with XML is that it is a document, through and 
through, to any program that does not understand the encoding 
conventions. We may contrast this with sexp's in lisp, where through the 
READ function the lisp can parse integers and strings and so on from the 
sexp. The basic encoding is substantially richer when it comes to 
describing data types, but is arguably poorer for differentiating data 
roles. (We need to add symbols to our sexp's to express the role of the 
sexp, whereas in XML this is encoded in the tag name.)

Another way to look at the difference between sexp's and XML is that 
there is a common reader for lisp that does substantially more 
interpretation than the common reader for XML (which only gives you back 
a hierarchically structured document with string content). Another 
additional layer of standardized interpretation must be imposed on top 
of the bare XML parser to do what the READ function does for SEXP's. 
Several such methods exist already, at least in the form of proposals. 
(Examples are SOAP and XSchema.) I don't like them all that much, given 
the complexity of their implementations etc, but they are there.

Sunil

Boris Schaefer wrote:

> Sunil Mishra <············@everest.com> writes:
> 
> | The difference I see between documents and program data is very
> | subtle, and may appear to be arbitrary at times.  Consider that XML
> | (and SGML) do not have any notion of a string.  They do not have any
> | notion of a number.  Or any such.  Any such restrictions have to be
> | imposed from above.
> 
> Well, somehow I also have mentally different conceptions of program
> data and documents, it has not so much to do with data types
> though[1].  I can hardly explain it though.  I think I see documents
> as being long paragraphs of text, whereas I think I see program data
> as being more deeply nested, usually, and probably also having more
> structuring information in relation to content.
> 
> This seems to me a completely arbitrary differentiation and I don't
> think it is very useful when talking about markup.  The main reason I
> consider this harmful is that I cannot tell you at all where to draw
> the line.
> 
> I think HTML actually "turns" most documents into program data,
> because of its lack of abstraction capabilities so that people have to
> litter their documents with markup that makes them essentially
> non-readable for humans, while they are still machine readable.
> 
> Boris
> 
> [1] Actually, I think that the notion that documents contain only text
>     is misguided.  Having everything of type "string" makes it very
>     hard to build abstractions that have to compute something and as I
>     said, I think the lack of abstractional capabilities turns
>     documents into an unreadable mess.

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Wed, 20 Sep 2000 00:00:00 +0000
Message-ID: <Mp9y5.68$ii3.2936@burlma1-snr2>

In article <················@everest.com>,
Sunil Mishra  <············@everest.com> wrote:
>That said, the problem with XML is that it is a document, through and 
>through, to any program that does not understand the encoding 
>conventions. We may contrast this with sexp's in lisp, where through the 
>READ function the lisp can parse integers and strings and so on from the 
>sexp. The basic encoding is substantially richer when it comes to 
>describing data types, but is arguably poorer for differentiating data 
>roles. (We need to add symbols to our sexp's to express the role of the 
>sexp, whereas in XML this is encoded in the tag name.)

Wouldn't you use defstruct types, so they would be annoted as #S(type
:field value :field value ...)?  Isn't this essentially analogous to XML's
<tag field=value field=value ...> notation?

One big difference is that XML is embedded into textual documents, because
it's trying to serve multiple purposes.  The plain text is there for the
benefit of humans reading the document with a browser, while the XML stuff
is annotation for the benefit of programs, possibly telling them how to
interpret the regular text.  Lisp S-expressions doesn't really have any
concept of this dual role, it's all oriented towards programs.  SGML could
have been designed in such a way that everything is tagged, instead of
having the text/annotation distinction, but I believe the idea of the way
it's done is that a renderer may not recognize all tags, so they can just
ignore them and display the raw text in between.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 22 Sep 2000 00:00:00 +0000
Message-ID: <877l84yveb.fsf@qiwi.uncommon-sense.net>

Barry Margolin <······@genuity.net> writes:

| [...] SGML could have been designed in such a way that everything is
| tagged, instead of having the text/annotation distinction, but I
| believe the idea of the way it's done is that a renderer may not
| recognize all tags, so they can just ignore them and display the raw
| text in between.

A more Lisp-like interpretation method could easily achieve the same
effect.  Consider having ((foo :bar baz) "content") as the basic
markup style.  An interpreter for this could treat any "function call"
that it can't recognize as the identity function.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

There's got to be more to life than compile-and-go.

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 22 Sep 2000 00:00:00 +0000
Message-ID: <873disyuuc.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| We may contrast this with sexp's in lisp, where through the READ
| function the lisp can parse integers and strings and so on from the
| sexp. The basic encoding is substantially richer when it comes to
| describing data types, but is arguably poorer for differentiating
| data roles. (We need to add symbols to our sexp's to express the
| role of the sexp, whereas in XML this is encoded in the tag name.)

I don't understand what you mean here.  What "data role" does XML's
tag name encode?  And from what other "data roles" do you want to
distinguish it?

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Non-Reciprocal Laws of Expectations:
	Negative expectations yield negative results.
	Positive expectations yield negative results.

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Mon, 25 Sep 2000 00:00:00 +0000
Message-ID: <39CF8B28.4060908@everest.com>

Well, consider alists. They have an attribute/value structure. There is 
nothing in the alist that says it is an alist, we recognize them by 
convention. The list itself is thus untyped, which can be both good and bad.

In XML, you would probably have to encode the entire alist in specific 
tags, something like:

<alist>
<entry>
<key>...</key>
<value>...</value>
</entry>
...
</alist>

This is surely a lot more verbose, but the structure of the data is made 
explicit by the document's tag structure. And assuming the existence of 
an XML "reader", each of these tags and their content would have 
specific semantics. In contrast, the sexp structured alist is left at 
the mercy of the application program to be treated appropriately.

Again, I'm not sure which would be better... I generally dislike strong 
typing, but this is not typing at a programmatic level. It is typing the 
data structures, which is in fact much more in the spirit of CL.

Sunil

Boris Schaefer wrote:

> Sunil Mishra <············@everest.com> writes:
> 
> | We may contrast this with sexp's in lisp, where through the READ
> | function the lisp can parse integers and strings and so on from the
> | sexp. The basic encoding is substantially richer when it comes to
> | describing data types, but is arguably poorer for differentiating
> | data roles. (We need to add symbols to our sexp's to express the
> | role of the sexp, whereas in XML this is encoded in the tag name.)
> 
> I don't understand what you mean here.  What "data role" does XML's
> tag name encode?  And from what other "data roles" do you want to
> distinguish it?

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Mon, 25 Sep 2000 00:00:00 +0000
Message-ID: <87aecww5oc.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| Well, consider alists. They have an attribute/value structure. There
| is nothing in the alist that says it is an alist, we recognize them
| by convention. The list itself is thus untyped, which can be both
| good and bad.

Well, yes and no.  Being an alist is a structural property of a list
in Lisp.  You can always check, whether an object is an alist or not.
No convention needed.  However, since alists are a structural property
a program can never know, whether some list was meant to be treated as
an alist or an ordinary list, or a tree, or whatever, unless the
programmer knew the semantics of the data beforehand and built the
program accordingly.  This is pretty much the same situation that
exists in XML.

| In XML, you would probably have to encode the entire alist in
| specific tags, something like:
| 
| <alist>
| <entry>
| <key>...</key>
| <value>...</value>
| </entry>
| ...
| </alist>
| 
| This is surely a lot more verbose, but the structure of the data is
| made explicit by the document's tag structure. And assuming the
| existence of an XML "reader", each of these tags and their content
| would have specific semantics. In contrast, the sexp structured
| alist is left at the mercy of the application program to be treated
| appropriately.

Well, the XML structured alist above is left at the mercy of an XML
reader that knows that the above encodes an alist.  What's the
difference to the Lisp alist?

| Again, I'm not sure which would be better... I generally dislike strong 
| typing, but this is not typing at a programmatic level. It is typing the 
| data structures, which is in fact much more in the spirit of CL.

I think you mean _static_ typing.  Anyway, to me the above document is
not typed.  An XML reader could add type semantics, but XML itself
doesn't.

Boris

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

The famous politician was trying to save both his faces.

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Mon, 25 Sep 2000 00:00:00 +0000
Message-ID: <39CFC641.7050704@everest.com>

Boris Schaefer wrote:

> Sunil Mishra <············@everest.com> writes:
> 
> | Well, consider alists. They have an attribute/value structure. There
> | is nothing in the alist that says it is an alist, we recognize them
> | by convention. The list itself is thus untyped, which can be both
> | good and bad.
> 
> Well, yes and no.  Being an alist is a structural property of a list
> in Lisp.  You can always check, whether an object is an alist or not.
> No convention needed.  However, since alists are a structural property
> a program can never know, whether some list was meant to be treated as
> an alist or an ordinary list, or a tree, or whatever, unless the
> programmer knew the semantics of the data beforehand and built the
> program accordingly.  This is pretty much the same situation that
> exists in XML.
> 
> | In XML, you would probably have to encode the entire alist in
> | specific tags, something like:
> | 
> | <alist>
> | <entry>
> | <key>...</key>
> | <value>...</value>
> | </entry>
> | ...
> | </alist>
> | 
> | This is surely a lot more verbose, but the structure of the data is
> | made explicit by the document's tag structure. And assuming the
> | existence of an XML "reader", each of these tags and their content
> | would have specific semantics. In contrast, the sexp structured
> | alist is left at the mercy of the application program to be treated
> | appropriately.
> 
> Well, the XML structured alist above is left at the mercy of an XML
> reader that knows that the above encodes an alist.  What's the
> difference to the Lisp alist?
> 
> | Again, I'm not sure which would be better... I generally dislike strong 
> | typing, but this is not typing at a programmatic level. It is typing the 
> | data structures, which is in fact much more in the spirit of CL.
> 
> I think you mean _static_ typing.  Anyway, to me the above document is
> not typed.  An XML reader could add type semantics, but XML itself
> doesn't.
> 
> Boris

As Marco has corrected, as I did not make explicit, one would need an 
XML reader geared toward the reading of alists to be able to magically 
do the right thing with my representation. I think I had previously 
stated (perhaps I failed to actually state it :-) ) that what XML lacks 
is standard reader sytanx. As XML evolves we shall see some of these 
standard readers come into existence. SOAP and xschema are early efforts 
and I think we are going to see some rather interesting things happen.

The key point of my example still stands-- you don't have to look at the 
structure of the XML fragment to decide if it is an alist given the 
right reader. (After all, what is lisp syntax without the right reader?) 
With sexp's you would have no choice but to examine the structure and 
make an educated guess. You would likely have to add some  decoration to 
the alist (such as adding a :alist keyword at the head of the data) to 
make the representation explicit.

So, the primary difference between XML and sexp's would be that there is 
no information *implicit* in the list that identifies it as an alist. In 
XML, such information can be encoded into the XML tags which at least 
makes available the possibility of the right XML reader automatically 
turning the document into the appropriate data structure. The current 
state of the technology is not as important to me, only the possibility 
of the existence of such a technology.

Sunil

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Tue, 26 Sep 2000 00:38:33 +0000
Message-ID: <87wvg0vvg6.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| So, the primary difference between XML and sexp's would be that there is 
| no information *implicit* in the list that identifies it as an alist. In 
| XML, such information can be encoded into the XML tags which at least 
| makes available the possibility of the right XML reader automatically 
| turning the document into the appropriate data structure. The current 
| state of the technology is not as important to me, only the possibility 
| of the existence of such a technology.

I think you're comparing apples to oranges.  You say nothing
*implicit* in a list identifies it as an alist and in the next
sentence you say that in XML this information can be *explicitely*
encoded in the tag name.

Of course explicit encoding with a tag is different from implicit
encoding, but what in the world stops you from using explicit
encodingin a Lisp-like syntax?  How is

((:alist) ((:entry) ((:key) "foo"))
                    ((:value) "bar"))
          ((:entry) ((:key) "foobar")
                    ((:value) "baz")))

structurally different from the XML that you wrote.  Notice that in a
more Lisp-like interpretation of XML, you could even do the following:

((:alist) '("foo" . "bar") '("foobar" . "baz"))

and save you a *lot* of typing, without the danger of confusing
ordinary lists with alists.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

"Every morning, I get up and look through the 'Forbes' list of the
richest people in America.  If I'm not there, I go to work"
		-- Robert Orben

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Mon, 25 Sep 2000 00:00:00 +0000
Message-ID: <y6c8zsgff4s.fsf@octagon.mrl.nyu.edu>

Sunil Mishra <············@everest.com> writes:

> Well, consider alists. They have an attribute/value structure. There is 
> nothing in the alist that says it is an alist, we recognize them by 
> convention. The list itself is thus untyped, which can be both good and bad.
> 
> In XML, you would probably have to encode the entire alist in specific 
> tags, something like:
> 
> <alist>
> <entry>
> <key>...</key>
> <value>...</value>
> </entry>
> ...
> </alist>
>
	...

> And assuming the existence of 
> an XML "reader", each of these tags and their content would have 
> specific semantics.

	...

Yes and no.  XML assumes that a *specific* ALISTML capable parser
will be available.  AFAIU, the same tag structure structure may not be
correctly "interpretable" by another XML reader if the proper dtd
stuff is not available.

Which kind of supports my claim that XML is the equivalent of Sexpr
for the rest of the world, in the sense of being a "universally
parsable external representation of whatever".  This is the key of its
true success.

> Again, I'm not sure which would be better... I generally dislike strong 
> typing, but this is not typing at a programmatic level. It is typing the 
> data structures, which is in fact much more in the spirit of CL.

Yes.  I agree.

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Karl Eichwalder
Subject: Re: XML, Lisp and Meaning
Date: Wed, 27 Sep 2000 02:31:42 +0000
Message-ID: <shg0mmin01.fsf@tux.gnu.franken.de>

Marco Antoniotti <·······@cs.nyu.edu> writes:

> AFAIU, the same tag structure structure may not be correctly
> "interpretable" by another XML reader if the proper dtd stuff is not
> available.

This is only true for SGML with some FEATURES enabled; XML is that
verbose that it's unambiguous (XML doesn't allowed minimization nor
omission, etc.).

-- 
work : ··@suse.de                          |          ------    ,__o
     : http://www.suse.de/~ke/             |         ------   _-\_<,
home : ·······@gmx.net                     |        ------   (*)/'(*)

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Wed, 27 Sep 2000 00:00:00 +0000
Message-ID: <87em262qn5.fsf@qiwi.uncommon-sense.net>

Karl Eichwalder <·······@gmx.net> writes:

| > AFAIU, the same tag structure structure may not be correctly
| > "interpretable" by another XML reader if the proper dtd stuff is not
| > available.
| 
| This is only true for SGML with some FEATURES enabled; XML is that
| verbose that it's unambiguous (XML doesn't allowed minimization nor
| omission, etc.).

("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")

unless the Lisp program knows _beforehand_ what it should do with the
above.  It is however _readable_ with (read).  The same is true with
XML.  It is readable, but not interpretable without some further
knowledge beforehand.

The DTD stuff alone isn't sufficient either to interpret an arbitrary
XML document ... oh well, I gotta go work.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

genealogy, n.:
	An account of one's descent from an ancestor
	who did not particularly care to trace his own.
		-- Ambrose Bierce

From: Tim Bradshaw
Subject: Re: XML, Lisp and Meaning
Date: Wed, 27 Sep 2000 00:00:00 +0000
Message-ID: <ey3d7hqb2zo.fsf@cley.com>

* Boris Schaefer wrote:

> ("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")

> unless the Lisp program knows _beforehand_ what it should do with the
> above.  It is however _readable_ with (read).  The same is true with
> XML.  It is readable, but not interpretable without some further
> knowledge beforehand.

Yes, and the significant thing about XML (as I understand it, Erik for
instance might want to correct me!) is that it *is* readable without a
DTD -- you can read a document and build a corresponding parse tree
which you know is unique & correct without a DTD, although you
obviously don't know if it corresponds to any given DTD.  This is
*not* true for SGML in general -- without a DTD you may not not even
be able to read the document, let alone establish a unique parse tree.
That is just as losing as it sounds as far as I can see.

--tim

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Wed, 27 Sep 2000 00:00:00 +0000
Message-ID: <39D22AFE.9070201@everest.com>

That I think is an excellent observation... One of the big reasons for 
the evolution of XML AFAIR was the promise of being able to parse any 
XML document using some common semantics. That works fine, but a lot of 
people appear to have interpreted that position as being able to 
*interpret* any document. And then came the horror of imposing lots and 
lots of meta-information to make XML readable in the lisp sense, which 
most sane programming languages do by using double quotes for strings 
and so on. (I have yet to see anyone propose <foo type=string>This is 
the value of foo</foo> for defining a variable in a programming 
language. Lisp based on XML rather than sexp's, anyone?) Which is why I 
had initially taken a strong stance against XML. Now I'm not sure where 
I stand on the matter...

Sunil

Tim Bradshaw wrote:

> * Boris Schaefer wrote:
> 
> 
>> ("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")
> 
> 
>> unless the Lisp program knows _beforehand_ what it should do with the
>> above.  It is however _readable_ with (read).  The same is true with
>> XML.  It is readable, but not interpretable without some further
>> knowledge beforehand.
> 
> 
> Yes, and the significant thing about XML (as I understand it, Erik for
> instance might want to correct me!) is that it *is* readable without a
> DTD -- you can read a document and build a corresponding parse tree
> which you know is unique & correct without a DTD, although you
> obviously don't know if it corresponds to any given DTD.  This is
> *not* true for SGML in general -- without a DTD you may not not even
> be able to read the document, let alone establish a unique parse tree.
> That is just as losing as it sounds as far as I can see.
> 
> --tim

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Thu, 28 Sep 2000 00:00:00 +0000
Message-ID: <871yy4tmyr.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| That I think is an excellent observation...  One of the big reasons
| for the evolution of XML AFAIR was the promise of being able to
| parse any XML document using some common semantics.  That works
| fine, but a lot of people appear to have interpreted that position
| as being able to *interpret* any document.

Well, as far as I remember, one of the original goals of SGML was that
document formats should be implementation independent to ensure the
longevity of the documents.  I believe XML shared this goal, at least
originally.  Maybe people have forgotten about it, or more probably,
they think they have reached that goal.

Now, how in the world can a document's interpretation be
implementation independent, if you can only interpret it if you know
something that is not included in the document, not in the DTD and not
in the XML standard.

I think many people are saying that XML reaches this goal is that they
come up with examples like

  <table><entry key="foo">bar</entry></table>

and then they say, ``of course everybody can make sense of this''.
Well, maybe humans can; a machine can't, unless it is programmed
with some knowledge about what that should mean.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Stenderup's Law:
    The sooner you fall behind, the more time you will have to catch up.

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Thu, 28 Sep 2000 00:00:00 +0000
Message-ID: <XXLA5.26$Ej2.260@burlma1-snr2>

In article <··············@qiwi.uncommon-sense.net>,
Boris Schaefer  <·····@uncommon-sense.net> wrote:
>Now, how in the world can a document's interpretation be
>implementation independent, if you can only interpret it if you know
>something that is not included in the document, not in the DTD and not
>in the XML standard.

The whole point of XML is to be an extensible, general-purpose
representation of real-world data.  It would be impossible for the XML
standard to contain the details of every application domain that it might
be applied to.

If it were in the DTD, the DTD would probably have to be a full-fledged,
general-purpose programming language.  Browsers and agents that deal with
XML documents would have to contain an interpreter/compiler for that
language.  But in order for this to be safe, it would need to be designed
and implemented similar to Java or Javascript, with lots of protection,
sandboxing, etc.; it would presumably suffer from the same security
problems that have plagued Java/Javascript over the years.

The alternative that's being used with XML is to keep the smarts in the
applications.  XML just serves as a common file format that happens to be
similar to HTML, so that web applications that don't support XML can also
make use of them.  S-expressions would probably be a nice file format if
HTML-compatibility weren't needed, but that's not the case.  It would have
been nice if Tim Berners-Lee had chosen Lisp, rather than SGML, as the
basis for web documents a decade ago, but he didn't and we're stuck with
his design choice now.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Thu, 28 Sep 2000 00:00:00 +0000
Message-ID: <87hf702rkc.fsf@qiwi.uncommon-sense.net>

Barry Margolin <······@genuity.net> writes:

| If it were in the DTD, the DTD would probably have to be a full-fledged,
| general-purpose programming language.  Browsers and agents that deal with
| XML documents would have to contain an interpreter/compiler for that
| language.  But in order for this to be safe, it would need to be designed
| and implemented similar to Java or Javascript, with lots of protection,
| sandboxing, etc.; it would presumably suffer from the same security
| problems that have plagued Java/Javascript over the years.

You may well be right about that.  Point taken.

| The alternative that's being used with XML is to keep the smarts in the
| applications.  XML just serves as a common file format that happens to be
| similar to HTML, so that web applications that don't support XML can also
| make use of them.  S-expressions would probably be a nice file format if
| HTML-compatibility weren't needed, but that's not the case.  It would have
| been nice if Tim Berners-Lee had chosen Lisp, rather than SGML, as the
| basis for web documents a decade ago, but he didn't and we're stuck with
| his design choice now.

I think we agree here.  XML is just a file format.  Nothing more,
nothing less.  And, as you said, we're pretty much stuck with it.  I
agree completely.

My point though is, that XML actually wasn't intended just as a plain
file format.  It was intended to give a document a life-span
independent of the programs to read it.  Well, I'm not sure about XML,
but as far as I know that was at least one of SGML's goals.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Life exists for no known purpose.

From: Tim Bradshaw
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:14:45 +0000
Message-ID: <ey3og188362.fsf@cley.com>

* Boris Schaefer wrote:

> My point though is, that XML actually wasn't intended just as a plain
> file format.  It was intended to give a document a life-span
> independent of the programs to read it.  Well, I'm not sure about XML,
> but as far as I know that was at least one of SGML's goals.

I think you need to remember what the other options for document file
formats are.  Restricting ourselves to `text' documents (I fully
realise that SGML/XML can deal with many other kinds of object), you
immediately start looking at things like Word.  A Word document is in
an essentially entirely undocumented format, defined entirely by an
application, which is deliberately changed every year or so to
encourage upgrades.  In 30 years you are not going to have much chance
of being able to reliably decode a document in Word 95 format, unless
you are lucky enough to have a seriously antique machine around which
will run Windows 95, and the right CDROMS and so on.  At least with an
SGML document there is an ISO standard out there which will still
exist and be available, which tells you enough that you can write a
parser which you can use to unambiguously recover the structure.
That's a significant win.

--tim

From: Chris Page
Subject: Re: XML, Lisp and Meaning
Date: Tue, 03 Oct 2000 00:00:00 +0000
Message-ID: <B5FF0780.14C07%page@best-OMIT-THIS-.com>

in article ···············@cley.com, Tim Bradshaw at ···@cley.com wrote on
2000.09.28 5:14 PM:

> A Word document is in an essentially entirely undocumented format, defined
> entirely by an application, which is deliberately changed every year or so to
> encourage upgrades.

[Your point is valid, but I think this is a bad example. Pardon me while I
get "detail oriented" for a moment.]

I spent several years implementing file translators for word processors.
More than once during those years, Microsoft quite readily provided
documentation for various versions of the Word file format (as well as other
MS applications). I do not know if they still do this, however.

-- 
Chris Page                        *** MAKE ME MONEY FAST!                ***
Mac OS Guy                        *** SEND ME ALL YOUR MONEY, THEN SEND  ***
Palm, Inc.                        *** THIS MESSAGE TO EVERYONE YOU KNOW! ***

let mailto = concatenate( "page", ·@", "best.com" );

From: Espen Vestre
Subject: Re: XML, Lisp and Meaning
Date: Tue, 03 Oct 2000 00:00:00 +0000
Message-ID: <w6k8bqw4d6.fsf@wallace.ws.nextra.no>

Chris Page <····@best-OMIT-THIS-.com> writes:

> I spent several years implementing file translators for word processors.
> More than once during those years, Microsoft quite readily provided
> documentation for various versions of the Word file format (as well as other
> MS applications). I do not know if they still do this, however.

they even provide automatic translators to make older versions cooperate
reasonably with newer versions, BUT:

- the format is completely incomprehensible without a program to interpret
  it (whereas e.g. a LaTeX document or a xml document *not* generated by
  Word is to a large extent readable by a human not knowing the format).

- the way the format is used by most users, the documents become almost
  completely unstructured

- the documentation and even the plug-ins for older versions (or vice
  versa) is not very readily accessible by ordinary users.

  "Now, in the context of business (the chief market for Word) this
   sort of thing is only an annoyance - one of the routine hassles
   that go along with using computers. It's easy to buy a little file
   converter that will take care of this problem. But if you are a
   writer whose career is words, whose professional identity is a
   corpus of written documents, this kind of thing is extremely dis-
   quieting."

 (Neal Stephenson, in 'In the Beginning... Was the Command Line', 
  after describing his his own experiences trying to read old Word 
  documents with a new version)

-- 
  (espen)

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Mon, 02 Oct 2000 00:00:00 +0000
Message-ID: <nb4C5.19$GM3.385@burlma1-snr2>

In article <··············@qiwi.uncommon-sense.net>,
Boris Schaefer  <·····@uncommon-sense.net> wrote:
>My point though is, that XML actually wasn't intended just as a plain
>file format.  It was intended to give a document a life-span
>independent of the programs to read it.  Well, I'm not sure about XML,
>but as far as I know that was at least one of SGML's goals.

This is true of both SGML and XML.  But the "meaning" of the file is still
intended to be described separately, in natual language oriented towards
the humans that write the programs that read it.

I remember when SGML first came out in the early 80's.  My impression was
that it was mainly intended at the time to replace text formatter source
file formats (e.g. nroff, TeX, Scribe) as well as all the proprietary word
processor file formats of the time.  Most of them used (and still use)
directives oriented towards specific layout, fonts, etc., although Scribe
was oriented towards logical organization and some of the more advanced
text formatters allowed definition of macros that can be used that way
(LaTeX is an example built on top of TeX, and Unix's man-page macros are an
example on top of nroff).  When GML went through the ANSI standardization
process and became SGML it got super-generalized so that it could
potentially be used for anything, hence its adoption as the underlying
basis of XML.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: David Combs
Subject: Re: XML, Lisp and Meaning
Date: Sat, 07 Oct 2000 05:52:06 +0000
Message-ID: <8rmdm6$1pq$2@news.panix.com>

In article <················@burlma1-snr2>,
Barry Margolin  <······@genuity.net> wrote:
>In article <··············@qiwi.uncommon-sense.net>,
<SNIP>
>I remember when SGML first came out in the early 80's.  My impression was
>that it was mainly intended at the time to replace text formatter source
>file formats (e.g. nroff, TeX, Scribe) as well as all the proprietary word
...

Do you still use Scribe?

Do you know anyone who still does?

(I use it, and love it!)

There seems no way for us users to get together and
trade tricks...

David

From: Tim Bradshaw
Subject: Re: XML, Lisp and Meaning
Date: Thu, 28 Sep 2000 00:00:00 +0000
Message-ID: <ey3snqk88oa.fsf@cley.com>

* Boris Schaefer wrote:
> Well, as far as I remember, one of the original goals of SGML was that
> document formats should be implementation independent to ensure the
> longevity of the documents.  I believe XML shared this goal, at least
> originally.  Maybe people have forgotten about it, or more probably,
> they think they have reached that goal.

> Now, how in the world can a document's interpretation be
> implementation independent, if you can only interpret it if you know
> something that is not included in the document, not in the DTD and not
> in the XML standard.

SGML as such certainly has nothing to say about the *interpretation*
of the document, it merely describes the syntax, and has nothing to
say about semantics.

--tim

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 01:30:37 +0000
Message-ID: <87d7hornlu.fsf@qiwi.uncommon-sense.net>

Tim Bradshaw <···@cley.com> writes:

| SGML as such certainly has nothing to say about the *interpretation*
| of the document, it merely describes the syntax, and has nothing to
| say about semantics.

Yes, I know it doesn't say anything about the semantics.  My point is
that, if one of SGML's goals was to improve implementation
independence and thus longevity of documents, then it _should_ have
said something about the semantics.

I _don't_ know, whether implementation independence and longevity
really was one of the main goals of SGML, but that's something I kept
hearing from different sources.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Aim for the moon.  If you miss, you may hit a star.
		-- W. Clement Stone

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 01:40:20 +0000
Message-ID: <878zsc562j.fsf@qiwi.uncommon-sense.net>

Boris Schaefer <·····@uncommon-sense.net> writes:

| Yes, I know it doesn't say anything about the semantics.  My point is
| that, if one of SGML's goals was to improve implementation
| independence and thus longevity of documents, then it _should_ have
| said something about the semantics.
| 
| I _don't_ know, whether implementation independence and longevity
| really was one of the main goals of SGML, but that's something I kept
| hearing from different sources.

I'm replying to myself, because I just want to clarify something that
I think is not clear from most of my last posts.

Although I know that XML/SGML documents do not have a defined meaning,
I started out this thread questioning _what_ meaning an XML document
should have.  Although I am perfectly aware that I am all the time
arguing that XML documents have no inherent meaning, I think they
_should_ have one.  The question really is, how to specify this
meaning.  I don't really know an answer to that one and that's why I
started this thread.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

By perseverance the snail reached the Ark.
		-- Charles Spurgeon

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Thu, 28 Sep 2000 00:00:00 +0000
Message-ID: <39D3EF37.9000508@everest.com>

Meaning... It is a big bag of worms... Appears to have sunk a generation 
of companies (AI stuff from the 80's)... And in despair many, many 
researchers have turned to more amenable things, like finding patterns 
in 1000x1000 matrices. As you can tell, I'm rather skeptical that anyone 
will ever be able to introduce some unambiguous meaning into any kind of 
document. Even those written in English intended for a human audience 
are often misunderstood, so how can we expect to be able to encode 
something that squishy into a machine oriented document? Come to think 
of it, do we even want to? People have made careers (academic ones at 
least) in finding additional interpretations of existing documents 
(books, scientific papers, etc.) As human beings we routinely disregard 
the intended meaning of a document, depending on the task at hand. Why 
would we ever want to limit our software when it comes to working with 
XML documents?

Random ponderings from a mind deranged by a temporary move to Java...

Sunil

Boris Schaefer wrote:

> Boris Schaefer <·····@uncommon-sense.net> writes:
> 
> | Yes, I know it doesn't say anything about the semantics.  My point is
> | that, if one of SGML's goals was to improve implementation
> | independence and thus longevity of documents, then it _should_ have
> | said something about the semantics.
> | 
> | I _don't_ know, whether implementation independence and longevity
> | really was one of the main goals of SGML, but that's something I kept
> | hearing from different sources.
> 
> I'm replying to myself, because I just want to clarify something that
> I think is not clear from most of my last posts.
> 
> Although I know that XML/SGML documents do not have a defined meaning,
> I started out this thread questioning _what_ meaning an XML document
> should have.  Although I am perfectly aware that I am all the time
> arguing that XML documents have no inherent meaning, I think they
> _should_ have one.  The question really is, how to specify this
> meaning.  I don't really know an answer to that one and that's why I
> started this thread.

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <87n1grf7gr.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| Meaning... It is a big bag of worms... Appears to have sunk a generation 
| of companies (AI stuff from the 80's)... And in despair many, many 
| researchers have turned to more amenable things, like finding patterns 
| in 1000x1000 matrices. As you can tell, I'm rather skeptical that anyone 
| will ever be able to introduce some unambiguous meaning into any kind of 
| document. Even those written in English intended for a human audience 
| are often misunderstood, so how can we expect to be able to encode 
| something that squishy into a machine oriented document?

Oh well, I wasn't talking about the meaning of the contents of the
document, just about the "meaning" of the document's structure.  See
my reply to Tim's post for details.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

How you look depends on where you go.

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <39D4D57A.50801@everest.com>

Boris Schaefer wrote:

> Sunil Mishra <············@everest.com> writes:
> 
> | Meaning... It is a big bag of worms... Appears to have sunk a generation 
> | of companies (AI stuff from the 80's)... And in despair many, many 
> | researchers have turned to more amenable things, like finding patterns 
> | in 1000x1000 matrices. As you can tell, I'm rather skeptical that anyone 
> | will ever be able to introduce some unambiguous meaning into any kind of 
> | document. Even those written in English intended for a human audience 
> | are often misunderstood, so how can we expect to be able to encode 
> | something that squishy into a machine oriented document?
> 
> Oh well, I wasn't talking about the meaning of the contents of the
> document, just about the "meaning" of the document's structure.  See
> my reply to Tim's post for details.

I shall posit that there isn't a significant difference between the two. 
Your example for alists for instance still suffers from problems. To 
copy-and-paste:

----
((document :name "my document name")
((section :name "my first section")
((paragraph) "this" "is" "one" "paragraph")
((paragraph) "this is another one"))
((section :name "my second section")
((paragraph) "This document's name is: "
(document-name))))

This could be reduced to:

(set-of
"my document name"
(list-of
(set-of
"my first section"
(list-of (list-of "this" "is" "one" "paragraph")
(list-of "this is another one")))
(set-of
"my second section"
(list-of (list-of "This document's name is: "
"my document name")))))

----

You are assuming that the meaning of a document is self-evident from 
this structure you have described. To us indeed it is, we have grown up 
in an environment where this document has "the right meaning". Take this 
to your local neighborhood plumber and they might have a little more 
trouble with it. So, where's the meaning?

By asserting the existence of a DTD, you have merely pushed the problem 
one level of abstraction higher. Now it is the job of the DTD's consumer 
to figure out what the DTD is talking about. Most of us would actually 
perform rather poorly at a higher level of abstraction. The concrete 
instance you have presented above however is fairly straightforward.

When I see your attempt, I see someone abstracting their knowledge about 
a domain into a DTD, which *they* implicitly know is shared by the 
community at large, and (correctly) assuming that any conforming 
document will be correctly parsed. But that is not even remotely an 
absolute expression of meaning. This goal I believe is unattainable.

The best an XML reader can do is to ensure the document is structurally 
correct. Beyond that, it *must* be the application's task to figure out 
what to do with it. The author of the application generally decides how 
general the reader is going to be. A very specific reader would not 
understand anything but alists. A more general reader might be able to 
make sense of a number of different DTD's, perhaps relating them to each 
other in interesting ways, and even figuring out where in an HTML 
document a table tag is being used to represent a DTD. But this general 
program only has a more abstract notion of what the community's agreed 
representation of an alist is. Is such a program useful? Of course! 
Which is why I'm no longer doing research...

Sunil

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <87og163mm9.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| You are assuming that the meaning of a document is self-evident from
| this structure you have described.  To us indeed it is, we have
| grown up in an environment where this document has "the right
| meaning". Take this to your local neighborhood plumber and they
| might have a little more trouble with it. So, where's the meaning?

What has my local neighborhood plumber have to do with this?  Of
course somebody who does not know what a `set' is won't understand the
source code of the document, but what do I care.  I want that an
arbitrary program can interpret the language the document is written
in and do a decent job of rendering it in some way that someone (it
might be my local plumber) can understand the contents of the document
and also the relationships of the different parts of the document.

Let's see my example again:

((document :name "my document name")
  ((section :name "my first section")
    ((paragraph) "this" "is" "one" "paragraph")
    ((paragraph) "this is another one"))
  ((section :name "my second section")
    ((paragraph) "This document's name is: "
                 (document-name))))

which would reduce to:

(set-of
  "my document name"
  (list-of
    (set-of
      "my first section"
      (list-of (list-of "this" "is" "one" "paragraph")
               (list-of "this is another one")))
    (set-of
      "my second section"
      (list-of (list-of "This document's name is: "
                        "my document name")))))

What does a program have to know to give a reasonable representation
of this document?  It has to know what a `set' is and what a `list'
is.  Here are two example renderings of this document:

1.

   my document name

     my first section

     this is one paragraph

     this is another one

     my second section

     This document's name is: my document name

2.

  /-------------------------------------------------------------+
  |                                                             |
  |  +----------------+  +----------------------------------+   |
  |  |my document name|  |                                  |   |
  |  +----------------+  | +------------------------------+ |   |
  |                      | | +----------------+           | |   |
  |                      | | |my first section|           | |   |
  |                      | | +----------------+           | |   |
  |                      | |                              | |   |
  |                      | |                              | |   |
  |                      | | +--------------------------+ | |   |
  |                      | | | +----------------------+ | | |   |
  |                      | | | | this is one paragraph| | | |   |
  |                      | | | +----------------------+ | | |   |
  |                      | | |         |                | | |   |
  |                      | | |         V                | | |   |
  |                      | | | +---------------------+  | | |   |
  |                      | | | | this is another one |  | | |   |
  |                      | | | +---------------------+  | | |   |
  |                      | | +--------------------------+ | |   |
  |                      | +------------------------------+ |   |
  |                      |       |                          |   |
  |                      |       |                          |   |
  |                      |       V                          |   |
  |                      | +------------------------------+ |   |
  |                      | | +--------------------------+ | |   |
  |                      | | | +----------------------+ | | |   |
  |                      | | | | This document's name | | | |   |
  |                      | | | | is my document name  | | | |   |
  |                      | | | +----------------------+ | | |   |
  |                      | | +--------------------------+ | |   |
  |                      | |                              | |   |
  |                      | | +------------------+         | |   |
  |                      | | | my second section|         | |   |
  |                      | | +------------------+         | |   |
  |                      | |                              | |   |
  |                      | +------------------------------+ |   |
  |                      +----------------------------------+   |
  |                                                             |
  \-------------------------------------------------------------/

Both these renderings are vastly different, and which one is better
depends on what you want to do; for reading, probably the first is
better (especially, because the title of the second section appears
below the section's contents).

| By asserting the existence of a DTD, you have merely pushed the
| problem one level of abstraction higher. Now it is the job of the
| DTD's consumer to figure out what the DTD is talking about.  Most of
| us would actually perform rather poorly at a higher level of
| abstraction.  The concrete instance you have presented above however
| is fairly straightforward.

The point is _not_ that arbitrary humans should be able to understand
the DTD.  The point is that a machine can understand the DTD, so that
it can give reasonable representations of documents, without any
further knowledge, except for the DTD and some basic data types like
sets, lists, tables and so on.

| When I see your attempt, I see someone abstracting their knowledge
| about a domain into a DTD, which *they* implicitly know is shared by
| the community at large, and (correctly) assuming that any conforming
| document will be correctly parsed. But that is not even remotely an
| absolute expression of meaning. This goal I believe is unattainable.

I don't know what an _absolute representation of meaning_ is, but I am
sure it does not exist.  I think what you see as meaning is what I
call the _human_ interpretation.  Of course humans will interpret a
given document differently, some people might not understand what the
second rendering above should mean.  I suppose many people would think
because of the many boxes that there is more to it than there really
is.  I think a renderer that uses the second form of representation
should allow people to remove some unnecessary boxes to reduce
clutter.  Such a renderer might also allow the user to display the
original structuring tags with the boxes so that it would look more or
less like this:

           +--(section:contents)------+
           |                          |
           | +-(paragraph)----------+ |
           | | this is one paragraph| |
           | +----------------------+ |
           |         |                |
           |         V                |
           | +-(paragraph)---------+  |
           | | this is another one |  |
           | +---------------------+  |
           |                          |
           +--------------------------+

Which is helpful if people don't understand the rendering, or they
find it overly complicated and they might want to say something like:
"join all linked paragraphs" so that the above would become:

           +--(section:contents)------+
           |                          |
           | this is one paragraph    |
           |                          |
           | this is another one      |
           |                          |
           +--------------------------+

There is probably some middle ground for a renderer to give an
acceptable representation that is readable and where the relationships
of the data are still mostly recognizeable.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

I read the newspaper avidly.  It is my one form of continuous fiction.
		-- Aneurin Bevan

From: Karl Eichwalder
Subject: Re: XML, Lisp and Meaning
Date: Sat, 30 Sep 2000 04:46:46 +0000
Message-ID: <shu2ay8p1l.fsf@tux.gnu.franken.de>

Boris Schaefer <·····@uncommon-sense.net> writes:

> without any further knowledge, except for the DTD and some basic data
> types like sets, lists, tables and so on.

You can always use Architectural Forms Processing to map special
elements (HTML exapmle: <ol>, <ul>) to more generic elements (<a-list>);
or you can add an attribute (<ol class="list">).  Cf. David Megginsons'
book on XML (Prentice Hall).

SGML offer the info you need to do the different renderings you
described.

-- 
work : ··@suse.de                          |          ------    ,__o
     : http://www.suse.de/~ke/             |         ------   _-\_<,
home : ·······@gmx.net                     |        ------   (*)/'(*)

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Sat, 30 Sep 2000 00:00:00 +0000
Message-ID: <87wvfufcqi.fsf@qiwi.uncommon-sense.net>

Karl Eichwalder <·······@gmx.net> writes:

| You can always use Architectural Forms Processing to map special
| elements (HTML exapmle: <ol>, <ul>) to more generic elements
| (<a-list>); or you can add an attribute (<ol class="list">).
| Cf. David Megginsons' book on XML (Prentice Hall).

I can do many things, but SGML per se does not specify what I can do,
so *my* application may be able to do this, but another application
that does not know about this _beforehand_ will not be able to do
this.  That's my entire point.

| SGML offer the info you need to do the different renderings you
| described.

No, it does not.  How an SGML reader interprets the resulting
structure is entirely up to the application, thus, the information
does not come from SGML, but from the programmer.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Two wrights don't make a rong, they make an airplane.  Or bicycles.

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Mon, 02 Oct 2000 00:00:00 +0000
Message-ID: <zr4C5.20$GM3.508@burlma1-snr2>

In article <··············@qiwi.uncommon-sense.net>,
Boris Schaefer  <·····@uncommon-sense.net> wrote:
>What does a program have to know to give a reasonable representation
>of this document?  It has to know what a `set' is and what a `list'
>is.  Here are two example renderings of this document:
...
>Both these renderings are vastly different, and which one is better
>depends on what you want to do; for reading, probably the first is
>better (especially, because the title of the second section appears
>below the section's contents).

Both of the renderings you showed make the bold assumption that the
document represents text that should be formatted.  It could just as easily
be stock market data, prices in an online catalogue, pieces of the human
genome, etc.  In order to process the data in any useful way, the program
has to know what its intent is, and current technology pretty much requires
that to be hard-coded into the program, it can't be represented in the DTD.

And once you realize this, the particular file format (S-expression
vs. XML) becomes a pretty unimportant detail.  For any popular, open file
format there will generally be libraries that can translate it into an
internal, structured format isomorphic to your LIST-OF and SET-OF lists,
and then the application can do what it wishes with that.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Wed, 04 Oct 2000 00:00:00 +0000
Message-ID: <87g0mc4jl0.fsf@qiwi.uncommon-sense.net>

Barry Margolin <······@genuity.net> writes:

| Both of the renderings you showed make the bold assumption that the
| document represents text that should be formatted.  It could just as
| easily be stock market data, prices in an online catalogue, pieces
| of the human genome, etc.  In order to process the data in any
| useful way, the program has to know what its intent is, and current
| technology pretty much requires that to be hard-coded into the
| program, it can't be represented in the DTD.

I don't know exactly what you mean by "processing the data in any
useful way", but it doesn't sound like "formatting it into some
human-readable form".  Of course a program that receives a piece of
DNA in my example format, won't be able to do anything on that piece
of data that requires knowledge of genetics.  It also won't be able to
generate a double-helix rendering or something like that.

But maybe it could still capture the important relationships between
the data, so that some rendering could be generated that a human could
understand.  I am not sure about this, though, possibly because I
don't know much about genetics and what would constitute a "rendering
a human could understand" in this domain.

I see a difference between presenting data and "doing useful stuff"
with the data (well, presenting data is useful, but I think you know
what I mean).  You may be able to give an at least somewhat reasonable
presentation of some piece of data without any specialized knowledge
of the domain.  "Doing useful stuff", though, usually requires this
knowledge.

Having not though much about this, I don't know whether this
separation is a good idea actually.  It might be very stupid.

So, yes, I made a bold assumption in that the data is something to be
formatted, about the assumption that it is text, I'm not so sure.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Nobody said computers were going to be polite.

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Tue, 03 Oct 2000 00:00:00 +0000
Message-ID: <39DA2636.6090909@everest.com>

Sorry, I've been away from the ng for a few days, and the thread has 
become a little stale. I shall attempt a reasonable response :-)

Let me recap my arguments, so that I know where I stand on the matter 
:-) For me this thread started off with the question of whether XML was 
a reasonable medium for representing the semantics of a document. I've 
claimed that XML, sexp's, or any other format, cannot represent meaning 
in any reasonable way. The only thing that a particular document format 
can do is to make processing easier or harder to get at the syntactic 
structure of the document. From there on the application programmer 
ensures that the program's manipulation preserve and/or enrich the 
content the document conveys. Only the final consumer of the document 
can read meaning into the output form. If we generalize the document 
structure through some meta-information (a DTD in the case of XML/SGML), 
then the application has to know how to interpret the symbols in the 
meta-information and apply the appropriate processing to an input 
document. There is still no concept of meaning anywhere in the document 
or the application.

Having gone over the last couple of your posts, I think we have fallen 
into an argument over terminology. To me what you are describing is a 
processing model. It has relatively little to do with the document's 
meaning per se, from the point of view of content or structure.

Boris Schaefer wrote:

> Sunil Mishra <············@everest.com> writes:
> 
> | You are assuming that the meaning of a document is self-evident from
> | this structure you have described.  To us indeed it is, we have
> | grown up in an environment where this document has "the right
> | meaning". Take this to your local neighborhood plumber and they
> | might have a little more trouble with it. So, where's the meaning?
> 
> What has my local neighborhood plumber have to do with this?  Of
> course somebody who does not know what a `set' is won't understand the
> source code of the document, but what do I care.  I want that an
> arbitrary program can interpret the language the document is written
> in and do a decent job of rendering it in some way that someone (it
> might be my local plumber) can understand the contents of the document
> and also the relationships of the different parts of the document.

The source is not for your plumber to understand. But even the concept 
of a set is a relatively sophisticated one. I expand on this below.

> Let's see my example again:
> 
> ((document :name "my document name")
>   ((section :name "my first section")
>     ((paragraph) "this" "is" "one" "paragraph")
>     ((paragraph) "this is another one"))
>   ((section :name "my second section")
>     ((paragraph) "This document's name is: "
>                  (document-name))))
> 
> which would reduce to:
> 
> (set-of
>   "my document name"
>   (list-of
>     (set-of
>       "my first section"
>       (list-of (list-of "this" "is" "one" "paragraph")
>                (list-of "this is another one")))
>     (set-of
>       "my second section"
>       (list-of (list-of "This document's name is: "
>                         "my document name")))))
> 
> What does a program have to know to give a reasonable representation
> of this document?  It has to know what a `set' is and what a `list'
> is.  Here are two example renderings of this document:
> 
> 1.
> 
>    my document name
> 
>      my first section
> 
>      this is one paragraph
> 
>      this is another one
> 
>      my second section
> 
>      This document's name is: my document name
> 
> 2.
> 
>   /-------------------------------------------------------------+
>   |                                                             |
>   |  +----------------+  +----------------------------------+   |
>   |  |my document name|  |                                  |   |
>   |  +----------------+  | +------------------------------+ |   |
>   |                      | | +----------------+           | |   |
>   |                      | | |my first section|           | |   |
>   |                      | | +----------------+           | |   |
>   |                      | |                              | |   |
>   |                      | |                              | |   |
>   |                      | | +--------------------------+ | |   |
>   |                      | | | +----------------------+ | | |   |
>   |                      | | | | this is one paragraph| | | |   |
>   |                      | | | +----------------------+ | | |   |
>   |                      | | |         |                | | |   |
>   |                      | | |         V                | | |   |
>   |                      | | | +---------------------+  | | |   |
>   |                      | | | | this is another one |  | | |   |
>   |                      | | | +---------------------+  | | |   |
>   |                      | | +--------------------------+ | |   |
>   |                      | +------------------------------+ |   |
>   |                      |       |                          |   |
>   |                      |       |                          |   |
>   |                      |       V                          |   |
>   |                      | +------------------------------+ |   |
>   |                      | | +--------------------------+ | |   |
>   |                      | | | +----------------------+ | | |   |
>   |                      | | | | This document's name | | | |   |
>   |                      | | | | is my document name  | | | |   |
>   |                      | | | +----------------------+ | | |   |
>   |                      | | +--------------------------+ | |   |
>   |                      | |                              | |   |
>   |                      | | +------------------+         | |   |
>   |                      | | | my second section|         | |   |
>   |                      | | +------------------+         | |   |
>   |                      | |                              | |   |
>   |                      | +------------------------------+ |   |
>   |                      +----------------------------------+   |
>   |                                                             |
>   \-------------------------------------------------------------/
> 
> Both these renderings are vastly different, and which one is better
> depends on what you want to do; for reading, probably the first is
> better (especially, because the title of the second section appears
> below the section's contents).

My objections come from studies in perception, reasoning etc. The most 
profound bit of knowledge that I acquired in graduate school was that 
logical reasoning is thoroughly unnatural. Modus ponens (given that "a 
implies b" is true, and "a" is true, then "b" must be true) is a learned 
trick. Even basic geometric reasoning (such as depth perception) is 
unnatural to those that have grown up in claustrophobic environments, 
such as a dense equitorial rain forest. What I see in your argument is 
the claim that anyone will be able to translate a set of lines with 
relative identation into structured bits of data. That is true within 
the context of the US, certainly, and most likely a large majority of 
the world's population. Which is the population that you are most likely 
concerned with.

What I'm trying to emphasize is that you have merely transformed 
representations. You have an input format, a transformation model, and 
one or more output formats. To me there is no self-evident meaning in 
any of this, which I suspect you might agree with. But I've found the 
implications of this knowledge often appear to be tossed aside in the 
course of writing a program. Let me stress again that I'm not saying 
your program isn't going to be useful or interesting, only that imbuing 
meaning to either your source, your transformations, or your output, is 
dangerous.

Sunil

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Wed, 04 Oct 2000 00:00:00 +0000
Message-ID: <87bsx04hv7.fsf@qiwi.uncommon-sense.net>

Sunil Mishra <············@everest.com> writes:

| Sorry, I've been away from the ng for a few days, and the thread has 
| become a little stale. I shall attempt a reasonable response :-)

So shall I.

| I've claimed that XML, sexp's, or any other format, cannot represent
| meaning in any reasonable way. The only thing that a particular
| document format can do is to make processing easier or harder to get
| at the syntactic structure of the document.

Well, I disagree partially.  The format itself is only structure, but
you can assign semantics to structural elements, whether this _is_
meaning itself or not, I don't care.  Let's say that structure with
semantics has _more_ meaning than just structure.

Let me quote Tim Bradshaw from a while ago:

# If you start worrying about just what exactly it means to `have an
# intent' or `have a meaning' you will rapidly fall into a quagmire of
# philosophy and probably be doomed to spend the rest of your life as an
# embittered cognitive scientist or something. But you can stay away
# from that by asking much more specific questions.
# 
# Take the string.
# 
# 	"(lambda () (let ((x '(1 2))) (car x)))"
# 
# Then there are several things you can do:
# 
# 	READ (well, READ-FROM-STRING) will accept this and return an
# 	object.  So you know that it's well-formed as a lisp form.
# 
# 	COMPILE (something like (compile nil ...)) will accept what
# 	READ gave you and return another object.  So you know that
# 	it's well-formed as a lisp program.
# 
# 	FUNCALL will accept that object and return 1.  So you know
# 	that that lisp program actually does something.
# 
# Cognitive scientists will disagree with all this, because CL probably
# isn't formally enough specified, but I don't care about them.  And
# language lawyers will point out that you have to be in the right
# package and the readtable has to be sane, and I've carefully chosen
# the string not to have anything that might be a macro in it which
# makes it possibly-indeterminate whether it's a well-formed program,
# but I don't care about them either.
# 
# Now the point is that SGML and XML only give you the first two stages,
# at best.  In fact I think they give only partial bits of them:
# 
# 	I'm not sure (someone will know) if either assign a structure
# 	to the string rather than just saying that it's well-formed.
# 	I presume they do.
# 
# 	SGML only gives you the second stage in general: without the
# 	grammar, you can't even tell if a string is readable the way
# 	you can in Lisp.  XML, I think, aimed to give both first and
# 	second stages, so you should be able to check an XML document
# 	for first-stage well-formedness even without a grammar.  I
# 	don't know if it succeeds.
# 
# 
# All this will not satisfy people who care about formal semantics and
# so on.  All I'm trying to get at is that it's clear that Lisp programs
# do have a whole bunch more `meaning' than *ML documents, in a sense
# that can be made formal.
# 
# --tim

And in exactly this sense, my example has _more_ meaning than ordinary
XML documents.

 ((document :name "foo") "bar")

will return an object of type `set' with the elements "foo" and "bar".
The only thing that this object does not have is an external
representation, but that would be the task of the renderer.

Maybe I'm just confusing meaning, semantics and interpretation all the
time, but I'm not a cognitive scientist, and as far as I know
cognitive scientists also don't all agree on the meaning of these
terms.

| What I'm trying to emphasize is that you have merely transformed 
| representations. You have an input format, a transformation model, and 
| one or more output formats. To me there is no self-evident meaning in 
| any of this, which I suspect you might agree with.

Self-evident meaning, probably not.
More meaning than without the transformations, yes.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

What on earth would a man do with himself if something did not stand in his way?
		-- H.G. Wells

From: Sunil Mishra
Subject: Re: XML, Lisp and Meaning
Date: Thu, 05 Oct 2000 00:00:00 +0000
Message-ID: <39DCBBB2.3090100@everest.com>

> Well, I disagree partially.  The format itself is only structure, but
> you can assign semantics to structural elements, whether this _is_
> meaning itself or not, I don't care.  Let's say that structure with
> semantics has _more_ meaning than just structure.

Well, at least now I understand where you are coming from, until now I 
was not entirely sure :-)

> 
> Let me quote Tim Bradshaw from a while ago:
> 
> # If you start worrying about just what exactly it means to `have an
> # intent' or `have a meaning' you will rapidly fall into a quagmire of
> # philosophy and probably be doomed to spend the rest of your life as an
> # embittered cognitive scientist or something. But you can stay away
> # from that by asking much more specific questions.

An "embittered cognitive scientist" was exactly where I was heading 
before I decided that path would turn out to be ultimately less 
interesting :-)


> 
> And in exactly this sense, my example has _more_ meaning than ordinary
> XML documents.
> 
>  ((document :name "foo") "bar")
> 
> will return an object of type `set' with the elements "foo" and "bar".
> The only thing that this object does not have is an external
> representation, but that would be the task of the renderer.
> 
> Maybe I'm just confusing meaning, semantics and interpretation all the
> time, but I'm not a cognitive scientist, and as far as I know
> cognitive scientists also don't all agree on the meaning of these
> terms.
> 
> | What I'm trying to emphasize is that you have merely transformed 
> | representations. You have an input format, a transformation model, and 
> | one or more output formats. To me there is no self-evident meaning in 
> | any of this, which I suspect you might agree with.
> 
> Self-evident meaning, probably not.
> More meaning than without the transformations, yes.

May I suggest the phrase "easier for humans to interpret" than "more 
meaning"? Meaning is *most definitely* a loaded term, and I would 
recommend you avoid that to avoid lengthy debates :-)

Sunil

From: Espen Vestre
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <w6ya0bmv4v.fsf@wallace.ws.nextra.no>

Boris Schaefer <·····@uncommon-sense.net> writes:

> Although I am perfectly aware that I am all the time
> arguing that XML documents have no inherent meaning, I think they
> _should_ have one.

When I first heard about XML, it didn't even occur to me that it could
be used as a pure *data* format, which is, IMHO, not always a good
idea, not only because s-exprs are better, but because it gets so
voluminous (one of the examples which really disturb me is the 'ip
call record' work going on, where you get extremely voluminous
representations of records which are created in enormous volumes by
large ISPs - i.e. which really call for extremely compact
representation and *very* fast parsing).

I rather thought: Hey, with XML, you can tag words and sentences in
*written documents* with hints on their *semantics and pragmatics*.
The meanings of those tags would then be agreed upon within an 
application domain, but would only really be describable with *words*.
Actual applications could only implement a facet of that meaning
(at least before we have found the Holy Grail Of AI), but since
*humans* would know what the intention of the tag was, they would
still be usable to the applications for specific tasks.
-- 
  (espen)

From: Tim Bradshaw
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <ey3k8bv8qe5.fsf@cley.com>

* Boris Schaefer wrote:

> Although I know that XML/SGML documents do not have a defined meaning,
> I started out this thread questioning _what_ meaning an XML document
> should have.  Although I am perfectly aware that I am all the time
> arguing that XML documents have no inherent meaning, I think they
> _should_ have one.  The question really is, how to specify this
> meaning.  I don't really know an answer to that one and that's why I
> started this thread.

I think you've probably answered your own question.  I suspect that
the SGML people would have liked to be able to specify some kind of
meaning.  But, given the range of possible applications of SGML, they
were looking at a vast open research problem.  Instead they opted to
specify syntax, which was at least something that could be done (and
even that was almost too hard, witness the complexity of SGML).

To bring this back in context, it's probably not insignificant that
almost no programming language designs do any kind of formal
specification of meaning: it's just too hard.

--tim

From: Philip Lijnzaad
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <u766nfv4k2.fsf@o2-3.ebi.ac.uk>

Tim> To bring this back in context, it's probably not insignificant that
Tim> almost no programming language designs do any kind of formal
Tim> specification of meaning: it's just too hard.

I vaguely remember that there was an (to me utterly incomprehensible) attempt
to define semantics for Scheme; wan't this at one point even an appendix to
some R^nR report? Or it may have been in connection with security certifying
some scheme implementation.

Incidentally, I find it interesting that this thread keeps talking about
meaning rather than semantics. The former, to me, is something that by
definition only can exist in wetware. Kind a like knowledge <-> information. 

                                                                      Philip
-- 
Time passed, which, basically, is its job. -- Terry Pratchett (in: Equal Rites)
-----------------------------------------------------------------------------
Philip Lijnzaad, ········@ebi.ac.uk \ European Bioinformatics Institute,rm A2-24
+44 (0)1223 49 4639                 / Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax)           \ Cambridgeshire CB10 1SD,  GREAT BRITAIN
PGP fingerprint: E1 03 BF 80 94 61 B6 FC  50 3D 1F 64 40 75 FB 53

From: Espen Vestre
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <w666nfl9qo.fsf@wallace.ws.nextra.no>

Philip Lijnzaad <········@ebi.ac.uk> writes:

> Incidentally, I find it interesting that this thread keeps talking about
> meaning rather than semantics. The former, to me, is something that by
> definition only can exist in wetware. Kind a like knowledge <-> information. 

In some theories of semantics, the 'interpretation', but not the
'meaning' is the wetware, e.g. in situation semantics, 'meaning' is
the relation between the expressions, the circumstances and the real
world. So I think the use of the word 'meaning' as it has been used
here is reasonable.
-- 
  (espen)

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <87r963f7op.fsf@qiwi.uncommon-sense.net>

Philip Lijnzaad <········@ebi.ac.uk> writes:

| I vaguely remember that there was an (to me utterly
| incomprehensible) attempt to define semantics for Scheme; wan't this
| at one point even an appendix to some R^nR report? Or it may have
| been in connection with security certifying some scheme
| implementation.

Yes, R5RS (probably earlier RxRS's as well) has an appendix specifying
the formal semantics.  Tim's point is valid though, _almost_ no
languages have that.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

And that's the way it is...
		-- Walter Cronkite

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Mon, 02 Oct 2000 00:00:00 +0000
Message-ID: <DE4C5.21$GM3.474@burlma1-snr2>

In article <··············@qiwi.uncommon-sense.net>,
Boris Schaefer  <·····@uncommon-sense.net> wrote:
>Philip Lijnzaad <········@ebi.ac.uk> writes:
>
>| I vaguely remember that there was an (to me utterly
>| incomprehensible) attempt to define semantics for Scheme; wan't this
>| at one point even an appendix to some R^nR report? Or it may have
>| been in connection with security certifying some scheme
>| implementation.
>
>Yes, R5RS (probably earlier RxRS's as well) has an appendix specifying
>the formal semantics.  Tim's point is valid though, _almost_ no
>languages have that.

Because writing denotational semantics is very hard.  It can generally be
done only with small languages that don't have very complex semantics.
Scheme has a very minimal core, upon which just about everything else is
based, so it's amenable to this reduction (X3J13 considered doing a DS for
Common Lisp, but the only ones among us who were familiar enough to attempt
it realized that it would be a practically impossible task).  And the DS
doesn't even attempt to address some issues, such as the distinction
between exact and inexact numbers -- all it really defines is the execution
and memory model, while other things are taken as axioms that are built
into DS (e.g. what it means to add numbers).

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Lieven Marchand
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <m3zokrxhuq.fsf@localhost.localdomain>

Philip Lijnzaad <········@ebi.ac.uk> writes:

> I vaguely remember that there was an (to me utterly incomprehensible) attempt
> to define semantics for Scheme; wan't this at one point even an appendix to
> some R^nR report? Or it may have been in connection with security certifying
> some scheme implementation.

The denotational semantics are still a part of R^nRS. If you know the
notation, it's not that bad. It's basically an interpreter for an
abstract syntax tree of the language, written in pure lambda calculus
with both the interaction with memory and the continuation explicitly
represented. It would be fairly easy to derive a (very very very) slow
interpreter for the language from it.

-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <y6c8zsan6yo.fsf@octagon.mrl.nyu.edu>

Lieven Marchand <···@bewoner.dma.be> writes:

> Philip Lijnzaad <········@ebi.ac.uk> writes:
> 
> > I vaguely remember that there was an (to me utterly incomprehensible) attempt
> > to define semantics for Scheme; wan't this at one point even an appendix to
> > some R^nR report? Or it may have been in connection with security certifying
> > some scheme implementation.
> 
> The denotational semantics are still a part of R^nRS. If you know the
> notation, it's not that bad. It's basically an interpreter for an
> abstract syntax tree of the language, written in pure lambda calculus
> with both the interaction with memory and the continuation explicitly
> represented. It would be fairly easy to derive a (very very very) slow
> interpreter for the language from it.

Does it mean that if R^(n+1)RS will drop the denotational semantics
part, then it may be able to accommodate, say, multi-dimensional
arrays?  Or DEFSTRUCT?

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Lieven Marchand
Subject: Re: XML, Lisp and Meaning
Date: Sat, 30 Sep 2000 00:00:00 +0000
Message-ID: <m34s2y2kep.fsf@localhost.localdomain>

Marco Antoniotti <·······@cs.nyu.edu> writes:

> Does it mean that if R^(n+1)RS will drop the denotational semantics
> part, then it may be able to accommodate, say, multi-dimensional
> arrays?  Or DEFSTRUCT?

In what sense do you mean your question? There is no problem in
describing these features in denotational semantics.

If you are referring to the reluctance of the RnRS authors to expand
the number of pages of the spec, then dropping the semantics would
help.

-- 
Lieven Marchand <···@bewoner.dma.be>
Lambda calculus - Call us a mad club

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Fri, 06 Oct 2000 00:00:00 +0000
Message-ID: <y6c3diam276.fsf@octagon.mrl.nyu.edu>

Lieven Marchand <···@bewoner.dma.be> writes:

> Marco Antoniotti <·······@cs.nyu.edu> writes:
> 
> > Does it mean that if R^(n+1)RS will drop the denotational semantics
> > part, then it may be able to accommodate, say, multi-dimensional
> > arrays?  Or DEFSTRUCT?
> 
> In what sense do you mean your question? There is no problem in
> describing these features in denotational semantics.

Of course not.

> 
> If you are referring to the reluctance of the RnRS authors to expand
> the number of pages of the spec, then dropping the semantics would
> help.

They should bite the bullet, shouldn't they? :)

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <87vgvff7vv.fsf@qiwi.uncommon-sense.net>

Tim Bradshaw <···@cley.com> writes:

| > The question really is, how to specify this meaning.  I don't
| > really know an answer to that one and that's why I started this
| > thread.
| 
| I think you've probably answered your own question.  [...]

I don't think I did.  Today at work, I suddenly realized, what I want,
and I think it should be possible.  I want a document's interpretation
to be a reduction to some primitively typed contents.

We had the example of an alist before in this thread.  Now what is an
alist other than a mapping of values from one set to some other set.
The <ul> unordered list in HTML is nothing other than a set.  The <ol>
ordered list is a mapping from natural numbers to values.  I would
like to specify such mappings in a DTD.  Probably the only needed
primitive data structure would be the `set', if you can specify
arbitrary relations, but this would probably be too low-level, so you
might want to include other data structures, like `list', `vector',
`string', etc.

You could define something like this in your DTD-equivalent:

  (deftag document (attributes (name 'string))
                   (&rest contents)
    (set-of (apply #'set-of attributes)
            (apply #'list-of contents)))

  (deftag paragraph (attributes ())
                    (&rest contents)
    (apply #'list-of contents)

  (deftag section (attributes (name 'string))
                  (&rest (contents '(list-of-type paragraph)))
    (set-of (apply #'set-of attributes)
            (apply #'list-of contents)))

  (deftag unordered-list (&rest elements)
    (apply #'set-of elements)

  (deftag ordered-list (&rest elements)
    (apply #'list-of elements))

and then you could write a document like this:

  ((document :name "my document name")
     ((section :name "my first section")
        ((paragraph) "this" "is" "one" "paragraph")
        ((paragraph) "this is another one"))
     ((section :name "my second section")
        ((paragraph) "This document's name is: "
                   (document-name))))

This could be reduced to:

  (set-of
    "my document name"
    (list-of
      (set-of
        "my first section"
        (list-of (list-of "this" "is" "one" "paragraph")
                 (list-of "this is another one")))
      (set-of
        "my second section"
        (list-of (list-of "This document's name is: "
                          "my document name")))))

The above is actually pretty good.  You could render it in lots of
different styles and it would probably still be understandable.  E.g.,
I believe humans would understand this, even if a renderer says: oh,
that's a set, I'm going to render it like an ellipsoid, because that's
what sets are usually rendered as.  Probably adding some more
primitive data types like `table' would be a good idea, though.

The execution model must be different from that of an ordinary Lisp,
because you might want to get the document's name, like I did with
(document-name).  But that name was specified in an enclosing scope
and inside the paragraph there is nothing that tells you that you are
enclosed in a (document ...) form.

To come back to the example of the `alist'.  I don't know exactly how
to describe it in the above syntax, but it should be feasible to
describe relations between sets, and then it should be possible to
specify that an alist is a mapping from one set to another.  If you
could define relations, then the renderer could add little arrows, or
whatever between the elements of the related sets or something
similar.

Something in this direction would be magnitudes better than XML, IMHO.

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

Outside of a dog, a book is a man's best friend.  Inside a dog it's too
dark to read.
		-- Groucho Marx

From: Rob Warnock
Subject: Re: XML, Lisp and Meaning
Date: Sun, 01 Oct 2000 00:34:31 +0000
Message-ID: <8r60qn$13gng5$1@fido.engr.sgi.com>

Boris Schaefer  <·····@uncommon-sense.net> wrote:
+---------------
| The <ul> unordered list in HTML is nothing other than a set.
+---------------

Doesn't the "u" in <ul> *really* mean "unnumbered", not "unordered"??
Yes, I know that the standard refers to <ul> as "unordered", but lists
introduced with <ul> *are* still ordered -- the 1st element comes first,
the 2nd comes second, etc. -- but they're just not displayed with an
explicit ordinal tag like <ol>. And in any case, a browser is certainly
*not* allowed to re-order elements of a <ul>, so I really can't agree
with calling a <ul> a "set"...


-Rob

-----
Rob Warnock, 31-2-510		····@sgi.com
Network Engineering		http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.		Phone: 650-933-1673
1600 Amphitheatre Pkwy.		PP-ASEL-IA
Mountain View, CA  94043

From: Weiqi Gao
Subject: Re: XML, Lisp and Meaning
Date: Thu, 28 Sep 2000 00:00:00 +0000
Message-ID: <39D3FD7C.1AB72FED@networkusa.net>

Boris Schaefer wrote:
> 
> ("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")

Wait a minute.  Doesn't Lisp complain ' "this" is not a function name '?

> unless the Lisp program knows _beforehand_ what it should do with the
> above.  It is however _readable_ with (read).  The same is true with
> XML.  It is readable, but not interpretable without some further
> knowledge beforehand.
> 
> The DTD stuff alone isn't sufficient either to interpret an arbitrary
> XML document ... oh well, I gotta go work.

-- 
Weiqi Gao
········@networkusa.net

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <y6cog17w8d6.fsf@octagon.mrl.nyu.edu>

Weiqi Gao <········@networkusa.net> writes:

> Boris Schaefer wrote:
> > 
> > ("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")
> 
> Wait a minute.  Doesn't Lisp complain ' "this" is not a function
> name '?

It depends.

Common Lisp Prompt > (read)
("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")
("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")
Common Lisp Prompt >

XML basically tells the rest of the world how to implement XML-READ.

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Boris Schaefer
Subject: Re: XML, Lisp and Meaning
Date: Fri, 29 Sep 2000 00:00:00 +0000
Message-ID: <87itrff72j.fsf@qiwi.uncommon-sense.net>

Weiqi Gao <········@networkusa.net> writes:

| Boris Schaefer wrote:
| > 
| > ("this" "is" "not" "interpretable" "by" "a" "Lisp" "program")
| 
| Wait a minute.  Doesn't Lisp complain ' "this" is not a function name '?

Not, if you don't funcall it.  I said "not interpretable _by_ a Lisp
program", rather than "not an interpretable Lisp program".

-- 
·····@uncommon-sense.net - <http://www.uncommon-sense.net/>

If everything seems to be going well, you have obviously overlooked something.

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Mon, 25 Sep 2000 00:00:00 +0000
Message-ID: <4_Qz5.44$O96.1296@burlma1-snr2>

In article <················@everest.com>,
Sunil Mishra  <············@everest.com> wrote:
>Well, consider alists. They have an attribute/value structure. There is 
>nothing in the alist that says it is an alist, we recognize them by 
>convention. The list itself is thus untyped, which can be both good and bad.

IMHO, alists are an archaic concept in Lisp.  Alists and plists were
invented in the days before Lisp had structured data types, hash tables,
etc.  In those days, conses and lists were used for just about everything.
While this has obvious simplicity, it's lacking something in elegance and
maintainability.  Notice the dichotomy between abstract and concrete data
types: the latter have manifest types, so PRINT will display them in a
unique way, and programs can dispatch on the type using TYPECASE (or TYPEP
in the early days), whereas types like "alist" don't manifest them.

While I admit that there's some value to being able to use such lists, such
as being able to use both the sequence functions and ASSOC to search them,
or being able to CONS things onto the list, in the general scheme of things
I think mechanisms like DEFSTRUCT and CLOS are better.

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Tue, 26 Sep 2000 00:00:00 +0000
Message-ID: <y6c3dinw85i.fsf@octagon.mrl.nyu.edu>

Barry Margolin <······@genuity.net> writes:

> In article <················@everest.com>,
> Sunil Mishra  <············@everest.com> wrote:
> >Well, consider alists. They have an attribute/value structure. There is 
> >nothing in the alist that says it is an alist, we recognize them by 
> >convention. The list itself is thus untyped, which can be both good and bad.
> 
> IMHO, alists are an archaic concept in Lisp.  Alists and plists were
> invented in the days before Lisp had structured data types, hash tables,
> etc.  In those days, conses and lists were used for just about everything.
> While this has obvious simplicity, it's lacking something in elegance and
> maintainability.  Notice the dichotomy between abstract and concrete data
> types: the latter have manifest types, so PRINT will display them in a
> unique way, and programs can dispatch on the type using TYPECASE (or TYPEP
> in the early days), whereas types like "alist" don't manifest them.
> 
> While I admit that there's some value to being able to use such lists, such
> as being able to use both the sequence functions and ASSOC to search them,
> or being able to CONS things onto the list, in the general scheme of things
> I think mechanisms like DEFSTRUCT and CLOS are better.

Yes of course.  But really allow me to reiterate.  IMHO the crucial
point of Sexpr *and* XML is that they give you a "pragmatically
universally parsable" format.

Lisp has had this concept for ages.  The rest of the world reinvented
the wheel: XML.

E.g. in the project I am working on, I needed to exchange files with a
C/C++ application (it could have been Java).  Since I am a very sneaky
guy *and* I almost have the driver's seat in this instance, I devised
a "common interchange format", which contains - by absolute pure
chance :) - a lot of parenthesis. :)

I can READ the files thus produced in one fell swoop.  Of course on
the other side they had to write a parser for Sexpr :)
Had I come up with a XML format they would have "parsed" the files with
an XML library.

Of course, using the CLOCC and other CL code, doing XML is trivial in
CL as well.

The CL program is a rather complex aggregate of various data
structures (classes, defstructs, vectors, priority-queues etc etc),
but I am using very simple Sexpr to exchange data.  I will use XML in
the near future.

Cheers

Marco

PS. You think my devilish sneakiness stopped in devising a Lispy
    interchange format? :)  Since the parser needed on the C side is
    essentially a Sexpr reader, I just linked in the C code a Lisp
    library appropriatedly named CSRCL ("Contains a Subset
    Reimplementation of Common Lisp).

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Barry Margolin
Subject: Re: XML, Lisp and Meaning
Date: Tue, 26 Sep 2000 00:00:00 +0000
Message-ID: <U43A5.12$aT.453@burlma1-snr2>

In article <···············@octagon.mrl.nyu.edu>,
Marco Antoniotti  <·······@cs.nyu.edu> wrote:
>Lisp has had this concept for ages.  The rest of the world reinvented
>the wheel: XML.

To give them the benefit of the doubt, XML was a pretty natural solution in
their context.  It was invented to for the WWW, which had already
standardized on SGML as the basis for its interchange formats, and XML is
based on SGML.

>E.g. in the project I am working on, I needed to exchange files with a
>C/C++ application (it could have been Java).  Since I am a very sneaky
>guy *and* I almost have the driver's seat in this instance, I devised
>a "common interchange format", which contains - by absolute pure
>chance :) - a lot of parenthesis. :)

You're certainly not the first.  Have you ever looked at WAIS source
definitions?  Or Sun calendar manager .callog files (although for some
reason they threw in lots of unnecessary commas in these).

-- 
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Marco Antoniotti
Subject: Re: XML, Lisp and Meaning
Date: Wed, 27 Sep 2000 00:00:00 +0000
Message-ID: <y6cbsx952ba.fsf@octagon.mrl.nyu.edu>

Barry Margolin <······@genuity.net> writes:

> In article <···············@octagon.mrl.nyu.edu>,
> Marco Antoniotti  <·······@cs.nyu.edu> wrote:
> >Lisp has had this concept for ages.  The rest of the world reinvented
> >the wheel: XML.
> 
> To give them the benefit of the doubt, XML was a pretty natural solution in
> their context.  It was invented to for the WWW, which had already
> standardized on SGML as the basis for its interchange formats, and XML is
> based on SGML.

Yes.  One thing that kind of strikes me is that SGML itself has been
around for about 20 years.

> >E.g. in the project I am working on, I needed to exchange files with a
> >C/C++ application (it could have been Java).  Since I am a very sneaky
> >guy *and* I almost have the driver's seat in this instance, I devised
> >a "common interchange format", which contains - by absolute pure
> >chance :) - a lot of parenthesis. :)
> 
> You're certainly not the first.  Have you ever looked at WAIS source
> definitions?  Or Sun calendar manager .callog files (although for some
> reason they threw in lots of unnecessary commas in these).

Yep.  Have seen that.  PRCS (a pretty good revision control system)
also has a Lispy syntax (they made the big mistake of not going the
whole way - so their files are not directly readable from CL without
some extra hacks).

Cheers

-- 
Marco Antoniotti =============================================================
NYU Bioinformatics Group			 tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                          fax  +1 - 212 - 995 4122
New York, NY 10003, USA				 http://galt.mrl.nyu.edu/valis
             Like DNA, such a language [Lisp] does not go out of style.
			      Paul Graham, ANSI Common Lisp

From: Rainer Joswig
Subject: Re: XML, Lisp and Meaning
Date: Wed, 27 Sep 2000 00:00:00 +0000
Message-ID: <joswig-E8312D.17434127092000@news.is-europe.net>

In article <···············@octagon.mrl.nyu.edu>, Marco Antoniotti 
<·······@cs.nyu.edu> wrote:

> Yep.  Have seen that.  PRCS (a pretty good revision control system)
> also has a Lispy syntax (they made the big mistake of not going the
> whole way - so their files are not directly readable from CL without
> some extra hacks).

Same for the configuration files of Firewall 1.

-- 
Rainer Joswig, Hamburg, Germany
Email: ·············@corporate-world.lisp.de
Web: http://corporate-world.lisp.de/