Machine-editable DOM/sexpr model specifying HTML output to be generated?

From: Robert Maas, http://tinyurl.com/uh3t
Subject: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Sat, 09 Aug 2008 02:56:32 +0000
Message-ID: <rem-2008aug08-002@yahoo.com>

I'm currently planning work on a project for a CGI/CMUCL
application, where I would like to build a data structure
representing the parse tree of HTML output, i.e. a DOM
(Document-Object Model). But rather than build all the pieces
bottom-up and put them together at the last moment, I want to build
an editable DOM, which starts out with a generic ABORT message
which is later replaced by either a more specific ABORT message or
a specific user-error message or normal valid output or a mix of
normal output and a user-error message. I'll have a toplevel CATCH
and/or CATCH-ALL-ERRORS around my entire application, and if
anybody ever throws to that CATCH then the DOM will be quickly
patched to include something expressiong the parameter to the THROW
or ERROR, and the the entire contents of the DOM after that edit
will be sent out immediately.

The purpose is to allow abort (deliberate, or due to program bug,
it shouldn't matter which) at any time in the middle of a
HTTP/CGI/CMUCL transaction to be handled by a single unified
mechanism, thereby establishing failsafe operation in all cases.
And to unify the code, normal successful program termination would
share the same code, somewhat analagous to the way the Unix shell
treats return code of 0 to indicate successful return from
application but no matter whether there was an error or not there's
exactly the same graceful return from the application to the Unix
shell.

Before I go on, does my purpose (second paragraph) sound reasonable
and obtainable, and does my proposed mechanism (first paragraph)
sound like a reasonable way to achieve the goal implied by the
purpose? Or is there a better way to think this through?

In all my CGI applications to date, I have functions that I call to
generate HTML output to standard output live at the moment I call
the HTML-generation functions, which are buffered by the Apache
server until my application quits. Unfortunately this makes coding
tricky for dealing with deciding whether to generate text/html or
text/plain output, since the header for that must be generated,
hence that decision committed, before my program reaches any point
where subsequent output might be generated. I hack my way past it
by keeping a flag as to whether the HTTP header has been generated
yet, and on ABORT I generated the header if it wasn't yet
generated. I would like *NOT* to continue that tradition in my next
major CGI application, hence my thinking about the editable-DOM
method as an alternative.

In recent years I've seen mention of various sexpr syntax for
representing the structure of HTML files, which may be used by a
parser or by a generator. It wasn't clear to me which of these
open-source and other software packages allowed editing within a
structure and which allowed *only* bottom-up building of structure
with no further editing supported. So my purpose in posting today
is to ask anyone with experience with any of those packages to
please identify the name of the package you know about and tell me
how well the package supports automated editing to build from a
skeleton abort message towards the final DOM product just prior to
translating to actual HTML text. I.e. how suitable is each
HTML/DOM/sexpr package for my intended use.

Names that I recall for file-extensions on static DOM/sexpr pages
that trigger automatic translation to HTML are .LHP (analagous to
PHP) and .LSP (analagous to JSP and ASP). The one that most
appealed to me was one that used (in CAR position of sexpr) keyword
symbols for HTML tags, such as (:B "This text is bold."), and
ordinary symbols for Lisp fuctions to call to generate the sub-form
to insert at this place in the overall expression, such as
(FORMAT "~&The current time is ~D seconds since the era began.~%"
  (GET-UNIVERSAL-TIME))
But since those discussions were about static overall Web pages
that just happened to be expressed in DOM/sexpr form instead of
SGML/XML form, and allowed small pieces to be generated by live
Lisp code, there wasn't any discussion of the ability to build such
a structure in a Lisp core-image by multiple edits until it's all
finished and ready to convert to HTML, where Lisp could could
traverse the *entire* DOM structure instead of only work with the
tiny place where the code was embedded. Of course I suppose
somebody could "cheat" by wrapping a DOM-editor around the entire
document, or around a failsafe stub, like this:
(MAIN-EDITOR '((:HTTP "text/plain")
               (:HTML (:HEAD (:TITLE "Abort"))
                      (:BODY "Some unknown fatal error happened."))))
MAIN-EDITOR could then do whatever edits were required per my idea.
But unless the DOM software had good utilities to support finding
the desired place in the structure to do each edit, it would be a
bear to write the code for MAIN-EDITOR.
So I'm asking if somebody has experience with hacking one of those
packages as summarized just above, or whether any of the packages
actually *supports* my overall design in a non-hackish way.

In particular, one of my ideas if I needed to write the whole thing
myself, "re-inventing the wheel", if none of the available packages
can do the job reasonably, is that various places in the structure
would have a continuation tag in the CDR of the last cell in the
level of list, which would be replaced by NCONCing more statements
at the same level followed by that continuation tag at the new end.
Thus (:BODY . :BODYMORE)
might later be spliced to read
     (:BODY (:H1 "My DOM Hello World attempt")
            . :BODYMORE)
then later:
     (:BODY (:H1 "My DOM Hello World attempt")
            (:P "First line of test paragraph."
                . :PMORE)
            . :BODYMORE)
then later:
     (:BODY (:H1 "My DOM Hello World attempt")
            (:P "First line of test paragraph."
                "Testing 1 2 3 ... infinity!"
                . :PMORE)
            . :BODYMORE)
then later:
     (:BODY (:H1 "My DOM Hello World attempt")
            (:P "First line of test paragraph."
                "Testing 1 2 3 ... infinity!"
                . :PMORE)
            . :BODYMORE)
then later:
     (:BODY (:H1 "My DOM Hello World attempt")
            (:P "First line of test paragraph."
                "Testing 1 2 3 ... infinity!"
                . :PMORE)
            (:P "Heres's a second paragraph, just one line, all done already.")
            . :BODYMORE)
then if that form as it stands there is suddenly submitted for
conversion to HTML, all the continuation tags are immediately
flushed first, so this is the form that gets converted to HTML:
     (:BODY (:H1 "My DOM Hello World attempt")
            (:P "First line of test paragraph."
                "Testing 1 2 3 ... infinity!"
                )
            (:P "Heres's a second paragraph, just one line, all done already.")
            )

But maybe one of the existing HTML/DOM/sexpr packages has a better
idea for making it easy to edit a growing DOM as the application
computes more items to insert into it various places?

Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Kent M Pitman
- Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Robert Maas, http://tinyurl.com/uh3t
- Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Robert Maas, http://tinyurl.com/uh3t
  - Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? David Lichteblau
    - Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Robert Maas, http://tinyurl.com/uh3t
Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Kenny
- Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Robert Maas, http://tinyurl.com/uh3t
  - Re: Machine-editable DOM/sexpr model specifying HTML output to be generated? Rob Warnock
- Re: Finally making use of PowerLisp despite several **horrible** bugs it has Robert Maas, http://tinyurl.com/uh3t

From: Kent M Pitman
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Sat, 09 Aug 2008 12:42:17 +0000
Message-ID: <umyjmxsuu.fsf@nhplace.com>

··················@spamgourmet.com.remove (Robert Maas, http://tinyurl.com/uh3t) writes:

> Thus (:BODY . :BODYMORE)
> might later be spliced to read
>      (:BODY (:H1 "My DOM Hello World attempt")
>             . :BODYMORE)
...
> But maybe one of the existing HTML/DOM/sexpr packages has a better
> idea for making it easy to edit a growing DOM as the application
> computes more items to insert into it various places?

Why would you actually insert :BODYMORE there?

In a list, would you want :LISTMORE at the end of a list if someone
aborted out of a LOOP that was collecting elements into it?  That just
doesn't smell like anything I've ever much wanted to do.

I understand your desire to have an intact structure but not your
desire to have that structure have garbage in it.  Isn't a cursor
something you usually store on the outside, even in an editable
structure?  As I recall, the TECO text editor used to show /\ as
the cursor, as in:

   this is/\some text

but it wasn't like it actually inserted those characters into the buffer.

As an example, the XML parser I wrote for HyperMeta maintains XML
structure incrementally as it's parsed but it doesn't put garbage in
to be removed as it goes, mostly to handle the case of
 <foo><bar>...</foo>
where you want to throw out of the parsing of BAR to FOO without losing
the fact that you've been accumulating a BAR element.   (And no, that
parser isn't publicly available.  I'm just saying that it's something 
I've thought about a bit, which is why I have this question.)

From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Sun, 10 Aug 2008 20:46:31 +0000
Message-ID: <rem-2008aug10-001@yahoo.com>

REM> Thus (:BODY . :BODYMORE)
REM> might later be spliced to read
REM>      (:BODY (:H1 "My DOM Hello World attempt")
REM>             . :BODYMORE)
REM> ...
REM> But maybe one of the existing HTML/DOM/sexpr packages has a better
REM> idea for making it easy to edit a growing DOM as the application
REM> computes more items to insert into it various places?

Kent M Pitman <······@nhplace.com> wrote:
KMP> Why would you actually insert :BODYMORE there?

It was just an extremely sketchy idea of how I might do it myself
if none of the existing DOMs for representation of SGML/HTML/XML as
nested-list structure within a Lisp environment turned out to
provide the kind of incremental modify-in-place construction that I
seek. The idea was to have a bunch o tags within the structure
which could auto-replaced by warnings of the form (stub: some more
[elements|text] within this [body|paragraph] might have been
included here, were it not for the premature abort). It was just an
idea for people who have used existing DOM modules trigger their
memories to perhaps come up with something like "aha, I remember a
DOM module that worked somewhat like that, in fact I remember a
keyword involved in that stub-markup that might help me find it
again via Google or cliki search engine".

> In a list, would you want :LISTMORE at the end of a list if
> someone aborted out of a LOOP that was collecting elements into it?
> That just doesn't smell like anything I've ever much wanted to do.

That's something *I* would want to do for this specific kind of
application. When I look at the fragmentary output, I want to see
where my algorithm believes the structure is already complete and
where it believes it was in the middle of building something and
*would*have* put more there later if the abort hadn't precluded
further constructive processing. This information may help me
diagnose the bug that caused the premature abort.

> I understand your desire to have an intact structure but not your
> desire to have that structure have garbage in it.  Isn't a cursor
> something you usually store on the outside, even in an editable
> structure?  As I recall, the TECO text editor used to show /\ as
> the cursor, as in:
>    this is/\some text
> but it wasn't like it actually inserted those characters into the buffer.

Hmm, indeed that's another way to do it. Perhaps I was thinking
that it's better to have a single global that contains the
under-construction object with stub marks within it, than to have
two globals which contain a no-mark object and a separate set of
marks that point into the first structure. Of course these *two*
globals would be encapsulated into a list of two objects, something
like:
(:VERYTOP
  (:DOM 
    (:HTTP ...)
    (:HTML (:HEAD ...) (:BODY ...))
    )
  (:MARKS ...)
  )
This is unclean in the sense of having information for one object
in two places, even those two places are encapsulated into a large
single place. But as a matter of efficiency, indeed it might be
better to have the separate marks. Instead of doing a deep search
in the single marked DOM to find all the places that need fixups in
case of unexpected abort, the code would MAP down the list of marks
to find all those fixup places.

Thanks for the reminder of the separate-DOM+MARKS way of doing it.
I think if nobody knows of any existing DOM software and I have to
write it from scratch, I'll do it that way.

Of course another reason for suggesting the single-DOM-with-internal-stubs
way was that it's a lot easier to illustrate that in a newsgroup article.
I would have had to go to a lot of work to build something with
cross-links and then pretty-print it to include in the newsgroup
article, whereas the embedded-stubs way I could manually edit by
copy-and-paste without having to build anything inside Lisp.

Also with embedded stub marks I could display just *part* of the
entire DOM in legible form, in this case the :BODY, ignoring the
:HTML around it including :TITLE also, and ignoring the :DOM around
it with :HTTP also.

So now that I'm switched over to the separate-DOM+marks
methodology, let me hack up a structure representing the DOM+marks
situation at one point in time during the construction process,
with those other branches stubbed out too:

     (:BODY (:H1 "My DOM Hello World attempt")
            (:P "First line of test paragraph."
                . :PMORE)
            . :BODYMORE)
;Prefixes: D- = DOM object, P- = cons pair whose CDR is a NIL stub
;          M- = formal mark object (within list of marks) points at P-
(setq D-h1 (list :H1 "My DOM Hello World attempt"))
(setq D-para (list :P "First line of test paragraph."))
(setq P-para (last D-para))  ;In practice call to LAST function never needed,
                             ; because we're maintaining last ptr incrementally
(setq M-para (list :PMORE P-para))
(setq D-body (list :BODY D-h1 D-para))
(setq P-body (last D-body))  ;Ditto
(setq M-body (list :BODYMORE P-body))
(setq D-http (list :HTTP))
(setq P-http (last D-http))
(setq M-http (list :HTTPMORE P-http))
(setq D-title (list :TITLE))
(setq P-title (last D-title))
(setq M-title (list :TITLEMORE P-title))
(setq D-html (list :HTML D-title D-body)) ;Nothing more at this level!

(setq D-top (list :DOM D-http D-html))    ;Nothing more at this level!
=>
(:DOM (:HTTP)
 (:HTML (:TITLE)
  (:BODY (:H1 "My DOM Hello World attempt")
   (:P "First line of test paragraph."))))

(setq M-all (list :MARKS M-http M-title M-body M-para))
=>
(:MARKS (:HTTPMORE (:HTTP)) (:TITLEMORE (:TITLE))
 (:BODYMORE ((:P "First line of test paragraph.")))
 (:PMORE ("First line of test paragraph.")))

(setq $dom+marks$ (list :VERYTOP D-top M-all))
=>
(:VERYTOP
 (:DOM #1=(:HTTP)
  (:HTML #2=(:TITLE)
   (:BODY (:H1 "My DOM Hello World attempt")
    . #3=((:P . #4=("First line of test paragraph."))))))
 (:MARKS (:HTTPMORE #1#) (:TITLEMORE #2#) (:BODYMORE #3#) (:PMORE #4#)))

There, is that more to your liking?

By the way, after seeing your response, while composing my response
here, I remembered that in fact for the CAI-Calculus project we had
an expression editor that had an external "cursor" that could be
moved around within the overall expression, and the mathprinter
(MRPP5) would draw a box around whatever sub-expression the
"cursor" happened to point to at the moment. The use could then
move the "cursor" to the desired sub-expression, the apply some
algebraic transformation (such as distributive law) directly to
that particular sub-expression and have the result (if successful)
patched back in to replace what had been there, and the new
*overall* expresssion with the cursor still sitting at the same
location would be printed. Something like this (making up the names
for the commands because after 19 years I don't recall the details
any longer):

14:  diff(y,x) = a*x + x*a

cmd> 14 edit

     +---------------------+
15:  |diff(y,x) = a*x + x*a|
     +---------------------+

cmd> 15 in

     +---------+
16:  |diff(y,x)| = a*x + x*a
     +---------+

cmd> 16 right 2

                       +---+
17:  diff(y,x) = a*x + |x*a|
                       +---+

cmd> 17 commute

                       +---+
18:  diff(y,x) = a*x + |a*x|
                       +---+

cmd> 18 out

                 +---------+
19:  diff(y,x) = |a*x + a*x|
                 +---------+

cmd> 19 distribute right

                 +-------+
20:  diff(y,x) = |(a+a)*x|
                 +-------+

cmd> 20 in

                 +---+
21:  diff(y,x) = |a+a|*x    ;Parens redundant when they exactly match the box
                 +---+

cmd> 21 simplify

                 +---+
22:  diff(y,x) = |2*a|*x
                 +---+

cmd> 22 exit

23:  diff(y,x) = 2*a*x     ;Parens not needed because * is associative

So indeed, it's just been so long since I worked on that, it sorta
slipped my mind when I posted the very first article in this
thread.

> As an example, the XML parser I wrote for HyperMeta maintains XML
> structure incrementally as it's parsed but it doesn't put garbage
> in to be removed as it goes ...

Note this is the *opposite* usage, building a DOM of SGML/XML from
application to be rendered when done (or when aborted) rather than
taking SGML/XML input to parse to feed into an application. I.e.
I'm wanting LazyEVAL-PRINT rather than LazyREAD-EVAL.
But of course for both applications it might be useful to abstract
out a common methodology that works like this:
 Once: (MAKE-EMPTY-DOM-OBJECT+EMPTY-MARK-LIST)
 Many: (SELECT-MARK-DO-EDIT domMarkObject markKey specificEditParam)
Thus signals from a SAX-like tokenizer would generate edit tasks
which would be performed, or signals from my CGI application would
likewise generate edit tasks to perform, building up exactly the
same kind of DOM in a similar incremental way in for either parsing
or generation.

So was your package nicely abstracted like that, so it could be
used for *any* event-driven edit sequence, regardless of whether
generated by SAX-like events or what I have in mind? If so, is it
available for me to try?

By the way, several years ago, shortly after I took classes to
learn advanced Java including J2EE etc., I tried my hand at using
the SAX package of Java to generate events and then handled each of
those events to build a DOM. But it was just a first stab at such
programming, to get a feel for what kinds of events were generated
by SAX and what was needed to handle them, and it was outside the
requirements of the course I was taking, so I'm sure I didn't do it
really nicely. (And of course it was written in Java, so even if it
had been nice it'd be of no use to me here, except if I want to go
to all the trouble of setting up a FIFO linking Lisp to Java, and
an external representation for edit tasks that both Lisp and Java
can understand, so that Lisp can generate events and Java can
process them, like why bother.)

> And no, that parser isn't publicly available.

But if it's nicely abstracted, do you like me enough to let me
beta-test it and maybe fix minor bugs to get it in usable shape?

<OT>: How do you feel about my use of your "intention" essay as a
starting point for developing the idea that intentional datatypes
should be the unifying concept behind both of:
- learning to write/use software at all levels
  - bare machine:
    - electric voltage levels qua bits of binary data
    - bytes qua integers or characters
  - greenspun hackery in not-so-high-level-languages
    - short integers qua characters in ELISP (104 101 108 108 111 32 119 111 114 108 100)
    - short integers qua characters in C
    - Integers qua UniCodeCharacters in Java
    - short integers qua booleans in C  (0 or nonzero)
    - symbols qua booleans in Lisp  (NIL or T|anythingElse)
  - DSLs (Domain-Specific Languages)
- organizing software modules in a coherent way so that people can
   find which modules are available to do the task they need done
   (</OT>Hey, I need a module that deals with intentional type
         mutable-DOM of SGML; I don't care how the structure is
         implemented internally so long as it provides the
         representation and editability I seek; I need to be able
         to make a new empty object, incrementally mutate it, and
         suddenly patch all loose ends then render the whole thing
         as SGML i.e. HTML output.)

From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Sun, 10 Aug 2008 23:17:22 +0000
Message-ID: <rem-2008aug10-002@yahoo.com>

After I posted my previous article, I did some re-thinking, and did
two flip-flops on a major design decision:

Suppose we're working with the design of having a DOM together with
a separate list of edit marks. Suppose you create a rather
complicated smaller object, which you want to then embed in the
larger object? So you need to SETF the smaller object somewhere
inside the larger object. But what if you want to use our
mutable-DOM technology for building the smaller object
incrementally? Then you need to maintain a separate list of edit
marks for the small object, which is not a problem. When the small
object is finished, the edit list is empty, and you can discard it.
But what if you are *not* finished with the small object at the
point when you want to embed it into the larger object? Then you
must not only SETF the smaller object itself into the large object,
but you must NCONC the smaller-object's remaining edit-mark list
onto the end of the main edit-mark list to obtain a complete set of
edit marks for the merged object. Royal pain! Better to use my
original idea of having edit-stub blips right in the object itself,
so that each object-with-marks is self-contained, and all you need
is a single SETF to embed one object-with-marks inside another.

So now we're using the self-contained object-with-marks design. One
big problem: Suppose we have a partly completed big object, and we
are in the middle of building a separate self-contained smaller
object when an abort happens. We haven't yet embedded the small
object in the main object, so all work on the small object is lost,
so our output doesn't show any of our partial work, so we have no
idea what parts of the small object were completed before the abort
happened, making it difficult to figure out where the problem
happened.

The solition is to *never* build a separate object. *Always* build
sub-objects *from*the*start* already embedded within the larger
object. Then *whenever* an abort happens, *all* our work to that
point is visible inside the large object that will be
emergency-rendered to HTML.

This solution also simplifies the way we advance an edit-mark past
the newly-embedded structure. We always embed a new sub-structure,
a totally empty one initially, at the edit-mark posltion within the
next larger object, and immediately advance the edit-mark within
that next larger object to be past the newly added object. Since
the only way we know where to embed something is by having a
pointer to the relevant edit mark in the first place, the code for
embedding is quite simple
(note that our global $dom+marks$ is a tagged assoc list of the
 form (:VERYTOP (:DOM (:HTTP ...) (:HTML ...))
                (:MARKS ...))
 and each mark is of the form (:fooMORE ptrToConsCellWhoseCdrNeedsSetf)
 )

(defun mark+newtag-embed (mark newtag)
  (let* ((oldConsCellNeedsSetf (cadr mark))
         (newSubObject (list newtag))
         (newConsCellNeedsSetf (list newSubObject))
         (newMarkWithinEmptySubObject
            (list (tag-to-marktag newtag) newSubObject))
         (aresAllMarks (assoc :MARKS (cdr $dom+marks$)))
         )
    (setf (cdr oldConsCellNeedsSetf) newConsCellNeedsSetf) ;Embed sub-object
    (setf (cadr mark) newConsCellNeedsSetf) ;Update mark to advance past it
    (nconc aresAllMarks (list newMarkWithinEmptySubObject)) ;Register new mark
    newMarkWithinEmptySubObject))
;Return value is new edit mark within new object, because that's
; the only thing the calling program needs to know that it didn't
; already know.

If edit marks are kept in *reverse* order of creation, then the NCONC
line can be replaced by:
    (push newMarkWithinEmptySubObject (cdr aresAllMarks)) ;Register new mark

To discard an edit mark later because that particular level of
structure is all done:
(defun discard-mark (mark)
  (let* ((aresAllMarks (assoc :MARKS (cdr $dom+marks$))))
    (setf (cdr aresAllMarks)     ;For newbies: the SETF is needed in case we're
          (delete mark (cdr aresAllMarks)))   ; deleting the very first element
    ))

FYI: tag-to-marktag simply takes a DOM tag and converts it to the
corresponding fooMORE tag. Thus :P gets converted to :PMORE etc.
Trivial exercise in SYMBOL-NAME and FORMAT and MAKE-SYMBOL for
newbies to try.

<caveat>None of the above code has been tested beyond checking
        matching parens</caveat>

Why have a different kind of MORE tag for edit mark of each
different type of DOM object that's still mutable? Typically we
have several pending edit marks but only one pending edit for each
different entity type, so we can use ASSOC to quickly find the
desired edit mark for something we want to edit at the moment. For
example, we might keep the HTTP header and the HEAD and the BODY
open (mutable) indefinitely, but there's only one of each in any
DOM. And we might keep a A and some formatting on the label open
simultaneously, but wouldn't have two A's open simultaneously. We
will probably have a P open for a long time while adding various
elements to it, but close that P (and all elements within it)
before opening another P. We'll probably close each table row
before working on the next, and probably close each table before
working on another. So being able to use ASSOC to find the
currently open paragraph or table or table-row etc. solves most of
our find-the-mark problems, so we don't need to explicitly pass
around mark pointers everywhere, we can find the one we need when
we need it. In the rare case where we need two marks for the same
kind of entity open simultanously, either we pass around the two
marks explicitly, or we find one or the other by a deep-search
pattern-match algorithm.
(<OT> for Harrop: Can OCaml search for a pattern of TABLE entity
      with something deep inside as opposed to TABLE entity with
      something else deep inside it, when the search is restricted
      to the list of marks, none of which point to the actual
      object but only to the fooMORE point?</OT>
 Hmm, maybe in Lisp we also need a pointer to the object itself for
 such occasional deep-search purposes, so a mark object would be of
 the form:  (:fooMORE ptrLastCellNeedsSetfCdr ptrWholeObject)
 <OT> for Harrop: With the extra pointer, can OCaml do the search easily?</OT>
 )

Actually maybe the :fooMORE labels aren't necessary now that marks
are kept separately. Just use :foo by itself in the mark.

Also there's a big advantage to using a stack (reverse sequence)
instead of NCONC for keeping the list of marks: Most of the time we
want to edit the most local object of a given type, nevermind
whether it's deeply embedded in another object of a similar type.
For example, UL with LI with another level of UL with more LI deep
inside, we want to edit that last LI and get it all finished, maybe
even finish the UL above it, before getting back to add more to the
topmost LI. In that case ASSOC gives us that most-local match and
ignores anything higher up in structure *if* we keep things as a
stack. (This is why the original implementation of Lisp used an
assoc list of special-variable bindings that was PUSHed instead of
NCONCed.)


Now something important I've been wanting to ask in this article:
In XML there's a sort of equivalance between attributes and entitites:

  <gorilla> <mood>angry</mood>
            <name>Koko</name>
            <speak>My name is a secret. Try to guess.</speak>
  </gorilla>

  <gorilla mood="angry" name="Koko">
           <speak>My name is a secret. Try to guess.</speak>
  </gorilla>

Is that correct?
Is the same true in SGML? Can we write our anchors like this if we wish?
  <a> <href>http://tinyurl.com/uh3t</href>
      Click here to get to my Web site.
  </a>
instead of the usual way
  <a href="http://tinyurl.com/uh3t">Click here to get to my Web site.</a>
?

The general question is whether it's absolutely necessary to
distinguish between attributes and entities.

I ask because if it's necessary to keep the two separate, so that
we will render each in its proper format
 (:TAG "val") can be rendered as either of:
   tag="val"
   <tag>val</tag>
then we need some kind of splitter-mark in our DOM that separates
the attributes that precede the splitter-mark from the entities and
inline text that follow the splitter-mark.
  (:P (:COLOR "red") (:ALIGN RIGHT) :END-OF-ATTRIBUTES
      "Please click "
      (:A (:HREF "http://tinyurl.com/uh3t") :END-OF-ATTRIBUTES
          (:B :END-OF-ATTRIBUTES "here")
          )
      " to get to my Web site."
      )
It would be a royal pain to have to keep around a complete list of
all allowed HTML entity tags
 (HTML HEAD TITLE BODY A UL OL LI TABLE TR TD B I EM STRONG etc.)
and all allowed attribute keywords
 (HREF ALIGN COLOR NAME LABEL CLASS etc.)
and check both lists to see how to render something.
Better to just keep two separate lists of attributes on the one
hand and sub-entities + inline text on the other hand, right?

There's another completely different way to do this: Keep the
attributes and sub-entities as two separate sub-objects within each
entity object. Then when adding a new attribute we never have to
splice a new element into the middle of a list if we're just adding
a new attribute or entity at the end of a list of such. But I don't
see the advantage of this. We'll need two edit marks per object now
anyway, and SETF of CDR works just fine in all cases, and we're
doing massive in-place destructive-modification anyway, so there's
no real reason to keep things separate.

But it complicates looking for a given edit mark among the global
list of edit marks if there are two marks each in-edit object, one
edit mark for new attributes and another for new sub-entities (and
inline string text). So we come to another design decision, whether
to have two separate marks
  (:PMOREA ptrCellCdrSetfAttributes ptrTopObject)
  (:PMOREE ptrCellCdrSetfEntities ptrTopObject)
or one two-fork mark
  (:PMORE ptrCellCdrSetfAttributes ptrCellCdrSetfEntities ptrTopObject)
When we're done with either of the single marks, we just delete it
from the master list of marks. When we're done with either fork of
a two-fork mark, it's more complicated: First we RPLACA that spot
in the mark to be NIL to disable further edit of that place. Then
we check if the *other* fork is already NIL, in which case we can
now delete the entire mark from the master list of marks.

KMP, did you think of this question when you were implementing your stuff?


P.S. The more I work out the optimal design, the more I think I'm
not going to bother using what somebody else wrote, it's easier to
write it from scratch myself. Will somebody quickly show me to an
existing package/module that does what I want before I get started
*really* re-inventing yet another wheel?

From: David Lichteblau
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Mon, 11 Aug 2008 08:07:52 +0000
Message-ID: <slrng9vsqm.ga7.usenet-2008@radon.home.lichteblau.com>

On 2008-08-10, Robert Maas, http://tinyurl.com/uh3t <··················@spamgourmet.com.remove> wrote:
> Now something important I've been wanting to ask in this article:
> In XML there's a sort of equivalance between attributes and entitites:

When you write "entities", do you mean "elements"?

>  <gorilla> <mood>angry</mood>
>             <name>Koko</name>
>             <speak>My name is a secret. Try to guess.</speak>
>  </gorilla>
>
>  <gorilla mood="angry" name="Koko">
>            <speak>My name is a secret. Try to guess.</speak>
>  </gorilla>
>
> Is that correct?

No.

> The general question is whether it's absolutely necessary to
> distinguish between attributes and entities.

In general, yes.

But of course, if you limit yourself to a specific DTD that happens to
be unambiguous in this regard, you can distinguish between the two cases
automatically.

(I won't go into the details of sexp representations for HTML/XML,
because I dislike all of them.  My recommendation is to keep your HTML
out of the .lisp files entirely.)

> P.S. The more I work out the optimal design, the more I think I'm
> not going to bother using what somebody else wrote, it's easier to
> write it from scratch myself. Will somebody quickly show me to an
> existing package/module that does what I want before I get started
> *really* re-inventing yet another wheel?

Design for *what*?  What is the use case?

I agree that sending out HTML to the client while still building it is
usually a bad idea.  Except for those few requests where you know that
you will be streaming a large document, always build the document in
memory first, then write it to a stream when done.

But what are those insertion pointers for?  When you error out while
building your HTML, throw it away entirely and send an error message
instead.

If you insist on insertion pointers, I'm with KMP.  Build on top of any
existing library for a data model (DOM or otherwise), and keep the
pointers outside of that data structure.

That also answers your question about the libraries suitable for this
task: All of them, because your fancy builder with insertion pointers
would be layered strictly on top of them.

d.

From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Sun, 31 Aug 2008 22:22:58 +0000
Message-ID: <rem-2008aug31-004@yahoo.com>

> From: David Lichteblau <···········@lichteblau.com>
> When you write "entities", do you mean "elements"?

Ouch, I think you're correct, sorry for my guessing at jargon.

> > The general question is whether it's absolutely necessary to
> > distinguish between attributes and entities [sic].
> In general, yes.

OK, then I'll have to keep them clearly separate when I'm building
them into my DOM of SGML/XML, and then generate the appropriate
syntax <foo>text</foo> or foo="bar" per element or attribute
respectively. Thus embedded inside an element called bar, I'd generate
  <bar><foo>text</foo></bar>
or
  <bar foo="text"></bar>
respectively.

> But of course, if you limit yourself to a specific DTD that
> happens to be unambiguous in this regard, you can distinguish
> between the two cases automatically.

Since my proposed mutable-DOM would be totally general purpose, not
just multi-DTD but SGML/XML ambiguous, I'll distinguish elements
from attributes from the very start. I'll also restrict the
rendering from DOM to SGML/XML to the common subset, i.e. never use
XML explicitly-self-closing tags such as <foo/> or <foo />. The
only thing I'll need to special case would be HTML
implicitly-self-closing tags such as <br>. I'm not sure at this
point how I want to deal with such special cases. If the only such
cases I ever generate are <br> and <hr>, I think a hardwired list
of such exceptions would be good enough. Another idea is to
explicitly express these differently in the DOM. Thus maybe:
  (:P "Some text here." "Some more text here." :BR
      "After line break, more text." "Last part of text.")
instead of what I would have done the more uniform way:
  (:P "Some text here." "Some more text here." (:BR)
      "After line break, more text." "Last part of text.")

> (I won't go into the details of sexp representations for
> HTML/XML, because I dislike all of them.  My recommendation is to
> keep your HTML out of the .lisp files entirely.)

Nobody would ever see the s-expression representation except for
during debugging. The DOM would exist only in the core image, as
pointy structures of nested lists (CONS trees), never in printed
s-expr form except for debugging. Or maybe I might save a partially
complete DOM to a disk file, in which case it'd show as
s-expression notation there, but that sorta kills all the edit
marks within it, requiring them to be found again upon re-load,
unless I save both the DOM and the list of edit marks as a single
structure with *print-circular* turned on, maybe that's the way to
go if I ever need to save a "draft version" of a SGML/XML-DOM to
disk. Whether I ever have a literal DOM in source code is another
question. Because literals are immutable, I'd need to do a deep
copy from the literal to the mutable object before I could start
editing it, which might be more trouble than it's worth. For such a
constant value starting point for editing, it might be better to
write an EVALuable s-expression which simply nests constructors for
the starting-point DOM, whereby I have a giant LET* with local
variables pointing to each piece as it's built and to each edit
point within such a piece, and at the end the LET* returns the
entire DOM+EDITLIST structure. Then I just pass that to EVAL
whenever I want to create a new instance of that
fixed-startint-point mutable-DOM. I'm not sure whether that would
be better or worse than simply hand-coding in bottom-up DEFUNs all
the pieces and edit marks. Bottom-up building in LET*, or bottom-up
building in DEFUNs, can't decide:
  (defun make-standard-dom ()
    (make-html-dom (list (make-standard-head-dom) (make-standard-body-dom))))
  (defun make-standard-head-dom ()
    (make-head-dom (list (make-standard-title-dom) (make-standard-author-dom))))
or
  (defparameter g*standard-dom-let
    '(let* ((title (make-standard-title-dom))
            (author (make-standard-author-dom))
            (head (make-head-dom (list title author)))
            (body (make-standard-body-dom)))
            (html (make-html-dom (list head body)))
       html))
  (defun make-standard-dom (eval g*standard-dom-let))
I'm ignoring edit marks in that toy example.

> > P.S. The more I work out the optimal design, the more I think I'm
> > not going to bother using what somebody else wrote, it's easier to
> > write it from scratch myself. Will somebody quickly show me to an
> > existing package/module that does what I want before I get started
> > *really* re-inventing yet another wheel?
> Design for *what*?  What is the use case?

I have some existing CGI applications which spew out HTML text at
random times during execution. This is a pain because I have to
deal with the optimum point to issue the HTTP header which tells
whether the output will be text/plain (optimum for issuing bug
reports) or text/html (necessary for good output). The HTTP header
can be emitted only once, and there are several places it might be
emitted, so whenever it *might* be emitted I need to check a global
flag to see if it's already emitted and not do it again. This means
output from bug reports will be in two different formats depending
on whether the text/html header has already been emitted (because
some ordinary output has already spewed out before the bug was
detected) or the bug is caught before the first HTML output (in
which case the text/plain header is emitted at the point of the bug
report). It's a crock. I want to write some new CGI applications in
the near future, and I want to do it **right** this time around,
holding *all* SGML output in a DOM until I know whether there's a
bug or not. If there's a bug, ignore the DOM and instead emit
text/plain header and the detailed bug report and a general idea
how much of the DOM had been generated prior to the occurrance of
the bug. If there's no bug, and the very bottom of the program is
reached, *then* emit text/html header and convert the entire DOM to
HTML at once and immediately close standard output before anything
can go wrong.

> I agree that sending out HTML to the client while still building
> it is usually a bad idea.  Except for those few requests where you
> know that you will be streaming a large document, always build the
> document in memory first, then write it to a stream when done.

OK then you agree that DOM for this application is a good idea, and
the only reason my early applications didn't use it was that at
first I was just learning how to do CGI and it's best to learn just
one thing at a time, then later I was "into" the spew-out way of
doing it, with my existing library of HTML-spew functions I could
call as needed, and it just didn't occur to me to re-do everything
as DOM instead of just continuing to use my old (2001.Jan)
HTML-spew library. But with any new idea, there's a first time it's
conceived, which was in the near past, and a first time it's used,
which is currently planned for the near future, as soon as I work
out the design (or find an existing package somebody else wrote
which does what I want).

> But what are those insertion pointers for?

As I build the DOM, I want at all times for the current DOM to be
globally valid, yet I want at all times to have the potential to
add more to it later. When a bug occurs, I might want to display
the DOM as it stands in partially-complete form, as an aid to
debugging. For example, maybe something else already went wrong
earlier, but I didn't trap it as a bug. I can fix two bugs in one
cycle if I can see the trapped bug and also inspect the earlier DOM
to notice things that don't look right. I might generate the
earlier DOM in HTML, or in s-expression, depending on what I feel
would help debugging best. I might generate the error page in
text/heml with the error report first in <pre>, then three copies
of the DOM, one <pre> of s-expr, one <pre> of HTML, and one
actually rendered HTML. Having the mutable DOM in its partially
complete state sitting there at any moment I might want to see it,
gives me lots of options for display or discard of that DOM, and
gives me the option of changing the debug-display-mode settings
from call to call as I change my mind whether to see the DOM or not
and what mode(s) to use to show the DOM.

So at the start I have a DOM with just the head and body, with just
a couple items in the head and nothing at all in the body. Then I
start computing various new parts to add to head or body, and the
edit marks are to show me where to put the next item in any of
those places, so that they appear *after* anything previously
added, and I don't have to scan from the start of each object down
to the last CONS cell each time to find where to insert the next
item at the end. Even worse, I don't have to travel several levels
deep in nested DOM every time I want to add one new piece of text
inside a table data inside a table row inside a frame inside a body
inside the html. The edit mark takes me instantly to the CONS cells
whose CDR needs to be patched to append more one item at that point
deep inside the overall DOM. It's like TCONC (from BBN/UCI lisp) as
alternative to NCONC, extended to deep structures instead of just
one level of list.

> When you error out while building your HTML, throw it away
> entirely and send an error message instead.

Maybe, maybe not, depending on circumstances of the error.

> If you insist on insertion pointers, I'm with KMP.  Build on top
> of any existing library for a data model (DOM or otherwise), and
> keep the pointers outside of that data structure.

I've come to that conclusion too, partly because of KMP's feedback.
Unfortuately nobody has shown me where to find any existing library
for mutable DOM for SGML/XML, so I'll probably have to build my own
library from scratch. But now that I understand what's needed
better, and have posted some articles describing my tentative
design, the rest of the design and the implementation ought to be
less than a week of work. My first application, as I develop it,
will probably be my WAP (mobile InterNet on cellphone) "hello CGI
plus 3 steps" <http://www.rawbw.com/~rem/WAP/HelloPlus/wStep3.html>
All the current examples there use the spew-on-the-fly methodology.
I plan to re-implement the Common Lisp example to use mutable DOM,
and then maybe translate that design to one or more of the other
languages, with no garbage collection whatsoever in C and C++,
because CGI applications are short-lived where memory never gets
exhausted anyway so reclaiming it isn't necessary.

> That also answers your question about the libraries suitable for this
> task: All of them, because your fancy builder with insertion pointers
> would be layered strictly on top of them.

Yeah. I have no idea where to find "all of them", and no good idea
where to find any of them. I don't even know if even one
mutable-DOM library exists. The libraries I saw mentionned a few
months ago seemed to involve one-pass bottom-up building, with no
provision for showing an intact-yet-incomplete overall structure
when an abort happens. Like if you're nine levels deep inside
function calls:
  Enter Make-HTML arg
   Enter LIST
    Enter Make-HEAD arg
     ...
    Exit Make-HEAD
    Enter Make-BODY
     Enter LIST
      Enter Make-SECTION
       Enter LIST
        Enter Make-P
         Enter LIST
          Enter Make-text
          Exit Make-text
          Enter Make-IMG
           ERROR from Make-IMG
very little of the side branches have been completed, and they are
located in other stack frames which are easily accessible to the
interactive debugger but essentially inaccessible to the CGI
framework.

From: Kenny
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Mon, 11 Aug 2008 01:45:02 +0000
Message-ID: <489f99a1$0$20936$607ed4bc@cv.net>

Robert Maas, http://tinyurl.com/uh3t wrote:
> I'm currently planning work on a project for a CGI/CMUCL
> application, where I would like to build a data structure
> representing the parse tree of HTML output, i.e. a DOM
> (Document-Object Model). But rather than build all the pieces
> bottom-up and put them together at the last moment, I want to build
> an editable DOM, which starts out with a generic ABORT message
> which is later replaced by either a more specific ABORT message or
> a specific user-error message or normal valid output or a mix of
> normal output and a user-error message. I'll have a toplevel CATCH
> and/or CATCH-ALL-ERRORS around my entire application, and if
> anybody ever throws to that CATCH then the DOM will be quickly
> patched to include something expressiong the parameter to the THROW
> or ERROR, and the the entire contents of the DOM after that edit
> will be sent out immediately.
> 
> The purpose is to allow abort (deliberate, or due to program bug,
> it shouldn't matter which) at any time in the middle of a
> HTTP/CGI/CMUCL transaction to be handled by a single unified
> mechanism, thereby establishing failsafe operation in all cases.
> And to unify the code, normal successful program termination would
> share the same code, somewhat analagous to the way the Unix shell
> treats return code of 0 to indicate successful return from
> application but no matter whether there was an error or not there's
> exactly the same graceful return from the application to the Unix
> shell.
> 
> Before I go on, does my purpose (second paragraph) sound reasonable
> and obtainable, and does my proposed mechanism (first paragraph)
> sound like a reasonable way to achieve the goal implied by the
> purpose? Or is there a better way to think this through?
> 
> In all my CGI applications to date, I have functions that I call to
> generate HTML output to standard output live at the moment I call
> the HTML-generation functions, which are buffered by the Apache
> server until my application quits. Unfortunately this makes coding
> tricky for dealing with deciding whether to generate text/html or
> text/plain output, since the header for that must be generated,
> hence that decision committed, before my program reaches any point
> where subsequent output might be generated. I hack my way past it
> by keeping a flag as to whether the HTTP header has been generated
> yet, and on ABORT I generated the header if it wasn't yet
> generated. I would like *NOT* to continue that tradition in my next
> major CGI application, hence my thinking about the editable-DOM
> method as an alternative.
> 
> In recent years I've seen mention of various sexpr syntax for
> representing the structure of HTML files, which may be used by a
> parser or by a generator. It wasn't clear to me which of these
> open-source and other software packages allowed editing within a
> structure and which allowed *only* bottom-up building of structure
> with no further editing supported. So my purpose in posting today
> is to ask anyone with experience with any of those packages to
> please identify the name of the package you know about and tell me
> how well the package supports automated editing to build from a
> skeleton abort message towards the final DOM product just prior to
> translating to actual HTML text. I.e. how suitable is each
> HTML/DOM/sexpr package for my intended use.
> 
> Names that I recall for file-extensions on static DOM/sexpr pages
> that trigger automatic translation to HTML are .LHP (analagous to
> PHP) and .LSP (analagous to JSP and ASP). The one that most
> appealed to me was one that used (in CAR position of sexpr) keyword
> symbols for HTML tags, such as (:B "This text is bold."), and
> ordinary symbols for Lisp fuctions to call to generate the sub-form
> to insert at this place in the overall expression, such as
> (FORMAT "~&The current time is ~D seconds since the era began.~%"
>   (GET-UNIVERSAL-TIME))
> But since those discussions were about static overall Web pages
> that just happened to be expressed in DOM/sexpr form instead of
> SGML/XML form, and allowed small pieces to be generated by live
> Lisp code, there wasn't any discussion of the ability to build such
> a structure in a Lisp core-image by multiple edits until it's all
> finished and ready to convert to HTML, where Lisp could could
> traverse the *entire* DOM structure instead of only work with the
> tiny place where the code was embedded. Of course I suppose
> somebody could "cheat" by wrapping a DOM-editor around the entire
> document, or around a failsafe stub, like this:
> (MAIN-EDITOR '((:HTTP "text/plain")
>                (:HTML (:HEAD (:TITLE "Abort"))
>                       (:BODY "Some unknown fatal error happened."))))
> MAIN-EDITOR could then do whatever edits were required per my idea.
> But unless the DOM software had good utilities to support finding
> the desired place in the structure to do each edit, it would be a
> bear to write the code for MAIN-EDITOR.
> So I'm asking if somebody has experience with hacking one of those
> packages as summarized just above, or whether any of the packages
> actually *supports* my overall design in a non-hackish way.
> 
> In particular, one of my ideas if I needed to write the whole thing
> myself, "re-inventing the wheel", if none of the available packages
> can do the job reasonably, is that various places in the structure
> would have a continuation tag in the CDR of the last cell in the
> level of list, which would be replaced by NCONCing more statements
> at the same level followed by that continuation tag at the new end.
> Thus (:BODY . :BODYMORE)
> might later be spliced to read
>      (:BODY (:H1 "My DOM Hello World attempt")
>             . :BODYMORE)
> then later:
>      (:BODY (:H1 "My DOM Hello World attempt")
>             (:P "First line of test paragraph."
>                 . :PMORE)
>             . :BODYMORE)
> then later:
>      (:BODY (:H1 "My DOM Hello World attempt")
>             (:P "First line of test paragraph."
>                 "Testing 1 2 3 ... infinity!"
>                 . :PMORE)
>             . :BODYMORE)
> then later:
>      (:BODY (:H1 "My DOM Hello World attempt")
>             (:P "First line of test paragraph."
>                 "Testing 1 2 3 ... infinity!"
>                 . :PMORE)
>             . :BODYMORE)
> then later:
>      (:BODY (:H1 "My DOM Hello World attempt")
>             (:P "First line of test paragraph."
>                 "Testing 1 2 3 ... infinity!"
>                 . :PMORE)
>             (:P "Heres's a second paragraph, just one line, all done already.")
>             . :BODYMORE)
> then if that form as it stands there is suddenly submitted for
> conversion to HTML, all the continuation tags are immediately
> flushed first, so this is the form that gets converted to HTML:
>      (:BODY (:H1 "My DOM Hello World attempt")
>             (:P "First line of test paragraph."
>                 "Testing 1 2 3 ... infinity!"
>                 )
>             (:P "Heres's a second paragraph, just one line, all done already.")
>             )
> 
> But maybe one of the existing HTML/DOM/sexpr packages has a better
> idea for making it easy to edit a growing DOM...

You are getting very warm...

>... as the application
> computes more items to insert into it various places?

Doh! You are so cold you have icicles on your ears. You want a paradigm 
not in which procedural code builds a DOM but in which the DOM builds 
itself. It grows.

The math engine here http://www.theoryyalgebra.com/ does not build a 
solution, the solution grows by itself. The first step of the solution 
is really the problem statement, and it sires children each step of 
which is a plausible second step in the solution. It was mildly tricky 
persuading this to happen breadth first, because we want to stop when 
the shortest  solution... I digress.

In your case you would be writing declarative rules for various nodes in 
the DOM to grow their expansions, with CLOS instances for each node 
until you are ready to cull the actual XML. I believe you will find a 
starter kit in the OpenAIR project on common-lisp.net.

Note now you do not need a cursor, because you are not building the DOM. 
Each node knows where it is so knows where to grow.

As for the abort... will NIL where the XML should be suffice as "bad 
news"? If not, Cells would need hacking....nah, when you trap the error 
just sweep the tree and where you find nil look to a second "Failsafe" 
slot for instance-specific values and default that to :kaboom or something.

hth, kenny

-- 

$$$$$: http://www.theoryyalgebra.com/
Cells: http://common-lisp.net/project/cells/
BSlog: http://smuglispweeny.blogspot.com/

From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Sun, 31 Aug 2008 22:43:11 +0000
Message-ID: <rem-2008aug31-005@yahoo.com>

> From: Kenny <·········@gmail.com>
> ... You want a paradigm not in which procedural code builds a DOM
> but in which the DOM builds itself. It grows.

Well a DOM is a static data structure (despite the fact that it's
mutable), so there's no way it can possibly build itself. Either
some agent must traverse it and notice places that are tagged for
growth, or some external program must "know" where to grow it from
a priori knowledge.

> The math engine here http://www.theoryyalgebra.com/ ...
   Free!
   Trial Version for Windows*
   Download Now!
   (built 26-Aug-2008 08:00 GMT)
   * Mac version available by
   September 15.
This must run on FreeBSD Unix, the only computer with CGI service
that I have an account on. I don't see any such version listed, so
I don't believe it would be of any use to me.

If I decide to take that approach and "re-invent the wheel" for FreeBSD:
> In your case you would be writing declarative rules for various
> nodes in the DOM to grow their expansions, with CLOS instances for
> each node until you are ready to cull the actual XML.

What does "cull" mean there? That word in that context doesn't make
sense to me. "Cull" means to kill off inferior animals to reduce
the size of the herd, so that the survivors will have more food and
living space, and improve the average of the herd by reducing the
weight of the inferior members in the average. Would a different
word make more sense?

Why are CLOS instances needed at all for this application?

> I believe you will find a starter kit in the OpenAIR project on
> common-lisp.net.

   This one uses ... XMLHttpRequest ...

That doesn't sound like it'd work in CMUCL to generate HTML 3.2 output.

I don't see any design documents on that Web site that give more
details about what it does and what it can't do.

From: Rob Warnock
Subject: Re: Machine-editable DOM/sexpr model specifying HTML output to be generated?
Date: Mon, 01 Sep 2008 07:01:58 +0000
Message-ID: <45idnXkAjvP7DibVnZ2dnUVZ_uWdnZ2d@speakeasy.net>

Robert Maas <··················@spamgourmet.com.remove> wrote:
+---------------
| From: Kenny <·········@gmail.com>
| > In your case you would be writing declarative rules for various
| > nodes in the DOM to grow their expansions, with CLOS instances for
| > each node until you are ready to cull the actual XML.
| 
| What does "cull" mean there? That word in that context doesn't make
| sense to me.
+---------------

I suspect that Kenny might have meant "reap" or "harvest". It's common
when dealing with any kind of generated code (e.g., in a compiler) to
have additional auxiliary information in the in-memory data structure
(tree or DAG or general graph, whatever) and then once you've optimized
everything to your liking make one final walk of the in-memory data
("DOM", if you will) to emit the final output format -- XML in this case.

Kenny's use of "cull" could be quite reasonably seen as a flip side
of the usual approach in which, instead of reaping the desired bits,
one culls everything *but* the desired final output format. Viewing
it either way, you're separating the wheat from the chaff (to add yet
another metaphor to the stew).

+---------------
| Why are CLOS instances needed at all for this application?
+---------------

I dunno, because it might make it easier to customize the various
optimization/walker phases???  ;-}


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: Finally making use of PowerLisp despite several **horrible** bugs it has
Date: Mon, 01 Sep 2008 00:00:00 +0000
Message-ID: <rem-2008aug31-006@yahoo.com>

> From: Kenny <·········@gmail.com>
> >> PowerLisp 2.01 68k, released in 1994, apparently the last version
> >> of PowerLisp that works on 68xxx CPUs such as in my Macintosh
> >> Performa
> > You're diving through the wrong dumpsters for equipment.
> dumpster schmumpster, someone would probably give him $10 to take
> away a nice system and a couple of 19" monitors.

There's no room for any such here. You need to have somebody come
visit me and just look around and see how grossly overcrowded it
already is in this tiny room smaller than a typical model room, and
report back to you there's no room for even one desktop computer or
monitor. There's not even room for a laptop, except on my lap when
I'm using it and under the bed when I'm not using it.

> I just wish he lived nearby. :)

You mean you appreciate me enough that you wish you could come visit me?
Do you know anyone in the SF bay area who appreciates me?
Do you know anyone in Sunnyvale who knows "beans" about Linux?