Editor buffer CLOS update protocol.

From: Christopher R. Barry
Subject: Editor buffer CLOS update protocol.
Date: Mon, 06 Mar 2000 00:00:00 +0000
Message-ID: <87g0u37it1.fsf@2xtreme.net>

I've basically finished writing a simple text editor as a school
programming assignment. It works, but as I added various features to
it one at a time (which were extra-credit), I found myself having to
rewrite pieces of code not directly related to the functionality being
added, and introducing a lot of annoying bugs and special-cases. Every 
feature I add has to have knowledge of buffer-state internals and
update the buffer object's slots itself.

I'd like to rewrite the program to use a good CLOS design, and to
create protocols for modifying the buffer so that invariants are
maintained more automatically via method dispatch rather than via
explicit coding in each function.

I don't know how to write this editor using a CLOS framework though.

If that was too vague to give a reader the gist of what I am talking
about, then I offer the following example:

Say I have a text editor buffer class:

  (defclass buffer ()
    (...
     ... zillions of slots ...
     ...
     (position-of-point)
     (current-line-number)
     (total-lines-count)))

Now say I want to have one function (or two distinct functions) which
insert a new line into the buffer, either before the current line or
after it. If I insert a line before the current line, then the
current-line-number does not change, but position-of-point and
total-lines-count does. If I insert a line after the current one
though, then all three of the example slots change. To program this in 
a style similar to how I have been doing things I would do something like:

  (defun insert-line-before-current (line buffer)
    (with-slots (...) buffer
      ...))

  (defun insert-line-after-current (line buffer)
    (with-slots (... current-line-number) buffer
      ...))

What I want (or at least what I think I want) is to have a single
point that every function that updates the current-line-number goes
through, and a single point that every function that updates the
total-lines-count goes through, and so on, so that it will be more
difficult (if not impossible) for a misprogrammed function that adds a
new feature to make a buffer's internal state inconsistent, like
inserting a line and then not updating the total-lines-count. (In that
specific example I could use a "insert-line" protocol perhaps, but I
don't know how I would write one with CLOS, if a separate
line-insertion protocol is indeed what I want.)

Thank you in advance to anyone who has the time and expertise to help me,
Christopher

Re: Editor buffer CLOS update protocol. Paolo Amoroso
- Re: Editor buffer CLOS update protocol. Christopher R. Barry
  - Re: Editor buffer CLOS update protocol. Paolo Amoroso
  - Re: Editor buffer CLOS update protocol. Jason Trenouth
Re: Editor buffer CLOS update protocol. Robert Monfera
- Re: Editor buffer CLOS update protocol. Christopher R. Barry
  - Re: Editor buffer CLOS update protocol. Robert Monfera

From: Paolo Amoroso
Subject: Re: Editor buffer CLOS update protocol.
Date: Tue, 07 Mar 2000 00:00:00 +0000
Message-ID: <9VnFOLXI76Mg+g+EV3BgEMRWYoFl@4ax.com>

On Mon, 06 Mar 2000 23:49:40 GMT, ······@2xtreme.net (Christopher R. Barry)
wrote:

> I'd like to rewrite the program to use a good CLOS design, and to
> create protocols for modifying the buffer so that invariants are
> maintained more automatically via method dispatch rather than via

I don't know whether it's relevant, but it might help to use a
constraint-based object system such as KR, which is part of Garnet. The
user interface toolkit SLIK, implemented in CLOS, is based on a similar
approach (it uses abstract behavioral types, events and mediators). Source
code and documentation are available at:

  ftp://ftp.radonc.washington.edu:/dist/slik/

SLIK is part of the Prism radiation therapy planning system developed at
the University of Washington:

  http://www.radonc.washington.edu/medinfo/prism/

> Thank you in advance to anyone who has the time and expertise to help me,

I do have the time, but unfortunately not enough expertise :)

Paolo
-- 
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/

From: Christopher R. Barry
Subject: Re: Editor buffer CLOS update protocol.
Date: Wed, 08 Mar 2000 00:00:00 +0000
Message-ID: <87og8q59e5.fsf@2xtreme.net>

Paolo Amoroso <·······@mclink.it> writes:

> On Mon, 06 Mar 2000 23:49:40 GMT, ······@2xtreme.net (Christopher R. Barry)
> wrote:
> 
> > I'd like to rewrite the program to use a good CLOS design, and to
> > create protocols for modifying the buffer so that invariants are
> > maintained more automatically via method dispatch rather than via
> 
> I don't know whether it's relevant, but it might help to use a
> constraint-based object system such as KR, which is part of Garnet. The
> user interface toolkit SLIK, implemented in CLOS, is based on a similar
> approach (it uses abstract behavioral types, events and mediators). Source
> code and documentation are available at:

[...]

No, the GUI isn't the problem. The buffer representation is. BTW, the
CMUCL Hemlock sources were instructive (from struct.lisp):

  ;;; The buffer object:
  ;;;
  (defstruct (buffer (:constructor internal-make-buffer)
		     (:print-function %print-hbuffer)
		     (:copier nil)
		     (:predicate bufferp))
    "A Hemlock buffer object.  See Hemlock Command Implementor's Manual for details."
    %name			      ; name of the buffer (a string)
    %region		      ; the buffer's region
    %pathname		      ; associated pathname
    modes			      ; list of buffer's mode names
    mode-objects		      ; list of buffer's mode objects
    bindings		      ; buffer's command table
    point			      ; current position in buffer
    (%writable t)		      ; t => can alter buffer's region
    (modified-tick -2)	      ; The last time the buffer was modified.
    (unmodified-tick -1)	      ; The last time the buffer was unmodified
    windows		      ; List of all windows into this buffer.
    var-values		      ; the buffer's local variables
    variables		      ; string-table of local variables
    write-date		      ; File-Write-Date for pathname.
    display-start		      ; Window display start when switching to buf.
    %modeline-fields	      ; List of modeline-field-info's.
    (delete-hook nil))	      ; List of functions to call upon deletion.


If my LispM were powered up, I'd probably find the Zmacs sources most
instructive....

> I do have the time, but unfortunately not enough expertise :)

I also looked at the Xemacs 21 sources; editors have got to be one of
the most difficult classes of programs there are to implement. (If you 
want something efficient and like Emacs, at least.)

Christopher

From: Paolo Amoroso
Subject: Re: Editor buffer CLOS update protocol.
Date: Wed, 08 Mar 2000 00:00:00 +0000
Message-ID: <e4jGOHtGOuGv0mRsM712cHXGvDUu@4ax.com>

On Wed, 08 Mar 2000 05:07:37 GMT, ······@2xtreme.net (Christopher R. Barry)
wrote:

> No, the GUI isn't the problem. The buffer representation is. BTW, the

I know. But although KR and SLIK's constraint system happen to have been
used for GUI development in Garnet and SLIK, I mentioned them for a
different reason. I meant to suggest that their constraint-based,
declarative approach might also be useful for maintaining buffer
invariants. KR was designed as a general purpose constraint-based
object/knowledge representation system.


Paolo
-- 
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/

From: Jason Trenouth
Subject: Re: Editor buffer CLOS update protocol.
Date: Thu, 09 Mar 2000 00:00:00 +0000
Message-ID: <m3THOCQ+QNp=jeO58UCk08154Xf8@4ax.com>

On Wed, 08 Mar 2000 05:07:37 GMT, ······@2xtreme.net (Christopher R. Barry)
wrote:

> Paolo Amoroso <·······@mclink.it> writes:
> 
> > On Mon, 06 Mar 2000 23:49:40 GMT, ······@2xtreme.net (Christopher R. Barry)
> > wrote:
> > 
> > > I'd like to rewrite the program to use a good CLOS design, and to
> > > create protocols for modifying the buffer so that invariants are
> > > maintained more automatically via method dispatch rather than via
> 
> [...]
> 
> .. The buffer representation is [important]. BTW, the
> CMUCL Hemlock sources were instructive (from struct.lisp):
>
> ..
>
> If my LispM were powered up, I'd probably find the Zmacs sources most
> instructive....
>
> ..
>
> I also looked at the Xemacs 21 sources; editors have got to be one of
> the most difficult classes of programs there are to implement. (If you 
> want something efficient and like Emacs, at least.)

Another place to look might be the LGPL'ed sources of the Dylan editor, DEUCE,
in Fun-Dev 2.0 beta: http://www.fun-o.com

Dylan is fairly CLOSish so you might get some ideas on editor data structures
and protocols.

Note that DEUCE is a 'hard sectioning' editor: ie its internal buffer
representation is broken into sections and it is the latter which contain
lines. The sections are created from program source structure, e.g. one per
definition and accompanying comments etc.

A corollary of this internal representation is that DEUCE can edit 'virtual
files' which don't physically exist, but which are useful collections of
definitions that you might want to edit together: e.g. all the methods on a
generic function.

__Jason

From: Robert Monfera
Subject: Re: Editor buffer CLOS update protocol.
Date: Tue, 07 Mar 2000 00:00:00 +0000
Message-ID: <38C517A5.6F9ED01E@fisec.com>

"Christopher R. Barry" wrote:

>   (defclass buffer ()
>     (...
>      ... zillions of slots ...
>      ...
>      (position-of-point)
>      (current-line-number)
>      (total-lines-count)))

When determining the slots of a class, I find it helpful to "normalize"
classes by capturing the _state_ of the object with the _minimum_ amount
of data, the minimum number of (conceptually meaningful) slots.  During
modeling, I'd just consider total line count a generic function.  I
don't want to manage redundant slots for the sake of faster performance
while my object model is highly volatile and limited in functionality.

Normalization also leads to insights in the form of discoveries and
sometimes unexpectedly easy implementation of cool functions.

As for the maintenance of current position, you could define a handful
of basic functions that move the cursor.  Things that are triggered by
changes to the cursor position (e.g., highlighting the matching
parenthesis :-) could be implemented as :after methods.

You may segregate functions along abstraction layers (e.g., functions
related to word and s-expression handling are above character-level
functionality), aiming at low coupling between layers (separate packages
may help).

Robert

From: Christopher R. Barry
Subject: Re: Editor buffer CLOS update protocol.
Date: Tue, 07 Mar 2000 00:00:00 +0000
Message-ID: <877lff6x9r.fsf@2xtreme.net>

Robert Monfera <·······@fisec.com> writes:

> "Christopher R. Barry" wrote:
> 
> >   (defclass buffer ()
> >     (...
> >      ... zillions of slots ...
> >      ...
> >      (position-of-point)
> >      (current-line-number)
> >      (total-lines-count)))
> 
> When determining the slots of a class, I find it helpful to "normalize"
> classes by capturing the _state_ of the object with the _minimum_ amount
> of data, the minimum number of (conceptually meaningful) slots.  During
> modeling, I'd just consider total line count a generic function.  I
> don't want to manage redundant slots for the sake of faster performance
> while my object model is highly volatile and limited in
> functionality.

Consider total line count a generic function? A total-line-count
generic function would (as far as I can conceive) only have one method
that would specialize on BUFFER, so why make it generic? (I'm sure you
are giving good advice, I just think I need a more specific example to
understand you better.) Also, are you implying that the
total-lines-count should be maintained in the generic function as
state (perhaps in a closure), or are you saying that the
total-lines-count should be recomputed every time the generic function
is run? (Which seems too needlessly inefficient even for a prototype,
but maybe not?)

> As for the maintenance of current position, you could define a handful
> of basic functions that move the cursor.  Things that are triggered by
> changes to the cursor position (e.g., highlighting the matching
> parenthesis :-) could be implemented as :after methods.

So getting back to my line-insertion example; would I define a
line-insertion generic function (or a more general protocol) and
implement the buffer state updating as :AFTER methods for the
insert-line method calls? I thought of somehow doing something
analogous to this, and also of declaring :AFTER (or whatever) methods
for slot writer methods to keep the buffer state consistent, but I
couldn't figure out how to put it all together nice and clean.

> You may segregate functions along abstraction layers (e.g., functions
> related to word and s-expression handling are above character-level
> functionality), aiming at low coupling between layers (separate packages
> may help).

Low coupling is a major goal along with better encapsulation of the
buffer. I want the buffer to maintain its own invariants as much as
possible, so that when I implement new-feature-number-50, that feature
can't easily (if at all) make the buffer's state inconsistent, because
it must modify the buffer through a protocol the buffer provides for
modifying its state. (Hence the title of my post.) Every time I
implemented a new feature in the current editor codebase, I had to
rewrite lots of stuff which I don't think I should have had to, and I
had to debug lots of bugs from the buffer's state being made
inconsistent because the feature wouldn't clean up after itself right.
I have been thinking quite a lot about how to do this a better way
using CLOS, and realize now that this is probably a much more
difficult problem then I first thought and that in-depth study of my
copy of AMOP might be most productive.....

Christopher

From: Robert Monfera
Subject: Re: Editor buffer CLOS update protocol.
Date: Tue, 07 Mar 2000 00:00:00 +0000
Message-ID: <38C5D124.13654EE6@fisec.com>

"Christopher R. Barry" wrote:

> Consider total line count a generic function? A total-line-count
> generic function would (as far as I can conceive) only have one method
> that would specialize on BUFFER, so why make it generic?

The main point was that during the initial development it does not have
to be a slot, because its value can be calculated from some other slot
carrying the state.  Your function could count the lines or new line
characters in a vector, for example.

As for polymorphism, never say never.  If you ever introduce a
mini-buffer, you may create a new method returning a constant 1.  You
may also introduce a subclass of BUFFER later, e.g., for handling
extra-large files, and you'd calculate the number of lines differently.
Creating an ordinary function instead of a generic function does not
bring any benefit.  Speaking of performance, a GF with only one method
should be optimized to be just as efficient as a regular function which
possibly starts with type (class) assertion.

> are you implying that the
> total-lines-count should be maintained in the generic function as
> state (perhaps in a closure),

No - maintaining state in a closure variable and in a slot are similar
in nature, to the extent that object slots and accessors can easily be
implemented with closures.  The only advantage of using a closure
variable would be data hiding, but it's not worth the effort of
basically running two sets of slots in parallel and probably you aren't
concerned about it.

> or are you saying that the
> total-lines-count should be recomputed every time the generic function
> is run? (Which seems too needlessly inefficient even for a prototype,
> but maybe not?)

Yes, line-count is a good example because it's as inefficient to
recalculate as it can get.  For a largish, 64MB text file you may have
to wait a second or so to calculate it.  When building up
_functionality_ in an exploratory way, however, you aren't likely to
test with files larger than a few hundred lines.

The benefit is that you can completely redesign and rewrite GFs that
possibly have an impact on line count (e.g., character insertion or
letter size change) with no consideration whatsoever on updating the
line count slot.  In fact, you don't even have to think about all
function relations (whose number increases quadratically with the added
functionality), just the functions (whose number increases linearly as
you develop).  Faster exploration, fewer bugs.  When you add the
functionality of changing letter size, you'll only have to worry about
the buffer representation, and how the new information _in the buffer_
can be used in functions generally (including the line count function,
if necessary - maybe the line count function just invokes some parts of
the rendering functionality that takes font size into account already).
You can also quickly alter the meaning of the line count function or add
a new function, for example, to distinguish between line numbers on the
screen with line wrap on and line numbers in the file.  You avoid the
domino-effect of having to revisit a lot of functions.

It's like interpretation:  if there are delegates from 20 countries, you
may interpret speakers to one language and translate for all delegates
from that language, requiring only 20 bilingual translators rather than
190, except maybe in the EU.  The common language is the lean,
non-redundant state information carried in object slots.

This "language" may well have a lot in common with the mental model of
users of your editor, just more detailed and explicit.

Later, as your model improves and your functions are becoming more
solid, you'll be able to identify a very small set of functions that can
cause a change to the line number (for example, primitives like line
insertion, line deletion and line overflow), and you may attach an
:after method to trigger all the housekeeping necessary (e.g., by
invoking something like RECALCULATE-LINE-COUNT, which would update the
newly created slot, which in turn would trigger other functions like
page number recalculation and line count display).  So yes, it makes
sense to have a slot for the line number after the architecture becomes
stable.

> So getting back to my line-insertion example; would I define a
> line-insertion generic function (or a more general protocol) and
> implement the buffer state updating as :AFTER methods for the
> insert-line method calls?

The very core of the line insertion function is the line insertion in
the buffer itself, which you can best represent as the method itself
specialized on the buffer class, rather than an :after method.  :after
methods are useful if you have multiple levels of the class hierarchy,
for example, you have a BUFFER class ignorant about the way the buffer
is represented, and a GAP-BUFFER which is a specialization of it.
Inserting a new line or hitting an arrow key could (indirectly) invoke
the MOVE-CURSOR GF, which has a method specialized for the GAP-BUFFER
class to properly manage how a cursor move is handled in a gap editor,
and an :after method for invoking  higher-level GFs like cursor
redisplay on the screen.  An :after method specialised for LISP-BUFFER
could take care of paren highlighting etc.  You can combine :before,
:after, :around and (call-next-method) to manage layers of
functionality.  You can use them on setter functions like

(defmethod (setf cursor-position) :after (new-value (buffer
lisp-buffer))
   (highlight-parens buffer)
   ...)

Robert