Are shared-structure mutations actually important?

From: Ray Dillinger
Subject: Are shared-structure mutations actually important?
Date: Sun, 16 Nov 2008 19:57:33 +0000
Message-ID: <49207b52$0$33570$742ec2ed@news.sonic.net>

Is the ability to alter the value of one variable by 
mutating the value of a different or "alias" variable
ever actually important to our ability to solve 
problems?  I'm increasingly of the opinion that it's 
not. 

I get that saving memory was very important when lisp 
was being designed and needed to run on machines that 
people could build back then.  I get that saving memory
is still *somewhat* important today.  But, questions 
of memory size aside, and performance issues about 
copy-on-write being slower than direct mutation aside, 
is there anything we accomplish by aliased mutations 
that we couldn't accomplish with more ease and less 
confusion without them?

I'm imagining a lisp dialect with a strict copy-on-write 
rule, such that variables can only be rebound, and never 
have their values changed in place.

Eq? would be discarded as meaningless.  There'd be no 
string-set, no array-set, no rplaca/set-car!, no 
rplacd/set-cdr!, etc - or at least not as you've known 
them.  The names, if they survive, would be used for 
nondestructive operations that return newly-allocated 
values differing in the fields referenced, without 
any effect on the existing value (and other variables 
or list structure that may be bound to it).  

I have already made a "ropes" string library that works 
this way - when I change the first occurrence of " " 
after the 37489th character to to ". ", I get a new 
string.  It doesn't contribute much to memory-hoggery, 
because it shares 99% or so of its low-level structure 
with the old one. It has one newly-allocated "strand" 
of 100-1000 characters containing the change, and a 
chain of a few newly-allocated tree-nodes leading to 
that strand.  Other branches of the tree nodes point
directly at existing structure shared with the original 
string.  So, if I bind the variable to the new string, 
and there's nothing that still has the value of the old 
string, then one strand and 3 or 4 tree nodes would be 
eaten by the Garbage collector.  If something still has 
the value of the old string, then the new one shares 
most of its data with it and the garbage collector goes 
hungry.  From the user's POV, I have two strings with 
slightly different values, because the two variables 
are bound to different root nodes.  

Anyway, this works.  Constant-time string copy and 
logarithmic-time insertions, deletions, and unequal-
size replacement are very nice to have as string 
operations, and make some of the stuff I work with 
(where strings several million characters long 
aren't all that uncommon) work much faster. Also, 
it's very nice to be able to keep an "undo stack" 
on huge strings, just by keeping the old values of 
the string.  This is cheap, even on huge strings, 
because of the shared structure.  

And now I'm looking at cons cells, and even arrays, 
and thinking.... is there a good reason why not to 
make this sort of behavior universal?  Is there anything
that would be significantly harder to do in a more 
nearly pure-functional dialect where the only mutation 
operation is setq/set! itself, and that rebinds a 
variable rather than changing its value in-place?  

                                Bear

Re: Are shared-structure mutations actually important? ··················@gmail.com
- Re: Are shared-structure mutations actually important? Scott Burson
  - Re: Are shared-structure mutations actually important? Scott
    - Re: Are shared-structure mutations actually important? Scott Burson
      - Re: Are shared-structure mutations actually important? Scott
        Re: Are shared-structure mutations actually important? Slobodan Blazeski
    - Re: Are shared-structure mutations actually important? Jens Axel Soegaard
      - Re: Are shared-structure mutations actually important? Scott
        Re: Are shared-structure mutations actually important? Jens Axel Soegaard
Re: Are shared-structure mutations actually important? Rich Hickey
- Re: Are shared-structure mutations actually important? Andrew Reilly
  - Re: Are shared-structure mutations actually important? Scott Burson
  - Re: Are shared-structure mutations actually important? Rich Hickey
  - Re: Are shared-structure mutations actually important? Ray Dillinger
    - Re: Are shared-structure mutations actually important? Rich Hickey
      - Re: Are shared-structure mutations actually important? Matthew D Swank
    - Re: Are shared-structure mutations actually important? Jens Axel Soegaard
Re: Are shared-structure mutations actually important? Scott Burson
Re: Are shared-structure mutations actually important? Tim Bradshaw
- Re: Are shared-structure mutations actually important? Ray Dillinger
  - Re: Are shared-structure mutations actually important? Tim Bradshaw
  - Re: Are shared-structure mutations actually important? Alex Mizrahi
Re: Are shared-structure mutations actually important? Matthias Blume
- Re: Are shared-structure mutations actually important? Ray Dillinger
- Re: Are shared-structure mutations actually important? George Neuner
  - Re: Are shared-structure mutations actually important? Rob Warnock
    - Re: Are shared-structure mutations actually important? George Neuner
      - Re: Are shared-structure mutations actually important? Nick Keighley
Re: Are shared-structure mutations actually important? Alex Mizrahi
Re: Are shared-structure mutations actually important? Kaz Kylheku
- Re: Are shared-structure mutations actually important? Ray Dillinger
Re: Are shared-structure mutations actually important? Pascal Costanza
- Re: Are shared-structure mutations actually important? Ray Dillinger
  - Re: Are shared-structure mutations actually important? Rob Warnock
    - Re: Are shared-structure mutations actually important? Ray Dillinger
      - Re: Are shared-structure mutations actually important? George Neuner
        Re: Are shared-structure mutations actually important? Ray Dillinger
        Re: Are shared-structure mutations actually important? George Neuner
    - Re: Are shared-structure mutations actually important? Slobodan Blazeski
      - Re: Are shared-structure mutations actually important? Rob Warnock
    - Re: Are shared-structure mutations actually important? Pascal J. Bourguignon
    - Re: Are shared-structure mutations actually important? Kaz Kylheku
      - Re: Are shared-structure mutations actually important? Rob Warnock
  - Re: Are shared-structure mutations actually important? Paul Wallich
  - Re: Are shared-structure mutations actually important? Pascal Costanza
- Re: Are shared-structure mutations actually important? Pascal J. Bourguignon

From: ··················@gmail.com
Subject: Re: Are shared-structure mutations actually important?
Date: Sun, 16 Nov 2008 20:44:57 +0000
Message-ID: <44196582-aa12-415b-a328-e08cc586fd03@i18g2000prf.googlegroups.com>

On Nov 16, 2:57 pm, Ray Dillinger <····@sonic.net> wrote:
> Is the ability to alter the value of one variable by
> mutating the value of a different or "alias" variable
> ever actually important to our ability to solve
> problems?  I'm increasingly of the opinion that it's
> not.
>
> I get that saving memory was very important when lisp
> was being designed and needed to run on machines that
> people could build back then.  I get that saving memory
> is still *somewhat* important today.  But, questions
> of memory size aside, and performance issues about
> copy-on-write being slower than direct mutation aside,
> is there anything we accomplish by aliased mutations
> that we couldn't accomplish with more ease and less
> confusion without them?
>
> I'm imagining a lisp dialect with a strict copy-on-write
> rule, such that variables can only be rebound, and never
> have their values changed in place.
>
> Eq? would be discarded as meaningless.  There'd be no
> string-set, no array-set, no rplaca/set-car!, no
> rplacd/set-cdr!, etc - or at least not as you've known
> them.  The names, if they survive, would be used for
> nondestructive operations that return newly-allocated
> values differing in the fields referenced, without
> any effect on the existing value (and other variables
> or list structure that may be bound to it).  
>
> I have already made a "ropes" string library that works
> this way - when I change the first occurrence of " "
> after the 37489th character to to ". ", I get a new
> string.  It doesn't contribute much to memory-hoggery,
> because it shares 99% or so of its low-level structure
> with the old one. It has one newly-allocated "strand"
> of 100-1000 characters containing the change, and a
> chain of a few newly-allocated tree-nodes leading to
> that strand.  Other branches of the tree nodes point
> directly at existing structure shared with the original
> string.  So, if I bind the variable to the new string,
> and there's nothing that still has the value of the old
> string, then one strand and 3 or 4 tree nodes would be
> eaten by the Garbage collector.  If something still has
> the value of the old string, then the new one shares
> most of its data with it and the garbage collector goes
> hungry.  From the user's POV, I have two strings with
> slightly different values, because the two variables
> are bound to different root nodes.  
>
> Anyway, this works.  Constant-time string copy and
> logarithmic-time insertions, deletions, and unequal-
> size replacement are very nice to have as string
> operations, and make some of the stuff I work with
> (where strings several million characters long
> aren't all that uncommon) work much faster. Also,
> it's very nice to be able to keep an "undo stack"
> on huge strings, just by keeping the old values of
> the string.  This is cheap, even on huge strings,
> because of the shared structure.  
>
> And now I'm looking at cons cells, and even arrays,
> and thinking.... is there a good reason why not to
> make this sort of behavior universal?  Is there anything
> that would be significantly harder to do in a more
> nearly pure-functional dialect where the only mutation
> operation is setq/set! itself, and that rebinds a
> variable rather than changing its value in-place?  
>
>                                 Bear

It would also get rid of some of the pitfalls that lisp newbies
encounter.

However, wouldn't insertions be much more work-intensive than
currently?
imagine:

(setf foo (list 1 2 3... n))
(setf (third foo) 'bar)

Unless I'm misunderstanding, you'd be copying the whole list, where as
currently we just jump in and change the fourth item?

Now Imagine if it is a fairly lengthy list of lists.
One will become significantly faster than the other.

Now imagine you have code that is iterating through one list and using
it to set certain values in another list. Is it better to copy the
entire list each time or to copy it once and destructively modify?

I think the second.

From: Scott Burson
Subject: Re: Are shared-structure mutations actually important?
Date: Sun, 16 Nov 2008 23:57:08 +0000
Message-ID: <3437d17c-99cb-4b43-8787-a5cdaa90f678@w1g2000prk.googlegroups.com>

On Nov 16, 12:44 pm, ··················@gmail.com wrote:
> On Nov 16, 2:57 pm, Ray Dillinger <····@sonic.net> wrote:
>
>
>
> > Is the ability to alter the value of one variable by
> > mutating the value of a different or "alias" variable
> > ever actually important to our ability to solve
> > problems?  I'm increasingly of the opinion that it's
> > not.
>
> > I get that saving memory was very important when lisp
> > was being designed and needed to run on machines that
> > people could build back then.  I get that saving memory
> > is still *somewhat* important today.  But, questions
> > of memory size aside, and performance issues about
> > copy-on-write being slower than direct mutation aside,
> > is there anything we accomplish by aliased mutations
> > that we couldn't accomplish with more ease and less
> > confusion without them?
>
> > I'm imagining a lisp dialect with a strict copy-on-write
> > rule, such that variables can only be rebound, and never
> > have their values changed in place.
>
> > Eq? would be discarded as meaningless.  There'd be no
> > string-set, no array-set, no rplaca/set-car!, no
> > rplacd/set-cdr!, etc - or at least not as you've known
> > them.  The names, if they survive, would be used for
> > nondestructive operations that return newly-allocated
> > values differing in the fields referenced, without
> > any effect on the existing value (and other variables
> > or list structure that may be bound to it).  
>
> > I have already made a "ropes" string library that works
> > this way - when I change the first occurrence of " "
> > after the 37489th character to to ". ", I get a new
> > string.  It doesn't contribute much to memory-hoggery,
> > because it shares 99% or so of its low-level structure
> > with the old one. It has one newly-allocated "strand"
> > of 100-1000 characters containing the change, and a
> > chain of a few newly-allocated tree-nodes leading to
> > that strand.  Other branches of the tree nodes point
> > directly at existing structure shared with the original
> > string.  So, if I bind the variable to the new string,
> > and there's nothing that still has the value of the old
> > string, then one strand and 3 or 4 tree nodes would be
> > eaten by the Garbage collector.  If something still has
> > the value of the old string, then the new one shares
> > most of its data with it and the garbage collector goes
> > hungry.  From the user's POV, I have two strings with
> > slightly different values, because the two variables
> > are bound to different root nodes.  
>
> > Anyway, this works.  Constant-time string copy and
> > logarithmic-time insertions, deletions, and unequal-
> > size replacement are very nice to have as string
> > operations, and make some of the stuff I work with
> > (where strings several million characters long
> > aren't all that uncommon) work much faster. Also,
> > it's very nice to be able to keep an "undo stack"
> > on huge strings, just by keeping the old values of
> > the string.  This is cheap, even on huge strings,
> > because of the shared structure.  
>
> > And now I'm looking at cons cells, and even arrays,
> > and thinking.... is there a good reason why not to
> > make this sort of behavior universal?  Is there anything
> > that would be significantly harder to do in a more
> > nearly pure-functional dialect where the only mutation
> > operation is setq/set! itself, and that rebinds a
> > variable rather than changing its value in-place?  
>
> >                                 Bear
>
> It would also get rid of some of the pitfalls that lisp newbies
> encounter.
>
> However, wouldn't insertions be much more work-intensive than
> currently?
> imagine:
>
> (setf foo (list 1 2 3... n))
> (setf (third foo) 'bar)
>
> Unless I'm misunderstanding, you'd be copying the whole list, where as
> currently we just jump in and change the fourth item?

Well, you wouldn't have to copy the whole list, just the first three
conses.

But in general you're right: lists are not the ideal choice for
implementing functional data structures.  My FSet library uses
balanced binary trees; Clojure uses something called array-mapped
tries, a less widely-known but very clever structure.  Both of these
do functional update in O(log n) time.

-- Scott

From: Scott
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 01:50:01 +0000
Message-ID: <e8ef5186-8a22-472a-9d71-9280e584ad2f@a26g2000prf.googlegroups.com>

On Nov 16, 4:57 pm, Scott Burson <········@gmail.com> wrote:
>
> > Unless I'm misunderstanding, you'd be copying the whole list, where as
> > currently we just jump in and change the fourth item?
>
> Well, you wouldn't have to copy the whole list, just the first three
> conses.
>
> But in general you're right: lists are not the ideal choice for
> implementing functional data structures.  My FSet library uses
> balanced binary trees; Clojure uses something called array-mapped
> tries, a less widely-known but very clever structure.  Both of these
> do functional update in O(log n) time.
>
> -- Scott

Clojure's logarithm is base 32 for those data structures.  That starts
to make O(log N) look and feel a lot more like O(1)...

PLT's "galore" library also has immutable sets, maps, and bags.
Apparently including "ordered sets" which sounds a lot like a list or
vector to me.  I don't know their order of complexity though.

From: Scott Burson
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 02:53:17 +0000
Message-ID: <18ad51b3-6f7c-4c51-9a3e-62538b79caf3@r37g2000prr.googlegroups.com>

On Nov 16, 5:50 pm, Scott <·······@gmail.com> wrote:
>
> Clojure's logarithm is base 32 for those data structures.  That starts
> to make O(log N) look and feel a lot more like O(1)...

O(log n) should always feel a lot like O(1) :)

The issue is not the base of the logarithm, which translates into a
constant factor anyway.  O(log n) is O(log n) for any base logarithm.

Rather, the question is, how long do accesses to a collection of a
given size actually take?  I.e., what is the asymptotic constant
factor, and how does the data structure behave for small collections
(which may not follow the asymptotic curve very well)?

Rich says that AMTs perform better than binary trees.  I haven't tried
the experiment, but I have no reason to doubt his claim.  The AMT is a
very clever design.

But FSet also has a small trick up its sleeve.  Instead of using
binary nodes all the way to the bottom, like most binary tree
packages, FSet uses bounded-length vectors at the leaves.  This makes
the tree 2 to 3 levels shallower, cutting space usage roughly in half
and improving access performance, particularly to small collections.

> PLT's "galore" library also has immutable sets, maps, and bags.
> Apparently including "ordered sets" which sounds a lot like a list or
> vector to me.  I don't know their order of complexity though.

Nice to see functional collections catching on!

-- Scott

From: Scott
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 21:29:20 +0000
Message-ID: <a4d2cdca-6d23-4ece-8bcd-f478a0775415@f37g2000pri.googlegroups.com>

On Nov 16, 7:53 pm, Scott Burson <········@gmail.com> wrote:
>
> O(log n) should always feel a lot like O(1) :)
>

Sure, you can says it's all the same, but the constant in front
matters in deciding whether a million item set takes 20 times as long
or 4 times as long...

From: Slobodan Blazeski
Subject: Re: Are shared-structure mutations actually important?
Date: Tue, 18 Nov 2008 17:05:00 +0000
Message-ID: <4d4e4f30-7265-4c6a-b414-0436da02e213@x16g2000prn.googlegroups.com>

On Nov 17, 10:29 pm, Scott <·······@gmail.com> wrote:
> On Nov 16, 7:53 pm, Scott Burson <········@gmail.com> wrote:
>
>
>
> > O(log n) should always feel a lot like O(1) :)
>
> Sure, you can says it's all the same, but the constant in front
> matters in deciding whether a million item set takes 20 times as long
> or 4 times as long...

Or 1000 000 times longer compared with algo that has a worse
complexity but smaller constant. Ignoring the constant in real life
bited  me more than ones, so I'm really suspicious to all those
promising wonders algorithms that work only under circumstances that
have nothing to do with reality that my program lives on.

From: Jens Axel Soegaard
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 09:18:48 +0000
Message-ID: <492136f5$0$15888$edfadb0f@dtext01.news.tele.dk>

Scott skrev:
> On Nov 16, 4:57 pm, Scott Burson <········@gmail.com> wrote:
>>> Unless I'm misunderstanding, you'd be copying the whole list, where as
>>> currently we just jump in and change the fourth item?
>> Well, you wouldn't have to copy the whole list, just the first three
>> conses.
>>
>> But in general you're right: lists are not the ideal choice for
>> implementing functional data structures.  My FSet library uses
>> balanced binary trees; Clojure uses something called array-mapped
>> tries, a less widely-known but very clever structure.  Both of these
>> do functional update in O(log n) time.
>>
>> -- Scott
> 
> Clojure's logarithm is base 32 for those data structures.  That starts
> to make O(log N) look and feel a lot more like O(1)...
> 
> PLT's "galore" library also has immutable sets, maps, and bags.
> Apparently including "ordered sets" which sounds a lot like a list or
> vector to me.  

Unless the elements have an order, it is not possible to represent
the set as a tree. That is, if there is an order, you get logarithmic
times and without linear.

> I don't know their order of complexity though.

 From the documentation of version 2.2 (not present in the 4.0
documentation in the same form):

----------------------------------------------------------------------
TIME COMPLEXITIES
----------------------------------------------------------------------

The following table shows the worst case time complixity. The O( ) has
been omitted due to space considerations.

                   RB-Set    RB-Bag  Leftist-Heaps  List-Set  List-Bag
find-min            1         1            1           1         1
delete-min          1         1            1           1         1
get, member?      log n      log n         n           n         n
insert            log n      log n       log n         n         n
union              n+m        n+m       log(n+m)      n+m       n+m
elements            n          n           n           1         n
delete            log n      log n       log n         n         n
delete-all        log n      log n       log n         n         n
size                n          n           n           n         n

bag,  list->set  n log(n)                           n log(n)
heap, list->bag              n log(n)                          n log(n)
set,  list->heap                           n

The operations: empty?, empty, singleton, and select are all O(1).

From: Scott
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 21:22:24 +0000
Message-ID: <295ccd0d-8e27-48df-8fdb-f852f6a15f55@a29g2000pra.googlegroups.com>

On Nov 17, 2:18 am, Jens Axel Soegaard
<·························@soegaard.net> wrote:
>
> Unless the elements have an order, it is not possible to represent
> the set as a tree. That is, if there is an order, you get logarithmic
> times and without linear.
>

I misunderstood the intention.  I (wrongly) assumed that an ordered
set maintained the order that you inserted them in, not that they were
ordered by the value.  My bad.  I take it from your logarithmic
comment that you don't use hashing for any of the implementations.

Clojure apparently has non-ordered immutable lists/vectors with O
(log32) accessors and reasonably efficient creation of new lists that
contain insertions, replacements, and omissions sharing much of the
"parent" data.  It also has the same properties for hash-maps.  He
hashes the keys, but the keys and values are stored in a shallow and
wide tree.  I don't remember if he implemented hash-sets as a special
case.

From: Jens Axel Soegaard
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 21:46:31 +0000
Message-ID: <4921e638$0$15884$edfadb0f@dtext01.news.tele.dk>

Scott skrev:
> On Nov 17, 2:18 am, Jens Axel Soegaard
> <·························@soegaard.net> wrote:
>> Unless the elements have an order, it is not possible to represent
>> the set as a tree. That is, if there is an order, you get logarithmic
>> times and without linear.
>>
> 
> I misunderstood the intention.  I (wrongly) assumed that an ordered
> set maintained the order that you inserted them in, not that they were
> ordered by the value.  

Okay. The naming was chosen, since sets with unordered elements
is also supported (with linear times of course).

> My bad.  I take it from your logarithmic
> comment that you don't use hashing for any of the implementations.

That's correct. Red-black trees from Okasaki's book is the basis
of the set implementation. In the newer versions of Galore Carl Eastlund
and Stevie Strickland added tables and hashed sets (see
<http://planet.plt-scheme.org/package-source/soegaard/galore.plt/4/0/doc.txt>)

> Clojure apparently has non-ordered immutable lists/vectors with O
> (log32) accessors and reasonably efficient creation of new lists that
> contain insertions, replacements, and omissions sharing much of the
> "parent" data.  It also has the same properties for hash-maps.  He
> hashes the keys, but the keys and values are stored in a shallow and
> wide tree.  I don't remember if he implemented hash-sets as a special
> case.

-- 
Jens Axel S�gaard

From: Rich Hickey
Subject: Re: Are shared-structure mutations actually important?
Date: Sun, 16 Nov 2008 21:09:46 +0000
Message-ID: <c1587099-c6d1-4f80-ba83-6d26d6eeb3d4@h23g2000prf.googlegroups.com>

On Nov 16, 2:57 pm, Ray Dillinger <····@sonic.net> wrote:
> Is the ability to alter the value of one variable by
> mutating the value of a different or "alias" variable
> ever actually important to our ability to solve
> problems?  I'm increasingly of the opinion that it's
> not.
>
> I get that saving memory was very important when lisp
> was being designed and needed to run on machines that
> people could build back then.  I get that saving memory
> is still *somewhat* important today.  But, questions
> of memory size aside, and performance issues about
> copy-on-write being slower than direct mutation aside,
> is there anything we accomplish by aliased mutations
> that we couldn't accomplish with more ease and less
> confusion without them?
>
> I'm imagining a lisp dialect with a strict copy-on-write
> rule, such that variables can only be rebound, and never
> have their values changed in place.
>
> Eq? would be discarded as meaningless.  There'd be no
> string-set, no array-set, no rplaca/set-car!, no
> rplacd/set-cdr!, etc - or at least not as you've known
> them.  The names, if they survive, would be used for
> nondestructive operations that return newly-allocated
> values differing in the fields referenced, without
> any effect on the existing value (and other variables
> or list structure that may be bound to it).
>
> I have already made a "ropes" string library that works
> this way - when I change the first occurrence of " "
> after the 37489th character to to ". ", I get a new
> string.  It doesn't contribute much to memory-hoggery,
> because it shares 99% or so of its low-level structure
> with the old one. It has one newly-allocated "strand"
> of 100-1000 characters containing the change, and a
> chain of a few newly-allocated tree-nodes leading to
> that strand.  Other branches of the tree nodes point
> directly at existing structure shared with the original
> string.  So, if I bind the variable to the new string,
> and there's nothing that still has the value of the old
> string, then one strand and 3 or 4 tree nodes would be
> eaten by the Garbage collector.  If something still has
> the value of the old string, then the new one shares
> most of its data with it and the garbage collector goes
> hungry.  From the user's POV, I have two strings with
> slightly different values, because the two variables
> are bound to different root nodes.
>
> Anyway, this works.  Constant-time string copy and
> logarithmic-time insertions, deletions, and unequal-
> size replacement are very nice to have as string
> operations, and make some of the stuff I work with
> (where strings several million characters long
> aren't all that uncommon) work much faster. Also,
> it's very nice to be able to keep an "undo stack"
> on huge strings, just by keeping the old values of
> the string.  This is cheap, even on huge strings,
> because of the shared structure.
>
> And now I'm looking at cons cells, and even arrays,
> and thinking.... is there a good reason why not to
> make this sort of behavior universal?  Is there anything
> that would be significantly harder to do in a more
> nearly pure-functional dialect where the only mutation
> operation is setq/set! itself, and that rebinds a
> variable rather than changing its value in-place?
>
>                                 Bear

No need to imagine. Clojure is such a dialect, working exactly as
you've described. Nothing is significantly harder, and many things are
cleaner, as persistent hash tables and vectors can be significantly
lispier than their destructive counterparts.

Rich

From: Andrew Reilly
Subject: Re: Are shared-structure mutations actually important?
Date: Sun, 16 Nov 2008 23:55:55 +0000
Message-ID: <6obq8aF2s40jU1@mid.individual.net>

On Sun, 16 Nov 2008 13:09:46 -0800, Rich Hickey wrote:

>> And now I'm looking at cons cells, and even arrays, and thinking.... is
>> there a good reason why not to make this sort of behavior universal? 
>> Is there anything that would be significantly harder to do in a more
>> nearly pure-functional dialect where the only mutation operation is
>> setq/set! itself, and that rebinds a variable rather than changing its
>> value in-place?
>>
>>                                 Bear
> 
> No need to imagine. Clojure is such a dialect, working exactly as you've
> described. Nothing is significantly harder, and many things are cleaner,
> as persistent hash tables and vectors can be significantly lispier than
> their destructive counterparts.

As another data point (sort of): the PLT dialect of scheme removed set-
car! and set-cdr! from the language when version 4 was introduced.  As a 
fairly new schemer, this made me re-architect my project, because I had 
been operating fairly imperatively to that point, but I think that the 
code is better for the change.

However, I don't see how you can get away with immutable vectors or 
(related) structs, at least as long as you want to be able to support a 
large chunk of "traditional" algorithms.  In my own case, my code is 
munging a graph of nodes that can have loops and other fairly hairy 
connection structures, and I can't figure out how to do that without 
mutation.  Maybe there's a good way that I haven't discovered yet?  How 
does Clojure manage it?

Cheers,

-- 
Andrew

From: Scott Burson
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 03:07:29 +0000
Message-ID: <ccdf89c4-0604-4850-85bb-274cf8a02b7a@v16g2000prc.googlegroups.com>

On Nov 16, 3:55 pm, Andrew Reilly <···············@areilly.bpc-
users.org> wrote:
> In my own case, my code is
> munging a graph of nodes that can have loops and other fairly hairy
> connection structures, and I can't figure out how to do that without
> mutation.  Maybe there's a good way that I haven't discovered yet?  How
> does Clojure manage it?

I can't speak for Clojure, but AFAIK, a good implementation of
functional graphs still eludes us.

Here's something I posted on it previously:

http://groups.google.com/group/clojure/browse_frm/thread/c7c9aff51027b3d3/ccfa0752f303157e

-- Scott

From: Rich Hickey
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 03:20:56 +0000
Message-ID: <c0d4ea77-aa5b-42cf-a7ba-343523ffc6da@h23g2000prf.googlegroups.com>

On Nov 16, 6:55 pm, Andrew Reilly <···············@areilly.bpc-
users.org> wrote:
> On Sun, 16 Nov 2008 13:09:46 -0800, Rich Hickey wrote:
> >> And now I'm looking at cons cells, and even arrays, and thinking.... is
> >> there a good reason why not to make this sort of behavior universal?
> >> Is there anything that would be significantly harder to do in a more
> >> nearly pure-functional dialect where the only mutation operation is
> >> setq/set! itself, and that rebinds a variable rather than changing its
> >> value in-place?
>
> >>                                 Bear
>
> > No need to imagine. Clojure is such a dialect, working exactly as you've
> > described. Nothing is significantly harder, and many things are cleaner,
> > as persistent hash tables and vectors can be significantly lispier than
> > their destructive counterparts.
>
> As another data point (sort of): the PLT dialect of scheme removed set-
> car! and set-cdr! from the language when version 4 was introduced.  As a
> fairly new schemer, this made me re-architect my project, because I had
> been operating fairly imperatively to that point, but I think that the
> code is better for the change.
>
> However, I don't see how you can get away with immutable vectors or
> (related) structs, at least as long as you want to be able to support a
> large chunk of "traditional" algorithms.  In my own case, my code is
> munging a graph of nodes that can have loops and other fairly hairy
> connection structures, and I can't figure out how to do that without
> mutation.  Maybe there's a good way that I haven't discovered yet?  How
> does Clojure manage it?
>

It is possible to represent things like graphs as immutable sets of
immutable edges for instance, or maps of edges to edge-related data.
Clojure has fast persistent maps and sets, making it relatively
efficient to produce a 'changed' graph as a function of the prior edge
set, sharing substantial structure. Doing so will give you benefits
similar to those described by Bear for his immutable strings
represented as ropes, and many others - speculative edits, undo,
concurrency etc. This is the preferred strategy.

It is true that there are traditional algorithms, including some best-
of-breed, that require mutation.  If you need mutation, there are two
ways to go in Clojure.

If you only need mutability within a computation that otherwise
presents a functional view to the outside world, you have full and
very fast direct access to Java primitive arrays etc. E.g. Clojure's
sort function, while non-destructive of its argument, uses mutation
internally.

If you require a data structure subject to ongoing mutation, Clojure
has mutable reference cells, a la ML, but with concurrency semantics.
In particular, there are transactional refs, which would support safe
and consistent mutation of arbitrary subsets of nodes from concurrent
threads. (Real, running on separate cores, threads). Or if you are
doing simulations, you might want autonomous state, supported by
Clojure's asynchronous agents, again, multi-core ready.

Clojure emphasizes a functional approach, but recognizes the
occasional need for state and provides facilities for state that
support sane concurrency without user-level locking.

Rich

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 06:21:59 +0000
Message-ID: <49210dad$0$33590$742ec2ed@news.sonic.net>

Andrew Reilly wrote:

> On Sun, 16 Nov 2008 13:09:46 -0800, Rich Hickey wrote:

>> No need to imagine. Clojure is such a dialect, working exactly as you've
>> described. Nothing is significantly harder, and many things are cleaner,
>> as persistent hash tables and vectors can be significantly lispier than
>> their destructive counterparts.

> As another data point (sort of): the PLT dialect of scheme removed set-
> car! and set-cdr! from the language when version 4 was introduced.  As a
> fairly new schemer, this made me re-architect my project, because I had
> been operating fairly imperatively to that point, but I think that the
> code is better for the change.

Interesting.  I did not know that about PLT, although I remember 
its authors arguing in favor of immutable cons cells during the 
R6RS process. 

I knew Clojure was more functional with its data structures, but 
didn't realize how much.  With its limited tail-call optimization 
and dependency on the JVM, I hadn't gotten up enough interest to 
try it yet.  As someone whose road to lisp started with scheme, 
I nearly always still express loops with tail calls and rely on 
tail-call optimization to do the right thing. If I know it doesn't 
or can't, then I get paranoid and write bad code.

> However, I don't see how you can get away with immutable vectors or
> (related) structs, at least as long as you want to be able to support a
> large chunk of "traditional" algorithms.  

I had been imagining the use of an "excessively clever" 
implementation of arrays, much like ropes are an "excessively 
clever" way to implement strings.  It's technically a tree, 
but it's very fast, lookups don't depend on comparisons, and 
it can be logarithmic in any power of two you set it for.  At 
each node, starting with the root, you shift-and-bitmask the 
array index to see which branch you follow.  If you use a 
table of 32 or 64 pointers at each branch, arrays that 
fit in main memory are balanced trees hardly ever more than 
2 or 3 nodes deep.  Access is quite fast. 

Updating, however, is expensive if you do it one cell at a 
time because you'd need to allocate the 2 or 3 nodes, plus 
a new strand, each time - and eventually garbage collect the 
old nodes & strand.  

Of course if you have a refcounting garbage collector and 
the refcounts in all the nodes and the strand are 1, then 
you can go ahead and do O(1) updates in place because it 
can't affect other variables.  So even though it's not the 
fastest collector, a refcounting collector can save you 
time in updating pure-functional structures.

But that case is the exception rather than the rule, so 
you save up your updates to do in a batch. To accomplish 
this you "front" each array value with a small hash table 
(32 or 64 nodes) that contains the indices and new values 
of updated cells not yet expressed in the structure.  This 
adds 20% or so to the cost of each lookup because you must
check the hash table before going into the tree, but it 
saves 75% or more on update time, because typically when 
you update it, you get an updated hash table only, 
pointing at the same tree root.  That saves a bunch of 
time that you'd otherwise spend allocating and collecting 
tree structure on each and every update, and when it gets 
full (or you have a hash collision) you just do all your 
tree updates in one chunk.  Since most algorithms access 
arrays sequentially, you usually get good use of it because 
all the mutations will be in just one or two strands. 

> In my own case, my code is 
> munging a graph of nodes that can have loops and other fairly hairy
> connection structures, and I can't figure out how to do that without
> mutation.  Maybe there's a good way that I haven't discovered yet?  How
> does Clojure manage it?

Mmmm.  Graph theory.  I read about ropes and implemented 
them, then got my own brainstorm about arrays kinda based 
on that.  I seem to recall that a lot of the scholarly 
material where I studied graph theory represented graphs 
as state transition tables, which are inherently array-like 
and would probably fall to the same approach outlined for 
arrays above, with some twiddling.  The sticking point I 
haven't thought very hard about yet is that with graph theory, 
updating the node set on the fly alters the usual indexing 
scheme.  So some mildly hairy code, or some "excessively 
clever" twist on the indexing scheme would be required to 
keep it straight - but I think it's doable.  

                                Bear

From: Rich Hickey
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 13:04:25 +0000
Message-ID: <8dfc9a88-8343-487e-a83c-970c82c62c13@v5g2000prm.googlegroups.com>

On Nov 17, 1:21 am, Ray Dillinger <····@sonic.net> wrote:

> I knew Clojure was more functional with its data structures, but
> didn't realize how much.  With its limited tail-call optimization
> and dependency on the JVM, I hadn't gotten up enough interest to
> try it yet.  As someone whose road to lisp started with scheme,
> I nearly always still express loops with tail calls and rely on
> tail-call optimization to do the right thing. If I know it doesn't
> or can't, then I get paranoid and write bad code.
>

I agree TCO would be nice, and the JVM folks know it is much in demand
as a JVM facility. When it is, Clojure will have it.

Until then, two features mitigate that somewhat. First, the native
looping construct, recur, looks like a tail call. Second, the
sequences support laziness, so you can use lazy-cons and not blow the
stack. Here's both in action:

(defn filter
  "Returns a lazy seq of the items in coll for which
  (pred item) returns true."
  [pred coll]
    (when (seq coll)
      (if (pred (first coll))
        (lazy-cons (first coll) (filter pred (rest coll)))
        (recur pred (rest coll)))))

> > However, I don't see how you can get away with immutable vectors or
> > (related) structs, at least as long as you want to be able to support a
> > large chunk of "traditional" algorithms.
>
> I had been imagining the use of an "excessively clever"
> implementation of arrays, much like ropes are an "excessively
> clever" way to implement strings.  It's technically a tree,
> but it's very fast, lookups don't depend on comparisons, and
> it can be logarithmic in any power of two you set it for.  At
> each node, starting with the root, you shift-and-bitmask the
> array index to see which branch you follow.  If you use a
> table of 32 or 64 pointers at each branch, arrays that
> fit in main memory are balanced trees hardly ever more than
> 2 or 3 nodes deep.  Access is quite fast.
>

Again, no need to imagine. Clojure has fast persistent vectors and
hash tables implemented using bit-partitioned tries, with sparse 32-
way branching.

> Updating, however, is expensive if you do it one cell at a
> time because you'd need to allocate the 2 or 3 nodes, plus
> a new strand, each time - and eventually garbage collect the
> old nodes & strand.
>
> Of course if you have a refcounting garbage collector and
> the refcounts in all the nodes and the strand are 1, then
> you can go ahead and do O(1) updates in place because it
> can't affect other variables.  So even though it's not the
> fastest collector, a refcounting collector can save you
> time in updating pure-functional structures.
>
> But that case is the exception rather than the rule, so
> you save up your updates to do in a batch. To accomplish
> this you "front" each array value with a small hash table
> (32 or 64 nodes) that contains the indices and new values
> of updated cells not yet expressed in the structure.  This
> adds 20% or so to the cost of each lookup because you must
> check the hash table before going into the tree, but it
> saves 75% or more on update time, because typically when
> you update it, you get an updated hash table only,
> pointing at the same tree root.  That saves a bunch of
> time that you'd otherwise spend allocating and collecting
> tree structure on each and every update, and when it gets
> full (or you have a hash collision) you just do all your
> tree updates in one chunk.  Since most algorithms access
> arrays sequentially, you usually get good use of it because
> all the mutations will be in just one or two strands.
>

I think you'll find that such schemes fall down in the presence of
multiple threads, and it's best to use the fastest truly immutable
data structures you can find.

Rich

From: Matthew D Swank
Subject: Re: Are shared-structure mutations actually important?
Date: Tue, 18 Nov 2008 08:02:28 +0000
Message-ID: <oEuUk.2252$jV1.2125@newsfe13.iad>

On Mon, 17 Nov 2008 05:04:25 -0800, Rich Hickey wrote:


> 
> First [clojure has], the native looping construct, recur, looks like a
> tail call.
> Second, the sequences support laziness, so you can use lazy-cons and
> not blow the stack.
> Here's both in action:
> 
> (defn filter
>   "Returns a lazy seq of the items in coll for which (pred item) returns
>   true."
>   [pred coll]
>     (when (seq coll)
>       (if (pred (first coll))
>         (lazy-cons (first coll) (filter pred (rest coll))) 
>         (recur pred (rest coll)))))
> 

An implementation of the tail call primitive, though not the laziness, 
came to mind:
(defun filter (arg1 arg2)
  (macrolet ((recur (pred list)
               `(progn
                  (setf arg1 ,pred arg2 ,list)
                  (go recur))))
    (block nil
      (tagbody
       recur
         (return
           (let ((pred arg1)
                 (list arg2))
             (when list
               (if (funcall pred (car list))
                   (cons (car list) (filter pred (cdr list)))
                   (recur pred (cdr list))))))))))

Obviously it's only for lists, and could use some macro-love to paper 
over the idiom. 

Matt
-- 
The camel died quite suddenly on the second day, and Selena fretted
sullenly and, buffing her already impeccable nails -- not for the first
time since the journey begain -- pondered snidely if this would dissolve
into a vignette of minor inconveniences like all the other holidays spent
with Basil.
	-- Winning sentence, 1983 Bulwer-Lytton bad fiction contest.

From: Jens Axel Soegaard
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 09:21:21 +0000
Message-ID: <4921378e$0$15888$edfadb0f@dtext01.news.tele.dk>

Ray Dillinger wrote:

>> As another data point (sort of): the PLT dialect of scheme removed set-
>> car! and set-cdr! from the language when version 4 was introduced.  As a
>> fairly new schemer, this made me re-architect my project, because I had
>> been operating fairly imperatively to that point, but I think that the
>> code is better for the change.
> 
> Interesting.  I did not know that about PLT, although I remember 
> its authors arguing in favor of immutable cons cells during the 
> R6RS process. 

You might find the following interesting:

http://blog.plt-scheme.org/2007/11/getting-rid-of-set-car-and-set-cdr.html


-- 
Jens Axel S�gaard

From: Scott Burson
Subject: Re: Are shared-structure mutations actually important?
Date: Sun, 16 Nov 2008 23:52:08 +0000
Message-ID: <300524a8-319a-4c40-b8a8-6fca5149031f@z6g2000pre.googlegroups.com>

On Nov 16, 11:57 am, Ray Dillinger <····@sonic.net> wrote:
> And now I'm looking at cons cells, and even arrays,
> and thinking.... is there a good reason why not to
> make this sort of behavior universal?

A similar line of thinking led me to develop FSet, a functional set-
theoretic collections library for Common Lisp:

http://common-lisp.net/project/fset/

Along with functional sequences, it also has functional sets, maps,
and bags.  Have a look.

-- Scott

From: Tim Bradshaw
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 13:43:27 +0000
Message-ID: <395b73be-b161-4491-8534-43b476924447@h23g2000prf.googlegroups.com>

On Nov 16, 7:57 pm, Ray Dillinger <····@sonic.net> wrote:

>
> I get that saving memory was very important when lisp
> was being designed and needed to run on machines that
> people could build back then.  I get that saving memory
> is still *somewhat* important today.  But, questions
> of memory size aside, and performance issues about
> copy-on-write being slower than direct mutation aside,
> is there anything we accomplish by aliased mutations
> that we couldn't accomplish with more ease and less
> confusion without them?

Yes. If you are representing something *in the real world* which has
mutable state (and hence a good notion of identity), then it may well
be natural to represent it *in your program* by something which also
has mutable state.

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 16:40:15 +0000
Message-ID: <49219e95$0$33528$742ec2ed@news.sonic.net>

Tim Bradshaw wrote:

> On Nov 16, 7:57 pm, Ray Dillinger <····@sonic.net> wrote:
> 
>> But, questions
>> of memory size aside, and performance issues about
>> copy-on-write being slower than direct mutation aside,
>> is there anything we accomplish by aliased mutations
>> that we couldn't accomplish with more ease and less
>> confusion without them?
> 
> Yes. If you are representing something *in the real world* which has
> mutable state (and hence a good notion of identity), then it may well
> be natural to represent it *in your program* by something which also
> has mutable state.

Right... mutation is fine.  It's *aliased* mutation I'm thinking is 
a problem.  I have no problem changing one field of a record, if 
there's nothing else whose value will visibly change as a result. 

Given modfield as a copy constructor, if we say 

(setq recordname (modfield recordname fieldname newvalue))

or in Scheme 

(set! recordname (modfield recordname fieldname newvalue))

that's what we're doing; representing mutable state by mutating 
the (binding of the) variable.  The only reason we use a copy 
constructor is so that nothing else's value will change.  But 
under the hood, if the runtime knows that this is the *only* 
reference to recordname, it definitely should optimize by just 
changing the one field in place - because nothing else's value 
will change anyway. 

Hmmm.  Aliased mutation may be useful for one part of a program 
communicating with another.  Locks, critical sections, etc? But 
those all have their stylized use patterns which are captured in 
calls rather than explicit mutable data.  

                                Bear

From: Tim Bradshaw
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 20:09:16 +0000
Message-ID: <f215bda5-9f48-4754-96d7-53a7589d5944@q30g2000prq.googlegroups.com>

On Nov 17, 4:40 pm, Ray Dillinger <····@sonic.net> wrote:

> Right... mutation is fine.  It's *aliased* mutation I'm thinking is
> a problem.  I have no problem changing one field of a record, if
> there's nothing else whose value will visibly change as a result.
>

Again, the question to ask is: does this model anything useful in the
real world?  I can think of plenty of cases where yes, it does.  To
give a slightly nerdy, example (which, however, represents stuff I
worry about all the time in my day job): I want to apply a patch to
system x.  But before I can do that I need to know that system x
happens to be *the identical machine* to system y and system z, so if
I patch x, I also patch y and z.  This can matter.  There are really
plenty of examples of this kind of thing going on, and (as you can
imagine in the patching case) it can be somewhat annoying and
surprising in the real world.

Of course, I realise that the standard CS answer to this is just to
ignore all those annoying bugs in the real world and wander off into
irrelevancy.  That's probably why I don't talk to programmers much
nowadays.

From: Alex Mizrahi
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 20:58:14 +0000
Message-ID: <4921dae9$0$90274$14726298@news.sunsite.dk>

 RD> Hmmm.  Aliased mutation may be useful for one part of a program
 RD> communicating with another.

not only..
the are plenty of problems which can be modelled with directed
graph of arbitrary structure -- nodes can refer to each other freely.
if you attempt to modify some node, that will be "aliased mutation"
which you want to prohibit. this prohibition will making working
with such structures rather painful.

From: Matthias Blume
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 17:02:48 +0000
Message-ID: <m13ahq2slj.fsf@hana.uchicago.edu>

Ray Dillinger <····@sonic.net> writes:

> Is the ability to alter the value of one variable by 
> mutating the value of a different or "alias" variable
> ever actually important to our ability to solve 
> problems?  I'm increasingly of the opinion that it's 
> not. 

I agree that most of the time it is not important, and the advantages of
purity outweigh the disadvantages.

However, notice that there are some well-known algorithms that
inherently rely on what you call aliased mutation.  Union-find with path
compression is probably the best known, but there are others.

Pippenger proved that there are certain problems for which the best
"pure lisp" algorithm is worse by a factor of log n (IIRC) than the best
imperative algorithm.

And, by the way, "aliased mutation" is really the only motivation for
having mutation in the first place.  If all your mutation is
non-aliased, you can eliminated it quite easily and get completely pure
code.

Of course, if the phrase "important to our ability to solve problems"
refers to whether something can or cannot be done at all, then the
answer is "no".  Any problem for which there is an algorithm there is
also a pure algorithm.

Matthias

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 21:08:10 +0000
Message-ID: <4921dd61$0$95576$742ec2ed@news.sonic.net>

Matthias Blume wrote:

> Pippenger proved that there are certain problems for which the best
> "pure lisp" algorithm is worse by a factor of log n (IIRC) than the best
> imperative algorithm.

Okay.  That's the information I was looking for.  Thank you. 

So yes, there *are* instances where it's necessary to make 
better solutions to problems, and somebody's proved it. Now 
I need to go find his paper and understand exactly why.

> Of course, if the phrase "important to our ability to solve problems"
> refers to whether something can or cannot be done at all, then the
> answer is "no".  Any problem for which there is an algorithm there is
> also a pure algorithm.

Phht.  If that was what I meant by "important ..." then nothing 
beyond Turing machines and production grammars are important.  
They're both fully general models of computation, right?  That's 
true, but not "interesting" when the objective is making it easy 
for human beings to solve problems.

I have always been annoyed at language designers who make something
impossible, or even hard, on the grounds that some uses of it are 
bad ideas.  In an "ideal" language, IMO, all those dangerous low-
level facilities are there, but the high-level facilities are 
well-enough developed that there's hardly ever any good reason to 
use them.  And I think a good language should also make the use of 
dangerous facilities easy to spot and filter. Maybe PLT did the 
right thing with mcons and friends. 

I suppose part of the issue is cultural.  Programmers have to 
understand the importance of NOT doing certain things in code that's 
intended to be used in a threading environment or a multiprocessing 
environment, for example - even if the language allows them to do 
those things.  The language needs to provide ways for them to solve 
their problems without doing those things, but programmers have to 
understand the importance of it.  If the people maintaining libraries 
of code out there were to sort it and vet it for use under various
constraints, the libraries would be a hell of a lot more useful.  

But it goes back to the tools, too. The tools need to support 
people building code in those environments by proving, on demand, 
that a new library or code being checked in is "clean" of those 
things.  And it needs to support "use it anyway" overrides by 
engineers who know better than it can prove. 

If I have a point, it's that safety should be easy - not that 
safety should be mandatory.

                                Bear

From: George Neuner
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 23:29:05 +0000
Message-ID: <dtu3i49sq2d6v7l24vnshlkjmvegocttsq@4ax.com>

On Mon, 17 Nov 2008 11:02:48 -0600, Matthias Blume
<·····@hana.uchicago.edu> wrote:

>Ray Dillinger <····@sonic.net> writes:
>
>> Is the ability to alter the value of one variable by 
>> mutating the value of a different or "alias" variable
>> ever actually important to our ability to solve 
>> problems?  I'm increasingly of the opinion that it's 
>> not. 
>
>I agree that most of the time it is not important, and the advantages of
>purity outweigh the disadvantages.
>
>However, notice that there are some well-known algorithms that
>inherently rely on what you call aliased mutation.  Union-find with path
>compression is probably the best known, but there are others.
>
>Pippenger proved that there are certain problems for which the best
>"pure lisp" algorithm is worse by a factor of log n (IIRC) than the best
>imperative algorithm.
>
>And, by the way, "aliased mutation" is really the only motivation for
>having mutation in the first place.  If all your mutation is
>non-aliased, you can eliminated it quite easily and get completely pure
>code.
>
>Of course, if the phrase "important to our ability to solve problems"
>refers to whether something can or cannot be done at all, then the
>answer is "no".  Any problem for which there is an algorithm there is
>also a pure algorithm.
>
>Matthias

Not to mention that aliased mutation is at the heart of every mutual
exclusion algorithm.  Disallowing it would make handling exclusive use
resources impossible.

George

From: Rob Warnock
Subject: Re: Are shared-structure mutations actually important?
Date: Tue, 18 Nov 2008 12:09:51 +0000
Message-ID: <a-KdnQTvt_0SLb_UnZ2dnUVZ_sTinZ2d@speakeasy.net>

George Neuner  <········@comcast.net> wrote:
+---------------
| Not to mention that aliased mutation is at the heart of every mutual
| exclusion algorithm.  Disallowing it would make handling exclusive use
| resources impossible.
+---------------

I'm sorry, I don't understand this. For example, Dekker's
Algorithm <http://en.wikipedia.org/wiki/Dekker%27s_algorithm>
does not use aliased mutation. [Mutation, yes, but not aliased.]
So AFAICT aliased mutation is certainly *not* at the heart of
*every* mutual exclusion algorithm...


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: George Neuner
Subject: Re: Are shared-structure mutations actually important?
Date: Tue, 18 Nov 2008 20:43:14 +0000
Message-ID: <n406i4duh2gh1sc9c3lelptgvtdckmg7c0@4ax.com>

On Tue, 18 Nov 2008 06:09:51 -0600, ····@rpw3.org (Rob Warnock) wrote:

>George Neuner  <········@comcast.net> wrote:
>+---------------
>| Not to mention that aliased mutation is at the heart of every mutual
>| exclusion algorithm.  Disallowing it would make handling exclusive use
>| resources impossible.
>+---------------
>
>I'm sorry, I don't understand this. For example, Dekker's
>Algorithm <http://en.wikipedia.org/wiki/Dekker%27s_algorithm>
>does not use aliased mutation. [Mutation, yes, but not aliased.]
>So AFAICT aliased mutation is certainly *not* at the heart of
>*every* mutual exclusion algorithm...

I probably worded that too strongly.  

It's a matter of perspective.  Programmers tend to think of "shared
non-local objects" in terms of referencing the same identifier, but
that really is an illusion.  Without peeking under the hood, there is
no way to know whether the compiler really referenced the same
identifier or whether it was aliased into the scopes of the code that
used it.  If two functions were compiled using separate invocations of
the compiler they cannot have shared identifiers (they can after
linking share a named object but not the identifiers which name it).

I suppose it's a distinction that makes no difference.
YMMV.

I'd completely forgotten about Dekker until you mentioned him.  All of
my books teach Peterson's and Lamport's algorithms ... Dekker is just
a footnote.  Did Algol 58 or 60 even have tasking (beyond coroutines)?
I know Algol68 and PL/I did have multitasking, but they were after
Dekker's algorithm was published.

George

From: Nick Keighley
Subject: Re: Are shared-structure mutations actually important?
Date: Thu, 20 Nov 2008 12:52:31 +0000
Message-ID: <526075c3-67d1-43d9-8232-49167c8cf50d@t2g2000yqm.googlegroups.com>

On 18 Nov, 20:43, George Neuner <········@comcast.net> wrote:

> Did Algol 58 or 60 even have tasking (beyond coroutines)?

no. Well Alogol 60 didn't so I'm guessing Algol 58 didn't

> I know Algol68 and PL/I did have multitasking, but they were after
> Dekker's algorithm was published.

--
Nick Keighley

From: Alex Mizrahi
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 11:53:11 +0000
Message-ID: <49215b29$0$90262$14726298@news.sunsite.dk>

 RD> Is the ability to alter the value of one variable by
 RD> mutating the value of a different or "alias" variable
 RD> ever actually important to our ability to solve
 RD> problems?

sort of, it allows to express algorithms that work with mutable state
in more natural way. of course, you can always convert them
to immutable state algorithms, but why bother? i think mutable state
ones are easier to understand and to analyze their performance
characteristics.

From: Kaz Kylheku
Subject: Re: Are shared-structure mutations actually important?
Date: Mon, 17 Nov 2008 22:06:53 +0000
Message-ID: <20081117131136.346@gmail.com>

On 2008-11-16, Ray Dillinger <····@sonic.net> wrote:
>
> Is the ability to alter the value of one variable by 
> mutating the value of a different or "alias" variable
> ever actually important to our ability to solve 
> problems?

Well, it's important to our ability to code solutions to problems on an
ordinary computer, especially when we are using off-the-shelf algorithms.

If you pick some random algorithm out of, say, Knuth, chances are that it calls
for some destructive manipulation of cells in some dynamic set data structure
or whatever.

You could implement it using some copy-on-write scheme, but you're arguably
changing the algorithm.  

In general we have to avoid that kind of thinking: ``because we have discovered
a wonderful Y which apparently does everything that X does, we should dispense
with X''.

That leads us into idiocies like: ``we have static typing, and so we don't need
type information at run time''; or ``we have functional programming, so we must
never assign to a storage location'';  or ``we have objects with methods, so we
don't need anonymous functions (or vice versa)''.

> I'm increasingly of the opinion that it's not. 

Right. But we don't just want the ability to solve problems, but we want to be
able to solve them in different ways with different tradeoffs. Merely having
substantially different ways to express a solution is valuable, even in the
absence of performance considartions.

Likewise, someone could say that we don't need a system where only variables
are mutable, and all structures are copy-on-write, if we simply allow
structures to be mutated. :)

> I get that saving memory was very important when lisp 
> was being designed and needed to run on machines that 
> people could build back then.  I get that saving memory
> is still *somewhat* important today.

In some cases, when you save memory, you also save time. I.e. sometimes there
is a tradoff between time and space, but sometimes when you use more space, you
chew up more time too; there may still be a performance win, but it comes from
something else.

It may be cheaper in terms of time, not only in terms of memory, to poke a
value into some data structure, than to simulate the same thing using a
copy-on-write approach using a more complicated version of that data structure.

To obtain a performance benefit, you must take advantage of the special
properties of copy-on-write; namely that you obtain simultaneous use of the
original unmodified data structure and the modified one.

> But, questions 
> I'm imagining a lisp dialect with a strict copy-on-write 
> rule, such that variables can only be rebound, and never 
> have their values changed in place.

If the only thing you are going to do after lazily copying the data structure
is to assign the reference back to the variable, and you're using only one
processor, why bother.

But suppose that you need to be able to refer to the old version of the data
structure after the virtual update takes place.

One such situation occurs in a multiprocessor, when you have lots of concurrent
reads occuring on some data set which also has to be updated.  Assume that it's
a complex update which changes the data structure in many places that have to
be consistent with each other.

If the update is done on the data structure in-place, then readers have to be
locked out over the course of the update. While the update is occuring,
processing related to that data structure becomes serialized.

If the update takes place only by changing the value of the variable which
holds a reference to the structure, your update can proceed in parallel with
reads; no read-write lock is required.

Only a regular lock is needed, which is only used to serialize the writers.

Readers which are scanning the structure at the time that an update begins
simply complete their read operation on the old version of the structure, which
remains stable. 

The update proceeds in parallel, computing a new version of the structure which
is then installed in place of the old by owerwriting the variable. As soon as
this happens, any new readers will pick up the new structure.  The old versions
of the structure become garbage when all readers are finished with it.

If you simply modify the structure in place, you cannot have readers accessing
it concurrently with updates. You must lock out readers while making the
update.

This approach also helps in the face of recursion, of course. A reader could be
processing the structure, but may also need to make updates to it.  It may be
desireable to be able to make those updates without seeing their effect.

This is also how we update functions in a Lisp image. We don't modify the
machine code of a function, but replace the function binding with a new one.
If the old function is being executed, thanks to recursion or multithreading,
it's not a problem.

This idea is now being rediscovered by some poor fools who don't know Lisp.
E.g. RCU in the Linux kernel.  Papers have been written to explain the simple
idea in complicated terms.

See, you can apply this in the absence of garbage collection. You have some
structure which is being picked up by readers, and also updated. The structure
is referenced only via its root pointer, and via the read and update
operations. When no read and update operations are in progress, the structure
is in a ``quiescent state'': only the root reference tracks it.

Thus, when an update occurs, the old version of the data structure no longer
has any root reference, and becomes garbage when readers are done with it.

You don't actually have to do GC to know when the structure is garbage; you
only have to know that all of the readers have reached a quiescent state with
respect to that version of the structure.  There are various ways to know that
this has happened. For instance in ``epoch-based reclamation'', threads
regularly report that they are entering a new epoch. I.e. ``I have done all my
processing in this epoch, and am entering a new epoch''.  The epochs cycle
through some global counter. Whenever an epoch is empty of all threads, you can
run a bunch of callbacks dynamically registered  to that epoch; the callbacks
reclaim the objects which were abandoned during that epoch.

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Tue, 18 Nov 2008 02:44:37 +0000
Message-ID: <49222c3d$0$95494$742ec2ed@news.sonic.net>

Kaz Kylheku wrote:

> In general we have to avoid that kind of thinking: ``because we have
> discovered a wonderful Y which apparently does everything that X does, we
> should dispense with X''.

Don't worry; I'm unlikely to make that mistake.  I guess my conclusion 
is that aliased mutation is something people ought not run into by 
accident while using the language in a "normal" way, and that there 
should be a subset of the language not including it that's highly 
useful and all you need to know to do ordinary programming.  

When people decide they need to do it, for reasons they understand, 
then of course it should be simply a matter of looking up the right 
library, reading the doc, and using it - same as a library supporting 
any other specialized need.

The question came up mainly because I'm implementing the language 
interface to threads on a machine with 4 CPUs right now.  Things would 
be simpler if the system could prove more about the immutability of 
data structures, and I think my conclusion is that it's worth it to 
make immutable structures be the default.

                                Bear

From: Pascal Costanza
Subject: Re: Are shared-structure mutations actually important?
Date: Tue, 18 Nov 2008 19:57:34 +0000
Message-ID: <6ogl1eF3ib7pU1@mid.individual.net>

Ray Dillinger wrote:
> Is the ability to alter the value of one variable by 
> mutating the value of a different or "alias" variable
> ever actually important to our ability to solve 
> problems?  I'm increasingly of the opinion that it's 
> not. 

You bet it is.

Side effects are necessary for properties that change over time. (A 
silly example: the wage of a person.) And with it comes the necessity of 
dealing with object identity.

If you want to have a unified language in which most data is represented 
in the same way, it's inevitable that your underlying data structure 
supports side effects in one way or the other. Say, you want to support 
first-class lexical environments to avoid having to include explicit OOP 
or struct mechanisms, and you want to represent such environments as 
associations lists, because everything in your language happens to be 
represented as lists and cons cells, then you need support for side 
effects in cons cells. QED.

If you don't want side effects in cons cells and lists, you need to 
create a variety of data structures, some of which support side effects, 
some of which don't.

Even if you only expose side effects via some form of transaction 
mechanisms, like in Clojure, somewhere deep down you need plain ol' side 
effects to implement that. If you don't expose the plain ol' side 
effects in your 'proper' language, then you divide your world into those 
who are allowed to use plain ol' side effects (the implementor of the 
language) and those who are not allowed to do so (the users of the 
language). A language that is supposed to be usable for all kinds of 
programming, including systems programming, from the ground up shouldn't 
make such a separation. [1] What may be better is a way to treat side 
effects as part of a reflective interface of your language 
implementation (as first-class continuations and environments are in 
3-Lisp) that is shielded away from user programs in 'regular', 
'base-level' programming, but I am not aware of a good design yet that 
goes in that direction, so that's pure speculation.

Pascal

[1] I'm not saying that Clojure is a bad idea, I actually think it's 
quite impressive and one of the better developments in the Lisp world in 
recent times. I'm only saying that you have to be aware of this trade off.

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 01:42:05 +0000
Message-ID: <49236f16$0$95568$742ec2ed@news.sonic.net>

Pascal Costanza wrote:

> Ray Dillinger wrote:
>> Is the ability to alter the value of one variable by
>> mutating the value of a different or "alias" variable
>> ever actually important to our ability to solve
>> problems?  I'm increasingly of the opinion that it's
>> not.
> 
> You bet it is.
> 
> Side effects are necessary for properties that change over time. (A
> silly example: the wage of a person.) And with it comes the necessity of
> dealing with object identity.
> 
> If you want to have a unified language in which most data is represented
> in the same way, it's inevitable that your underlying data structure
> supports side effects in one way or the other. 

<snip>
Ergh.  Did you even read my post?  I was not questioning the value 
of mutation itself.  I was questioning the value of the ability 
for mutation to change the value of some *DIFFERENT* variable 
besides the one being mutated.  This seems like a "chatterbot" 
response where, because I accidentally used some set of keywords 
I'm getting a canned answer to a question which is sort of 
related to the one I asked, but not the one I asked.

You're making a lot of arguments in favor of a very broad concept 
(side effects) that I only asked about one very narrow slice of 
(side effects that change visible values other than the one on 
which the side effect was requested). 

> Even if you only expose side effects via some form of transaction
> mechanisms, like in Clojure, somewhere deep down you need plain ol' side
> effects to implement that. If you don't expose the plain ol' side
> effects in your 'proper' language, then you divide your world into those
> who are allowed to use plain ol' side effects (the implementor of the
> language) and those who are not allowed to do so (the users of the
> language). 

And yet again, people think I'm talking about banning side effects 
in user code.  I'm not, honest.  I regret that my question has 
attracted attention from the 'functional vs. non-functional' language
wars and the attendant assumptions that all involved are extremists 
who want or oppose a complete ban something or other.  I don't. 
If my phrasing gave the impression that I do, then it was a 
mistake on my part and I apologize for it. 

What I'm considering is making a dialect in which the need for 
mutation-without-rebinding is minimized in the main language and 
served by libraries intended for use in specialized applications.  
This would permit the system to note when the libraries that do 
such things are and are not linked and make optimizations accordingly. 
I thought it would be beneficial because very different sets of
optimizations give the most benefit in those two different scenarios. 

                                Bear

From: Rob Warnock
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 03:50:40 +0000
Message-ID: <VvSdnWP_ltGNEL7UnZ2dnUVZ_jCdnZ2d@speakeasy.net>

Ray Dillinger <····@sonic.net> wrote:
+---------------
| Pascal Costanza wrote:
| > You bet it is.
| > Side effects are necessary for properties that change over time.
| > (A silly example: the wage of a person.) And with it comes the
| > necessity of dealing with object identity.
...
| Ergh.  Did you even read my post?  I was not questioning the value 
| of mutation itself.  I was questioning the value of the ability 
| for mutation to change the value of some *DIFFERENT* variable 
| besides the one being mutated.
+---------------

In Lisp, variables per se are never mutated:

1. A variable *binding* may be mutated to refer to another object
   [e.g., as with a simple SETQ].

2. The *object* a variable is currently bound to may be mutated
   [e.g. as with (SETF accessor)], provided it is a mutable object.
   [Some aren't. E.g., numbers & characters are immutable in CL.]

So it seems that in fact your fundamental complaint is really that
Lisp allows a mutable *object* to be simultaneously bound to two
different variables. Sorry, that's life. You can't prevent that
without preventing function calls!![1]

I suspect from your continuing upset at peoples's responses in this
thread that what you really might be looking is not for ordinary
applicative Lisp at all, but rather for some flavor of "linear logic"
instead, such as the "linear Lisp" Henry Baker has written about at
great length, see:

    http://home.pipeline.com/~hbaker1/ForthStack.html
    Linear Logic and Permutation Stacks--The Forth Shall Be First
    ...
    Linear Lisp is a "linear" style of Lisp--i.e., every bound name
    is referenced exactly once. Thus, each function parameter occurs
    just once, as do the names introduced via other binding constructs--
    e.g., let, let*, etc.
    ...

    http://home.pipeline.com/~hbaker1/Use1Var.html
    'Use-Once' Variables and Linear Objects -- Storage Management,
    Reflection and Multi-Threading

    http://home.pipeline.com/~hbaker1/LQsort.html
    A 'Linear Logic' Quicksort

and several others. See his archive page:

    http://home.pipeline.com/~hbaker1/

+---------------
| ...because I accidentally used some set of keywords 
| I'm getting a canned answer to a question which is sort of 
| related to the one I asked, but not the one I asked.
+---------------

I suspect that's because you asked "the wrong question".  ;-}
Or equivalently, asked the question you were really interested
in using words that confused your readers, see above. [But I
could be wrong about that...]


-Rob

[1] Since *every* function call binds the formal parameters in
    the called routine to the *values* of the actual parameters
    in the calling routine, any time any actual parameter in a
    calling routine happens to be a variable bound to a mutable
    object then there *will* be a simultaneous binding of the
    mutable object to two different variables as soon as the
    function call happens!! Mutation of the object in the called
    routine *will* be visible in the calling routine after the
    called routine returns.

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 04:46:33 +0000
Message-ID: <49239a52$0$95532$742ec2ed@news.sonic.net>

Rob Warnock wrote:

> In Lisp, variables per se are never mutated:

> 1. A variable *binding* may be mutated to refer to another object
>    [e.g., as with a simple SETQ].

> 2. The *object* a variable is currently bound to may be mutated
>    [e.g. as with (SETF accessor)], provided it is a mutable object.
>    [Some aren't. E.g., numbers & characters are immutable in CL.]

Okay, that's a good distinction.  My idea of "mutating variables" 
included both of these.  But in fact I find that WRT to the question
I thought I was asking, mutating bindings is clearly harmless and 
mutating objects whose reference count is 1 is also probably 
harmless.  

Mutating other objects is sometimes necessary but often gives 
rise to confusion via aliases, often causes concurrency difficulties 
in multiprocessing environments with locking requirements, inhibits 
a bunch of compiler optimizations, and enables some different ones. 
So I'm rapidly concluding that the base language ought to be designed 
to be as useful and simple as possible without it, and that it ought 
to be supported through libraries.  

I've also concluded that I don't need a character datatype distinct 
from strings of length 1, and that strings ought to be immutable 
objects too.  ('Arrays of characters' and 'Arrays of bytes' are both
different ideas, and should be supported too). 

> So it seems that in fact your fundamental complaint is really that
> Lisp allows a mutable *object* to be simultaneously bound to two
> different variables. Sorry, that's life. You can't prevent that
> without preventing function calls!![1]

> [1] Since *every* function call binds the formal parameters in
>     the called routine to the *values* of the actual parameters
>     in the calling routine, any time any actual parameter in a
>     calling routine happens to be a variable bound to a mutable
>     object then there *will* be a simultaneous binding of the
>     mutable object to two different variables as soon as the
>     function call happens!! Mutation of the object in the called
>     routine *will* be visible in the calling routine after the
>     called routine returns.

Also an interesting point.  My function calls work differently 
(caller passes expr+environment; callee does evaluation or does
other stuff with the information) and it takes some thought to 
determine whether this is (or needs to be) true.  After a 
little thought, I think it is.  Regardless  of whether caller 
or called function is doing the evaluation, an evaluation of the 
same expression (which will happen in ordinary functions that 
lambda returns, but not with all functions) with the same 
environment winds up referring to the same object. And if the 
object can be mutated in place, then you have the mutation on 
a multiply-bound object. 

And with that, you lose compiler optimizations to aliasing 
problems again, unless you can statically solve all of them.  
Darn. 

It's looking like I really do need to have distinguished 
classes of mutable and immutable objects, and the programmer 
really does need to be aware of them, in order to get the 
compilation benefits of relatively tiny "critical sections" 
and fast locking in multiprocessing environments.  Darn. 

                                Bear

From: George Neuner
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 18:31:32 +0000
Message-ID: <ijj8i4pccqbdlldm588h3uo98g654hteut@4ax.com>

On Tue, 18 Nov 2008 20:46:33 -0800, Ray Dillinger <····@sonic.net>
wrote:

>Rob Warnock wrote:
>
>> In Lisp, variables per se are never mutated:
> 
>> 1. A variable *binding* may be mutated to refer to another object
>>    [e.g., as with a simple SETQ].
>
>> 2. The *object* a variable is currently bound to may be mutated
>>    [e.g. as with (SETF accessor)], provided it is a mutable object.
>>    [Some aren't. E.g., numbers & characters are immutable in CL.]
>
>Okay, that's a good distinction.  My idea of "mutating variables" 
>included both of these.  But in fact I find that WRT to the question
>I thought I was asking, mutating bindings is clearly harmless and 
>mutating objects whose reference count is 1 is also probably 
>harmless.  
>
>Mutating other objects is sometimes necessary but often gives 
>rise to confusion via aliases, often causes concurrency difficulties 
>in multiprocessing environments with locking requirements, inhibits 
>a bunch of compiler optimizations, and enables some different ones. 
>So I'm rapidly concluding that the base language ought to be designed 
>to be as useful and simple as possible without it, and that it ought 
>to be supported through libraries. 

So what you're really miffed about is mutation with structure sharing
rather than (or maybe along with) mutation with aliasing.  

I agree that mutation and structure sharing together can be trouble.
Aliasing itself is somewhat unavoidable.
 

>I've also concluded that I don't need a character datatype distinct 
>from strings of length 1, and that strings ought to be immutable 
>objects too.  ('Arrays of characters' and 'Arrays of bytes' are both
>different ideas, and should be supported too). 
>
>> So it seems that in fact your fundamental complaint is really that
>> Lisp allows a mutable *object* to be simultaneously bound to two
>> different variables. Sorry, that's life. You can't prevent that
>> without preventing function calls!![1]
>
>> [1] Since *every* function call binds the formal parameters in
>>     the called routine to the *values* of the actual parameters
>>     in the calling routine, any time any actual parameter in a
>>     calling routine happens to be a variable bound to a mutable
>>     object then there *will* be a simultaneous binding of the
>>     mutable object to two different variables as soon as the
>>     function call happens!! Mutation of the object in the called
>>     routine *will* be visible in the calling routine after the
>>     called routine returns.
>
>Also an interesting point.  My function calls work differently 
>(caller passes expr+environment; callee does evaluation or does
>other stuff with the information) and it takes some thought to 
>determine whether this is (or needs to be) true.  After a 
>little thought, I think it is.  Regardless  of whether caller 
>or called function is doing the evaluation, an evaluation of the 
>same expression (which will happen in ordinary functions that 
>lambda returns, but not with all functions) with the same 
>environment winds up referring to the same object. And if the 
>object can be mutated in place, then you have the mutation on 
>a multiply-bound object. 
>
>And with that, you lose compiler optimizations to aliasing 
>problems again, unless you can statically solve all of them.  
>Darn. 
>>
>It's looking like I really do need to have distinguished 
>classes of mutable and immutable objects, and the programmer 
>really does need to be aware of them, in order to get the 
>compilation benefits of relatively tiny "critical sections" 
>and fast locking in multiprocessing environments.  Darn. 
>
>                                Bear
>

Aliasing is statically resolvable in single threaded code, but not in
parallel code.  I saw Rob mentioned Linear Lisp - it might help if
most of your language can be expressed in a similar single use form.
Single Static Use (SSU) is nearly impossible to achieve in a language
that allows mutation, but Single Static Assignment (SSA) can
approximate SSU to varying degrees.  Using SSA can help a lot with
analysis, but handling general continuations (if you care to) in SSA
can be a real b*tch.

It isn't very Lispy, but you could use syntax to distinguish between
"safe" and "unsafe" code and forbid using mutation in safe code.  That
would at least help contain the problem.

George

From: Ray Dillinger
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 21:52:44 +0000
Message-ID: <49248ad6$0$95518$742ec2ed@news.sonic.net>

George Neuner wrote:

> On Tue, 18 Nov 2008 20:46:33 -0800, Ray Dillinger <····@sonic.net>

> So what you're really miffed about is mutation with structure sharing
> rather than (or maybe along with) mutation with aliasing.

Yes, I think so. 

(equal? (and multiprocessing  cache-coherency-requirements mutation 
            (or aliasing structure-sharing) trouble) 
==> #True

I don't want to give up multiprocessing.  I can't seem to address 
cache coherency requirements without writing hardware-dependent 
microcode, and I'd like the implementation to be portable.  I 
don't want to give up mutation, because it's somewhat useful.  
So I was looking at aliasing & structure sharing to see if it 
could reasonably be left out of the core language. 

> Aliasing is statically resolvable in single threaded code, but not in
> parallel code.  

That is an insight with which I should have started, but instead it 
has cost me much head sweat and frustration over a period of weeks. 
At this point I can tell you from hard experience that it is true.

> It isn't very Lispy, but you could use syntax to distinguish between
> "safe" and "unsafe" code and forbid using mutation in safe code.  That
> would at least help contain the problem.

I intend something a lot like that with some kind of declamations - 
but it's not implemented yet, or even really designed.  Still working 
at a lower level so far. 

                                Bear

From: George Neuner
Subject: Re: Are shared-structure mutations actually important?
Date: Thu, 20 Nov 2008 16:12:30 +0000
Message-ID: <kg1bi45brmlim9r5r2hir1eh8j7g5ksie2@4ax.com>

On Wed, 19 Nov 2008 13:52:44 -0800, Ray Dillinger <····@sonic.net>
wrote:

>George Neuner wrote:
>
>> On Tue, 18 Nov 2008 20:46:33 -0800, Ray Dillinger <····@sonic.net>
>
>> So what you're really miffed about is mutation with structure sharing
>> rather than (or maybe along with) mutation with aliasing.
>
>Yes, I think so. 
>
>(equal? (and multiprocessing  cache-coherency-requirements mutation 
>            (or aliasing structure-sharing) trouble) 
>==> #True
>
>I don't want to give up multiprocessing.  I can't seem to address 
>cache coherency requirements without writing hardware-dependent 
>microcode, and I'd like the implementation to be portable.  I 
>don't want to give up mutation, because it's somewhat useful.  
>So I was looking at aliasing & structure sharing to see if it 
>could reasonably be left out of the core language. 

Can you say specifically what trouble you're having?  Or expecting?  

I can't say that I worry much about coherence trouble in a shared
memory system.  The specifics of the update and snoop protocols affect
bandwidth and performance, but regardless, if you let the cache do its
job, you should not have any additional consistency issues beyond what
you'd expect from MP.

If you're deliberately messing with cache bypass or preloading, that
is a different matter.

George

From: Slobodan Blazeski
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 10:21:30 +0000
Message-ID: <97993c46-d400-4a51-ab7b-80b621e9ee62@i20g2000prf.googlegroups.com>

On Nov 19, 4:50 am, ····@rpw3.org (Rob Warnock) wrote:
> Ray Dillinger <····@sonic.net> wrote:
>
> +---------------| Pascal Costanza wrote:
>
> | > You bet it is.
> | > Side effects are necessary for properties that change over time.
> | > (A silly example: the wage of a person.) And with it comes the
> | > necessity of dealing with object identity.
> ...
> | Ergh.  Did you even read my post?  I was not questioning the value
> | of mutation itself.  I was questioning the value of the ability
> | for mutation to change the value of some *DIFFERENT* variable
> | besides the one being mutated.
> +---------------
>
> In Lisp, variables per se are never mutated:
>
> 1. A variable *binding* may be mutated to refer to another object
>    [e.g., as with a simple SETQ].
>
> 2. The *object* a variable is currently bound to may be mutated
>    [e.g. as with (SETF accessor)], provided it is a mutable object.
>    [Some aren't. E.g., numbers & characters are immutable in CL.]
>
> So it seems that in fact your fundamental complaint is really that
> Lisp allows a mutable *object* to be simultaneously bound to two
> different variables. Sorry, that's life. You can't prevent that
> without preventing function calls!![1]
>
> I suspect from your continuing upset at peoples's responses in this
> thread that what you really might be looking is not for ordinary
> applicative Lisp at all, but rather for some flavor of "linear logic"
> instead, such as the "linear Lisp" Henry Baker has written about at
> great length, see:
>
>    http://home.pipeline.com/~hbaker1/ForthStack.html
>     Linear Logic and Permutation Stacks--The Forth Shall Be First
>     ...
>     Linear Lisp is a "linear" style of Lisp--i.e., every bound name
>     is referenced exactly once. Thus, each function parameter occurs
>     just once, as do the names introduced via other binding constructs--
>     e.g., let, let*, etc.
>     ...
>
>    http://home.pipeline.com/~hbaker1/Use1Var.html
>     'Use-Once' Variables and Linear Objects -- Storage Management,
>     Reflection and Multi-Threading
>
>    http://home.pipeline.com/~hbaker1/LQsort.html
>     A 'Linear Logic' Quicksort
>
> and several others. See his archive page:
>
>    http://home.pipeline.com/~hbaker1/
>

Could linear lisp support continuations?
bobi

From: Rob Warnock
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 10:42:08 +0000
Message-ID: <MMydnYnva8cdcL7UnZ2dnUVZ_tHinZ2d@speakeasy.net>

Slobodan Blazeski  <·················@gmail.com> wrote:
+---------------
| ····@rpw3.org (Rob Warnock) wrote:
| > ... such as the "linear Lisp" Henry Baker has written about at
| > great length, see: ...[examples]...
| > and several others. See his archive page:
| >   http://home.pipeline.com/~hbaker1/
| 
| Could linear lisp support continuations?
+---------------

I'm not an expert in either linear logic or linear Lisp, but
from a brief look at Baker's papers, I'd have to say probably
"yes, sort of", but not usefully in the ways Scheme programmers
would normally expect. Consider this excerpt [with apologies if
I've in any way distorted the meaning by eliding]:

    http://home.pipeline.com/~hbaker1/Use1Var.html
    'Use-Once' Variables and Linear Objects -- Storage Management,
    Reflection and Multi-Threading
    Henry G. Baker (1994)
    ...
    A 'use-once' variable must be dynamically referenced exactly
    once within its scope. Unreferenced use-once variables must be
    explicitly killed, and multiply-referenced use-once variables
    must be explicitly copied; this duplication and deletion is
    subject to the constraint that some linear datatypes do not
    support duplication and deletion methods.
    ...
    For example, use-once variables allow for the safe/controlled
    use of reified language implementation objects like single-use
    continuations.
    ...
    Continuations can be linear (enabling an efficient implementation,
    unlike [Lincoln92]), since they normally return only once ("one-shot"
    [Haynes87]), even when doing co-routining and time-sharing [Wand80].
    (Prolog-like backtracking [Haynes87] requires expensive copying--if
    allowed--or (non-linear) sharing of continuations.) Of course, a
    linear call/cc can have no implicit continuation--the only way to
    return from a linear call/cc is to invoke the passed continuation,
    so a function which does not escape via a passed continuation must
    return it for its caller's use. Escaping via a continuation kills
    an implicit continuation thereby causing intermediate linear variables
    to be finalized a la Common Lisp unwind-protect.
    ...

So, yes, you can have linear continuations, but they won't behave
like default (non-linear) Scheme multi-use continuations.


-Rob

p.s. In the presence of linear variables, non-linear (Scheme-style)
continuations also violate the "one-use" guarantee of linear variables,
since invoking a continuation captured between the definition of a
linear variable and its single use (single reference) would cause
the variable to be referenced again. Oops.

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Pascal J. Bourguignon
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 10:28:04 +0000
Message-ID: <7czljwovrf.fsf@pbourguignon.anevia.com>

····@rpw3.org (Rob Warnock) writes:
> 2. The *object* a variable is currently bound to may be mutated
>    [e.g. as with (SETF accessor)], provided it is a mutable object.
>    [Some aren't. E.g., numbers & characters are immutable in CL.]
>
> So it seems that in fact your fundamental complaint is really that
> Lisp allows a mutable *object* to be simultaneously bound to two
> different variables. Sorry, that's life. You can't prevent that
> without preventing function calls!![1]

You can, with a reversible computer.  Then you don't copy, you swap,
and while the argument is used by a function, it's not accessible to
the caller.

-- 
__Pascal Bourguignon__

From: Kaz Kylheku
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 20:20:41 +0000
Message-ID: <20081119121433.299@gmail.com>

["Followup-To:" header set to comp.lang.lisp.]
On 2008-11-19, Rob Warnock <····@rpw3.org> wrote:
> Ray Dillinger <····@sonic.net> wrote:
> +---------------
>| Pascal Costanza wrote:
>| > You bet it is.
>| > Side effects are necessary for properties that change over time.
>| > (A silly example: the wage of a person.) And with it comes the
>| > necessity of dealing with object identity.
> ...
>| Ergh.  Did you even read my post?  I was not questioning the value 
>| of mutation itself.  I was questioning the value of the ability 
>| for mutation to change the value of some *DIFFERENT* variable 
>| besides the one being mutated.
> +---------------
>
> In Lisp, variables per se are never mutated:

A variable is by definition a named quantity which can be muteted.

Constants are never mutated.

> 1. A variable *binding* may be mutated to refer to another object
>    [e.g., as with a simple SETQ].

Mutating a binding means modifying the environment so that a name which
refers to a memory location now refers to a different memory location.

This isn't what happens when we store a new value in a variable.  The binding
stays the same.

> 2. The *object* a variable is currently bound to may be mutated

A variable isn't bound to a value. A variable is a binding between a symbol and
a memory location, according to some environment. The variable stores the
object by means of this bound memory location.

From: Rob Warnock
Subject: Re: Are shared-structure mutations actually important?
Date: Thu, 20 Nov 2008 04:02:06 +0000
Message-ID: <uMOdnafqcfejfLnUnZ2dnUVZ_jidnZ2d@speakeasy.net>

Kaz Kylheku  <········@gmail.com> wrote:
+---------------
| ["Followup-To:" header set to comp.lang.lisp.]
| Rob Warnock <····@rpw3.org> wrote:
| > In Lisp, variables per se are never mutated:
| 
| A variable is by definition a named quantity which can be muteted.
+---------------

Whose "definition"? ;-}  ;-}  Seriously, it varies. I was trying to
make a distinction between C-like languages and Lisp-family languages,
but even within the latter it varies [and note that Ray Dillinger tends
to be more of a Schemer than a Lisper, so that matters].

CLHS says this (which is not as precise about the semantics as it
might be IMHO):

    variable n. a binding in the ``variable'' namespace.
                See Section 3.1.2.1.1 (Symbols as Forms).

    binding n. an association between a name and that which the name
               denotes. ``A lexical binding is a lexical association
               between a name and its value.'' When the term binding is
               qualified by the name of a namespace, such as ``variable''
               or ``function,'' it restricts the binding to the indicated
               namespace, as in: ``let establishes variable bindings.''
               or ``let establishes bindings of variables.''

    mutate v.  [Not defined]

    3.1.1 Introduction to Environments
    A binding is an association between a name and that which the
    name denotes. Bindings are established in a lexical environment
    or a dynamic environment by particular special operators.
    ...
    A single name can simultaneously have more than one associated
    binding per environment, but can have only one associated binding
    per namespace.

    3.1.2.1.1 Symbols as Forms
    ...
    A variable can store one object. The main operations on a
    variable are to read[1] and to write[1] its value.
    ...
    Non-constant variables can be assigned by using setq or bound[3]
    by using let.

    assign v.t. (a variable) to change the value of the variable in
	       a binding that has already been established.
	       See the special operator setq.

    write v.t. 1. (a binding or slot or component) to change the value
	       of the binding or slot.

Note already some confusion appearing between "assigning...the value
of the variable" "changing the value of the binding".

Scheme (well, R5RS) makes it a bit more clear (maybe):

    3.1 Variables, syntactic keywords, and regions
    An identifier may name a type of syntax, or it may name a
    location where a value can be stored. ... An identifier that names
    a location is called a variable and is said to be bound to that
    location. The set of all visible bindings in effect at some point
    in a program is known as the environment in effect at that point.
    The value stored in the location to which a variable is bound is
    called the variable's value. By abuse of terminology, the variable 
    is sometimes said to name the value or to be bound to the value.
    This is not quite accurate, but confusion rarely results from this
    practice.

Except in this thread, apparently!  ;-}

I was trying to *not* abuse the terminology, but seem to have failed.
Let me try again following the Scheme terminology more closely. The
main point I was trying to make when I said "In Lisp, variables per se
are never mutated" was that *which* location a variable is bound to is
never changed during the life of that variable [modulo copying GC issues!].

And when I said "A variable *binding* may be mutated to refer to another
object" I simply meant that the value stored *in* a binding's location
*can* be changed.

NOTA BENE: In neither Scheme not Lisp can the programmer get any
"handle" on the location to which a variable is bound other than
via the variable itself. That is, the "address" of the location is
always implicit and inaccessible [which is why copying GC works].

+---------------
| > 1. A variable *binding* may be mutated to refer to another object
| >    [e.g., as with a simple SETQ].
| 
| Mutating a binding means modifying the environment so that a name which
| refers to a memory location now refers to a different memory location.
+---------------

I disagree. You're talking about *extending* the environment, perhaps,
creating a *new* binding that *shadows* a previous binding, but that
does *not* mutate the previous binding, which still refers to the
same (previous) location. Bindings themselves are immutable once
created.

+---------------
| This isn't what happens when we store a new value in a variable.
| The binding stays the same.
+---------------

(*sigh*) I guess it depends on what you mean by the "binding"
and by "mutation of a binding". I was using "binding" to mean
the association between a variable (an identifier in some visible
variable environment) and the location to which it is bound.
By "mutation of the binding" I -- imprecisely, confusingly,
perhaps -- meant changing the contents of the location to which
the variable is bound. I probably should have said "mutation of
the binding's value".

+---------------
| > 2. The *object* a variable is currently bound to may be mutated
| 
| A variable isn't bound to a value. A variable is a binding between
| a symbol and a memory location, according to some environment.
| The variable stores the object by means of this bound memory location.
+---------------

Exactly, with the caveat that the object isn't necessarily *in* the
bound memory location, but is only *denoted* by it! Which is the only
way that two different bindings could denote "the same" object.

Note: Yes, this "denotation" smells suspiciously like what is called
a "pointer" in other languages, and *would* be a pointer except that
there is no way within Scheme/Lisp to get a handle on either it *or*
the bound location in which it resides.

See if you can agree with this version of my previous points #1 & #2:

1. Once established, a variable binding [the association between an
   identifier and a memory location, according to some environment]
   cannot be changed [though it can be shadowed, by extending that
   environment].

2. Assigning to a variable (or equivalently, to a binding) changes
   which object is denoted by the memory location of the binding.

And now we should add a third:

3. "The same" object may be denoted simultaneously by multiple
   distinct variable bindings.

And perhaps even a fourth, for implementers only:  ;-}

4. Immutable objects may be freely copied and/or represented as
   "immediate" denotations provided that there is no way within
   the language to tell that such copying has been done; i.e.,
   that such copies are in all respects "the same" as the objects
   copied from.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Paul Wallich
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 14:07:50 +0000
Message-ID: <gg16jn$mbd$1@reader1.panix.com>

Ray Dillinger wrote:
> Pascal Costanza wrote:
> 
>> Ray Dillinger wrote:
>>> Is the ability to alter the value of one variable by
>>> mutating the value of a different or "alias" variable
>>> ever actually important to our ability to solve
>>> problems?  I'm increasingly of the opinion that it's
>>> not.
>> You bet it is.
>>
>> Side effects are necessary for properties that change over time. (A
>> silly example: the wage of a person.) And with it comes the necessity of
>> dealing with object identity.
>>
>> If you want to have a unified language in which most data is represented
>> in the same way, it's inevitable that your underlying data structure
>> supports side effects in one way or the other. 
> 
> <snip>
> Ergh.  Did you even read my post?  I was not questioning the value 
> of mutation itself.  I was questioning the value of the ability 
> for mutation to change the value of some *DIFFERENT* variable 
> besides the one being mutated.  This seems like a "chatterbot" 
> response where, because I accidentally used some set of keywords 
> I'm getting a canned answer to a question which is sort of 
> related to the one I asked, but not the one I asked.
> 
> You're making a lot of arguments in favor of a very broad concept 
> (side effects) that I only asked about one very narrow slice of 
> (side effects that change visible values other than the one on 
> which the side effect was requested). 

What do you mean by "different" here?  Are you complaining about 
anything that shares structure, or only about things that share 
structure under the hood (e.g. where the compiler has decided to re-use 
or coalesce stuff in ways that are legal but not obvious at first 
glance)? It seems to me that any structure that's visibly shared, i.e. 
visible to a simple-minded code walker would be fine according to you, 
but I'm not sure.

paul

From: Pascal Costanza
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 10:48:20 +0000
Message-ID: <6oi97kF3qq53U1@mid.individual.net>

Ray Dillinger wrote:
> Pascal Costanza wrote:
> 
>> Ray Dillinger wrote:
>>> Is the ability to alter the value of one variable by
>>> mutating the value of a different or "alias" variable
>>> ever actually important to our ability to solve
>>> problems?  I'm increasingly of the opinion that it's
>>> not.
>> You bet it is.
>>
>> Side effects are necessary for properties that change over time. (A
>> silly example: the wage of a person.) And with it comes the necessity of
>> dealing with object identity.
>>
>> If you want to have a unified language in which most data is represented
>> in the same way, it's inevitable that your underlying data structure
>> supports side effects in one way or the other. 
> 
> <snip>
> Ergh.  Did you even read my post?

Yes.

> I was not questioning the value 
> of mutation itself.  I was questioning the value of the ability 
> for mutation to change the value of some *DIFFERENT* variable 
> besides the one being mutated.  This seems like a "chatterbot" 
> response where, because I accidentally used some set of keywords 
> I'm getting a canned answer to a question which is sort of 
> related to the one I asked, but not the one I asked.
> 
> You're making a lot of arguments in favor of a very broad concept 
> (side effects) that I only asked about one very narrow slice of 
> (side effects that change visible values other than the one on 
> which the side effect was requested). 

I meant the narrow slice you were talking about.

> 
>> Even if you only expose side effects via some form of transaction
>> mechanisms, like in Clojure, somewhere deep down you need plain ol' side
>> effects to implement that. If you don't expose the plain ol' side
>> effects in your 'proper' language, then you divide your world into those
>> who are allowed to use plain ol' side effects (the implementor of the
>> language) and those who are not allowed to do so (the users of the
>> language). 
> 
> And yet again, people think I'm talking about banning side effects 
> in user code.  I'm not, honest.  I regret that my question has 
> attracted attention from the 'functional vs. non-functional' language
> wars and the attendant assumptions that all involved are extremists 
> who want or oppose a complete ban something or other.  I don't. 
> If my phrasing gave the impression that I do, then it was a 
> mistake on my part and I apologize for it. 

True, there is too much use of side effects, and more functional data 
structures are worthwhile to investigate.

> What I'm considering is making a dialect in which the need for 
> mutation-without-rebinding is minimized in the main language and 
> served by libraries intended for use in specialized applications.  
> This would permit the system to note when the libraries that do 
> such things are and are not linked and make optimizations accordingly. 
> I thought it would be beneficial because very different sets of
> optimizations give the most benefit in those two different scenarios. 

That wasn't clear to me from your previous posting. Yes, that sounds 
interesting.


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Pascal J. Bourguignon
Subject: Re: Are shared-structure mutations actually important?
Date: Wed, 19 Nov 2008 09:58:57 +0000
Message-ID: <7c4p24qboe.fsf@pbourguignon.anevia.com>

Pascal Costanza <··@p-cos.net> writes:

> Ray Dillinger wrote:
>> Is the ability to alter the value of one variable by mutating the
>> value of a different or "alias" variable
>> ever actually important to our ability to solve problems?  I'm
>> increasingly of the opinion that it's not. 
>
> You bet it is.
>
> Side effects are necessary for properties that change over time. (A
> silly example: the wage of a person.) And with it comes the necessity
> of dealing with object identity.

If something changes over time, perhaps it would be useful to keep an
history of the changes.  If I get a raise for December, I don't want
to pay taxes on (* 12 december-pay).  (cf. google's "Debugging in
time", and McCarthy's Elephant 2000 where programs can refer to the
past).

-- 
__Pascal Bourguignon__