FSet/CL 1.0 released!

From: Scott Burson
Subject: FSet/CL 1.0 released!
Date: Mon, 28 May 2007 17:12:47 +0000
Message-ID: <1180372367.078041.206610@i13g2000prf.googlegroups.com>

I am pleased to announce the first release of FSet/CL, a functional
set-theoretic collections library for Common Lisp.

"Functional" means that update operations return a new collection
rather than modifying the existing one in place. "Set-theoretic"
means, among other things, that like sets etc. in mathematics, these
collections can be arbitrarily nested. Compound types like maps keyed
by sets of sequences _just work_ with no additional programmer effort.

FSet is implemented using heterogeneous weight-balanced binary trees.
FSet/CL, the Common Lisp implementation, uses a single, global,
extensible ordering relation (implemented as a CLOS generic function),
so that FSet collections can contain any combination of types.

Why functional collections?  The home page (see below) has a long
essay about this, but in a nutshell, the stylistic benefits of
functional programming are well known to many of us; and even in mixed
functional/imperative code, as Lisp programs tend to be, the more that
can be done functionally, the better.  Besides, collections (sets,
maps, etc.) are mathematical values like numbers; numbers are
functional in Lisp -- why shouldn't collections be?

Some examples to give the flavor:

* (setq s1 (set 1 2 3))     ; shadowed version of `set', not `cl:set'
#{ 1 2 3 }
* (setq s2 (with s1 'foo))
#{ 1 2 3 FOO }
* s1
#{ 1 2 3 }
* (setq m1 (map (s1 "without foo") (s2 "with foo")))   ; `map' also
shadowed
#{| (#{ 1 2 3 } "without foo") (#{ 1 2 3 FOO } "with foo") |}
* (lookup m1 (set 2 3 1))
"without foo"
* (lookup m1 (set 3 'foo 1 2))
"with foo"

For more information:

FSet home page: http://www.sympoiesis.com/FSet/
FSet on Common-Lisp.net: http://common-lisp.net/project/fset/

FSet is ASDF-Installable and has been tested on most popular CL
implementations.  Please give a try and let me know (a) if you have
any trouble with it, and (b) what you think of the design.

-- Scott L. Burson

Re: FSet/CL 1.0 released! Pascal Costanza
- Re: FSet/CL 1.0 released! Scott Burson
Re: FSet/CL 1.0 released! Geoff Wozniak
Re: FSet/CL 1.0 released! szergling
Re: FSet/CL 1.0 released! Scott Burson
Re: FSet/CL 1.0 released! Scott Burson

From: Pascal Costanza
Subject: Re: FSet/CL 1.0 released!
Date: Mon, 28 May 2007 17:58:35 +0000
Message-ID: <5c0jicF2t5k0nU1@mid.individual.net>

Scott Burson wrote:

> FSet is ASDF-Installable and has been tested on most popular CL
> implementations.  Please give a try and let me know (a) if you have
> any trouble with it, and (b) what you think of the design.

This looks interesting. However, there are two issues that I can think 
of after superficially browsing through the design document:

- I don't think that the @ macro is a good idea. If you really want to 
use a variable both in a variable and a function position, it is 
straightforward to just define a local macro, like this:

(let ((v ...))
   (macrolet ((v (&rest args) `(funcall v ,@args)))
     ...))

This can be abstracted away in a utility macro:

(defmacro let1 (bindings &body body)
   `(let ,bindings
      (macrolet ,(loop for (var) in bindings
                       collect `(,var (&rest args)
                                 `(funcall ,',var ,@args)))
        ,@body)))

[Untested.]


- Christophe Rhodes recently came up with a design for supporting 
generic sequences in Common Lisp. It might be interesting to see how 
your design fits with his ideas. See 
http://www.doc.gold.ac.uk/~mas01cr/papers/ilc2007/sequences-20070301.pdf


Cheers,
Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

From: Scott Burson
Subject: Re: FSet/CL 1.0 released!
Date: Tue, 29 May 2007 16:01:14 +0000
Message-ID: <1180454474.613022.167090@q19g2000prn.googlegroups.com>

Oh geez, I wrote a long reply yesterday that seems to have vanished
into the bit bucket.  How odd.  Oh well, in a nutshell...

On May 28, 10:58 am, Pascal Costanza <····@p-cos.net> wrote:
> - I don't think that the @ macro is a good idea.

Well, it's a minor feature, and of course no one is obliged to use it,
but I think some people may like it.

It's not clear to me that your `let1' macro is any more elegant.

> - Christophe Rhodes recently came up with a design for supporting
> generic sequences in Common Lisp. It might be interesting to see how
> your design fits with his ideas.

Thanks for the pointer.  There is an awful lot of commonality between
Rhodes' proposal and what I have done for FSet; but the two systems
don't quite dovetail perfectly.  [I'll have to reconstruct the
explanation from my lost post later -- no time now.]

-- Scott

From: Geoff Wozniak
Subject: Re: FSet/CL 1.0 released!
Date: Tue, 29 May 2007 03:09:08 +0000
Message-ID: <1180408148.585040.13650@q69g2000hsb.googlegroups.com>

On May 28, 1:12 pm, Scott Burson <········@gmail.com> wrote:
> I am pleased to announce the first release of FSet/CL, a functional
> set-theoretic collections library for Common Lisp.
>

Cool!  I look forward to checking this out.  Maybe I'll try to use it
in the work I presented at ILC'07.

Geoff

From: szergling
Subject: Re: FSet/CL 1.0 released!
Date: Tue, 29 May 2007 10:52:03 +0000
Message-ID: <1180435923.748429.318560@i38g2000prf.googlegroups.com>

> I am pleased to announce the first release of FSet/CL, a functional
> set-theoretic collections library for Common Lisp.
>
> "Functional" means that update operations return a

That reminded me of this Google summoer of code project:

Purely Functional Data Structures
http://code.google.com/soc/lispnyc/appinfo.html?csaid=F098323F679401F6

Just thought some of you might be interested.

From: Scott Burson
Subject: Re: FSet/CL 1.0 released!
Date: Fri, 15 Jun 2007 04:55:02 +0000
Message-ID: <1181883302.527614.179710@g37g2000prf.googlegroups.com>

[Second posting attempt]

On Jun 8, 8:00 am, Joerg Hoehle <······@users.sourceforge.net> wrote:
> > * (setq m1 (map (s1 "without foo") (s2 "with foo")))   ; `map' also
> > shadowed
> > #{| (#{ 1 2 3 } "without foo") (#{ 1 2 3 FOO } "with foo") |}
>
> [and from the web page:]
>
> >(let ((x 1)) (set x 2)) returns the set {1 2}
>
> The former example shows that 'map' is a macro with its own evaluation
> rules, while 'set' has functional semantics.  Why such a distinction?

Actually, they're both a little more complicated than that.  They're
both macros with syntax, but the syntax of `map' is more complex (and
more obviously so) because a map contains pairs.  I thought it would
be clearer, particularly for `map' forms with many pairs, to write:

(map (x y) (foo mumble) (1 17) ...)

than

(map x y foo mumble 1 17 ...)

but certainly, the extra nesting level in the first version does
require an understanding of the syntax, where one might guess the
meaning of the second (though one first has to realize that this isn't
`cl:map' anyway).

But there's still more to it than this.  Both `set' and `map' (and the
other constructor macros too, for that matter) take "splice
arguments", marked as sublists whose car is the symbol `$'.  These
indicate that a value is to become, not an element of the constructed
collection, but a subcollection.  So, for instance:

(set ($ s1) 3 17 'foo)

is equivalent to

(union s1 (set 3 17 'foo))

So in general, you'll have to know the syntax of these macros in order
to read code containing them.

I can see, though, that I should explain all this better in the
introductory material and design document, and maybe revise the CLiki
examples.  Sorry it wasn't all clearer.

I assure you, anyway, that the functional interface is not forgotten
by any means -- in fact, the constructor macros were something of an
afterthought, though I think they're nice for some uses.  Normally,
programmatic construction of maps would be done, as you suggest, by
repeated calls to `with'.  Or, if you have an alist, you can simply do

  (convert 'map alist)

> It looks like
> (map ($ m1) ($ m2)) is a consise operation for merging two maps.
> How do you merge N maps or a sequence of maps?

Oh!  I didn't read your whole message before writing the above.  So,
you've seen the splice syntax.  In fact, `map' takes any number of
such subforms -- or, you can use the functional `map-merge' operation.

> Do you have something like reduce to operate on sequences?
> ("reduce" #'"functional-map" sequence-of-maps
>   :from-end t/nil ; aka foldl or foldr in functional languages
>   :initial-value (empty-map))

There's a `fold' method that works on all these classes, though you
raise a good point -- I don't have a way to do `from-end'/`foldr' on
seqs.  (For the other classes, you're not supposed to care what order
they're in.)

> >In Lisp it's actually quite cheap to just convert the collection to a
> >list and cdr-iterate over that in the familiar way.
>
> What about having more than 1000 elements?
> Such conversion are not cheap.

But consider that you're probably going to do something with each of
those 1000 elements.  Converting the collection to a list is usually
going to be cheap by comparison, with modern GCs (and in particular,
it still takes linear time).  It's only when you are usually going to
exit the iteration prematurely -- without actually looking at many of
the elements -- that this conversion is a big loss (and can hurt not
just the time but the time complexity order of your algorithm).

> I can't understand why you wouldn't
> provide an enumerator ("impure", what the heck?)

Well, despite the above argument, I probably will provide stateful
enumerators eventually.  It dawned on me recently that I could
allocate a single array per collection to be iterated over (since the
tree depth is bounded by the behavior of the balancing algorithms) and
use it as a stack.  This is probably the best way as it entails only
one allocated object per collection.

-- Scott

From: Scott Burson
Subject: Re: FSet/CL 1.0 released!
Date: Wed, 20 Jun 2007 18:20:51 +0000
Message-ID: <1182363651.775588.277740@x35g2000prf.googlegroups.com>

On Jun 8, 8:00 am, Joerg Hoehle <······@users.sourceforge.net> wrote:
> > >In Lisp it's actually quite cheap to just convert the collection to a
> >list and cdr-iterate over that in the familiar way.
>
> What about having more than 1000 elements?
> Such conversion are not cheap.  I can't understand why you wouldn't
> provide an enumerator ("impure", what the heck?)
> Why not, in Lisp, have a function that can return a closure which
> iterates over any FSet type?

Okay, I've done exactly as you suggested, and I think the results
might surprise you.  I tested both the list and the closure strategies
on small sets (I chose size 4) and large ones (size 200).  For a loop
that does nothing but repeatedly iterate over a set -- an artificial
benchmark, obviously -- the two approaches take about the same amount
of time (very close on the small sets; on the large sets, the closure
approach is about 10% _slower_).  Consing is about the same on the
small sets, but on the large sets the closure approach conses less by
a factor of 8 (a factor that would continue to grow, of course, on
even larger sets).

(All tests were done in SBCL x86/Darwin on a Core 2 Duo.)

So there's no direct performance gain from using the closure
strategy.  Why not?  Well, there is some overhead introduced by the
closure invocation.  When I did the same iteration without using a
closure -- by directly invoking the internal routines that operate on
the iterator -- I saw speedups of 44% on the small sets and 25% on the
large sets.  (Why the closure strategy should perform relatively
better on the small sets is an interesting question for which I do not
have an answer.)  But I don't want clients writing calls directly to
those internal routines -- they're not generic, and they're compiled
speed-3 safety-0 so an erroneous call could endanger the Lisp session.

(Does that seem like too much overhead for the closure invocation?  I
was a little surprised too, but keep in mind this is a loop with a
null body.)

So, what's the message from all this?  First, that consing and
traversing lists is just wicked fast in Lisp, at least when nothing
else is going on.  But there's still the large reduction in consing
provided by the closure strategy on large sets (a factor of 8 at size
200).  In real code, I am aware that this is going to make much more
of a difference than it does in this synthetic benchmark situation.
When objects are being consed some of which will have substantial
lifetimes, also consing a large number of ephemeral objects adds GC
load by causing the longer-lifetime objects to be copied more times
and promoted more frequently.  Depending on the application and the
design of the GC, this effect can be quite substantial.  So, I am
inclined to agree that the effort of creating these iterators has been
worthwhile, though perhaps it was not as urgent as you thought.

I haven't released the new code yet, but I will do so soon.  You're
certainly welcome to peruse it to see if I missed any optimization
opportunities.

-- Scott