According to the CLHS in re SORT:
"[the sort] Predicate should return true if and only if the first
argument is strictly less than the second (in some appropriate sense).
If the first argument is greater than or equal to the second (in the
appropriate sense), then the predicate should return false."
I am trying to sort a list of the nodes in DAG where kids know their
parents. My only requirement on the sort result is that no node precede
any of its ancestors (its parent or its parent's parent or etc). Two
siblings, for example, can appear in any order.
My predicate correctly returns T if arg1 is an ascendant of arg2.
Otherwise it returns nil. So nil means either that arg2 is an ascendant
of arg1, or that neither is an ascendant of the other. My reading of the
above CLHS excerpt tells me SORT is fine with that.
My nodes ain't getting sorted correctly. In at least one case, a kid
comes out before an ascendant (its parent, as it happens).
I observe that the two never get compared directly. I realize that would
not always be necessary, but printing out all the comparisons I find
only two out of some two or three dozen that test out to be True. So
that is not a lot of information by which to sort quite a few things
(have not counted them up, but around twenty).
I am about to start whittling the thing down to where I have a literal
list and some simpler code to play with so I can easily test in other
Lisps, but I thought I would drop by here to find out if I am missing
something subtle about sorting.
kenny
Kenny Tilton wrote:
> I am trying to sort a list of the nodes in DAG where kids know their
> parents. My only requirement on the sort result is that no node precede
> any of its ancestors (its parent or its parent's parent or etc). Two
> siblings, for example, can appear in any order.
This sound awfully lot like a topological sort. A depth-first
traversal where you cons the nodes on to list when you leave
the node should do the trick.
Unless... Does "where kids know their parents" imply that a
parents don't know the kids?
--
Jens Axel S�gaard
Jens Axel S�gaard wrote:
> Kenny Tilton wrote:
>
>> I am trying to sort a list of the nodes in DAG where kids know their
>> parents. My only requirement on the sort result is that no node
>> precede any of its ancestors (its parent or its parent's parent or
>> etc). Two siblings, for example, can appear in any order.
>
>
> This sound awfully lot like a topological sort. A depth-first
> traversal where you cons the nodes on to list when you leave
> the node should do the trick.
I am reaching the sort in a roundabout way (which seems to be the
problem--see below), so I kind of have to deal with the collection as
presented to me at the time when it is presented to me (and this is
actually a sort within a larger sort, where the predicate has actually
first determined a tie on an opcode and is now trying to do a tiebreak
so parents go first).
>
> Unless... Does "where kids know their parents" imply that a
> parents don't know the kids?
>
No, so I could traverse to compare the two, but my current method (using
my-ascendant-p to check that) is OK, I think the problem is:
Looking at the population being sorted, I do not have the whole tree.
(not every node goes through this operation I need sorted, and I queue
up the operations and defer them until after the sort as they come up,
so I only have part of the tree).
My guess is that, if I had the whole tree, SORT would end up with enough
info to get things right. But let me confirm that before hypothesizing
further. It still does not make sense, since if SORT had bothered to
hand me the parent and child for comparison I would have given it the
right answer. So... how does SORT know it can stop? I still smell a bug.
kenny
Kenny Tilton wrote:
> Looking at the population being sorted, I do not have the whole tree.
> (not every node goes through this operation I need sorted, and I queue
> up the operations and defer them until after the sort as they come up,
> so I only have part of the tree).
That does sounds a bit tricky. If the tree is small, then you could
afford to build a tree where the parents know their kids, and then
go through that.
--
Jens Axel S�gaard
Kenny Tilton wrote:
> My predicate correctly returns T if arg1 is an ascendant of arg2.
> Otherwise it returns nil. So nil means either that arg2 is an ascendant
> of arg1, or that neither is an ascendant of the other. My reading of the
> above CLHS excerpt tells me SORT is fine with that.
I think this should work, unless "ascendant" means "direct ascendent".
What does you comparison predicate return when a grandparent and a kid
is compared?
--
Jens Axel S�gaard
Jens Axel S�gaard wrote:
> Kenny Tilton wrote:
>
>> My predicate correctly returns T if arg1 is an ascendant of arg2.
>> Otherwise it returns nil. So nil means either that arg2 is an
>> ascendant of arg1, or that neither is an ascendant of the other. My
>> reading of the above CLHS excerpt tells me SORT is fine with that.
>
>
> I think this should work, unless "ascendant" means "direct ascendent".
>
> What does you comparison predicate return when a grandparent and a kid
> is compared?
>
No cases are offered to the predicate, but I think I would get it right:
(defun fm-ascendant-p (older younger)
(cond
((null (fm-parent younger)) nil)
((eq older (fm-parent younger)) t)
(t (fm-ascendant-p older (fm-parent younger)))))
I certainly do not see any errors in the actual comparisons (I print
them all out and the result).
Meanwhile, I am not so sure I am not giving the whole tree to SORT.
Lemme work on this some more and report back with better info.
thx, kenny
So, looking at the hyperspec,
If the key and predicate always return, then the sorting operation
will always terminate, producing a sequence containing the same
elements as sequence (that is, the result is a permutation of
sequence). This is guaranteed even if the predicate does not really
consistently represent a total order (in which case the elements
will be scrambled in some unpredictable way, but no element will be
lost). If the key consistently returns meaningful keys, and the
predicate does reflect some total ordering criterion on those keys,
then the elements of the sorted-sequence will be properly sorted
according to that ordering.
What you would *like* the hyperpsec to say (for your purposes) is that
if the predicate represents a partial order, that the returned
sequence will respect that partial order --- in other words, that you
can use sort to do topological sort. Unfortunately, it doesn't say
that. According to the spec, if the predicate I give is a partial
order, it looks like just returning the sequence in a random order is
conforming behavior. Your predicate is a partial order, so you can
portably expect nothing except that you get back a new sequence
containing the same elements.
I agree with Jens Axel Søgaard's first suggestion in that I believe
what you want is a topological sort. The good news is it's O(n), not
O(n log n). The bad news is that you probably get to right it
yourself. The good news is it's not really hard.
(Side note: I think this is the "right" specification for sort, in
that the optimal algorithms for sort with a full ordering do not
automatically respect partial orderings (as you're observing). It
would be amusing to formalize this, if I had more time.)
Cheers,
rif
rif <···@mit.edu> wrote:
+---------------
| I agree with Jens Axel Søgaard's first suggestion in
| that I believe what you want is a topological sort.
+---------------
Ditto.
+---------------
| The good news is it's O(n), not O(n log n). The bad news is that you
| probably get to right it yourself. The good news is it's not really hard.
+---------------
And the best news is that if you pick up your copy of Knuth TAoCP
Vol.1 Ch.2 "Data Structures" and look for "Algorithm T", you'll
find a nice reference implementation of topological sort.
Note: I've been using it on & off for *years*, coded up in whatever
language of the moment I was required to use. And then I don't use
it for a while, and the next time I need it I find that I've forgetten
the magic trick that makes it all "just work" and have to go back
to Knuth to learn the tricky bit again... (*sigh*)
-Rob
-----
Rob Warnock <····@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
rif wrote:
> So, looking at the hyperspec,
>
> If the key and predicate always return, then the sorting operation
> will always terminate, producing a sequence containing the same
> elements as sequence (that is, the result is a permutation of
> sequence). This is guaranteed even if the predicate does not really
> consistently represent a total order (in which case the elements
> will be scrambled in some unpredictable way, but no element will be
> lost). If the key consistently returns meaningful keys, and the
> predicate does reflect some total ordering criterion on those keys,
> then the elements of the sorted-sequence will be properly sorted
> according to that ordering.
>
> What you would *like* the hyperpsec to say (for your purposes) is that
> if the predicate represents a partial order....
aha. what is a partial vs total order? Wait, Google is my friend:
> Total order
> From Wikipedia, the free encyclopedia
>
> In mathematics, a total order, linear order or simple order on a set X is
>any binary relation on X that is antisymmetric, transitive, and total.
>This means that if we denote one such relation by ≤ then the following
>statements hold for all a, b and c in X:
>
> if a ≤ b and b ≤ a then a = b (antisymmetry)
> if a ≤ b and b ≤ c then a ≤ c (transitivity)
> a ≤ b or b ≤ a (totality)
OK, I am missing transitivity. And I can fix that by imposing a tougher
sort than the application requires (traversal order).
Thanks! You just saved me half a day and an embarrassing bug report. :)
kenny
On Mon, 27 Feb 2006 00:10:29 GMT, Kenny Tilton
<·············@nyc.rr.com> wrote:
>OK, I am missing transitivity. And I can fix that by imposing a tougher
>sort than the application requires (traversal order).
If you have to traverse a big set of nodes to find the traversal order
of a small subset, and you would rather just sort that small subset,
you might want to have each node remember its depth, which would be
defined as being one greater than the depth of its deepest parent.
Then you could simply sort by depth, with no need for traversal.
> aha. what is a partial vs total order? Wait, Google is my friend:
>
> > Total order
> > From Wikipedia, the free encyclopedia
> > In mathematics, a total order, linear order or simple order on a set
> > X is
> >any binary relation on X that is antisymmetric, transitive, and total.
> >This means that if we denote one such relation by � then the following
> >statements hold for all a, b and c in X:
> > if a � b and b � a then a = b (antisymmetry)
> > if a � b and b � c then a � c (transitivity)
> > a � b or b � a (totality)
>
> OK, I am missing transitivity. And I can fix that by imposing a
> tougher sort than the application requires (traversal order).
>
> Thanks! You just saved me half a day and an embarrassing bug report. :)
>
> kenny
Unless I misunderstood, you have transitivity. If A is an ancestor of
B and B is an ancestor of C, then A is an ancestor of C. Partial
orderings are in general transitive (basically, partial orderings are
what you get from DAGs, with trees being a special case of DAGs).
What partial orderings are misssing is totality. If A has two
children, B and C, they are not necessarily comparable in a partial
ordering.
You can make them comparable by adding an ID, as has been discussed,
and that may or may not do what you want by default, but The Right
Thing is just to use a topological sort rather than artifically
imposing a total ordering so that you can use a regular sort.
rif
rif wrote:
> What you would *like* the hyperpsec to say (for your purposes) is that
> if the predicate represents a partial order, that the returned
> sequence will respect that partial order --- in other words, that you
> can use sort to do topological sort. Unfortunately, it doesn't say
> that. According to the spec, if the predicate I give is a partial
> order, it looks like just returning the sequence in a random order is
> conforming behavior. Your predicate is a partial order, so you can
> portably expect nothing except that you get back a new sequence
> containing the same elements.
Ah!
To get rid of the "partialness" one could give each node an
id-number, and when the two doesn't lie on the same path to
a common root, compare the id-numbers. The original simplicity
of just using sort seems to be lost though.
--
Jens Axel Søgaard
Jens Axel Søgaard wrote:
> rif wrote:
>
>> What you would *like* the hyperpsec to say (for your purposes) is that
>> if the predicate represents a partial order, that the returned
>> sequence will respect that partial order --- in other words, that you
>> can use sort to do topological sort. Unfortunately, it doesn't say
>> that. According to the spec, if the predicate I give is a partial
>> order, it looks like just returning the sequence in a random order is
>> conforming behavior. Your predicate is a partial order, so you can
>> portably expect nothing except that you get back a new sequence
>> containing the same elements.
>
>
> Ah!
>
> To get rid of the "partialness" one could give each node an
> id-number, and when the two doesn't lie on the same path to
> a common root, compare the id-numbers. The original simplicity
> of just using sort seems to be lost though.
>
Well, there is an opcode that gives me my first sort decision, but many
occurrences sharing each opcode and the opcode that says (effectively)
"instantiate-mirror-instance" requires that parents be created before
children, so SORT is still handy, I just have to restore the tree
ordering lost as the entries were collected.
thx for the input.
kenny