RE: EQUAL on circular lists

From: T. Kostelijk 43897
Subject: RE: EQUAL on circular lists
Date: Thu, 13 Jun 1991 14:54:55 +0000
Message-ID: <2867@prles2.prl.philips.nl>

Subject: EQUAL on circular lists

> ········@apolloway.prl.philips.nl's message of 5 Jun 91 14:08:48 GMT

> QUESTION:
> 
> Does anyone have an extended version of EQUAL which can determine in 
> finite time whether two trees of conses have the same printed
> (possibly infinite) representation?
>

The reactions I received confirmed me that not only novice Lispers read rn, 
and that it is useful for exchanging information about common problems.

A short answer to reactions received on the above question, that might be
of general use.

Frank Yellin (uunet!stanford.edu!lucid.com!karoshi!fy) asked 
for a more formal specification. 


MORE FORMAL DEFINITION OF EQUAL-CIRCLE:

Consider the infinite set S of functions denoted as C{A|D}*R,
and without loss of generality, only consider (possibly circular) lists and symbols.
Any two objects a and b are defined EQUAL-CIRCLE iff the following holds:
  for every function f in S for which (f a) is 
    a. a symbol,  1. (f b) is a symbol and
                  2. (eql (f a) (f b)),
    b. a cons,    1. (f b) is a cons,
    c. not defined (an error), (f b) must be not defined.

Notes: 1. Ofcourse I know that only a few functions of S are usually implemented,
          but that is irrelevant for formal statements.
       2. The definition could be extended easily to other atoms than symbols.
       3. (to Frank) So now obviously the lists (#1=(a . #1) b) and (#1=(a . #1) c)
          are not EQUAL-CIRCLE, take f = CADR


DETERMINISTIC FINITE STATE AUTOMATA:

Frank also confirmed my own impression that the problem seems pretty
much the same as the problem of determining whether two deterministic
finite state automata are equivalent.
This problem has a worst case complexity of sizeof(a) * sizeof(b),
where 'size' is the number of cons cells and atoms in the object.

Perhaps someone has a proof of the FSA statement?


THE BEST ALGORTIHM:

The algorithm I like most was send in by Bernhard Pfahringer
(········@ai-vie.uucp), from Vienna, by using the trick to make conses 
EQ temporarily, thus preventing EQUAL-CIRCLE from going into endless loops.
It has the following definition:

(defun EQUAL-CIRCLE (cons1 cons2)
  "Equal-circle determines equivalence between possibly circular trees"
  (or (eql cons1 cons2)
      (and (consp cons1)
	   (consp cons2)
	   (let ((f2 (first cons2))
		 (r2 (rest cons2)))
	     (unwind-protect
		  (progn (setf (first cons2) (first cons1) ; trick
			       (rest cons2) (rest cons1))      
			 (and (equal-circle (first cons1) f2)
			      (equal-circle (rest cons1) r2)))
	       (setf (first cons2) f2                      ; untrick
		     (rest cons2) r2))))))

It works quite efficiently for usual cases and is not hard to understand.     
The runtime can be reduced up to 50% by leaving the UNWIND-PROTECT out,
but then the process should not be interrupted for data consistency.

I want to thank for all reactions, especially from Bernhard and Frank.


FROM:

Ton Kostelijk, 
email adress: ········@apolloway.prl.philips.nl

Re: EQUAL on circular lists, yet again. Frank Yellin
Re: EQUAL on circular lists, yet again. Don Cohen

From: Frank Yellin
Subject: Re: EQUAL on circular lists, yet again.
Date: Mon, 17 Jun 1991 14:07:12 +0000
Message-ID: <FY.91Jun17160712@hardwick.lucid.com>

Ton Kostelijk posts the following algorithm from Bernhard Pfahringer.
> (defun equal-circle (cons1 cons2)
>   (or (eql cons1 cons2)
>       (and (consp cons1)
>            (consp cons2)
>            (let ((f2 (first cons2))
>                  (r2 (rest cons2)))
>              (unwind-protect
>                  (progn (setf (first cons2) (first cons1)
>                              (rest cons2) (rest cons1))
>                         (and (equal-circle (first cons1) f2)
>                              (and (equal-circle (first cons1) f2)
>                                  (equal-circle (rest cons1) r2))))
>               (setf (first cons2) f2 (rest cons2) r2))))))

I received mail from Jim Boyce (the originator of the idea for my original
algorithm) this morning.  He sent me an example for which the above
algorithm goes into an infinite loop, even though it should return 't.

Imagine four cons cells, which we'll call a, b, c, and d, in which the car
and cdr of each of the four cells are themselves one of the four cons cells.

Trivially, any two should be equal-circle to each other, since c {a|d}* r
applied to any of the four cells gives you another of the cells.

Here's what a, b, c, and d look like:

           a       b       c       d
        (c . b) (a . c) (a . c) (d . d)


Now try (equal-circle a d)
                                a          b          c          d

(equal-circle a d)           (c . b)    (a . c)    (a . c)    (d . d)
  BASH d=d.d                 (c . b)    (a . c)    (a . c)    (c . b)
  (equal-circle c d)         (c . b)    (a . c)    (a . c)    (c . b)
    BASH d=c.b               (c . b)    (a . c)    (a . c)    (a . c)
    (equal-circle a c)       (c . b)    (a . c)    (a . c)    (a . c)  **
      BASH c=a.c             (c . b)    (a . c)    (c . b)    (a . c)
      (equal-circle c a) => t
      (equal-circle b c)     (c . b)    (a . c)    (c . b)    (a . c)
        BASH c=c.b           (c . b)    (a . c)    (a . c)    (a . c)
        (equal-circle a c)   (c . b)    (a . c)    (a . c)    (a . c)  **

The two lines marked ** are identical, but one is a subproblem of the
other.  So the result is infinite recursion.

Here's the example in "standard" lisp notation:
   (equal-circle '#1=(#3=(#1# . #3#) . #2=(#1# . #3#))   '#4=(#4# . #4#))
Try it and watch your stack overflow.

The moral:  rplaca and rplacd are dangerous toys, even when they go by
other names.  


-- Frank Yellin
   ··@lucid.com

From: Don Cohen
Subject: Re: EQUAL on circular lists, yet again.
Date: Wed, 19 Jun 1991 20:09:49 +0000
Message-ID: <18310@venera.isi.edu>

It seems to me that the solution has to postulate a set of
equivalences.  Whenever you come across a non-list you can
potentially determine that the structures are not equivalent.
Otherwise you simply have to postulate that they are, and
check their cars and cdrs.  In order to avoid infinite
recursion, of course, any time that you come across a pair
that was already postulated equivalent, you don't recur.
The transitivity of equivalence can be used to reduce the
amount of checking.  The <angle brackets> below indicate
code to be filled in.

(defun equal-circle (x y &aux <equivalence-data>)
  (labels
   ((presume-equivalent (x y)
	<merge the equivalence classes of x and y>)
    (presumed-equivalent (x y)
	<determine whether the equivalence classes of x and y are the same>)
    (test-equivalent x y)
      (if (or (eq x y) (presumed-equivalent x y)) t
	(if (not (and (consp x) (consp y))) (return-from equal-circle nil)
	  (progn (presume-equivalent x y)
		 (test-equivalent (car x) (car y))
		 (test-equivalent (cdr x) (cdr y))))))
   (test-equivalent x y)))

There are pretty efficient ways of finding and merging equivalence classes.