Implementation limits when accessing circular lists

From: Adam Warner
Subject: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 04:32:26 +0000
Message-ID: <pan.2003.06.29.04.32.24.369038@consulting.net.nz>

Hi all,

Today is a great day. I grokked internal list structures, conses and
dotted lists as I came to understand why association lists were the
preferred form for lists of name/value pairs. Until this BFO (Blinding
Flash of the Obvious) the list (("name1" . "value1") ("name2" . "value2"))
felt more verbose than (("name1" "value1") ("name2" "value2")). Of course
the second list is actually a larger structure, as can be seen by
explicitly writing out the conses:

    (("name1" . ("value1" . nil)) ("name2" . ("value2" . nil)))

After these BFOs circular lists just fell into place:

(setf *print-circle* t)
(defparameter c (list "adam"))
(setf (cdr c) c)

So now I can take however many cdrs of c as I like and never reach the end
of the list. A few seconds later I test how far implementations can cope
with this. First CLISP:

(nth 100000000 c)

*** - nth: 100000000 is not a nonnegative fixnum and therefore not a valid index

Then CMUCL:

* (nth 1000000000 c)


Type-error in kernel::object-not-type-error-handler:
   1000000000 is not of type (mod 536870911)

Restarts:
  0: [abort] Return to Top-Level.

Debug  (type H for help)

(nth 2 1000000000 #1=("adam" . #1#))[:external]
Source: 
; File: target:code/list.lisp
(defun nth (n list)
  "Returns the nth object in a list where the car is the zero-th element."
  (car (nthcdr n list)))

And a similar result with SBCL. Do people concur that is not OK for these
implementations to choke? The HyperSpec states that nth can be supplied
with a "non-negative integer", not just a non-negative fixnum. Would
people approve of adding this as a conformance test?

While it is not presently efficient to access such high value cdrs of a
circular list, circular lists are a nice abstraction and I don't like to
see code break for no better reason than a fixnum becomes a bignum.

Regards,
Adam

Re: Implementation limits when accessing circular lists Nick Levine
Re: Implementation limits when accessing circular lists Paul F. Dietz
- Re: Implementation limits when accessing circular lists ·············@attbi.com
- Re: Implementation limits when accessing circular lists Adam Warner
  - Re: Implementation limits when accessing circular lists Paul F. Dietz
    - Re: Implementation limits when accessing circular lists Tim Daly, Jr.
      - Re: Implementation limits when accessing circular lists Paul F. Dietz
      - Re: Implementation limits when accessing circular lists Rob Warnock
        Re: Implementation limits when accessing circular lists Tim Daly, Jr.
  - Re: Implementation limits when accessing circular lists Rob Warnock
    - Re: Implementation limits when accessing circular lists Christophe Rhodes
      - Re: Implementation limits when accessing circular lists Rob Warnock
- Re: Implementation limits when accessing circular lists Adam Warner
  - Re: Implementation limits when accessing circular lists Kent M Pitman
Re: Implementation limits when accessing circular lists Adam Warner

From: Nick Levine
Subject: Re: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 10:18:20 +0000
Message-ID: <8732fc48.0306290218.7cadda02@posting.google.com>

It fails in LispWorks (Windows, version 4.2.7) too:


CL-USER 8 > (nth most-positive-fixnum x)
NIL

CL-USER 9 > (nth (1+ most-positive-fixnum) x)

Error: 8388608 is not a valid index for #1=(NIL . #1#).
  1 (abort) Return to level 0.
  2 Return to top loop level 0.

Type :b for backtrace, :c <option number> to proceed,  or :? for other options

CL-USER 10 : 1 > 


- nick

From: Paul F. Dietz
Subject: Re: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 13:11:32 +0000
Message-ID: <_rmcnUZELOYRfmOjXTWJhg@dls.net>

Adam Warner wrote:

> And a similar result with SBCL. Do people concur that is not OK for these
> implementations to choke? The HyperSpec states that nth can be supplied
> with a "non-negative integer", not just a non-negative fixnum. Would
> people approve of adding this as a conformance test?

Yes, that's a bug.  Standardized functions that take lists are not normally
required to handle circular lists, but NTH explicitly mentions that it should
be able to.

I'm not going to add a test for it to ansi-tests, though, since it would take
too long to run.

	Paul

From: ·············@attbi.com
Subject: Re: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 16:15:17 +0000
Message-ID: <el1c97xn.fsf@attbi.com>

"Paul F. Dietz" <·····@dls.net> writes:

> Yes, that's a bug.  Standardized functions that take lists are not normally
> required to handle circular lists, but NTH explicitly mentions that it should
> be able to.
>
> I'm not going to add a test for it to ansi-tests, though, since it would take
> too long to run.

You have your own opinion of what is `too long', but taking
most-positive-fixnum CDRs should be pretty damn quick.  Especially if
the list in question is all in the cache.

From: Adam Warner
Subject: Re: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 14:31:23 +0000
Message-ID: <pan.2003.06.29.14.31.17.220443@consulting.net.nz>

Hi Paul F. Dietz,

> Adam Warner wrote:
> 
>> And a similar result with SBCL. Do people concur that is not OK for these
>> implementations to choke? The HyperSpec states that nth can be supplied
>> with a "non-negative integer", not just a non-negative fixnum. Would
>> people approve of adding this as a conformance test?
> 
> Yes, that's a bug.  Standardized functions that take lists are not normally
> required to handle circular lists, but NTH explicitly mentions that it should
> be able to.
> 
> I'm not going to add a test for it to ansi-tests, though, since it would take
> too long to run.

Thanks for the reply Paul. At the moment it's an instantaneous failure on
CLISP, CMUCL and SBCL.

(nth (1+ most-positive-fixnum) '#1=(t . #1#))

*** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index

And even if it wasn't an instantaneous failure we're talking about
approximate a second when running interpreted on modest 32-bit hardware.
$ cat /proc/cpuinfo | grep 'MHz'
cpu MHz         : 530.046

CLISP:
(time (nth most-positive-fixnum '#1=(t . #1#)))

Real time: 0.394791 sec.
Run time: 0.39 sec.
Space: 0 Bytes

I wouldn't recommend testing this on 64-bit implementations though :-)

With CMUCL/SBCL the test also fails with a fixnum:

* (nth most-positive-fixnum '#1=(t . #1#))

Type-error in kernel::object-not-type-error-handler:
   536870911 is not of type (mod 536870911)

Restarts:
  0: [abort] Return to Top-Level.

Debug  (type H for help)

(nth 2 536870911 (t t t t t ...))[:external]
Source: (defun nth (n list)
          "Returns the nth object in a list where the car is the zero-th element."
          (car (nthcdr n list)))

I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!

Regards,
Adam

From: Paul F. Dietz
Subject: Re: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 14:56:43 +0000
Message-ID: <swmdnaSydouuYWOjXTWJhQ@dls.net>

Adam Warner wrote:

> *** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index
> 
> And even if it wasn't an instantaneous failure we're talking about
> approximate a second when running interpreted on modest 32-bit hardware.
> $ cat /proc/cpuinfo | grep 'MHz'
> cpu MHz         : 530.046

It's not going to take only 1 second if you're searching for the 536 millionth
element, though, using the naive algorithm.

It *is* possible to implement this in time proportional to the number
of cons cells in the circular list, rather than in the integer n passed
to NTH (or NTHCDR)  (idea: detect circularity using the standard two pointer
trick and use MOD to reduce n to a manageable size.)  I've suggested that
to the cmucl and sbcl implementors.

 > I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!

Good question; I have MOST-POSITIVE-FIXNUM (and integer near it) in the 'universe'
of values that ansi-tests passes to various functions.

	Paul

From: Tim Daly, Jr.
Subject: Re: Implementation limits when accessing circular lists
Date: Tue, 01 Jul 2003 02:19:36 +0000
Message-ID: <87n0fzdm4n.fsf@tenkan.org>

"Paul F. Dietz" <·····@dls.net> writes:

> It *is* possible to implement this in time proportional to the number
> of cons cells in the circular list, rather than in the integer n passed
> to NTH (or NTHCDR)  (idea: detect circularity using the standard two pointer
> trick and use MOD to reduce n to a manageable size.)  I've suggested that
> to the cmucl and sbcl implementors.
> 

Could someone please clue me in on the "standard two pointer trick"?

Thanks,
Tim

--

From: Paul F. Dietz
Subject: Re: Implementation limits when accessing circular lists
Date: Tue, 01 Jul 2003 02:38:23 +0000
Message-ID: <CQOdnYjCvoC9b52iXTWJjg@dls.net>

Tim Daly, Jr. wrote:

> Could someone please clue me in on the "standard two pointer trick"?

It's used to detect if a list is circular.  Two pointers are marched
down the list, one twice as fast as the other, and if they ever become
equal after time 0 the list is circular (and vice versa.)

(defun circular-list-p (x &aux (y x))
   (loop
     (when (or (atom y) (atom (cdr y))) (return nil))
     (setf y (cddr y)
           x (cdr x))
     (when (eq x y) (return t))))

	Paul

From: Rob Warnock
Subject: Re: Implementation limits when accessing circular lists
Date: Tue, 01 Jul 2003 02:43:32 +0000
Message-ID: <LsicnURPfrLJaJ2iXTWc-w@speakeasy.net>

Tim Daly, Jr. <···@tenkan.org> wrote:
+---------------
| "Paul F. Dietz" <·····@dls.net> writes:
| > (idea: detect circularity using the standard two pointer trick
| > and use MOD to reduce n to a manageable size.)
| 
| Could someone please clue me in on the "standard two pointer trick"?
+---------------

Do a web search for:

	tortoise hare circular

or perhaps better:

	tortoise hare "circular list"

Also see CLHS "Function LIST-LENGTH". As shown there, a common approach
is to have the hare move at twice the speed of the tortoise, but many
other winning strategies are possible [iff your termination tests are
correct!]. E.g., make the tortoise move at half the speed of the hare.
[It is useful to ask why/how this might be different than the previous,
and whether or not one might be "better" than the other in some sense.]


-Rob

-----
Rob Warnock, PP-ASEL-IA		<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Tim Daly, Jr.
Subject: Re: Implementation limits when accessing circular lists
Date: Tue, 01 Jul 2003 21:24:42 +0000
Message-ID: <87el1andnp.fsf@tenkan.org>

Thanks guys!

-Tim

--

From: Rob Warnock
Subject: Re: Implementation limits when accessing circular lists
Date: Mon, 30 Jun 2003 08:59:07 +0000
Message-ID: <S92cnZzM-rVGZmKjXTWc-w@speakeasy.net>

Adam Warner  <······@consulting.net.nz> wrote:
+---------------
| Paul F. Dietz:
| > I'm not going to add a test for it to ansi-tests, though, since it
| > would take too long to run.
| 
| Thanks for the reply Paul. At the moment it's an instantaneous failure on
| CLISP, CMUCL and SBCL. ...
...
| And even if it wasn't an instantaneous failure we're talking about
| approximate a second when running interpreted on modest 32-bit hardware.
| $ cat /proc/cpuinfo | grep 'MHz'
| cpu MHz         : 530.046
| 
| CLISP:
| (time (nth most-positive-fixnum '#1=(t . #1#)))
| 
| Real time: 0.394791 sec.
| Run time: 0.39 sec.
| Space: 0 Bytes
+---------------

On CMUCL-18e, using (1- most-positive-fixnum), it's *far* faster than that!!

	cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
	; Compiling LAMBDA NIL: 
	; Compiling Top-Level Form: 

	; Evaluation took:
	;   0.0 seconds of real time
	;   5.0e-6 seconds of user run time
	;   7.0e-6 seconds of system run time
	;   3,212 CPU cycles
	;   0 page faults and
	;   64 bytes consed.
	; 
	T
	cmu> 

Now, granted, I have a 1.4 GHz Athlon ("1600+"), but it's not *that*
much faster than Adam's machine. Hmmm... Let's try it on a lowly 133 MHz
Pentium II:

	cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
	; Compiling LAMBDA NIL: 
	; Compiling Top-Level Form: 

	; Evaluation took:
	;   0.0 seconds of real time
	;   6.6e-5 seconds of user run time
	;   10.0e-7 seconds of system run time
	;   922 CPU cycles
	;   0 page faults and
	;   64 bytes consed.
	; 
	T
	cmu> 

Now that's *really* weird! Is CMUCL's *compiler* somehow detecting that
the circularity implies a constant result?

Answer: No! Looks like some artifact with the TIME macro instead.
When you run it this way you get *much* more reasonable results:

	cmu> (defun foo (x)
	       (declare (fixnum x) (optimize (speed 3)))
	       (nth x '#1=(t . #1#)))

	FOO
	cmu> (compile 'foo)
	; Compiling LAMBDA (X): 
	; Compiling Top-Level Form: 

	FOO
	NIL
	NIL
	cmu> (time (foo (1- most-positive-fixnum)))
	; Compiling LAMBDA NIL: 
	; Compiling Top-Level Form: 

	; Evaluation took:
	;   1.59 seconds of real time
	;   1.572572 seconds of user run time
	;   0.013583 seconds of system run time
	;   2,234,478,100 CPU cycles
	;   0 page faults and
	;   64 bytes consed.
	; 
	T
	cmu> 

on the 1.4 GHz Athlon, and:

	cmu> (time (foo (1- most-positive-fixnum)))
	; Compiling LAMBDA NIL: 
	; Compiling Top-Level Form: 

	; Evaluation took:
	;   25.19 seconds of real time
	;   24.542253 seconds of user run time
	;   0.047189 seconds of system run time
	;   3,317,991,740 CPU cycles
	;   0 page faults and
	;   64 bytes consed.
	; 
	T
	cmu> 

on the 133 MHz Pentium.


-Rob

-----
Rob Warnock, PP-ASEL-IA		<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Christophe Rhodes
Subject: Re: Implementation limits when accessing circular lists
Date: Mon, 30 Jun 2003 09:12:25 +0000
Message-ID: <sqd6gwszd2.fsf@lambda.jcn.srcf.net>

····@rpw3.org (Rob Warnock) writes:

> 	cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
> 	;   0.0 seconds of real time
> Now that's *really* weird! Is CMUCL's *compiler* somehow detecting that
> the circularity implies a constant result?

No, the circularity has nothing to do with it.  It's detecting that
both (1- most-positive-fixnum) and (quote <anything>) are constants,
and CL:NTH is a known function with defined semantics, so it's
constant-folding the call at compile-time.

> Answer: No! Looks like some artifact with the TIME macro instead.
> When you run it this way you get *much* more reasonable results:
>
> 	cmu> (defun foo (x)
> 	       (declare (fixnum x) (optimize (speed 3)))
> 	       (nth x '#1=(t . #1#)))

X here isn't a constant, so the compiler can't constant-fold the call
to NTH.

Christophe
-- 
http://www-jcsu.jesus.cam.ac.uk/~csr21/       +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%")    (pprint #36rJesusCollegeCambridge)

From: Rob Warnock
Subject: Re: Implementation limits when accessing circular lists
Date: Tue, 01 Jul 2003 02:25:49 +0000
Message-ID: <J8mcnYgwmo6wbJ2iXTWc-w@speakeasy.net>

Christophe Rhodes  <·····@cam.ac.uk> wrote:
+---------------
| ····@rpw3.org (Rob Warnock) writes:
| >     cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
| >     ;   0.0 seconds of real time
| > Now that's *really* weird! Is CMUCL's *compiler* somehow detecting
| > that the circularity implies a constant result?
| 
| No, the circularity has nothing to do with it.  It's detecting that
| both (1- most-positive-fixnum) and (quote <anything>) are constants,
| and CL:NTH is a known function with defined semantics, so it's
| constant-folding the call at compile-time.
+---------------

*DOH!!*

Thanks, that also explains the long silence (much longer on the
slower machine!) that I saw between the "Compiling LAMBDA NIL:"
and "Compiling Top-Level Form:" messages...  ;-}  ;-}


-Rob

-----
Rob Warnock, PP-ASEL-IA         <····@rpw3.org>
627 26th Avenue                 <URL:http://rpw3.org/>
San Mateo, CA 94403             (650)572-2607

From: Adam Warner
Subject: Re: Implementation limits when accessing circular lists
Date: Sun, 29 Jun 2003 14:35:15 +0000
Message-ID: <pan.2003.06.29.14.34.38.592177@consulting.net.nz>

Hi Paul F. Dietz,

> Adam Warner wrote:
> 
>> And a similar result with SBCL. Do people concur that is not OK for these
>> implementations to choke? The HyperSpec states that nth can be supplied
>> with a "non-negative integer", not just a non-negative fixnum. Would
>> people approve of adding this as a conformance test?
> 
> Yes, that's a bug.  Standardized functions that take lists are not normally
> required to handle circular lists, but NTH explicitly mentions that it should
> be able to.
> 
> I'm not going to add a test for it to ansi-tests, though, since it would take
> too long to run.

Thanks for the reply Paul. At the moment it's an instantaneous failure on
CLISP, CMUCL and SBCL.

(nth (1+ most-positive-fixnum) '#1=(t . #1#))

*** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index

And even if it wasn't an instantaneous failure we're talking about
approximately a second when running interpreted on modest 32-bit hardware.
$ cat /proc/cpuinfo | grep 'MHz'
cpu MHz         : 530.046

CLISP:
(time (nth most-positive-fixnum '#1=(t . #1#)))

Real time: 0.394791 sec.
Run time: 0.39 sec.
Space: 0 Bytes

I wouldn't recommend testing this on 64-bit implementations though :-)

With CMUCL/SBCL the test also fails with a fixnum:

* (nth most-positive-fixnum '#1=(t . #1#))

Type-error in kernel::object-not-type-error-handler:
   536870911 is not of type (mod 536870911)

Restarts:
  0: [abort] Return to Top-Level.

Debug  (type H for help)

(nth 2 536870911 (t t t t t ...))[:external]
Source: (defun nth (n list)
          "Returns the nth object in a list where the car is the zero-th element."
          (car (nthcdr n list)))

I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!

Regards,
Adam

From: Kent M Pitman
Subject: Re: Implementation limits when accessing circular lists
Date: Mon, 30 Jun 2003 06:41:17 +0000
Message-ID: <sfw4r28gj8y.fsf@shell01.TheWorld.com>

Adam Warner <······@consulting.net.nz> writes:

> Thanks for the reply Paul. At the moment it's an instantaneous failure on
> CLISP, CMUCL and SBCL.
> 
> (nth (1+ most-positive-fixnum) '#1=(t . #1#))
> 
> *** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index

Perhaps it should instead quickly determine the cycle length using the cute
algorithm integer-length uses and and then use modular arithmetic to figure
out what integer element to access...

> I wouldn't recommend testing this on 64-bit implementations though :-)

In practice, the likelihood of an actual list of distinct elements that 
long is quite small, though yes, the theoretical problem couuld be large.

From: Adam Warner
Subject: Re: Implementation limits when accessing circular lists
Date: Mon, 30 Jun 2003 08:21:16 +0000
Message-ID: <pan.2003.06.30.08.21.14.852677@consulting.net.nz>

Impressively, Alexey Dejneka has reported to me that this is fixed in the
CVS version of SBCL (0.8.1.11). There has also been a short efficiency
discussion on the development mailing list:
<http://news.gmane.org/thread.php?group=gmane.lisp.steel-bank.devel>

Regards,
Adam