Hello readers.
(This is a follow up on a previous tasting question. A subdiscussion
on that thread focused on Ken Tilton's method of growing a list by its
tail. I figured that may be a topic for the common lisp cookbook.).
I would appreciate comments regarding the (in)correctness of the
recipe. I suspect there is more than one way of accomplishing the
task, but I do not plan to submit an exhaustive survey of the methods
at this time. So, if you could (please) limit comments to those of
type:
- this explantion is wrong because ... , go back and study your lisp
- be aware of the possible side-effects ...
- your explanation is not clear because ...
- and similar.
In other words, unless something is terribly wrong with the proposed
method, I will submit it to the cookbook maintainer, after having
considered your comments.
Thank you. And now here is the recipe: (Keep in mind that it is in
emacs muse format -- in particular *foo* will come out italicized foo
once the file is converted to html)
* The Common Lisp Cookbook - Grow list by its tail by using a pointer
** Contents
- Background [[#background]]
- Example [[#example]]
#background
** Background
Sometimes we need to grow a list by appending elements to its tail.
Usually this is done by pushing elements to a front of a list and then
reversing the list.
In some cases, this is not a feasible solution, as for example when
the first element of the list is already linked to some other list.
The following figure shows a list that has some elements (shown with
"+" signs) that are car's of their own list. How shall we grow these
lists?
<example>
-+---+----+--------
\ \ \
\ \ \
\ \
\
</example>
In list notation, this structure is of the form
<example>
(list 'a '('b1 'b2 'b3 'b4 'b5) 'c 'd 'e '('f1 'f2 'f3) 'g ...)
</example>
How do we grow lists b, f, ...? This is discussed next
#example
** Example
We build the sublists by using a pointer *tail* to the list's last
element. The list is grown by adding an element to *tail*, and then
re-pointing *tail* to this element.
As an example, consider growing the list *our-list*
<example>
(setf our-list (list 'b1))
</example>
We will first initialize *tail*
<example>
> (setf tail our-list)
(b1)
</example>
*tail* and *our-list* now point to a cons cell with contents (b1 .
nil)
<example>
our-list ---+
|
v
tail --- > ( b1 . nil )
</example>
To augment *our-list*, we use **rplacd** which replaces the cdr of a
cons cell
<example>
> (rplacd tail (list 'b2))
(b1 b2)
</example>
at which point we have the following situation
<example>
our-list -+
|
v
tail ---> ( b1 . ( b2 . nil ) )
</example>
Moving *tail* to the last element
<example>
> (setf tail (cdr tail))
(b2)
</example>
we have the following picture:
<example>
our-list -> ( b1 . ( b2 . nil ) )
^
|
tail ---+
</example>
We are now ready to append the next element since *tail* points to the
last element again.
Summarizing, we appended an element to the list by modifying *tail*,
and
then linked *tail* to that element.
The whole procedure can be implemented in one nested statement
<example>
> (setf tail (cdr (rplacd tail (list 'b3))))
(b3)
> our-list
(b1 b2 b3)
</example>
End of recipe.
Thanks,
Mirko
From: Ken Tilton
Subject: Re: grow list by tail, pointer example recipe -- please comment
Date:
Message-ID: <480578ee$0$15166$607ed4bc@cv.net>
·············@gmail.com wrote:
> Hello readers.
>
> (This is a follow up on a previous tasting question. A subdiscussion
> on that thread focused on Ken Tilton's method of growing a list by its
> tail. I figured that may be a topic for the common lisp cookbook.).
>
> I would appreciate comments regarding the (in)correctness of the
> recipe. I suspect there is more than one way of accomplishing the
> task, but I do not plan to submit an exhaustive survey of the methods
> at this time. So, if you could (please) limit comments to those of
> type:
>
> - this explantion is wrong because ... , go back and study your lisp
> - be aware of the possible side-effects ...
> - your explanation is not clear because ...
> - and similar.
> In other words, unless something is terribly wrong with the proposed
> method, I will submit it to the cookbook maintainer, after having
> considered your comments.
>
> Thank you. And now here is the recipe: (Keep in mind that it is in
> emacs muse format -- in particular *foo* will come out italicized foo
> once the file is converted to html)
>
> * The Common Lisp Cookbook - Grow list by its tail by using a pointer
>
>
> ** Contents
>
> - Background [[#background]]
> - Example [[#example]]
>
> #background
> ** Background
>
> Sometimes we need to grow a list by appending elements to its tail.
> Usually this is done by pushing elements to a front of a list and then
> reversing the list.
>
>
> In some cases, this is not a feasible solution, as for example when
> the first element of the list is already linked to some other list.
> The following figure shows a list that has some elements (shown with
> "+" signs) that are car's of their own list. How shall we grow these
> lists?
>
>
> <example>
> -+---+----+--------
> \ \ \
> \ \ \
> \ \
> \
> </example>
>
> In list notation, this structure is of the form
> <example>
> (list 'a '('b1 'b2 'b3 'b4 'b5) 'c 'd 'e '('f1 'f2 'f3) 'g ...)
> </example>
> How do we grow lists b, f, ...?
(let ((x (list 1 2 3 (list 'b 'c))))
(let ((y (nthcdr 3 x)))
(push 'a (car y)))
x)
-> (1 2 3 (a b c))
hth, kt
--
http://smuglispweeny.blogspot.com/
http://www.theoryyalgebra.com/
"I've never read the rulebook. My job is to catch the ball."
-- Catcher Josh Bard after making a great catch on a foul ball
and then sliding into the dugout, which by the rules allowed the
runners to advance one base costing his pitcher a possible shutout
because there was a runner on third base.
"My sig is longer than most of my articles."
-- Kenny Tilton
On Apr 16, 2:54 am, ·············@gmail.com wrote:
> Hello readers.
>
> (This is a follow up on a previous tasting question. A subdiscussion
> on that thread focused on Ken Tilton's method of growing a list by its
> tail. I figured that may be a topic for the common lisp cookbook.).
>
> I would appreciate comments regarding the (in)correctness of the
> recipe. I suspect there is more than one way of accomplishing the
> task, but I do not plan to submit an exhaustive survey of the methods
> at this time. So, if you could (please) limit comments to those of
> type:
>
> - this explantion is wrong because ... , go back and study your lisp
> - be aware of the possible side-effects ...
> - your explanation is not clear because ...
> - and similar.
> In other words, unless something is terribly wrong with the proposed
> method, I will submit it to the cookbook maintainer, after having
> considered your comments.
>
You might want to look at my collecting macro (http://www.tfeb.org/
lisp/hax.html#COLLECTING)
On Apr 16, 6:05 am, Tim Bradshaw <··········@tfeb.org> wrote:
> On Apr 16, 2:54 am, ·············@gmail.com wrote:
>
>
>
> > Hello readers.
>
> > (This is a follow up on a previous tasting question. A subdiscussion
> > on that thread focused on Ken Tilton's method of growing a list by its
> > tail. I figured that may be a topic for the common lisp cookbook.).
>
> > I would appreciate comments regarding the (in)correctness of the
> > recipe. I suspect there is more than one way of accomplishing the
> > task, but I do not plan to submit an exhaustive survey of the methods
> > at this time. So, if you could (please) limit comments to those of
> > type:
>
> > - this explantion is wrong because ... , go back and study your lisp
> > - be aware of the possible side-effects ...
> > - your explanation is not clear because ...
> > - and similar.
> > In other words, unless something is terribly wrong with the proposed
> > method, I will submit it to the cookbook maintainer, after having
> > considered your comments.
>
> You might want to look at my collecting macro (http://www.tfeb.org/
> lisp/hax.html#COLLECTING)
Thanks Tim. I will add it in a section : further reading
Mirko
·············@gmail.com writes:
> Hello readers.
>
> (This is a follow up on a previous tasting question. A subdiscussion
> on that thread focused on Ken Tilton's method of growing a list by its
> tail. I figured that may be a topic for the common lisp cookbook.).
>
> I would appreciate comments regarding the (in)correctness of the
> recipe. I suspect there is more than one way of accomplishing the
> task, but I do not plan to submit an exhaustive survey of the methods
> at this time. So, if you could (please) limit comments to those of
> type:
>
> - this explantion is wrong because ... , go back and study your lisp
> - be aware of the possible side-effects ...
> - your explanation is not clear because ...
> - and similar.
> In other words, unless something is terribly wrong with the proposed
> method, I will submit it to the cookbook maintainer, after having
> considered your comments.
>
> Thank you. And now here is the recipe: (Keep in mind that it is in
> emacs muse format -- in particular *foo* will come out italicized foo
> once the file is converted to html)
>
> * The Common Lisp Cookbook - Grow list by its tail by using a pointer
>
>
> ** Contents
>
> - Background [[#background]]
> - Example [[#example]]
>
> #background
> ** Background
>
> Sometimes we need to grow a list by appending elements to its tail.
> Usually this is done by pushing elements to a front of a list and then
> reversing the list.
- your explanation is not clear because "Usually this is
done by ..." prefixs an alternative, something you can
usually do instead.
I suggest giving an example of the standard idiom.
>
> In some cases, this is not a feasible solution, as for example when
> the first element of the list is already linked to some other list.
- your explanation is not clear because it requires the
reader to realise that you don't mean first element as in
(first list). You are talking about the first cons.
Even then one can push tail elements onto a stack and
finish with (append existing-list (nreverse
extra-elements)).
> The following figure shows a list that has some elements (shown with
> "+" signs) that are car's of their own list. How shall we grow these
> lists?
>
>
> <example>
> -+---+----+--------
> \ \ \
> \ \ \
> \ \
> \
> </example>
>
> In list notation, this structure is of the form
> <example>
> (list 'a '('b1 'b2 'b3 'b4 'b5) 'c 'd 'e '('f1 'f2 'f3) 'g ...)
> </example>
> How do we grow lists b, f, ...? This is discussed next
>
> #example
> ** Example
>
> We build the sublists by using a pointer *tail* to the list's last
> element. The list is grown by adding an element to *tail*, and then
> re-pointing *tail* to this element.
>
This is confusing. If we are growing it from scratch we
don't paint ourselves into this corner, while if it is
passed to us, how do we even get to the tail. Well, with
LAST of course, but if you don't mention LAST at this point
you throw off your readers.
I'm not seeing how this goes beyond
CL-USER> (defvar *list* (list 'a 'b 'c)) => *LIST*
CL-USER> (setf (cdr (last *list*)) (list 'd 'e)) => (D E)
CL-USER> *list* => (A B C D E)
Alan Crowe
Edinburgh
Scotland
On Apr 16, 11:54 am, Alan Crowe <····@cawtech.freeserve.co.uk> wrote:
> ·············@gmail.com writes:
> > Hello readers.
>
> > (This is a follow up on a previous tasting question. A subdiscussion
> > on that thread focused on Ken Tilton's method of growing a list by its
> > tail. I figured that may be a topic for the common lisp cookbook.).
>
> > I would appreciate comments regarding the (in)correctness of the
> > recipe. I suspect there is more than one way of accomplishing the
> > task, but I do not plan to submit an exhaustive survey of the methods
> > at this time. So, if you could (please) limit comments to those of
> > type:
>
> > - this explantion is wrong because ... , go back and study your lisp
> > - be aware of the possible side-effects ...
> > - your explanation is not clear because ...
> > - and similar.
> > In other words, unless something is terribly wrong with the proposed
> > method, I will submit it to the cookbook maintainer, after having
> > considered your comments.
>
> > Thank you. And now here is the recipe: (Keep in mind that it is in
> > emacs muse format -- in particular *foo* will come out italicized foo
> > once the file is converted to html)
>
> > * The Common Lisp Cookbook - Grow list by its tail by using a pointer
>
> > ** Contents
>
> > - Background [[#background]]
> > - Example [[#example]]
>
> > #background
> > ** Background
>
> > Sometimes we need to grow a list by appending elements to its tail.
> > Usually this is done by pushing elements to a front of a list and then
> > reversing the list.
>
> - your explanation is not clear because "Usually this is
> done by ..." prefixs an alternative, something you can
> usually do instead.
>
> I suggest giving an example of the standard idiom.
>
>
>
> > In some cases, this is not a feasible solution, as for example when
> > the first element of the list is already linked to some other list.
>
> - your explanation is not clear because it requires the
> reader to realise that you don't mean first element as in
> (first list). You are talking about the first cons.
>
> Even then one can push tail elements onto a stack and
> finish with (append existing-list (nreverse
> extra-elements)).
>
>
>
> > The following figure shows a list that has some elements (shown with
> > "+" signs) that are car's of their own list. How shall we grow these
> > lists?
>
> > <example>
> > -+---+----+--------
> > \ \ \
> > \ \ \
> > \ \
> > \
> > </example>
>
> > In list notation, this structure is of the form
> > <example>
> > (list 'a '('b1 'b2 'b3 'b4 'b5) 'c 'd 'e '('f1 'f2 'f3) 'g ...)
> > </example>
> > How do we grow lists b, f, ...? This is discussed next
>
> > #example
> > ** Example
>
> > We build the sublists by using a pointer *tail* to the list's last
> > element. The list is grown by adding an element to *tail*, and then
> > re-pointing *tail* to this element.
>
> This is confusing. If we are growing it from scratch we
> don't paint ourselves into this corner, while if it is
> passed to us, how do we even get to the tail. Well, with
> LAST of course, but if you don't mention LAST at this point
> you throw off your readers.
Point taken
>
> I'm not seeing how this goes beyond
>
> CL-USER> (defvar *list* (list 'a 'b 'c)) => *LIST*
>
> CL-USER> (setf (cdr (last *list*)) (list 'd 'e)) => (D E)
>
> CL-USER> *list* => (A B C D E)
>
> Alan Crowe
> Edinburgh
> Scotland
Indeed, one could use (last *list).
I think the point of the algorithm (as it was introduced to me by Ken
Tilton - I am not trying to throw blame on him, but how my
understanding of the issue came up), and which I did not point out,
was efficiency.
Thanks,
Mirko
On Apr 16, 11:54 am, Alan Crowe <····@cawtech.freeserve.co.uk> wrote:
> ·············@gmail.com writes:
stuff deleted ...
> - your explanation is not clear because it requires the
> reader to realise that you don't mean first element as in
> (first list). You are talking about the first cons.
>
this comments falls under "go back and study your lisp" category. I
still fail that test.
Thanks.
Mirko
P� Wed, 16 Apr 2008 17:54:49 +0200, skrev Alan Crowe
<····@cawtech.freeserve.co.uk>:
>
> This is confusing. If we are growing it from scratch we
> don't paint ourselves into this corner, while if it is
> passed to us, how do we even get to the tail. Well, with
> LAST of course, but if you don't mention LAST at this point
> you throw off your readers.
>
> I'm not seeing how this goes beyond
>
> CL-USER> (defvar *list* (list 'a 'b 'c)) => *LIST*
>
> CL-USER> (setf (cdr (last *list*)) (list 'd 'e)) => (D E)
>
> CL-USER> *list* => (A B C D E)
>
> Alan Crowe
> Edinburgh
> Scotland
what's wrong with nconc?
(setf *list* (nconc *list* (list 'D 'E)))
--------------
John Thingstad
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: grow list by tail, pointer example recipe -- please comment
Date:
Message-ID: <rem-2008apr17-001@yahoo.com>
Your language indicates that you have a gross mis-understanding of
linked lists. Whether you really do have such a gross
mis-understanding, or you understand fine but can't write legible
English to express your understanding, is beyond my ability to read
your mind. (Later in your article you seem to start to understand,
then you screw up again. I can't tell whether your mind is glazing
over, not understanding then understanding then not understanding,
or whether it's just your sloppy English.)
> From: ·············@gmail.com
> Sometimes we need to grow a list by appending elements to its tail.
No, we do *not* append elements, we append a list of elements.
If we are adding elements one at a time, then each time we append a
*list* of one element.
A linked list, what we call a "list" in Lisp, consists of a set of
cells, where each cell has two pointers, the CAR which points to an
element, and the CDR which points to the next cell in the linked
list.
> Usually this is done by pushing elements to a front of a list and
> then reversing the list.
Note carefully: Pushing an element to the front of a list does
*not* put that element ahead of the list. It puts a *cell* ahead of
the list, where the CAR of that cell points to the element, and the
CDR points to the first cell of the old list.
> In some cases, this is not a feasible solution, as for example
> when the first element of the list is already linked to some
> other list.
This is where you go grossly astray. If an *element* is the CAR of
a *cell* in one list, nothing stops that same *element* from being
the CAR of *another* cell that is in *another* list. The method of
attaching new cells to the front of that second list doesn't in any
way conflict with *other* cells which have those *same* elements in
their CARs within the first list.
(setq list1 (list 1 3 5 7 9))
(setq list2 (list))
(push (car list1) list2) ;list2 = (1)
(push 2 list2) ;list2 = (2 1)
(push (cadr list1) list2) ;list2 = (3 2 1)
(push 4 list2) ;list2 = (4 3 2 1)
(push (caddr list1) list2) ;list2 = (5 4 3 2 1)
(setq result (reverse list2));result = (1 2 3 4 5)
Through all that work, list1 remains: (1 3 5 7 9)
None of the cells in list1 are cells in list2, nor vice versa,
but three *elements* are shared between list1 and either list2 or result.
*All* five elements, but no cells, are shared between list2 and result.
> We build the sublists by using a pointer *tail* to the list's last
> element.
No, that's totally worthless. You need to maintain a pointer to the
last *cell* of the list you want to grow, the cell whose CAR points
to the last element. Having just a pointer to the last element does
no good because there's no back-link from the element to the cell.
(And think of it: If there actually were a back pointer, then a single
element could never be simultaneously in two lists, because the
back pointer could point back to just one cell in one of the lists,
not to the other, so the back-pointer rule would be violated for
the *other* list the element was in.)
> The list is grown by adding an element to *tail*, and then
> re-pointing *tail* to this element.
No no a thousand times no!
A list is grown by attaching a **cell** to *tail*, such that the
CAR of that **cell** is the desired new element, then re-pointing
*tail* to that new **cell**.
*tail*------------------------------\
|
V
+---+---+ +---+---+ +---+---+ +---+---+
| V | >---->| V | >---->| V | >---->| V |NIL|
+-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
V V V V
"Arnold" "Billy" "Charlie" "David"
(setq newelem "Edgar")
(setq newcell (cons newelem NIL))
*tail*----------------------------\ newcell
| |
V V
+---+---+ +---+---+ +---+---+ +---+---+ +---+---+
| V | >---->| V | >---->| V | >---->| V |NIL| | V |NIL|
+-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
V V V V V
"Arnold" "Billy" "Charlie" "David" "Edgar"
(setf (cdr *tail*) newcell) i.e. (rplacd *tail* newcell)
*tail*----------------------------\ newcell
| |
V V
+---+---+ +---+---+ +---+---+ +---+---+ +---+---+
| V | >---->| V | >---->| V | >---->| V | >---->| V |NIL|
+-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
V V V V V
"Arnold" "Billy" "Charlie" "David" "Edgar"
(setq *tail* newcell)
newcell
*tail*----------------------------------------\|
|/
V
+---+---+ +---+---+ +---+---+ +---+---+ +---+---+
| V | >---->| V | >---->| V | >---->| V | >---->| V |NIL|
+-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
V V V V V
"Alpha" "Baker" "Charlie" "David" "Edgar"
> (setf our-list (list 'b1))
> (setf tail our-list)
> *tail* and *our-list* now point to a cons cell with contents (b1 . nil)
Yes, now you have it correct. You really need to correct your earlier wording.
(snipped example of appending one new CONS cell whose CAR points to
one new element)
> We are now ready to append the next element since *tail* points
> to the last element again.
No, you do *not* append an *element*, you append a *list*, in this
case a brand-new CONS cell whose CAR is the new element and whose
CDR is NIL. You're back to using wrong language.
(Also, NCONC rather than APPEND is the correct jargon here, but I'm
omitting that distinction for this discussion. NCONC is basicaly a
destructive in-place version of APPEND. I suppose a more
appropriate name would be NAPPEND, analagous to NREVERSE.)
Try doing what you *say*
(setq tail (setq list (list 'a)))
(nconc tail 'b) i.e. (rplacd tail 'b) ;Doesn't do what you want!
(setq tail (cdr tail))
(nconc tail 'c) i.e. (rplacd tail 'c) ;Signals error, or crashes
(setq tail (cdr tail)) ;You never get to here
and watch the disaster that befalls you.
> Summarizing, we appended an element to the list by modifying *tail*,
> and then linked *tail* to that element.
No, no, again no!! You do *not* append an element. You append
(actually NCONC or RPLACD) a CONS cell, a *list* of one element.
You really do need to get your English correct before you offer
your recipe to a cookbook.
By the way, your recipe can *not* be implemented as a function,
because it requires *two* separate toplevel pointers.
In C++, you could perhaps pass two reference variables to a function,
but Lisp doesn't have reference variables in that sense.
In Lisp you'd have to implement your recipe as a macro, which
expands into code that references *two* different higher-scope
places, one to hold the front of the list, and one to hold the
last cell of the list, if you want to handle the very first step
in building a list:
(setq list (list)) ;Empty list
(setq tail list) ;Tail does *not* point to a CONS cell, because the
; list doesn't yet have any CONS cells.
(macrosplice list tail newelement)
expands into:
(let ((newcell (cons newelement nil)))
(if (null list) (setq tail (setq list newcell))
(progn (rplacd tail newcell) (setq tail newcell))))
Now if you want to *manually* build the single-element list, and
only later call your recipe to tack on additional elements, then
you don't need to do anything except pass tail to the function and
assign tail to the return value of the function.
(defun tackon (tail newelement)
(let ((newcell (cons newelement nil)))
(rplacd tail newcell)
newcell)))
;Usage: (setq tail (tackon tail newelement))
So users need to be careful to use tackon the same way that they
might use SORT or NREVERSE etc., where they must assign the return
value back to a variable.
Now if you just want an inline recipe, i.e. *two* lines of code to
accomplish *one* step, do this the first time:
(setq tail (setq list (cons firstelement nil)))
and then do this each subsequent time:
(rplacd tail (cons newelement nil))
(setq tail (cdr tail))
But note how you must carefully distinguish putting the first
element into a list from putting each subsequent element into the
same list.
Now back to square one!!!
There's a *major* design issue you haven't even begun to address in
your original specification of the problem to be solved:
[1] Do your use cases work like this:
- You start to build a list.
- You add elements one at a time.
- You are done building that list.
- *now* that list is available for use by other observers.
[2] Or do your use cases work like this:
- You start to build a list.
- The empty list is *already visible* to other observers.
- You add elements one at a time.
- After each element is added, the partial list is *immediately*
visible to other observers.
- You are done building that list.
- The final version of the list which was visible just before you
were done is *still* visible now, and is EQ.
[3] Or do your use cases work like this:
- You start to build a list. It can't yet be seen by other observers.
- You add the first element to the list.
- You publish the location of that list. It can now be seen by other observers.
- You add additional elements to the list, and each time the
resultant longer list is immediately seen by other observers.
- You are done building that list.
- The final version of the list which was visible just before you
were done is *still* visible now, and is EQ.
In case [1], you might consider TCONC instead. An "open" list being
built, not yet finished with the building, is represented by a
*single* CONS cell, whose CAR points to the first cell of the list
and whose CDR points to the last cell of the list, except for an
empty list both CAR and CDR are NIL.
So you build a list like this:
(setq tc (make-empty-tconc))
(tconc-grow tc firstelement)
(tconc-grow tc nextelement)
(tconc-grow tc nextelement)
(tconc-grow tc nextelement)
(setq $list$ (close-tconc tc)) ;published variable $list$ finally has a value
BBN/UCI Lisp had TCONC available, and I posted a modernized
implementation of TCONC a few days ago.
The logic of opening a list for building, building as many elements
as you want into it, then closing the building-process and
returning the finished list, is analagous to the way file I/O is
done (you open a file, write to it, close it, and only then can you
read any of it) or the way string-output-streams are done (you call
make-string-output-stream, you write to it as many times as you
want, then you call get-output-stream-string to see the final
result as a string).
In case [3], the code you described badly, which I cleared up,
would suffice.
In case [2], there's no way in general to accomplish those use
cases in Common Lisp. It might have been possible on the Lisp
Machine by using forward references (like locatives?) to allow a
reference to NIL to be changed out from under all observers to
suddenly become a CONS cell instead. I never used a Lisp machine,
so I'm not sure it really was possible.
Note that you compared your recipe to the recpie where you push
elements onto a list then nreverse it at the very end before
returning. Note that no portion of the final list is accessible by
anyone whatsoever until *after* that final nreverse is performed.
In that sense, TCONC satisfies all use cases that the old recipe
could satisfy, namely [1].
So which is faster, push+nreverse, or makeEmptyTconc+tconcGrow+closeTconc ?
push+nreverse is fastest as it runs, but then eats a gulp of CPU
time at the very end when it must traverse the whole list to
reverse all the CDR pointers.
makeEmptyTconc+tconcGrow+closeTconc, or the inline equivalent, is
slower as it runs because it must test for first case every time
through the loop (or totally mess up the calling code), but doesn't
have to traverse the list at the end.
I suspect the amortized cost is about the same either way.
But if the list occupies lots of virtual address space on a paged
system, TCONC visits each cell just twice in close succession, so
it doesn't generate any excess page faults, whereas NREVERSE
revisits the whole list after it may no longer be in the cache and
may require swapping all the cells back in. So in a paged system,
TCONC is faster than PUSH+NREVERSE.
·················@SpamGourmet.Com (Robert Maas, http://tinyurl.com/uh3t) writes:
> So which is faster, push+nreverse, or makeEmptyTconc+tconcGrow+closeTconc ?
> push+nreverse is fastest as it runs, but then eats a gulp of CPU
> time at the very end when it must traverse the whole list to
> reverse all the CDR pointers.
> makeEmptyTconc+tconcGrow+closeTconc, or the inline equivalent, is
> slower as it runs because it must test for first case every time
> through the loop (or totally mess up the calling code), but doesn't
> have to traverse the list at the end.
> I suspect the amortized cost is about the same either way.
It really depends on the processor, and on the size of the list, and
it doesn't really matter.
There's a paper showing that push+reverse is fastest, but it was on
some old processor. Nowadays, you have to be careful about the cache
size. While your list stays within L1-cache, or perhaps L2-cache,
either algorithm would be, externally, as fast as the other. On the
other hand, if the list is bigger than the cache, then processing it
sequentially only once would be faster. On the other hand, if it's
not too much bigger than the cache, the start of reversing it will
still process in the cache, and only the last part will need to be
fetched again. And of course, if the building of each element is
costly, then it doesn't matter, your list is already flushed out.
> But if the list occupies lots of virtual address space on a paged
> system, TCONC visits each cell just twice in close succession, so
> it doesn't generate any excess page faults, whereas NREVERSE
> revisits the whole list after it may no longer be in the cache and
> may require swapping all the cells back in. So in a paged system,
> TCONC is faster than PUSH+NREVERSE.
Well cache lines are like pages. But if your list must be swapped
out, forget it, go buy more RAM.
--
__Pascal Bourguignon__
On Apr 17, 4:46 am, ·················@SpamGourmet.Com (Robert Maas,
http://tinyurl.com/uh3t) wrote:
> Your language indicates that you have a gross mis-understanding of
> linked lists. Whether you really do have such a gross
> mis-understanding, or you understand fine but can't write legible
> English to express your understanding, is beyond my ability to read
> your mind. (Later in your article you seem to start to understand,
> then you screw up again. I can't tell whether your mind is glazing
> over, not understanding then understanding then not understanding,
> or whether it's just your sloppy English.)
>
> > From: ·············@gmail.com
> > Sometimes we need to grow a list by appending elements to its tail.
>
> No, we do *not* append elements, we append a list of elements.
> If we are adding elements one at a time, then each time we append a
> *list* of one element.
>
> A linked list, what we call a "list" in Lisp, consists of a set of
> cells, where each cell has two pointers, the CAR which points to an
> element, and the CDR which points to the next cell in the linked
> list.
>
> > Usually this is done by pushing elements to a front of a list and
> > then reversing the list.
>
> Note carefully: Pushing an element to the front of a list does
> *not* put that element ahead of the list. It puts a *cell* ahead of
> the list, where the CAR of that cell points to the element, and the
> CDR points to the first cell of the old list.
>
> > In some cases, this is not a feasible solution, as for example
> > when the first element of the list is already linked to some
> > other list.
>
> This is where you go grossly astray. If an *element* is the CAR of
> a *cell* in one list, nothing stops that same *element* from being
> the CAR of *another* cell that is in *another* list. The method of
> attaching new cells to the front of that second list doesn't in any
> way conflict with *other* cells which have those *same* elements in
> their CARs within the first list.
> (setq list1 (list 1 3 5 7 9))
> (setq list2 (list))
> (push (car list1) list2) ;list2 = (1)
> (push 2 list2) ;list2 = (2 1)
> (push (cadr list1) list2) ;list2 = (3 2 1)
> (push 4 list2) ;list2 = (4 3 2 1)
> (push (caddr list1) list2) ;list2 = (5 4 3 2 1)
> (setq result (reverse list2));result = (1 2 3 4 5)
> Through all that work, list1 remains: (1 3 5 7 9)
> None of the cells in list1 are cells in list2, nor vice versa,
> but three *elements* are shared between list1 and either list2 or result.
> *All* five elements, but no cells, are shared between list2 and result.
>
> > We build the sublists by using a pointer *tail* to the list's last
> > element.
>
> No, that's totally worthless. You need to maintain a pointer to the
> last *cell* of the list you want to grow, the cell whose CAR points
> to the last element. Having just a pointer to the last element does
> no good because there's no back-link from the element to the cell.
> (And think of it: If there actually were a back pointer, then a single
> element could never be simultaneously in two lists, because the
> back pointer could point back to just one cell in one of the lists,
> not to the other, so the back-pointer rule would be violated for
> the *other* list the element was in.)
>
> > The list is grown by adding an element to *tail*, and then
> > re-pointing *tail* to this element.
>
> No no a thousand times no!
> A list is grown by attaching a **cell** to *tail*, such that the
> CAR of that **cell** is the desired new element, then re-pointing
> *tail* to that new **cell**.
>
> *tail*------------------------------\
> |
> V
> +---+---+ +---+---+ +---+---+ +---+---+
> | V | >---->| V | >---->| V | >---->| V |NIL|
> +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
> V V V V
> "Arnold" "Billy" "Charlie" "David"
>
> (setq newelem "Edgar")
> (setq newcell (cons newelem NIL))
>
> *tail*----------------------------\ newcell
> | |
> V V
> +---+---+ +---+---+ +---+---+ +---+---+ +---+---+
> | V | >---->| V | >---->| V | >---->| V |NIL| | V |NIL|
> +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
> V V V V V
> "Arnold" "Billy" "Charlie" "David" "Edgar"
>
> (setf (cdr *tail*) newcell) i.e. (rplacd *tail* newcell)
>
> *tail*----------------------------\ newcell
> | |
> V V
> +---+---+ +---+---+ +---+---+ +---+---+ +---+---+
> | V | >---->| V | >---->| V | >---->| V | >---->| V |NIL|
> +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
> V V V V V
> "Arnold" "Billy" "Charlie" "David" "Edgar"
>
> (setq *tail* newcell)
> newcell
> *tail*----------------------------------------\|
> |/
> V
> +---+---+ +---+---+ +---+---+ +---+---+ +---+---+
> | V | >---->| V | >---->| V | >---->| V | >---->| V |NIL|
> +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
> V V V V V
> "Alpha" "Baker" "Charlie" "David" "Edgar"
>
> > (setf our-list (list 'b1))
> > (setf tail our-list)
> > *tail* and *our-list* now point to a cons cell with contents (b1 . nil)
>
> Yes, now you have it correct. You really need to correct your earlier wording.
>
> (snipped example of appending one new CONS cell whose CAR points to
> one new element)
>
> > We are now ready to append the next element since *tail* points
> > to the last element again.
>
> No, you do *not* append an *element*, you append a *list*, in this
> case a brand-new CONS cell whose CAR is the new element and whose
> CDR is NIL. You're back to using wrong language.
>
> (Also, NCONC rather than APPEND is the correct jargon here, but I'm
> omitting that distinction for this discussion. NCONC is basicaly a
> destructive in-place version of APPEND. I suppose a more
> appropriate name would be NAPPEND, analagous to NREVERSE.)
>
> Try doing what you *say*
> (setq tail (setq list (list 'a)))
> (nconc tail 'b) i.e. (rplacd tail 'b) ;Doesn't do what you want!
> (setq tail (cdr tail))
> (nconc tail 'c) i.e. (rplacd tail 'c) ;Signals error, or crashes
> (setq tail (cdr tail)) ;You never get to here
> and watch the disaster that befalls you.
>
> > Summarizing, we appended an element to the list by modifying *tail*,
> > and then linked *tail* to that element.
>
> No, no, again no!! You do *not* append an element. You append
> (actually NCONC or RPLACD) a CONS cell, a *list* of one element.
>
> You really do need to get your English correct before you offer
> your recipe to a cookbook.
>
> By the way, your recipe can *not* be implemented as a function,
> because it requires *two* separate toplevel pointers.
> In C++, you could perhaps pass two reference variables to a function,
> but Lisp doesn't have reference variables in that sense.
> In Lisp you'd have to implement your recipe as a macro, which
> expands into code that references *two* different higher-scope
> places, one to hold the front of the list, and one to hold the
> last cell of the list, if you want to handle the very first step
> in building a list:
> (setq list (list)) ;Empty list
> (setq tail list) ;Tail does *not* point to a CONS cell, because the
> ; list doesn't yet have any CONS cells.
> (macrosplice list tail newelement)
> expands into:
> (let ((newcell (cons newelement nil)))
> (if (null list) (setq tail (setq list newcell))
> (progn (rplacd tail newcell) (setq tail newcell))))
> Now if you want to *manually* build the single-element list, and
> only later call your recipe to tack on additional elements, then
> you don't need to do anything except pass tail to the function and
> assign tail to the return value of the function.
> (defun tackon (tail newelement)
> (let ((newcell (cons newelement nil)))
> (rplacd tail newcell)
> newcell)))
> ;Usage: (setq tail (tackon tail newelement))
> So users need to be careful to use tackon the same way that they
> might use SORT or NREVERSE etc., where they must assign the return
> value back to a variable.
>
> Now if you just want an inline recipe, i.e. *two* lines of code to
> accomplish *one* step, do this the first time:
> (setq tail (setq list (cons firstelement nil)))
> and then do this each subsequent time:
> (rplacd tail (cons newelement nil))
> (setq tail (cdr tail))
> But note how you must carefully distinguish putting the first
> element into a list from putting each subsequent element into the
> same list.
>
> Now back to square one!!!
>
> There's a *major* design issue you haven't even begun to address in
> your original specification of the problem to be solved:
>
> [1] Do your use cases work like this:
> - You start to build a list.
> - You add elements one at a time.
> - You are done building that list.
> - *now* that list is available for use by other observers.
>
> [2] Or do your use cases work like this:
> - You start to build a list.
> - The empty list is *already visible* to other observers.
> - You add elements one at a time.
> - After each element is added, the partial list is *immediately*
> visible to other observers.
> - You are done building that list.
> - The final version of the list which was visible just before you
> were done is *still* visible now, and is EQ.
>
> [3] Or do your use cases work like this:
> - You start to build a list. It can't yet be seen by other observers.
> - You add the first element to the list.
> - You publish the location of that list. It can now be seen by other observers.
> - You add additional elements to the list, and each time the
> resultant longer list is immediately seen by other observers.
> - You are done building that list.
> - The final version of the list which was visible just before you
> were done is *still* visible now, and is EQ.
>
> In case [1], you might consider TCONC instead. An "open" list being
> built, not yet...
>
> read more »
Robert,
Thanks for the comment. My understanding of the issue improved
somewhat from a couple of weeks ago (see a previous recipe
submission). I'll read your post carefully - I was away for a few
days - and hopefully it will improve some more.
And, parenthetically, the whole point of my post is to weed out
mistakes and misconceptions, before adding it to the lisp recipe
collection. I am starting to think that the major section will be the
"Acknowledgment" one.
Thanks again,
Mirko
On Apr 17, 4:46 am, ·················@SpamGourmet.Com (Robert Maas,
http://tinyurl.com/uh3t) wrote:
> Your language indicates that you have a gross mis-understanding of
> linked lists. Whether you really do have such a gross
> mis-understanding, or you understand fine but can't write legible
> English to express your understanding, is beyond my ability to read
> your mind. (Later in your article you seem to start to understand,
> then you screw up again. I can't tell whether your mind is glazing
> over, not understanding then understanding then not understanding,
> or whether it's just your sloppy English.)
Robert,
you were probably right that I did not quite get how lists are stored,
as you show an example in the following figure...
>
> *tail*------------------------------\
> |
> V
> +---+---+ +---+---+ +---+---+ +---+---+
> | V | >---->| V | >---->| V | >---->| V |NIL|
> +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
> V V V V
> "Arnold" "Billy" "Charlie" "David"
>
> (setq newelem "Edgar")
> (setq newcell (cons newelem NIL))
>
> *tail*----------------------------\ newcell
> | |
> V V
> +---+---+ +---+---+ +---+---+ +---+---+ +---+---+
> | V | >---->| V | >---->| V | >---->| V |NIL| | V |NIL|
> +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
> V V V V V
> "Arnold" "Billy" "Charlie" "David" "Edgar"
I think one reason were the diagrams in Peter Seibel's Practical
Common Lisp (see figures in Chapter 13).
Was Peter off base there, or are these alternate, and for practical
purposes equivalent pictures?
I have redone my write up, and I'll post it for further dismemberment.
Mirko
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: grow list by tail, pointer example recipe -- please comment
Date:
Message-ID: <rem-2008jun05-003@yahoo.com>
> From: ·············@gmail.com
> you were probably right that I did not quite get how lists are stored,
OK, I couldn't be sure if you didn't understand the facts or just
didn't express your understanding correctly. Either way, I'm glad
my comments were helpful.
> I think one reason were the diagrams in Peter Seibel's Practical
> Common Lisp (see figures in Chapter 13).
> Was Peter off base there, or are these alternate, and for
> practical purposes equivalent pictures?
I haven't seen the book so I have no way to comment on how he did it.
Note that in some cases it's an implementation detail whether a
value is stored directly within either half of a CONS pair or
whether what's stored there is a pointer to the value elsewhere in
the heap. In other cases, the value absolutely *must* be stored
elsewhere, because it's possible for multiple places to hold
pointers to it, in order to share storage hence share side effects
if any that occur. There is no case where there's a *requirment* to
store the value directly within half of a CONS pair. So if you
*always* show a pointer to a separate value, you're more likely to
be right than if you show values within half-CONS-pair.
The most common case of showing value within half-CONS-pair instead
of pointer to separate place in heap, is when authors draw a
diagonal line through a half-CONS-pair to indicate that it points
to NIL, which is absolutely misleading (wrong, if you want to be
nitpicky), because NIL absolutely must be a separate object,
because it contains a property list, which can be modified by
various parties, and those properties are accessible by all, thus
modifications are shared side effects, proving that it's a separate
object that is shared, with pointers from various places to it.
(They don't need to be *regular* pointers, with the machine address
of the NIL symbol. They might be a special token such as all zero
bits which is treated by the system as if a pointer to NIL. But
still, in effect when you take CAR or CDR as appropriate of such a
CONS cell you will have a copy of a pointer to the NIL symbol, not
the entire NIL symbol somehow crammed into a register.)
Now if the author makes it clear that diagonal slash through half a
CONS pair means that half contains pointer to NIL, but since such
pointers appear all over the place it'd turn the diagrams into a
horrible tangle if *every* such pointer were drawn as a trace
extending from that place to NIL located somewhere else on the
page, so the reader knew from the start that NIL isn't really
stored inside the half-CONS-pair, that would be acceptable to me.
But it really has to be made clear, on the page where such diagrams
are first introduced, not hidden in an appendix that hardly anyone
ever reads.
Anyway, that's my opinion, and some facts as I understand them.
On Apr 17, 4:46 am, ·················@SpamGourmet.Com (Robert Maas,
http://tinyurl.com/uh3t) wrote:
> Your language indicates that you have a gross mis-understanding of
> linked lists. Whether you really do have such a gross
> mis-understanding, or you understand fine but can't write legible
> English to express your understanding, is beyond my ability to read
> your mind. (Later in your article you seem to start to understand,
> then you screw up again. I can't tell whether your mind is glazing
> over, not understanding then understanding then not understanding,
> or whether it's just your sloppy English.)
OK Robert, is this better?
(again, thanks for the exhaustive critique and examples)
Proposed addition to the CL recipes starts here - it is in emacs' muse
mode that will be converted to HTML and all the * are part of markup
and NOT variable names.
* The Common Lisp Cookbook - Grow list by its tail by using a pointer
** Contents
- Preliminaries [[#preliminaries]]
- Background [[#background]]
- Recipe [[#recipe]]
- Other resources [[#other-resources]]
#preliminaries
** Preliminaries
The procedure described in this note involves list surgery, and we
will have to look at the internals of list representation in lisp.
For a good introduction to these, see Ch. 6 of Touretzky's Common
Lisp: A gentle introduction to symbolic computation (available here:
http://www.cs.cmu.edu/~dst/LispBook/index.html).
Briefly, lists are stored as a series of cons cells. The car of
each cell points to the stored element, and the cdr to the next cons
cell.
<example>
CL-USER> (setq *list1* (list 1 3 5 7 9))
(1 3 5 7 9)
CL-USER> (sdraw *list1*)
[*|*]--->[*|*]--->[*|*]--->[*|*]--->[*|*]--->NIL
| | | | |
v v v v v
1 3 5 7 9
</example>
There are some variations on the storage theme. For example, small
integers may be stored in the car, instead of having a pointer to
them. Consult a wizard for such fine details.
** Problem background
Usually, cons cells are added to a list in one of two ways: using the
push command to pre-pend a cons-cell with an element to the front, and
then reversing the list, shown here
<example>
(defvar *our-list* nil)
(push 'a *our-list*)
...
(nreverse *our-list*)
</example>
or by pointing the list's last cons cell's cdr the new cons cell:
<example>
(defvar *our-list* (list 'a 'b 'c))
(setf (cdr (last *our-list*)) (list 'next-element))
...
</example>
The second approach has the drawback of repeated use of the *last*
command. As the list grows longer, it becomes more expensive to use
it as it has to traverse the list to reach the last cons cell.
<comment>
Maybe throw this out.
One can also use append or nconc, but these are still inefficient,
because they need to traverse the list to find its last cons cell.
(This discussion assumes that our code will be compiled, not
interpreted, and thus as fast as the built-in lisp functions).
</comment>
As for the first approach, it may not be feasible in some cases. If
our data structure is a tree, one would have to insert a new cons cell
between the trunk and the list's head cons, and then at the end of all
the insertions, reverse the linkage branch linkage, and re-attach the
last cons cell of the branch to the trunk.
The following figure shows the layout of such a tree. It's trunk, or
spine, has some cons cells whose car's point to the head cons cells of
the branch lists. How shall we grow these branches?
<example>
-+---+----+--------
\ \ \
\ \ \
\ \
\
</example>
In list notation, this structure is of the form
<example>
(list 'a (list 'b1 'b2 'b3 'b4 'b5) 'c 'd 'e (list 'f1 'f2 'f3)
'g ...)
</example>
Growing the lists b, f, ... is the topic of this recipe.
#recipe
** Recipe
This recipe teaches how to append a cons cell to a list. It cannot be
easily implemented as a "canned" routine. It does not address various
other issues, such as initialization. It is intended to be used
inlined in the code and also to demonstrate the use of pointers.
We grow the sublists by using a pointer *tail* to the list's last
cons cell. For each additional element, we first create a cons cell
that points to it. This cons cell is then appended to the end of
the list using *tail*, and then tail is moved to the last cons cell.
*** Example with detailed explanation
Here is the whole procedure in greater detail. As an example,
consider growing the list *our-list*
<example>
(setf our-list (list 'b1))
</example>
We will first initialize *tail*
<example>
> (setf tail our-list)
(b1)
</example>
*tail* and *our-list* now point to a cons cell with contents (b1 .
nil)
<example>
tail ---+
|
v
our-list --- > [*|*]--->NIL
|
v
B1
</example>
To augment *our-list* with b2 we first create a cons cell to point to
b2
with *(list 'b2)*:
<example>
tail ---+ +--- new-cell
| |
v v
our-list --- > [*|*]--->NIL [*|*]--->NIL
| |
v v
B1 B2
</example>
We then use *(rplacd tail new-cell)* to create a link between our-
list's
last cell and the new cell
<example>
tail ---+
|
v
our-list --- > [*|*]---> [*|*]--->NIL
| |
v v
B1 B2
</example>
We finally re-point *tail* to the last cell
<example>
> (setf tail (cdr tail))
(b2)
</example>
we have the following picture:
<example>
tail ---+
|
v
our-list --- > [*|*]---> [*|*]--->NIL
| |
v v
B1 B2
</example>
We are now ready to append the next cons cell since *tail* points to
the
last cons cell again.
Summarizing, we appended a cons cell to the list by modifying *tail*,
and
then linked *tail* to that cons cell.
The whole procedure can be implemented in one nested statement
<example>
> (setf tail (cdr (rplacd tail (list 'b3))))
(b3)
> our-list
(b1 b2 b3)
</example>
This expression uses the fact that rplacd returns the element whose
cdr it just modified.
#other-resources
** Further reading
Tim Bradshaw offers a macro to accomplish the same goal. For a
discussion of its application and download link visit
[[http://www.tfeb.org/lisp/hax.html#COLLECTING]].
Several tools can be used to visualize list storage within lisp's
repl. The package sdraw.lisp described in Touretzky's book is
available here http://www.cs.cmu.edu/~dst/Lisp/sdraw/sdraw.generic.
Pascal Bourgignon' software is available here
(http://darcs.informatimago.com/lisp/common-lisp/cons-to-ascii.lisp)
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: grow list by tail, pointer example recipe -- please comment
Date:
Message-ID: <rem-2008jun05-004@yahoo.com>
> From: ·············@gmail.com
> OK Robert, is this better?
> Proposed addition to the CL recipes starts here - it is in emacs'
> muse mode that will be converted to HTML and all the * are part of
> markup and NOT variable names.
I'm not familiar with muse mode, so I don't know if I'll be able to
read your marked-up-text carefully enough to detect mistakes.
> The procedure described in this note involves list surgery,
Hmm, that's a cute term to denote directly modifying the internal
workings of data structures that already have been built and which
might have multiple viewers (thus side-effects of such
modifications would be shared to all such viewers). I hope that
somewhere previously you defined what you mean by "surgery".
Hey, looky what I found in WikiPedia! Sushruta, around 600 BC,
wrote a text on surgery, in particular describing rhinoplasty. His
technique, used to reconstruct noses that were amputated as a
punishment for crimes, is practiced almost unchanged in technique
to this day. Excerpted from: <http://en.wikipedia.org/wiki/Surgery#History>
Hmm, I wonder if anesthesia is analagous to freezing all other
processes by declaring a "critical section" of code when making a
set of related modifications in shared data?
> and we will have to look at the internals of list representation
> in lisp.
Note that we're looking *only* at an abstract or idealized
representation of the internals. We're not looking directly at the
internals, such as how pointers are represented (absolute address,
relative address, address divided by four, tag bits in addition to
the address, etc.) and how the heap is laid out in RAM.
Some earlier versions of Lisp had ways to obtain the machine
address that a pointer represents (converted to a boxed integer
value), and given a (boxed) integer could do the inverse of
manufacturing a pointer with that address. For example, I seem to
recall in MacLisp there was MAKNUM that converted a pointer into a
boxed integer, and MUNKAM or somesuch to convert back. I'm pretty
sure ANSI-CL doesn't provide any such facility, so the true innerds
are pretty much invisible to users and application programmers.
> Briefly, lists are stored as a series of cons cells.
That's not quite said well IMO. Maybe "chain" would be better than
"series" in that sentence. But the next sentence clarifies, so it's
not a fatal misstatement.
> The car of each cell points to the stored element, and the cdr to
> the next cons cell.
Yes, but note the possibility that some implementations may store
some values directly in the half-CONS-cell, rather than point to
value elsewhere. You might include a footnote to this effect,
saying that this is possible only when (1) the value requires less
storage than half a CONS pair, so that the value and some tag bits
can share that half-CONS-pair, and when (2) the semantics of
ANSI-CL prohibit modification (surgery) of the value in-place such
that side effects could be shared, and when (3) the semantics of
ANSI-CL allow violation of the EQ rule, that a value is always EQ
to itself. IIRC this possibility applies *only* to a limited range
of integers (typically integers within some small interval around
zero) and some characters (typically just US-ASCII, or standard
characters), and again it's an option if the implementation would
do this optimization or not. In Steele's CLtL1, this discussion is
at the bottom of page 77 and continuing to the top of page 78. The
corresponding text from the HyperSpec is here:
<http://www.isr.ist.utl.pt/library/docs/HyperSpec/Body/f_eq.htm>
An implementation is permitted to make ``copies'' of characters and
numbers at any time. The effect is that Common Lisp makes no guarantee
that eq is true even when both its arguments are ``the same thing'' if
that thing is a character or number.
If you're writing an online file, it'd probably be good to link to that.
> CL-USER> (setq *list1* (list 1 3 5 7 9))
> (1 3 5 7 9)
> CL-USER> (sdraw *list1*)
> [*|*]--->[*|*]--->[*|*]--->[*|*]--->[*|*]--->NIL
> | | | | |
> v v v v v
> 1 3 5 7 9
Hey, that's a nifty function. Do you have the source openly
available? Do you have a CGI demo whereby somebody can enter some
commands to build a list structure and then request such a diagram
to be displayed?
One thing is *not* nifty about it: It always shows the next CONS
cell consecutively to the right, which could mislead the novice to
believe these lists are in consecutive memory locations. I think
(strong opinion!!) that *before* you draw the very first such
left-to-right linked list, you draw how memory might *really* be
laid out, for example:
__________________________<<__________________________
/ \
->[*|*]->NIL [*|*] |
| | | |
V V | |
9 1 | |
/ -------------->[*|*]--\ |
/-----<<-------- / ________ | | |
| / / \ V | |
\->[*|*]------>>----/ ->[*|*] | 5 | ^
| | | \___________/ ^
V V \ |
3 7 ------->>----------/
(all the rest of the diagram represents memory which is part of
other unrelated objects)
(the traces wander all around to avoid crassing)
(if we were willing to have lines cross, we could draw more direct
links to show pointers going directly from one place to another:)
[*|*]->NIL [*|*]
| <- |//
V --- //V
9 --- // 1
-//- --------->[*|*]
// -<<<< >------- __----- |
V/ --->>>><<<---- <- V
[*|*]--- [*|*] 5
| |
V V
3 7
(but it would be very hard to follow the arrows that way)
Actually, for newbies, before you draw *any* arrow diagrams, you should
IMO show an actual hypothetical memory dump, like this:
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 32 04 NIL 38 74
1
2
3 9 1
4 7E 7A
5
6
7 A4 4E AA 02 5
8
9
A 3 7
Then describe how the top of the list is at location 08, where
first part (CAR) points to location 38 where the number 1 is found,
and second part (CDR) points to location 74 where the next CONS
pair is found. (Continue to describe in detail how the two-digit
numbers interpreted as memory addresses allow finding the next CONS
cell or the current number. Then repeat exactly the same picture
but with arrows overlaying it to show links from CDR parts to next
CONS pair. Then erase the hexadecimal numbers, replace them by
[*|*] notation, to show just the CONS pairs and links from one
place to another. Then redraw the arrows to avoid crossing. Then
rearrange the memory cells to show them in a nice linear diagram
the way your sdraw function draws them. That way the newbie really
understands that really in RAM these linked lists might jump all
over memory, and *that* is why you really need the CDR pointer to
tell you where to find the next cell in the chain. (If they were
always in consecutive memory locations, you wouldn't need a pointer
to find the next element. But that layout of memory would be an
ARRAY, not a linked list.)
> There are some variations on the storage theme. For example,
> small integers may be stored in the car, instead of having a
> pointer to them.
Actually they can be stored in the CDR too, at the end of a dotted
list. But in this context, you're talking about only proper lists,
so that wouldn't occur. I suggest a footnote here to say that,
namely that they can be stored in the CDR just the same, but then
it wouldn't be a proper list so it's not relevant to this
discussion here.
(Splitting my reply here. More another day.)
From: Robert Maas, http://tinyurl.com/uh3t
Subject: Re: grow list by tail, pointer example recipe -- please comment
Date:
Message-ID: <rem-2008jun05-005@yahoo.com>
(Part 2 of my reply. This is where you start perhaps going wrong:)
> From: ·············@gmail.com
> Usually, cons cells are added to a list in one of two ways:
Not correct! There are lots of ways to add cons cells to a list,
not just the two you describe.
> using the push command to pre-pend a cons-cell with an element to
> the front, and then reversing the list,
PUSH isn't a command, it's a macro operator.
A command would be something like BACKTRACE or L (from the
debugger, the first prints a backtrace of all forms currently
mid-execution, the second lists all bindings within the current
stack frame).
Anyway, pushing in normal sequence which builds a reverse list
which needs un-reversing at the end is one way. Another way is to
push in reversed sequence at the start so the list is already in
correct sequence and you don't need to un-reverse it at the end.
> or by pointing the list's last cons cell's cdr the new cons cell:
> (defvar *our-list* (list 'a 'b 'c))
> (setf (cdr (last *our-list*)) (list 'next-element))
The description is generic, could be either the way your example
shows, or a somewhat different way that is significantly more
time-efficient at only slight extra cost in storage.
> The second approach has the drawback of repeated use of the
> *last* command.
Only your particular implementation of it has that disadvantage.
The TCONC method doesn't require calling LAST at all, assuming
you start from an empty list and add one new element at a time.
> As the list grows longer, it becomes more expensive to use it as
> it has to traverse the list to reach the last cons cell.
Yes, if you do it your inefficient way, instead of the TCONC way.
Note there's yet another completely different way to build a list
from scratch: Instead of always pushing new elements at the start,
or always NCONCing new elements at the end, you can splice new
elements anywhere you want in the middle of the list. All you need
is a pointer to the CONS cell just before where you want to insert
the new element, and the new element itself of course.
(setq preptr howeverYouFindThatConsCell)
(setf (cdr preptr) (cons newElement (cdr preptr)))
There are a whole bunch of different applications for inserting a
new element somewhere in the middle of a list, which vary only by
how you find the CONS cell immediately before the insertion point:
- Insertion sort: Scan from front of list until the NEXT element
after the current one exceeds the new element.
- Substitution rules: Scan once down a list and everywhere you see
a particular type of element you insert a new element after it.
- Shuffle by random insert: SImilar to insertion sort except you
just pick a random place to insert.
> As for the first approach, it may not be feasible in some cases.
> If our data structure is a tree, one would have to ...
This whole paragraph+paragraph+diagram is inappropriate (off topic)
because we're dealing specifically with a proper linked list, not
with some other structure such as dotted list or tree or array or
hashtable or relational database or internet etc.
> This recipe teaches how to append a cons cell to a list. It
> cannot be easily implemented as a "canned" routine. It does not
> address various other issues, such as initialization. It is
> intended to be used inlined in the code and also to demonstrate the
> use of pointers.
As a set of macros, it *could* be "canned", and you could address
the initialization just fine.
macro: (initMyWay headPtrName tailPtrName)
sets both pointers to NIL
macro: (concMyWay headPtrName tailPtrName newElement)
Sets local temporary variable newCell to (cons newElement NIL), then
If pointers are NIL, sets both pointers to newCell;
Otherwise it does (setf (cdr tailPtrName) newCell) (setf tailPtrName newCell)
> We grow the sublists by using a pointer *tail* to the list's last
> cons cell. For each additional element, we first create a cons cell
> that points to it. This cons cell is then appended to the end of
> the list using *tail*, and then tail is moved to the last cons cell.
That's a nice description of the "Otherwise" clause of the concMyWay macro.
I suggest you first show how to do the code inline, as a code
pattern, as you do already, then later show how to encapsulate it
all within a pair of macros.
tail ---+ +--- new-cell
| |
v v
our-list --- > [*|*]--->NIL [*|*]--->NIL
| |
v v
B1 B2
Given that you have your nifty diagram-drawing function, I accept
that very-terse notation (compatible with your diagram-drawing
function) as an alternative to the usual boxed notation.
> We then use *(rplacd tail new-cell)* to create a link between
> our- list's last cell and the new cell
> tail ---+
> |
> v
> our-list --- > [*|*]---> [*|*]--->NIL
> | |
> v v
> B1 B2
I suggest you *keep* the new-cell pointer in this diagram. I.e. show:
tail ---+ +--- new-cell
| |
v v
our-list --- > [*|*]---> [*|*]--->NIL
| |
v v
B1 B2
> We finally re-point *tail* to the last cell
> (setf tail (cdr tail))
I suggest you just do (setq tail new-cell)
> tail ---+
> |
> v
> our-list --- > [*|*]---> [*|*]--->NIL
> | |
> v v
> B1 B2
I suggest you continue to keep the new-cell pointer even now.
It gets crowded with two pointers to the same CONS cell,
so at least one of the two pointers must come in at an angle:
> tail ---+ new-cell
> | /
> vL
> our-list --- > [*|*]---> [*|*]--->NIL
> | |
> v v
> B1 B2
In fact you might have used that diagonal trace in the previous
diagram when there wasn't yet any crowding, just to make that and
this diagram consistent.
At this point you might mention that we are done with new-cell and
can discard that pointer. If it's a local (LET) variable, then it
goes away automatically when we exit the LET form, so it might be a
good idea to put all that code inside a LET form:
(let ((new-cell (cons 'B2 NIL)))
(rplacd tail new-cell)
(setq tail new-cell)
)
> Summarizing, we appended a cons cell to the list by modifying
> *tail*, and then linked *tail* to that cons cell.
Typo: "linked" should read "linking".
But the wording is confusing because you don't modify *tail*,
you modify the cell it points to. Maybe re-word this way:
Summarizing, we appended a cons cell to the list by modifying
the cell that *tail* pointed to, and then advancing *tail* to
point to that new cons cell.
> The whole procedure can be implemented in one nested statement
> (setf tail (cdr (rplacd tail (list 'b3))))
> This expression uses the fact that rplacd returns the element
> whose cdr it just modified.
OK, I see why you used CDR to advance tail to the new cell instead
of just using new-cell again, to avoid having to make up a
temporary name for that new cell in the first place, so that you
could do the whole thing in one nested statment without needing
LET. I'm tempted to say that it's better to *first* write the code
using the temporary variable, the way I did it, ending with the LET
statement, and *then* notice that RPLACD returns the CONS cell that
was modified, which allows an alternate expression that doesn't
need a temporary variable and allows nesting. So you refactor the
original LET code to do the non-LET nesting.
I personally like writing the first draft of code where each single
step assigns a new named temporary variable, so that it's easy to
talk about that value by the name of its variable, and then after
the algorithm is working we consider whether it'd be worth the
effort to refactor the code to minimise use of temporary variables
and maximize the use of nesting expressions. People who hate five or
ten close parentheses in a row might choose to leave the code the
original way. People who are comfortable with all those consecutive
close parens might choose to refactor the code for maximum nesting.
But the reader of your tutorial ought to see that it can be done
either way and it's the option of the programmer which to use in
the production code. Neither way is *wrong*, and both are about
equally efficient.
More detail: Very first draft uses SETQ instead of LET, so that we
can debug each single line of code by itself. Then we refactor to
use LET. Then we either leave it that way or refactor again to do
it your fully-nested no-LET way.
Note: TCONC holds the values of headPtrName and tailPtrName in a
single CONS cell. Then you don't need a macro, an ordinary function
will do, but then (at least the very first time) you need to assign
the result of the TCONC call.
There's a sorta philosophic difference to worry about:
You need a special case for first item added to the list.
There are (at least) three ways to deal with that special case:
- The first time you add to the list, you use different code from
the later times. That's how you did it in your example.
- *Before* the first time, you do an initialization step. Then
*every* time you add to the list, even the first time, the code
is uniform, but you need a test in the uniform code to detect
whether this is the first time. This is how TCONC works.
- Before you start, you use special code to make a list with an
element you don't really need. Then no special case is needed in
the regular code at all, but at the very end you need to do CDR to
get past that extra element you had at the start.
The code for that third way is like this:
Init: (setq myList (setq myTail (cons :DUMMY NIL))
Step (with LET): (let ((newCell (cons newElement NIL)))
(rplacd myTail newCell)
(setq myTail newCell)
)
Step (sans LET): (setf myTail (cdr (rplacd myTail (cons newElement NIL))))
Step (with SETF): (setf myTail (setf (cdr myTail) (cons newElement NIL)))
Finalization: (setq myList (cdr myList))
On Jun 5, 3:02 pm, ··················@spamgourmet.com.invalid (Robert
Maas, http://tinyurl.com/uh3t) wrote:
> (Part 2 of my reply. This is where you start perhaps going wrong:)
>
> > From: ·············@gmail.com
> > Usually, cons cells are added to a list in one of two ways:
>
> Not correct! There are lots of ways to add cons cells to a list,
> not just the two you describe.
>
> > using the push command to pre-pend a cons-cell with an element to
> > the front, and then reversing the list,
>
> PUSH isn't a command, it's a macro operator.
> A command would be something like BACKTRACE or L (from the
> debugger, the first prints a backtrace of all forms currently
> mid-execution, the second lists all bindings within the current
> stack frame).
>
> Anyway, pushing in normal sequence which builds a reverse list
> which needs un-reversing at the end is one way. Another way is to
> push in reversed sequence at the start so the list is already in
> correct sequence and you don't need to un-reverse it at the end.
>
> > or by pointing the list's last cons cell's cdr the new cons cell:
> > (defvar *our-list* (list 'a 'b 'c))
> > (setf (cdr (last *our-list*)) (list 'next-element))
>
> The description is generic, could be either the way your example
> shows, or a somewhat different way that is significantly more
> time-efficient at only slight extra cost in storage.
>
> > The second approach has the drawback of repeated use of the
> > *last* command.
>
> Only your particular implementation of it has that disadvantage.
> The TCONC method doesn't require calling LAST at all, assuming
> you start from an empty list and add one new element at a time.
>
> > As the list grows longer, it becomes more expensive to use it as
> > it has to traverse the list to reach the last cons cell.
>
> Yes, if you do it your inefficient way, instead of the TCONC way.
>
> Note there's yet another completely different way to build a list
> from scratch: Instead of always pushing new elements at the start,
> or always NCONCing new elements at the end, you can splice new
> elements anywhere you want in the middle of the list. All you need
> is a pointer to the CONS cell just before where you want to insert
> the new element, and the new element itself of course.
>
> (setq preptr howeverYouFindThatConsCell)
> (setf (cdr preptr) (cons newElement (cdr preptr)))
>
> There are a whole bunch of different applications for inserting a
> new element somewhere in the middle of a list, which vary only by
> how you find the CONS cell immediately before the insertion point:
> - Insertion sort: Scan from front of list until the NEXT element
> after the current one exceeds the new element.
> - Substitution rules: Scan once down a list and everywhere you see
> a particular type of element you insert a new element after it.
> - Shuffle by random insert: SImilar to insertion sort except you
> just pick a random place to insert.
>
> > As for the first approach, it may not be feasible in some cases.
> > If our data structure is a tree, one would have to ...
>
> This whole paragraph+paragraph+diagram is inappropriate (off topic)
> because we're dealing specifically with a proper linked list, not
> with some other structure such as dotted list or tree or array or
> hashtable or relational database or internet etc.
>
> > This recipe teaches how to append a cons cell to a list. It
> > cannot be easily implemented as a "canned" routine. It does not
> > address various other issues, such as initialization. It is
> > intended to be used inlined in the code and also to demonstrate the
> > use of pointers.
>
> As a set of macros, it *could* be "canned", and you could address
> the initialization just fine.
> macro: (initMyWay headPtrName tailPtrName)
> sets both pointers to NIL
> macro: (concMyWay headPtrName tailPtrName newElement)
> Sets local temporary variable newCell to (cons newElement NIL), then
> If pointers are NIL, sets both pointers to newCell;
> Otherwise it does (setf (cdr tailPtrName) newCell) (setf tailPtrName newCell)
>
> > We grow the sublists by using a pointer *tail* to the list's last
> > cons cell. For each additional element, we first create a cons cell
> > that points to it. This cons cell is then appended to the end of
> > the list using *tail*, and then tail is moved to the last cons cell.
>
> That's a nice description of the "Otherwise" clause of the concMyWay macro.
>
> I suggest you first show how to do the code inline, as a code
> pattern, as you do already, then later show how to encapsulate it
> all within a pair of macros.
>
> tail ---+ +--- new-cell
> | |
> v v
> our-list --- > [*|*]--->NIL [*|*]--->NIL
> | |
> v v
> B1 B2
>
> Given that you have your nifty diagram-drawing function, I accept
> that very-terse notation (compatible with your diagram-drawing
> function) as an alternative to the usual boxed notation.
>
> > We then use *(rplacd tail new-cell)* to create a link between
> > our- list's last cell and the new cell
> > tail ---+
> > |
> > v
> > our-list --- > [*|*]---> [*|*]--->NIL
> > | |
> > v v
> > B1 B2
>
> I suggest you *keep* the new-cell pointer in this diagram. I.e. show:
>
> tail ---+ +--- new-cell
> | |
> v v
> our-list --- > [*|*]---> [*|*]--->NIL
> | |
> v v
> B1 B2
>
> > We finally re-point *tail* to the last cell
> > (setf tail (cdr tail))
>
> I suggest you just do (setq tail new-cell)
>
> > tail ---+
> > |
> > v
> > our-list --- > [*|*]---> [*|*]--->NIL
> > | |
> > v v
> > B1 B2
>
> I suggest you continue to keep the new-cell pointer even now.
> It gets crowded with two pointers to the same CONS cell,
> so at least one of the two pointers must come in at an angle:
>
> > tail ---+ new-cell
> > | /
> > vL
> > our-list --- > [*|*]---> [*|*]--->NIL
> > | |
> > v v
> > B1 B2
>
> In fact you might have used that diagonal trace in the previous
> diagram when there wasn't yet any crowding, just to make that and
> this diagram consistent.
>
> At this point you might mention that we are done with new-cell and
> can discard that pointer. If it's a local (LET) variable, then it
> goes away automatically when we exit the LET form, so it might be a
> good idea to put all that code inside a LET form:
> (let ((new-cell (cons 'B2 NIL)))
> (rplacd tail new-cell)
> (setq tail new-cell)
> )
>
> > Summarizing, we appended a cons cell to the list by modifying
> > *tail*, and then linked *tail* to that cons cell.
>
> Typo: "linked" should read "linking".
> But the wording is confusing because you don't modify *tail*,
> you modify the cell it points to. Maybe re-word this way:
>
> Summarizing, we appended a cons cell to the list by modifying
> the cell that *tail* pointed to, and then advancing *tail* to
> point to that new cons cell.
>
> > The whole procedure can be implemented in one nested statement
> > (setf tail (cdr (rplacd tail (list 'b3))))
> > This expression uses the fact that rplacd returns the element
> > whose cdr it just modified.
>
> OK, I see why you used CDR to advance tail to the new cell instead
> of just using new-cell again, to avoid having to make up a
> temporary name for that new cell in the first place, so that you
> could do the whole thing in one nested statment without needing
> LET. I'm tempted to say that it's better to *first* write the code
> using the temporary variable, the way I did it, ending with the LET
> statement, and *then* notice that RPLACD returns the CONS cell that
> was modified, which allows an alternate expression that doesn't
> need a temporary variable and allows nesting. So you refactor the
> original LET code to do the non-LET nesting.
>
> I personally like writing the first draft of code where each single
> step assigns a new named temporary variable, so that it's easy to
> talk about that value by the name of its variable, and then after
> the algorithm is working we consider whether it'd be worth the
> effort to refactor the code to minimise use of temporary variables
> and maximize the use of nesting expressions. People who hate five or
> ten close parentheses in a row might choose to leave the code the
> original way. People who are comfortable with all those consecutive
> close parens might choose to refactor the code for maximum nesting.
> But the reader of your tutorial ought to see that it can be done
> either way and it's the option of the programmer which to use in
> the production code. Neither way is *wrong*, and both are about
> equally efficient.
>
> More detail: Very first draft uses SETQ instead of LET, so that we
> can debug each single line of code by itself. Then we refactor to
> use LET. Then we either leave it that way or refactor again to do
> it your fully-nested no-LET way.
>
> Note: TCONC holds the values of headPtrName and tailPtrName in a
> single CONS cell. Then you don't need a macro, an ordinary function
> will do, but then (at least the very first time) you need to assign
> the result of the TCONC call.
>
> There's a sorta philosophic difference to worry about:
> You need a special case for first item added to the list.
> There are (at least) three ways to deal with that special case:
> - The first time you add to the list, you use different code from
> the later times. That's how you did it in your example.
> - *Before* the first time, you do an initialization step. Then
> *every* time you add to the list, even the first time, the code
> is uniform, but you need a test in the uniform code to detect
> whether this is the first time. This is how TCONC works.
> - Before you start, you use special code to make a list with an
> element you don't really need. Then no special case is needed in
> the regular code at all, but at the very end you need to do CDR to
> get past that extra element...
>
> read more »
Oh god, will this ever end? :-)
I will remove the cute stuff (like list-surgery) and correct other
terminology. I agree that when I was writing on the various ways of
growing lists I felt I was on very shaky ground. I think that going
into the lisp lists memory layout would be too far outside this recipe
topic. I will look into incorporating your suggestions regarding the
diagrams, and I am glad you did not find any horrible
misconceptions.
As for the drawing software, it is from Touretzky's web site:
http://www.cs.cmu.edu/~dst/Lisp/sdraw/sdraw.generic
Pascal Bourgignon also has a drawing tool:
http://darcs.informatimago.com/lisp/common-lisp/cons-to-ascii.lisp
See the thread "List diagrams -- Siebel and Touretzky draw them
differently" where he showed some examples.
Mirko