From: Ryan C. Tarpine
Subject: Function reusing variables
Date: 
Message-ID: <Pine.LNX.4.58-035.0402212329050.3192@unix44.andrew.cmu.edu>
I've been doing several of the exercises in Touretzky's Gentle Intro to
Symbolic Computation.  For one, I need to count the number of each type of
DNA base in a (single or double) strand.

Strands are represented either as '(A T A C G) or as doubles
'((A T) (T A) (A T) (C  G) (G C)).

I wrote the following function to do this:
(defun count-bases (strand)
  (let ((answer '((a 0) (t 0) (g 0) (c 0))))
    (dolist (base strand answer)
      (cond ((atom base)
             (incf (second (assoc base answer))))
            (t (mapcar
                (lambda (b) (incf (second (assoc b answer))))
                base))))))

Due to something obvious I'm missing, each call to count-bases seems to
reuse the same answer list so the result counts increase every time.

I just started learning Lisp a few days ago, so I don't know what I'm
doing wrong.  What's going on here?

Thanks,
Ryan

--
Spelling is a lossed art.

From: Matthew Danish
Subject: Re: Function reusing variables
Date: 
Message-ID: <20040222073736.GU8667@mapcar.org>
On Sun, Feb 22, 2004 at 12:53:10AM -0500, Ryan C. Tarpine wrote:
> I've been doing several of the exercises in Touretzky's Gentle Intro to
> Symbolic Computation.  For one, I need to count the number of each type of
> DNA base in a (single or double) strand.
> 
> Strands are represented either as '(A T A C G) or as doubles
> '((A T) (T A) (A T) (C  G) (G C)).
> 
> I wrote the following function to do this:
> (defun count-bases (strand)
>   (let ((answer '((a 0) (t 0) (g 0) (c 0))))
>     (dolist (base strand answer)
>       (cond ((atom base)
>              (incf (second (assoc base answer))))
>             (t (mapcar
>                 (lambda (b) (incf (second (assoc b answer))))
>                 base))))))
> 
> Due to something obvious I'm missing, each call to count-bases seems to
> reuse the same answer list so the result counts increase every time.

The reason is that the list you are using to count up values is a quoted
literal.  The compiler is allowed (required?) to use the actual object
((A 0) (T 0) (G 0) (C 0)) that was originally created by READ when you
run your program.  So basically, you keep modifying the same object
everytime you run the program (what you observed).  Modifying a quoted
literal is considered to be undefined behavior, this is one of the
possible effects of doing so.  Your problems would be easily solved by
creating a fresh list each time the function is called, so you would
write (list (list 'a 0) (list 'b 0) (list 'c 0) (list 'd 0)).
Alternatively, you can just represent the counts as ordinary lexical
variables and do some error-checking of the strand at the same time
(it'll be more efficient likely, since the lexical variables can be put
into registers).  If there were lots of potential keys to work with,
then you'd be better off with a hash-table.  I advise using a local
function that case-switches on the bases, also, so you don't repeat that
code twice (and you can feed it to MAPCAR too).

-- 
; Matthew Danish <·······@andrew.cmu.edu>
; OpenPGP public key: C24B6010 on keyring.debian.org
; Signed or encrypted mail welcome.
; "There is no dark side of the moon really; matter of fact, it's all dark."
From: Harald Hanche-Olsen
Subject: Re: Function reusing variables
Date: 
Message-ID: <pco65dz1h6q.fsf@thoth.math.ntnu.no>
+ Matthew Danish <·······@andrew.cmu.edu>:

| On Sun, Feb 22, 2004 at 12:53:10AM -0500, Ryan C. Tarpine wrote:
| > (defun count-bases (strand)
| >   (let ((answer '((a 0) (t 0) (g 0) (c 0))))
| > [...]
| > Due to something obvious I'm missing, each call to count-bases seems to
| > reuse the same answer list so the result counts increase every time.
| 
| The reason is that the list you are using to count up values is a quoted
| literal.  The compiler is allowed (required?) to use the actual object
| ((A 0) (T 0) (G 0) (C 0)) that was originally created by READ when you
| run your program.  [... excellent advice follows ...]

I found myself wondering if one couldn't use backquote instead.  So I
read a bit the HyperSpec, which seemed to confirm this notion.  Then I
tried the following experiment (in cmucl):

CL-USER>
(defun foo ()
  (let ((a `(0)))
    (incf (car a))))
FOO
CL-USER>
(foo)
1
CL-USER>
(foo)
2

Oops, not at all what I had expected.  And yet, I read the HyperSpec
as saying that `(0) should be interpreted as (append (list '0) nil),
or something operationally equivalent to that (i.e., (list 0)).

Am I wrong, or is cmucl wrong?

And if I am right, would this still be bad style?
I suppose (mapcar (lambda (b) (list b 0)) '(a t g c))
or even (loop for b in '(a t g c) collect (list b 0))
might convey the intended result more effectively to the reader.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Kalle Olavi Niemitalo
Subject: Re: Function reusing variables
Date: 
Message-ID: <87fzd3ie3g.fsf@Astalo.kon.iki.fi>
Matthew Danish <·······@andrew.cmu.edu> writes:

> Your problems would be easily solved by creating a fresh
> list each time the function is called, so you would write
> (list (list 'a 0) (list 'b 0) (list 'c 0) (list 'd 0)).

Or (copy-tree '((a 0) (t 0) (g 0) (c 0))).
From: Harald Hanche-Olsen
Subject: Re: Function reusing variables
Date: 
Message-ID: <pcovflzz5mx.fsf@thoth.math.ntnu.no>
+ Matthew Danish <·······@andrew.cmu.edu>:

| On Sun, Feb 22, 2004 at 12:53:10AM -0500, Ryan C. Tarpine wrote:
| > (defun count-bases (strand)
| >   (let ((answer '((a 0) (t 0) (g 0) (c 0))))
| > [...]
| > Due to something obvious I'm missing, each call to count-bases seems to
| > reuse the same answer list so the result counts increase every time.
| 
| The reason is that the list you are using to count up values is a quoted
| literal.  The compiler is allowed (required?) to use the actual object
| ((A 0) (T 0) (G 0) (C 0)) that was originally created by READ when you
| run your program.  [... excellent advice follows ...]

I found myself wondering if one couldn't use backquote instead.  So I
read a bit the HyperSpec, which seemed to confirm this notion.  Then I
tried the following experiment (in cmucl):

CL-USER>
(defun foo ()
  (let ((a `(0)))
    (incf (car a))))
FOO
CL-USER>
(foo)
1
CL-USER>
(foo)
2

Oops, not at all what I had expected.  And yet, I read the HyperSpec
as saying that `(0) should be interpreted as (append (list '0) nil),
or something operationally equivalent to that (i.e., (list 0)).

Am I wrong, or is cmucl wrong?

And if I am right, would this still be bad style?  Maybe so.  I
suppose using copy-tree, as Kalle Olavi Niemitalo suggested, is
clearer.  But I am still curious about the above issue.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Lars Brinkhoff
Subject: Re: Function reusing variables
Date: 
Message-ID: <85y8qvjohe.fsf@junk.nocrew.org>
Harald Hanche-Olsen <······@math.ntnu.no> writes:
> I read the HyperSpec as saying that `(0) should be interpreted as
> (append (list '0) nil), or something operationally equivalent to
> that (i.e., (list 0)).

Or anything EQUAL to that, like '(0).

-- 
Lars Brinkhoff,         Services for Unix, Linux, GCC, HTTP
Brinkhoff Consulting    http://www.brinkhoff.se/
From: Harald Hanche-Olsen
Subject: Re: Function reusing variables
Date: 
Message-ID: <pcoishzz23v.fsf@thoth.math.ntnu.no>
+ Lars Brinkhoff <·········@nocrew.org>:

| Harald Hanche-Olsen <······@math.ntnu.no> writes:
| > I read the HyperSpec as saying that `(0) should be interpreted as
| > (append (list '0) nil), or something operationally equivalent to
| > that (i.e., (list 0)).
| 
| Or anything EQUAL to that, like '(0).

Hmm.  Yes, the HyperSpec says that, but then it goes on

  provided that the side-effect behavior of the substitute form F2 is
  also consistent with the description given above.

I guess this is what had me confused.  Had I but noticed the very next
sentence:

  The constructed copy of the template might or might not share list
  structure with the template itself.

then I would have realized that your are right, and the meaning of the
previous sentence is different than I had taken it.  Presumably, it is
only about the side-effects from evaluating the form, not about how
its value is affected by the side-effects from other parts of the
program.

Just goes to show that the HyperSpec needs to be read with great care.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Ryan C. Tarpine
Subject: Re: Function reusing variables
Date: 
Message-ID: <Pine.LNX.4.58-035.0402221309160.4768@unix44.andrew.cmu.edu>
On Sun, 22 Feb 2004, Matthew Danish wrote:

> On Sun, Feb 22, 2004 at 12:53:10AM -0500, Ryan C. Tarpine wrote:
>
> The reason is that the list you are using to count up values is a quoted
> literal.  The compiler is allowed (required?) to use the actual object
> ((A 0) (T 0) (G 0) (C 0)) that was originally created by READ when you
> run your program.  So basically, you keep modifying the same object
> everytime you run the program (what you observed).  Modifying a quoted
> literal is considered to be undefined behavior, this is one of the
> possible effects of doing so.  Your problems would be easily solved by
> creating a fresh list each time the function is called, so you would
> write (list (list 'a 0) (list 'b 0) (list 'c 0) (list 'd 0)).
> ...snip useful advice...

Thanks! That's just what I needed to know.  If this wasn't just a little
exercise I would try out your other suggestions as well.  I don't know
enough of the language yet to do much.  I've seen LABELS for making local
function defs but I don't know how to do hashtables.

Ryan
From: Alan Crowe
Subject: Re: Function reusing variables
Date: 
Message-ID: <86ptc48flw.fsf@cawtech.freeserve.co.uk>
Matthew Danish advised: 
> Alternatively, you can just represent the counts as
> ordinary lexical variables and do some error-checking of
> the strand at the same time (it'll be more efficient
> likely, since the lexical variables can be put into
> registers).  If there were lots of potential keys to work
> with, then you'd be better off with a hash-table.

I wonder what the list thinks of using dynamic variables,
thus:

(shadow 't); liberate t to stand for Thymine
(defconstant true cl:t); Don't want to write cl:t
;; for true

(defun count-bases (strand)
  (let ((a 0)(c 0)(g 0)(t 0))
    (declare (special a c g t))
    (dolist (base strand)
	    (cond ((atom base)
		   (incf (symbol-value base)))
		  (true
		   (mapcar
		    (lambda (b)
		      (incf (symbol-value b)))
		    base))))
    (list a c g t)))

(count-bases '((a t)(c g) t t c))
=> (1 2 1 3)

Apologies to Ryan C. Tarpine, because shadowing t is an
advanced (insane?) technique, shocking to those wanting a
gentle introduction to Lisp. On the other hand, every-one
has a nasty shock when they transcribe an equation out of a
book, full of x,y,t,u, and v only to find that cl:t is a
pervasive constant, and not available as a variable name.

Alan Crowe
Edinburgh
Scotland
From: Pascal Costanza
Subject: Re: Function reusing variables
Date: 
Message-ID: <c1i75a$1tt$1@newsreader2.netcologne.de>
Alan Crowe wrote:

> Matthew Danish advised: 
> 
>>Alternatively, you can just represent the counts as
>>ordinary lexical variables and do some error-checking of
>>the strand at the same time (it'll be more efficient
>>likely, since the lexical variables can be put into
>>registers).  If there were lots of potential keys to work
>>with, then you'd be better off with a hash-table.
> 
> 
> I wonder what the list thinks of using dynamic variables,
> thus:
[...]

I don't think that's a good idea. The symbols you declare specially 
become global entities although you are only using them strictly 
locally. It makes sense to declare variables special even when they are 
not global for example when you want to keep intermediate results but 
restrict them to code that knows about those variables. In your code, 
the bindings of variables only exist during base counting are discarded 
afterwards. However, the symbols still exist then.

Of course, there is nothing seriously wrong here. The use of 
SYMBOL-VALUE in your code is cute, but you can get a similar effect by 
using property lists via GETF or by accessing slots in classes via 
SLOT-VALUE. This also has the advantage that you don't need to shadow T.


Pascal

> (shadow 't); liberate t to stand for Thymine
> (defconstant true cl:t); Don't want to write cl:t
> ;; for true
> 
> (defun count-bases (strand)
>   (let ((a 0)(c 0)(g 0)(t 0))
>     (declare (special a c g t))
>     (dolist (base strand)
> 	    (cond ((atom base)
> 		   (incf (symbol-value base)))
> 		  (true
> 		   (mapcar
> 		    (lambda (b)
> 		      (incf (symbol-value b)))
> 		    base))))
>     (list a c g t)))
> 
> (count-bases '((a t)(c g) t t c))
> => (1 2 1 3)
> 
> Apologies to Ryan C. Tarpine, because shadowing t is an
> advanced (insane?) technique, shocking to those wanting a
> gentle introduction to Lisp. On the other hand, every-one
> has a nasty shock when they transcribe an equation out of a
> book, full of x,y,t,u, and v only to find that cl:t is a
> pervasive constant, and not available as a variable name.
> 
> Alan Crowe
> Edinburgh
> Scotland

-- 
Tyler: "How's that working out for you?"
Jack: "Great."
Tyler: "Keep it up, then."