CL idioms

From: Bruce Butterfield
Subject: CL idioms
Date: Thu, 13 Jan 2005 17:12:35 +0000
Message-ID: <1105636355.685582.314270@c13g2000cwb.googlegroups.com>

I'm teaching myself Common Lisp and I am trying to do something I know
how to do in Scheme but I would like to know the True Path. Simply, I
want to read a file of alphabetically-sorted words into a vector so I
can do binary searches. It's simple enough to build a list (sorry about
the lack of indentation; don't know how to force spaces in the Google
Groups text box):

(defun build-dict (filename)
(let ((dict '()))
(with-open-file (str filename :direction :input)
(do ((line (read-line str nil 'eof)
(read-line str nil 'eof)))
((eql line 'eof))
(push line dict)))
(reverse dict)))

In Scheme I would probably do something like (list->vector (reverse
dict)) but is there a way to build the vector directly without knowing
the size in advance? Or is there a more "CL" way to do this?

Re: CL idioms Anne & Jim Bushnell
- Re: CL idioms Bruce Butterfield
  - Re: CL idioms Matthew Danish
    - Re: CL idioms Pascal Bourguignon
      - Re: CL idioms Steven E. Harris
        Re: CL idioms Svein Ove Aas
        Re: CL idioms Pascal Bourguignon
        Re: CL idioms Steven E. Harris
        Re: CL idioms Pascal Bourguignon
      - Re: CL idioms Svein Ove Aas
Re: CL idioms Wade Humeniuk
Re: CL idioms Pascal Costanza
Re: CL idioms Svein Ove Aas
- Re: CL idioms Zach Beane
- Re: CL idioms Peter Seibel
- Re: CL idioms Svein Ove Aas
  - Re: CL idioms David Sletten
    - Re: CL idioms Svein Ove Aas
      - Re: CL idioms David Sletten

From: Anne & Jim Bushnell
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:32:25 +0000
Message-ID: <o9udneAYs_s6LXvcRVn-hw@comcast.com>

"Bruce Butterfield" <···@entricom.com> wrote in message
·····························@c13g2000cwb.googlegroups.com...
> I'm teaching myself Common Lisp and I am trying to do something I know
> how to do in Scheme but I would like to know the True Path. Simply, I
> want to read a file of alphabetically-sorted words into a vector so I
> can do binary searches. It's simple enough to build a list (sorry about
> the lack of indentation; don't know how to force spaces in the Google
> Groups text box):
>
> (defun build-dict (filename)
> (let ((dict '()))
> (with-open-file (str filename :direction :input)
> (do ((line (read-line str nil 'eof)
> (read-line str nil 'eof)))
> ((eql line 'eof))
> (push line dict)))
> (reverse dict)))
>
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?

(coerce (nreverse dict) 'vector) will do this - nreverse instead of reverse
is ok because you are building the list from scratch.

Jim Bushnell

From: Bruce Butterfield
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:47:38 +0000
Message-ID: <1105638458.952024.296090@f14g2000cwb.googlegroups.com>

I thought that 'coerce' was one of those 'big-hammer' functions like
eval to be avoided whenever possible. Not in CL?

From: Matthew Danish
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 19:17:56 +0000
Message-ID: <87sm55nnob.fsf@mapcar.org>

"Bruce Butterfield" <···@entricom.com> writes:
> I thought that 'coerce' was one of those 'big-hammer' functions like
> eval to be avoided whenever possible. Not in CL?

Depends how you use it.  (coerce '(lambda (x) x) 'function) is a big
hammer, because it is EVAL (or COMPILE).  (coerce '(1 2 3) 'vector)
isn't so bad, because it's just converting a list to a vector; and any
decent CL compiler will recognize this statically.  You will learn how
to write this kind of optimization for your own functions too, at some
point: it's called a compiler-macro (and CL implementations usually
have even stronger user-definable optimization macros, like Python's
DEFTRANSFORM).

As far as your original question goes, though, NREVERSEing a list
isn't so bad, nor is LOOP...COLLECTING it; you can also experiment
using VECTOR-PUSH-EXTEND on an adjustable array with fill-pointer.
Relative efficiency may depend on your implementation.

-- 
;; Matthew Danish -- user: mrd domain: cmu.edu
;; OpenPGP public key: C24B6010 on keyring.debian.org

From: Pascal Bourguignon
Subject: Re: CL idioms
Date: Fri, 14 Jan 2005 14:46:31 +0000
Message-ID: <87u0pk9igo.fsf@thalassa.informatimago.com>

Matthew Danish <··········@cmu.edu> writes:
> As far as your original question goes, though, NREVERSEing a list
> isn't so bad, nor is LOOP...COLLECTING it; you can also experiment
> using VECTOR-PUSH-EXTEND on an adjustable array with fill-pointer.
> Relative efficiency may depend on your implementation.

push+nreverse has exactly the same space and time complexity than
appending to a list keeping a pointer to the last cons.
     

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
You're always typing.
Well, let's see you ignore my
sitting on your hands.

From: Steven E. Harris
Subject: Re: CL idioms
Date: Fri, 14 Jan 2005 17:41:19 +0000
Message-ID: <jk4is5zucw0.fsf@W003275.na.alarismed.com>

Pascal Bourguignon <····@mouse-potato.com> writes:

> push+nreverse has exactly the same space and time complexity than
> appending to a list keeping a pointer to the last cons.

I can see how this is so for each element added, but doesn't that
final NREVERSE, being O(N), have to be taken into account here?

-- 
Steven E. Harris

From: Svein Ove Aas
Subject: Re: CL idioms
Date: Fri, 14 Jan 2005 18:25:16 +0000
Message-ID: <cs92pr$mj4$1@services.kq.no>

start quoting Steven E. Harris :

> Pascal Bourguignon <····@mouse-potato.com> writes:
> 
>> push+nreverse has exactly the same space and time complexity than
>> appending to a list keeping a pointer to the last cons.
> 
> I can see how this is so for each element added, but doesn't that
> final NREVERSE, being O(N), have to be taken into account here?
> 
2*O(N) is stilll O(N), as you know.

He only referred to algorithmic complexity; you could repeat the nreverse
call 999 times, still get the same result, and still have O(N) complexity.

The important bit is that nreversing the list after constructing it uses the
same number of memory accesses as constructing it as we go would; however,
it's a 2-pass algorithm, and will therefore exhibit worse cache locality
than the effectively identical 1-pass algorithm.

From: Pascal Bourguignon
Subject: Re: CL idioms
Date: Fri, 14 Jan 2005 18:53:03 +0000
Message-ID: <878y6valm8.fsf@thalassa.informatimago.com>

"Steven E. Harris" <···@panix.com> writes:

> Pascal Bourguignon <····@mouse-potato.com> writes:
> 
> > push+nreverse has exactly the same space and time complexity than
> > appending to a list keeping a pointer to the last cons.
> 
> I can see how this is so for each element added, but doesn't that
> final NREVERSE, being O(N), have to be taken into account here?

O(N)+O(N)=O(N)

The constants change, there's 4n assignments instead of 2n
assignments, but the additional assignments with NREVERSE can be done
in registers so the only difference out of the processor will be with
respect to the cache size (and the order of the cells in the resulting
list).

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
I need a new toy.
Tail of black dog keeps good time.
Pounce! Good dog! Good dog!

From: Steven E. Harris
Subject: Re: CL idioms
Date: Fri, 14 Jan 2005 21:28:57 +0000
Message-ID: <jk41xcnu2cm.fsf@W003275.na.alarismed.com>

Pascal Bourguignon <····@mouse-potato.com> writes:

> The constants change, there's 4n assignments instead of 2n
> assignments,

Yes, that's what I was really bickering with. I asked the question
poorly.

> but the additional assignments with NREVERSE can be done
> in registers so the only difference out of the processor will be with
> respect to the cache size

Funny, but wasn't Svein Ove Aas just arguing the converse in a
parallel message?� Or, if not the exact same point about register use,
that the one-pass algorithm was likely to be more efficient? I'm not
taking sides.

Footnotes: 
� <············@services.kq.no>

-- 
Steven E. Harris

From: Pascal Bourguignon
Subject: Re: CL idioms
Date: Sat, 15 Jan 2005 00:30:47 +0000
Message-ID: <87651z8rew.fsf@thalassa.informatimago.com>

"Steven E. Harris" <···@panix.com> writes:

> Pascal Bourguignon <····@mouse-potato.com> writes:
> 
> > The constants change, there's 4n assignments instead of 2n
> > assignments,
> 
> Yes, that's what I was really bickering with. I asked the question
> poorly.
> 
> > but the additional assignments with NREVERSE can be done
> > in registers so the only difference out of the processor will be with
> > respect to the cache size
> 
> Funny, but wasn't Svein Ove Aas just arguing the converse in a
> parallel message?� Or, if not the exact same point about register use,
> that the one-pass algorithm was likely to be more efficient? I'm not
> taking sides.
> 
> 
> Footnotes: 
> � <············@services.kq.no>

Yes. We're both right. What I say is that we should get the same
number of memory access.  What he say is that the _order_ of these
accesses will favor the one-pass one.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The mighty hunter
Returns with gifts of plump birds,
Your foot just squashed one.

From: Svein Ove Aas
Subject: Re: CL idioms
Date: Fri, 14 Jan 2005 15:24:33 +0000
Message-ID: <cs8o70$dha$1@services.kq.no>

start quoting Pascal Bourguignon :

> Matthew Danish <··········@cmu.edu> writes:
>> As far as your original question goes, though, NREVERSEing a list
>> isn't so bad, nor is LOOP...COLLECTING it; you can also experiment
>> using VECTOR-PUSH-EXTEND on an adjustable array with fill-pointer.
>> Relative efficiency may depend on your implementation.
> 
> push+nreverse has exactly the same space and time complexity than
> appending to a list keeping a pointer to the last cons.
>      
I'd be more worried about constant factors, and those are only identical so
long as you don't cross any cache lines. Once you do, you'll find that
keeping a pointer means fewer cache misses, because you don't need to walk
the list twice. (Which would potentially mean fetching the memory twice.)

That said, the difference is *usually* hard to spot; there are a lot of
other ways to optimize that are more likely to give good results.

From: Wade Humeniuk
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:31:50 +0000
Message-ID: <agyFd.99500$KO5.50592@clgrps13>

Bruce Butterfield wrote:

> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?
> 

By using an adjustable array with a fill-pointer.

(defun build-dict (filename)
   (let ((dict (make-array 64 :adjustable t :fill-pointer 0)))
     (with-open-file (str filename :direction :input)
       (do ((line (read-line str nil 'eof)
                  (read-line str nil 'eof)))
           ((eql line 'eof))
         (vector-push-extend line dict)))
     dict))

Wade

From: Pascal Costanza
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:25:46 +0000
Message-ID: <cs6auo$85i$1@snic.vub.ac.be>

Bruce Butterfield wrote:
> I'm teaching myself Common Lisp and I am trying to do something I know
> how to do in Scheme but I would like to know the True Path. Simply, I
> want to read a file of alphabetically-sorted words into a vector so I
> can do binary searches. It's simple enough to build a list (sorry about
> the lack of indentation; don't know how to force spaces in the Google
> Groups text box):
> 
> (defun build-dict (filename)
> (let ((dict '()))
> (with-open-file (str filename :direction :input)
> (do ((line (read-line str nil 'eof)
> (read-line str nil 'eof)))
> ((eql line 'eof))
> (push line dict)))
> (reverse dict)))
> 
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?

What about this:

(with-open-file (str filename :direction :input)
   (loop for line = (read-line str nil 'eof)
         until (eql line 'eof)
         collect line into dict
         finally (return (apply #'vector dict))))

or:

(with-open-file (str filename :direction :input)
   (loop for line = (read-line str nil 'eof)
         until (eql line 'eof)
         count t into line-count
         collect line into dict
         finally (return
                   (make-array line-count :initial-contents dict))))


[untested code]


Pascal

From: Svein Ove Aas
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:37:25 +0000
Message-ID: <cs6bk7$rds$1@services.kq.no>

start quoting Bruce Butterfield :

> I'm teaching myself Common Lisp and I am trying to do something I know
> how to do in Scheme but I would like to know the True Path. Simply, I
> want to read a file of alphabetically-sorted words into a vector so I
> can do binary searches. It's simple enough to build a list (sorry about
> the lack of indentation; don't know how to force spaces in the Google
> Groups text box):

Have you tried prepending the lines with semicolons?

> (defun build-dict (filename)
>   (let ((dict '()))
>     (with-open-file (str filename :direction :input)
>       (do ((line (read-line str nil 'eof)
>       (read-line str nil 'eof)))
>      ((eql line 'eof))
>  (push line dict)))
>     (reverse dict)))


> 
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?

You can turn a list into a vector with the "coerce" function.

You could also build a vector directly if you use :adjustable t
and :fill-pointer t as arguments to make-array; use vector-push-extend
instead of push. I wouldn't make any bets on it being faster than coerce,
but if you know how large the vector should be it would be.


Here's my version; iterator->list is from my personal utility collection, so
wouldn't normally be visible.

(defun build-dict (filename)
  (with-open-file (s filename :direction :input)
    (flet ((next ()
      (read-line s :eof-error-p nil :eof-value 'eof))
    (iterator->list (iterator)
      (loop for cell = (funcall iterator) then (funcall iterator)
     while cell
     collect cell)))

      (coerce (iterator->list #'next) 'vector))))

From: Zach Beane
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:47:23 +0000
Message-ID: <m3mzvdcjbo.fsf@unnamed.xach.com>

Svein Ove Aas <·········@aas.no> writes:

> You could also build a vector directly if you use :adjustable t
> and :fill-pointer t as arguments to make-array; use vector-push-extend
> instead of push.

If using vector-push-extend, it would probably be better use 0 as the
fill-pointer argument.

The following is something I wrote recently to load words of a
specified length from a dictionary file:

   (defun load-words (file length)
     (let ((words (make-array 25000 :adjustable t :fill-pointer 0)))
       (with-open-file (stream file :direction :input)
         (loop for word = (read-line stream nil stream)
               until (eq word stream)
               when (= (length word) length)
               do (vector-push-extend word words)))
       words))

Zach

From: Peter Seibel
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:49:14 +0000
Message-ID: <m3brbt9q4b.fsf@javamonkey.com>

Svein Ove Aas <·········@aas.no> writes:

> start quoting Bruce Butterfield :
>
>> I'm teaching myself Common Lisp and I am trying to do something I know
>> how to do in Scheme but I would like to know the True Path. Simply, I
>> want to read a file of alphabetically-sorted words into a vector so I
>> can do binary searches. It's simple enough to build a list (sorry about
>> the lack of indentation; don't know how to force spaces in the Google
>> Groups text box):

Heh. I wonder how the folks over on comp.lang.python are digging that
feature. >:-)

> You could also build a vector directly if you use :adjustable t and
> :fill-pointer t as arguments to make-array; use vector-push-extend
> instead of push.

You'll probably want :fill-pointer 0, not t, unless the initial
dimension of the array is 0.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp

From: Svein Ove Aas
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 17:55:30 +0000
Message-ID: <cs6cm4$4pv$2@services.kq.no>

start quoting Svein Ove Aas :

> Here's my version; iterator->list is from my personal utility collection,
> so wouldn't normally be visible.
> 
> (defun build-dict (filename)
>   (with-open-file (s filename :direction :input)
>     (flet ((next ()
>       (read-line s :eof-error-p nil :eof-value 'eof))
>     (iterator->list (iterator)
>       (loop for cell = (funcall iterator) then (funcall iterator)
>      while cell
>      collect cell)))
> 
>       (coerce (iterator->list #'next) 'vector))))


:eof-value should of course be nil, not 'eof. Doh.

From: David Sletten
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 18:31:35 +0000
Message-ID: <b8zFd.66028$gd.58439@twister.socal.rr.com>

Svein Ove Aas wrote:

> start quoting Svein Ove Aas :
> 
> 
>>Here's my version; iterator->list is from my personal utility collection,
>>so wouldn't normally be visible.
>>
>>(defun build-dict (filename)
>>  (with-open-file (s filename :direction :input)
>>    (flet ((next ()
>>      (read-line s :eof-error-p nil :eof-value 'eof))
>>    (iterator->list (iterator)
>>      (loop for cell = (funcall iterator) then (funcall iterator)
>>     while cell
>>     collect cell)))
>>
>>      (coerce (iterator->list #'next) 'vector))))
> 
> 
> 
> :eof-value should of course be nil, not 'eof. Doh.
READ-LINE doesn't take keyword args either:
(read-line s nil nil)

From: Svein Ove Aas
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 19:17:54 +0000
Message-ID: <cs6hgj$7tu$1@services.kq.no>

start quoting David Sletten :

> Svein Ove Aas wrote:
> 
>> start quoting Svein Ove Aas :
>> 
>> 
>>>Here's my version; iterator->list is from my personal utility collection,
>>>so wouldn't normally be visible.
>>>
>>>(defun build-dict (filename)
>>>  (with-open-file (s filename :direction :input)
>>>    (flet ((next ()
>>>      (read-line s :eof-error-p nil :eof-value 'eof))
>>>    (iterator->list (iterator)
>>>      (loop for cell = (funcall iterator) then (funcall iterator)
>>>     while cell
>>>     collect cell)))
>>>
>>>      (coerce (iterator->list #'next) 'vector))))
>> 
>> 
>> 
>> :eof-value should of course be nil, not 'eof. Doh.
> READ-LINE doesn't take keyword args either:
> (read-line s nil nil)

Let's just call it "homework traps", shall we?
Yeah, that's it.

From: David Sletten
Subject: Re: CL idioms
Date: Thu, 13 Jan 2005 19:24:21 +0000
Message-ID: <FVzFd.59667$nP1.32274@twister.socal.rr.com>

Svein Ove Aas wrote:


> 
> Let's just call it "homework traps", shall we?
> Yeah, that's it.
> 
OK, sorry--I guess I missed that meeting. :)