I'm teaching myself Common Lisp and I am trying to do something I know
how to do in Scheme but I would like to know the True Path. Simply, I
want to read a file of alphabetically-sorted words into a vector so I
can do binary searches. It's simple enough to build a list (sorry about
the lack of indentation; don't know how to force spaces in the Google
Groups text box):
(defun build-dict (filename)
(let ((dict '()))
(with-open-file (str filename :direction :input)
(do ((line (read-line str nil 'eof)
(read-line str nil 'eof)))
((eql line 'eof))
(push line dict)))
(reverse dict)))
In Scheme I would probably do something like (list->vector (reverse
dict)) but is there a way to build the vector directly without knowing
the size in advance? Or is there a more "CL" way to do this?
"Bruce Butterfield" <···@entricom.com> wrote in message
·····························@c13g2000cwb.googlegroups.com...
> I'm teaching myself Common Lisp and I am trying to do something I know
> how to do in Scheme but I would like to know the True Path. Simply, I
> want to read a file of alphabetically-sorted words into a vector so I
> can do binary searches. It's simple enough to build a list (sorry about
> the lack of indentation; don't know how to force spaces in the Google
> Groups text box):
>
> (defun build-dict (filename)
> (let ((dict '()))
> (with-open-file (str filename :direction :input)
> (do ((line (read-line str nil 'eof)
> (read-line str nil 'eof)))
> ((eql line 'eof))
> (push line dict)))
> (reverse dict)))
>
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?
(coerce (nreverse dict) 'vector) will do this - nreverse instead of reverse
is ok because you are building the list from scratch.
Jim Bushnell
"Bruce Butterfield" <···@entricom.com> writes:
> I thought that 'coerce' was one of those 'big-hammer' functions like
> eval to be avoided whenever possible. Not in CL?
Depends how you use it. (coerce '(lambda (x) x) 'function) is a big
hammer, because it is EVAL (or COMPILE). (coerce '(1 2 3) 'vector)
isn't so bad, because it's just converting a list to a vector; and any
decent CL compiler will recognize this statically. You will learn how
to write this kind of optimization for your own functions too, at some
point: it's called a compiler-macro (and CL implementations usually
have even stronger user-definable optimization macros, like Python's
DEFTRANSFORM).
As far as your original question goes, though, NREVERSEing a list
isn't so bad, nor is LOOP...COLLECTING it; you can also experiment
using VECTOR-PUSH-EXTEND on an adjustable array with fill-pointer.
Relative efficiency may depend on your implementation.
--
;; Matthew Danish -- user: mrd domain: cmu.edu
;; OpenPGP public key: C24B6010 on keyring.debian.org
Matthew Danish <··········@cmu.edu> writes:
> As far as your original question goes, though, NREVERSEing a list
> isn't so bad, nor is LOOP...COLLECTING it; you can also experiment
> using VECTOR-PUSH-EXTEND on an adjustable array with fill-pointer.
> Relative efficiency may depend on your implementation.
push+nreverse has exactly the same space and time complexity than
appending to a list keeping a pointer to the last cons.
--
__Pascal Bourguignon__ http://www.informatimago.com/
You're always typing.
Well, let's see you ignore my
sitting on your hands.
Pascal Bourguignon <····@mouse-potato.com> writes:
> push+nreverse has exactly the same space and time complexity than
> appending to a list keeping a pointer to the last cons.
I can see how this is so for each element added, but doesn't that
final NREVERSE, being O(N), have to be taken into account here?
--
Steven E. Harris
start quoting Steven E. Harris :
> Pascal Bourguignon <····@mouse-potato.com> writes:
>
>> push+nreverse has exactly the same space and time complexity than
>> appending to a list keeping a pointer to the last cons.
>
> I can see how this is so for each element added, but doesn't that
> final NREVERSE, being O(N), have to be taken into account here?
>
2*O(N) is stilll O(N), as you know.
He only referred to algorithmic complexity; you could repeat the nreverse
call 999 times, still get the same result, and still have O(N) complexity.
The important bit is that nreversing the list after constructing it uses the
same number of memory accesses as constructing it as we go would; however,
it's a 2-pass algorithm, and will therefore exhibit worse cache locality
than the effectively identical 1-pass algorithm.
"Steven E. Harris" <···@panix.com> writes:
> Pascal Bourguignon <····@mouse-potato.com> writes:
>
> > push+nreverse has exactly the same space and time complexity than
> > appending to a list keeping a pointer to the last cons.
>
> I can see how this is so for each element added, but doesn't that
> final NREVERSE, being O(N), have to be taken into account here?
O(N)+O(N)=O(N)
The constants change, there's 4n assignments instead of 2n
assignments, but the additional assignments with NREVERSE can be done
in registers so the only difference out of the processor will be with
respect to the cache size (and the order of the cells in the resulting
list).
--
__Pascal Bourguignon__ http://www.informatimago.com/
I need a new toy.
Tail of black dog keeps good time.
Pounce! Good dog! Good dog!
Pascal Bourguignon <····@mouse-potato.com> writes:
> The constants change, there's 4n assignments instead of 2n
> assignments,
Yes, that's what I was really bickering with. I asked the question
poorly.
> but the additional assignments with NREVERSE can be done
> in registers so the only difference out of the processor will be with
> respect to the cache size
Funny, but wasn't Svein Ove Aas just arguing the converse in a
parallel message?� Or, if not the exact same point about register use,
that the one-pass algorithm was likely to be more efficient? I'm not
taking sides.
Footnotes:
� <············@services.kq.no>
--
Steven E. Harris
"Steven E. Harris" <···@panix.com> writes:
> Pascal Bourguignon <····@mouse-potato.com> writes:
>
> > The constants change, there's 4n assignments instead of 2n
> > assignments,
>
> Yes, that's what I was really bickering with. I asked the question
> poorly.
>
> > but the additional assignments with NREVERSE can be done
> > in registers so the only difference out of the processor will be with
> > respect to the cache size
>
> Funny, but wasn't Svein Ove Aas just arguing the converse in a
> parallel message?� Or, if not the exact same point about register use,
> that the one-pass algorithm was likely to be more efficient? I'm not
> taking sides.
>
>
> Footnotes:
> � <············@services.kq.no>
Yes. We're both right. What I say is that we should get the same
number of memory access. What he say is that the _order_ of these
accesses will favor the one-pass one.
--
__Pascal Bourguignon__ http://www.informatimago.com/
The mighty hunter
Returns with gifts of plump birds,
Your foot just squashed one.
start quoting Pascal Bourguignon :
> Matthew Danish <··········@cmu.edu> writes:
>> As far as your original question goes, though, NREVERSEing a list
>> isn't so bad, nor is LOOP...COLLECTING it; you can also experiment
>> using VECTOR-PUSH-EXTEND on an adjustable array with fill-pointer.
>> Relative efficiency may depend on your implementation.
>
> push+nreverse has exactly the same space and time complexity than
> appending to a list keeping a pointer to the last cons.
>
I'd be more worried about constant factors, and those are only identical so
long as you don't cross any cache lines. Once you do, you'll find that
keeping a pointer means fewer cache misses, because you don't need to walk
the list twice. (Which would potentially mean fetching the memory twice.)
That said, the difference is *usually* hard to spot; there are a lot of
other ways to optimize that are more likely to give good results.
Bruce Butterfield wrote:
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?
>
By using an adjustable array with a fill-pointer.
(defun build-dict (filename)
(let ((dict (make-array 64 :adjustable t :fill-pointer 0)))
(with-open-file (str filename :direction :input)
(do ((line (read-line str nil 'eof)
(read-line str nil 'eof)))
((eql line 'eof))
(vector-push-extend line dict)))
dict))
Wade
Bruce Butterfield wrote:
> I'm teaching myself Common Lisp and I am trying to do something I know
> how to do in Scheme but I would like to know the True Path. Simply, I
> want to read a file of alphabetically-sorted words into a vector so I
> can do binary searches. It's simple enough to build a list (sorry about
> the lack of indentation; don't know how to force spaces in the Google
> Groups text box):
>
> (defun build-dict (filename)
> (let ((dict '()))
> (with-open-file (str filename :direction :input)
> (do ((line (read-line str nil 'eof)
> (read-line str nil 'eof)))
> ((eql line 'eof))
> (push line dict)))
> (reverse dict)))
>
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?
What about this:
(with-open-file (str filename :direction :input)
(loop for line = (read-line str nil 'eof)
until (eql line 'eof)
collect line into dict
finally (return (apply #'vector dict))))
or:
(with-open-file (str filename :direction :input)
(loop for line = (read-line str nil 'eof)
until (eql line 'eof)
count t into line-count
collect line into dict
finally (return
(make-array line-count :initial-contents dict))))
[untested code]
Pascal
start quoting Bruce Butterfield :
> I'm teaching myself Common Lisp and I am trying to do something I know
> how to do in Scheme but I would like to know the True Path. Simply, I
> want to read a file of alphabetically-sorted words into a vector so I
> can do binary searches. It's simple enough to build a list (sorry about
> the lack of indentation; don't know how to force spaces in the Google
> Groups text box):
Have you tried prepending the lines with semicolons?
> (defun build-dict (filename)
> (let ((dict '()))
> (with-open-file (str filename :direction :input)
> (do ((line (read-line str nil 'eof)
> (read-line str nil 'eof)))
> ((eql line 'eof))
> (push line dict)))
> (reverse dict)))
>
> In Scheme I would probably do something like (list->vector (reverse
> dict)) but is there a way to build the vector directly without knowing
> the size in advance? Or is there a more "CL" way to do this?
You can turn a list into a vector with the "coerce" function.
You could also build a vector directly if you use :adjustable t
and :fill-pointer t as arguments to make-array; use vector-push-extend
instead of push. I wouldn't make any bets on it being faster than coerce,
but if you know how large the vector should be it would be.
Here's my version; iterator->list is from my personal utility collection, so
wouldn't normally be visible.
(defun build-dict (filename)
(with-open-file (s filename :direction :input)
(flet ((next ()
(read-line s :eof-error-p nil :eof-value 'eof))
(iterator->list (iterator)
(loop for cell = (funcall iterator) then (funcall iterator)
while cell
collect cell)))
(coerce (iterator->list #'next) 'vector))))
Svein Ove Aas <·········@aas.no> writes:
> You could also build a vector directly if you use :adjustable t
> and :fill-pointer t as arguments to make-array; use vector-push-extend
> instead of push.
If using vector-push-extend, it would probably be better use 0 as the
fill-pointer argument.
The following is something I wrote recently to load words of a
specified length from a dictionary file:
(defun load-words (file length)
(let ((words (make-array 25000 :adjustable t :fill-pointer 0)))
(with-open-file (stream file :direction :input)
(loop for word = (read-line stream nil stream)
until (eq word stream)
when (= (length word) length)
do (vector-push-extend word words)))
words))
Zach
Svein Ove Aas <·········@aas.no> writes:
> start quoting Bruce Butterfield :
>
>> I'm teaching myself Common Lisp and I am trying to do something I know
>> how to do in Scheme but I would like to know the True Path. Simply, I
>> want to read a file of alphabetically-sorted words into a vector so I
>> can do binary searches. It's simple enough to build a list (sorry about
>> the lack of indentation; don't know how to force spaces in the Google
>> Groups text box):
Heh. I wonder how the folks over on comp.lang.python are digging that
feature. >:-)
> You could also build a vector directly if you use :adjustable t and
> :fill-pointer t as arguments to make-array; use vector-push-extend
> instead of push.
You'll probably want :fill-pointer 0, not t, unless the initial
dimension of the array is 0.
-Peter
--
Peter Seibel ·····@javamonkey.com
Lisp is the red pill. -- John Fraser, comp.lang.lisp
start quoting Svein Ove Aas :
> Here's my version; iterator->list is from my personal utility collection,
> so wouldn't normally be visible.
>
> (defun build-dict (filename)
> (with-open-file (s filename :direction :input)
> (flet ((next ()
> (read-line s :eof-error-p nil :eof-value 'eof))
> (iterator->list (iterator)
> (loop for cell = (funcall iterator) then (funcall iterator)
> while cell
> collect cell)))
>
> (coerce (iterator->list #'next) 'vector))))
:eof-value should of course be nil, not 'eof. Doh.