string -> list

From: Todd McLaughlin
Subject: string -> list
Date: Thu, 01 Oct 1998 00:00:00 +0000
Message-ID: <6uv03e$hpa$2@samba.rahul.net>

I'm reading in a single word from the user with read-line.  How do I 
convert it to a list?  I've been playing with make-symbol without much
luck.  Thanks!

Todd

Re: string -> list Marc Dzaebel
- Re: string -> list Rainer Joswig
  - Re: string -> list Marc Dzaebel
    - Re: string -> list Rainer Joswig
      - Re: string -> list Marc Dzaebel
        Re: string -> list Rainer Joswig
        Re: string -> list Rainer Joswig
        Re: string -> list Erik Naggum
        Re: string -> list Marc Dzaebel
    - Re: string -> list Kent M Pitman

From: Marc Dzaebel
Subject: Re: string -> list
Date: Thu, 01 Oct 1998 00:00:00 +0000
Message-ID: <361351F7.4577D0E3@rose.de>

Todd McLaughlin wrote:
> I'm reading in a single word from the user with read-line.  How do I
> convert it to a list?

(loop for c across "abc" collect c) -> (#\a #\b #\c)

Is this what you need or do you want to separate words in the string?

Marc Dzaebel    (http://members.xoom.com/dzaebel)

From: Rainer Joswig
Subject: Re: string -> list
Date: Thu, 01 Oct 1998 00:00:00 +0000
Message-ID: <joswig-0110981214490001@pbg3.lavielle.com>

In article <·················@rose.de>, Marc Dzaebel <····@rose.de> wrote:

> Todd McLaughlin wrote:
> > I'm reading in a single word from the user with read-line.  How do I
> > convert it to a list?
> 
> (loop for c across "abc" collect c) -> (#\a #\b #\c)

(coerce "abc" 'list)

-- 
http://www.lavielle.com/~joswig

From: Marc Dzaebel
Subject: Re: string -> list
Date: Thu, 01 Oct 1998 00:00:00 +0000
Message-ID: <36138A5A.F38E209@rose.de>

Rainer Joswig wrote:
> (coerce "abc" 'list)

Your solution is the better one. However it's 2.5 time slower
than (loop for c across "abc" collect c).

From: Rainer Joswig
Subject: Re: string -> list
Date: Thu, 01 Oct 1998 00:00:00 +0000
Message-ID: <joswig-0110981639160001@pbg3.lavielle.com>

In article <················@rose.de>, Marc Dzaebel <····@rose.de> wrote:

> > (coerce "abc" 'list)
> 
> Your solution is the better one. However it's 2.5 time slower
> than (loop for c across "abc" collect c).

Not on my Mac:


(defparameter *string* (make-string 200000 :initial-element #\a))

(defun test1 ()
  (time (coerce *string* 'list)))

(COERCE *STRING* 'LIST) took 353 milliseconds (0.353 seconds) to run.
Of that, 298 milliseconds (0.298 seconds) was spent in GC.
 1,600,000 bytes of memory allocated.

(defun test2 ()
  (time (loop for c across *string* collect c)))

(LOOP FOR C ACROSS *STRING* COLLECT C) took 360 milliseconds (0.360
seconds) to run.
Of that, 1 milliseconds (0.001 seconds) were spent in The Cooperative
Multitasking Experience.
300 milliseconds (0.300 seconds) was spent in GC.
 1,600,008 bytes of memory allocated.


And not on my Lisp Machine:

Command:  (progn (test1) nil)
Evaluation of (COERCE *STRING* 'LIST) took 1.493364 seconds of elapsed time i!
ncluding:
  0.031 seconds processing sequence breaks,
  0.324 seconds in the storage system (including 0.000 seconds waiting for pa!
ges):
    0.000 seconds processing 0 page faults including 0 fetches,
    0.323 seconds in creating and destroying pages, and
    0.001 seconds in miscellaneous storage system tasks.
The garbage collector has flipped; so no consing was measured.
NIL

Command:  (progn (test2) nil)
Evaluation of (LOOP FOR C ACROSS *STRING* ...) took 24.110104 seconds of elap!
sed time including:
  0.833 seconds processing sequence breaks,
  6.820 seconds in the storage system (including 0.000 seconds waiting for pa!
ges):
    0.000 seconds processing 0 page faults including 0 fetches,
    0.531 seconds in creating and destroying pages, and
    6.288 seconds in miscellaneous storage system tasks.
The garbage collector has flipped; so no consing was measured.



Greetings,

Rainer Joswig

-- 
http://www.lavielle.com/~joswig

From: Marc Dzaebel
Subject: Re: string -> list
Date: Sun, 04 Oct 1998 00:00:00 +0000
Message-ID: <3617616A.165864EF@rose.de>

Rainer Joswig wrote:
> > Your solution is the better one. However it's 2.5 time slower
> > than (loop for c across "abc" collect c).
> Not on my Mac:
> And not on my Lisp Machine:

Hi Rainer,

The problem was to coerce "abc" rather than a string
with 200000 elements. Additionally you should compile
your functions because this will be the most important
scenario. However, in your example loop-across was
actually slower even on ACL5.0, which shows how
carefully speed has to be understood.

Following times have been measured on ACL5.0/Linux/200/PC

(defun test1()
 (time(dotimes(n 100000)(coerce"abc"'list))))
(compile'test1)
(test1)
; cpu time (non-gc) 190 msec user, 0 msec system (ACL5.0/Linux/200)

(defun test2 ()
 (time(dotimes(n 100000)(loop for c across "abc" collect c))))
(compile'test2)
(test2)
; cpu time (non-gc) 470 msec user, 0 msec system (ACL5.0/Linux/200)

Marc

From: Rainer Joswig
Subject: Re: string -> list
Date: Sun, 04 Oct 1998 00:00:00 +0000
Message-ID: <joswig-0410981507390001@194.163.195.67>

In article <·················@rose.de>, Marc Dzaebel <····@rose.de> wrote:

> Rainer Joswig wrote:
> > > Your solution is the better one. However it's 2.5 time slower
> > > than (loop for c across "abc" collect c).
> > Not on my Mac:
> > And not on my Lisp Machine:
> 
> Hi Rainer,
> 
> The problem was to coerce "abc" rather than a string
> with 200000 elements.

If you measure speed, you have to take several effects
into account. The function may have a constant
factor (large for coerce in MCL) and a factor that
depends on the data. Also memory speeds, various caches
come into play.

You also should avoid using constant test data
(unless this is part of your application).
String handling is also difficult to compare,
since implementations might provide different
string capabilities (unicode, ...) which might need to call
expensive OS functionality.

> Additionally you should compile
> your functions because this will be the most important
> scenario.

MCL usually compiles everything by default - even when you
execute code snippets via the editor.
I did compile it for Genera, though.

> (defun test1()
>  (time(dotimes(n 100000)(coerce"abc"'list))))
> (compile'test1)
> (test1)
> ; cpu time (non-gc) 190 msec user, 0 msec system (ACL5.0/Linux/200)

MCL on my machine is much slower on this. Around 5 seconds.
CCL::COERCE-TO-LIST is a bit slower (300msec).

> (defun test2 ()
>  (time(dotimes(n 100000)(loop for c across "abc" collect c))))
> (compile'test2)
> (test2)
> ; cpu time (non-gc) 470 msec user, 0 msec system (ACL5.0/Linux/200)

MCL on my machine is faster on this: 300 msec (unoptimized and incl. EGC).

-- 
http://www.lavielle.com/~joswig

From: Rainer Joswig
Subject: Re: string -> list
Date: Mon, 05 Oct 1998 00:00:00 +0000
Message-ID: <joswig-0510980021000001@194.163.195.67>

In article <·······················@194.163.195.67>, ······@lavielle.com
(Rainer Joswig) wrote:

> > (defun test1()
> >  (time(dotimes(n 100000)(coerce"abc"'list))))
> > (compile'test1)
> > (test1)
> > ; cpu time (non-gc) 190 msec user, 0 msec system (ACL5.0/Linux/200)
> 
> MCL on my machine is much slower on this. Around 5 seconds.

Actually a problem of the current MCL. Fixed by Digitool.

-- 
http://www.lavielle.com/~joswig

From: Erik Naggum
Subject: Re: string -> list
Date: Sun, 04 Oct 1998 00:00:00 +0000
Message-ID: <3116511818742521@naggum.no>

* Marc Dzaebel <····@rose.de>
| Following times have been measured on ACL5.0/Linux/200/PC
| 
| (defun test1()
|  (time(dotimes(n 100000)(coerce"abc"'list))))
| (compile'test1)
| (test1)
| ; cpu time (non-gc) 190 msec user, 0 msec system (ACL5.0/Linux/200)
| 
| (defun test2 ()
|  (time(dotimes(n 100000)(loop for c across "abc" collect c))))
| (compile'test2)
| (test2)
| ; cpu time (non-gc) 470 msec user, 0 msec system (ACL5.0/Linux/200)

  I don't believe your 190 ms figure.

  the following timings are from ACL 5.0 running under Linux 2.0.35 on an
  old 133MHz Intel Pentium II, best figures from three consecutive runs:

(defun test1 ()
  (time
   (dotimes (n 100000)
     (coerce "abc" 'list))))
; cpu time (non-gc) 990 msec user, 0 msec system
; cpu time (gc)     180 msec user, 0 msec system
; cpu time (total)  1,170 msec user, 0 msec system
; real time  1,168 msec
; space allocation:
;  300,000 cons cells, 0 symbols, 0 other bytes, 0 static bytes

(defun test2 ()
  (time
   (dotimes (n 100000)
     (loop for c across "abc" collect c))))
; cpu time (non-gc) 520 msec user, 0 msec system
; cpu time (gc)     80 msec user, 0 msec system
; cpu time (total)  600 msec user, 0 msec system
; real time  604 msec
; space allocation:
;  400,000 cons cells, 0 symbols, 0 other bytes, 0 static bytes

(defun test3 ()
  (time
   (dotimes (n 100000)
     (map 'list #'identity "abc"))))
; cpu time (non-gc) 430 msec user, 0 msec system
; cpu time (gc)     50 msec user, 0 msec system
; cpu time (total)  480 msec user, 0 msec system
; real time  476 msec
; space allocation:
;  300,000 cons cells, 0 symbols, 0 other bytes, 0 static bytes

(defun test4 ()     
  (time
   (dotimes (n 100000)
     (do ((i 0 (1+ i))
	  (l (length "abc"))
	  (list ()))
	 ((eq i l) (nreverse list))
       (declare (fixnum i l))
       (push (schar "abc" i) list)))))
; cpu time (non-gc) 270 msec user, 0 msec system
; cpu time (gc)     100 msec user, 0 msec system
; cpu time (total)  370 msec user, 0 msec system
; real time  369 msec
; space allocation:
;  300,000 cons cells, 0 symbols, 0 other bytes, 0 static bytes

  note: the extra cons cell allocated in TEST2 comes from the way LOOP
  collects into a list.

#:Erik

From: Marc Dzaebel
Subject: Re: string -> list
Date: Mon, 05 Oct 1998 00:00:00 +0000
Message-ID: <361876DA.63C0D7D2@rose.de>

Erik Naggum wrote:
>   I don't believe your 190 ms figure.

Correct, I interchanged the times. Loop is the faster
method for "abc"->list (190msec) and coerce the better 
style (470msec). Thank you for this remark. However, 
Rainer Joswig gave some true warnings about Unicode/
Constants/Caches/Memory etc.. I wonder whether there
is an FAQ on how to measure LISP speed.

Marc

From: Kent M Pitman
Subject: Re: string -> list
Date: Thu, 01 Oct 1998 00:00:00 +0000
Message-ID: <sfw90j046m7.fsf@world.std.com>

Marc Dzaebel <····@rose.de> writes:

> Rainer Joswig wrote:
> > (coerce "abc" 'list)
> 
> Your solution is the better one. However it's 2.5 time slower
> than (loop for c across "abc" collect c).

Surely that's a property of a particular implementation at a
particular point in time.  The language spec makes no claim on this
subject.

Also, speed is not everything.  COERCE probably compiles in most
implementations to a simple function call, conserving space.  What 
some of us intended in the design of the language was that people
would use abstract operators like COERCE and the compiler would
permit you to use OPTIMIZE qualities (speed, safety, compilation-speed,
debug) and the INLINE or NOTINLINE declarations to control your needs
in a given program.  Speed is not always a paramount concern; it
matters in critical loops, but often in small, one-time situations
space matters more.  LOOP, because it works as a macro, has really
a lot fewer ways to effectively consolidate inline code back into
notinline code (unless it literally recognizes your action as the
COERCE idiom and opts to call the native function, which I think goes
a little beyond the call of duty for an already-quite-complicated
macro).

Often I declare information to the compiler in hope of optimizations I
know at some point in time in some implementation that the compiler
won't use.  I think it's proper coding style to have a language that
lets you say what you mean, rather than one that in an obscure way
allows you to coerce your code into doing what you want by a sequence
of obscure constructions that don't really say what you mean.  Using
LOOP doesn't say (linguistically) "I'm trying to achieve a certain
degree of speed"; it says, "I'm going to tell you in excruciating
detail how a certain algorithm works".  Sometimes that's necessary,
even when there's a packaged operator that does the same, because
sometimes the speed issue is inescapable.  But (IMO) it should not be
a general programming style, and a suggestion to use it in place of a
plainly correct named operator should always be accompanied by some
sort of preface or caveat that observes that if speed doesn't matter,
LOOP is the wrong way to go.

If you do otherwise, you proliferate a style in which improvements to
the implementation cannot hope to manifest themselves as improvements
to program execution without substantial rewrite.  And, frankly,
C already has that market cornered.

In general, smart people waste far too much of their very valuable
time on earth optimizing things that don't matter.  And worse, in so
doing, they make their code hard to read and maintain, guaranteeing a
later loss of even more precious time trying to decipher and rewrite
and re-optimize the code later.  I prefer to code clearly, and
optimize lazily.  Often worth avoiding algorithmic complexity on the
first pass, though not always even that, but rarely worth fussing the
constant factors until you know it matters.

So, personally, I'd send a bug report to your vendor if you really
need COERCE to be a certain speed.  And otherwise, I'd use the LOOP
idiom and COERCE interchangeably, with the caveat that linguistically
the use of LOOP is more confining.

JMO, FWIW.