From: Ed Symanzik
Subject: split-string: request for comments
Date: 
Message-ID: <pan.2003.09.30.20.05.21.970268@msu.edu>
I'm working on teaching myself lisp and am working on a basic string 
splitter.  I wondered what you all thought of this code. It doesn't 
feel right to me, but I can't put my finger on why.

(defun split-string (s &key (delimiter " ")
                     (mark-missing-token nil)
                     (missing-token nil))
  (let* ((end (length s))
         (in-token (not (find (schar s (- end 1)) delimiter)))
         (result nil))
    (do ((pos (- end 1) (- pos 1)))                     ; work right to left
      ((< pos 0) result)
      (cond ((find (schar s pos) delimiter)
             (cond (in-token
                     (push (subseq s (+ pos 1) end) result)
                     (setq in-token nil))
                   (mark-missing-token
                     (push missing-token result))))
            ((not in-token) (setq in-token t)
                            (setq end (+ pos 1)))))
      (if (find (schar s 0) delimiter)                  ;finish start of string
        (if mark-missing-token
          (push missing-token result)
          result)
        (push (subseq s 0 end) result))))


[2]> (split-string "mary had a little bat ")
("mary" "had" "a" "little" "bat")

[3]> (split-string " -it's wings were long,  and brown" :delimiter " -,")
("it's" "wings" "were" "long" "and" "brown")

[4]> (split-string "root:*:0:0::/root:/bin/sh" 
       :delimiter ":" :mark-missing-token t :missing-token "")
("root" "*" "0" "0" "" "/root" "/bin/sh")

From: Christophe Rhodes
Subject: Re: split-string: request for comments
Date: 
Message-ID: <sqwubqxb8c.fsf@lambda.jcn.srcf.net>
"Ed Symanzik" <···@msu.edu> writes:

> I'm working on teaching myself lisp and am working on a basic string 
> splitter.  I wondered what you all thought of this code. It doesn't 
> feel right to me, but I can't put my finger on why.

There was a long discussion about this kind of thing here about two
years ago.  Google for "partition" and "split-sequence".

The executive summary as applied to your routine:
  * why are you limiting it to strings?  (and simple-strings, at that;
    why am I not allowed to split an adjustable string?);
  * it looks a bit like other sequence functions; you might want
    to split from the end, or a fixed count, or with an arbitrary
    function;
  * quite often, splitting a sequence is a precursor to other
    processing.  Instead of consing intermediary values, why not
    generate indices suitable to pass as :start and :end parameters?

> [3]> (split-string " -it's wings were long,  and brown" :delimiter " -,")
> ("it's" "wings" "were" "long" "and" "brown")

<http://www.angryflower.com/itsits.gif> (and, for context,
<http://www.angryflower.com/bobsqu.gif>).

Christophe
-- 
http://www-jcsu.jesus.cam.ac.uk/~csr21/       +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%")    (pprint #36rJesusCollegeCambridge)
From: Jonathan Spingarn
Subject: Re: split-string: request for comments
Date: 
Message-ID: <559e2239.0310021046.1c331c91@posting.google.com>
Christophe Rhodes <·····@cam.ac.uk> wrote in message news:<··············@lambda.jcn.srcf.net>...
> "Ed Symanzik" <···@msu.edu> writes:
> 
> > I'm working on teaching myself lisp and am working on a basic string 
> > splitter.  I wondered what you all thought of this code. It doesn't 
> > feel right to me, but I can't put my finger on why.
> 
Admittedly, this is not so much an answer to your question as it is
an advertisement for the Chio string processing library that I posted
recently. The macro "with-test-split" would handle your example
like this:

 (let ((str "it's wings were long, and brown"))
   (ch:with-test-split A (#~"[ ,]+" nil str :tags (:map))
     (ch:mref A 0 :s)))
 returns: ("it's" "wings" "were" "long" "and" "brown")

This may be a more powerful tool than you want. You only
need the Chio library if you are doing lots of work with
strings. -- Chio can split
on substrings matching any regular expression.  You can also
tear apart each of the fields and establish bindings any way
you want.  For instance, suppose you have telephone
numbers separated by alphabetical characters.  You can separate
out the area codes (the first three digits) like this:

(let ((str "321-758-2294bbb444-576-9121cc222-123-4589"))
  (ch:with-test-split A (#~"\a+" (:& (:&0 #~"\d+") #~"-" (:&1
#~"[\d-]+")) str :tags (:map))
    (cons (mref A 0 :i)       ; :i says to read area code as a fixnum
          (ch:mref A 1 :s))))    ; :s says to read telephone number as
a string
returns:
((321 . "758-2294") (444 . "576-9121") (222 . "123-4589"))

More with-test-split examples at 
http://www.toiling-in-obscurity.net/chio/examples/

  - Jonathan Spingarn
From: Steve Long
Subject: Re: split-string: request for comments
Date: 
Message-ID: <BB9FC4EA.4C9C%sal6741@hotmail.com>
On 09/30/2003 1:05 PM, in article ······························@msu.edu,
"Ed Symanzik" <···@msu.edu> wrote:

> I'm working on teaching myself lisp and am working on a basic string
> splitter.  I wondered what you all thought of this code. It doesn't
> feel right to me, but I can't put my finger on why.
> 
> (defun split-string (s &key (delimiter " ")
>                    (mark-missing-token nil)
>                    (missing-token nil))
> (let* ((end (length s))
>        (in-token (not (find (schar s (- end 1)) delimiter)))
>        (result nil))
>   (do ((pos (- end 1) (- pos 1)))                     ; work right to left
>     ((< pos 0) result)
>     (cond ((find (schar s pos) delimiter)
>            (cond (in-token
>                    (push (subseq s (+ pos 1) end) result)
>                    (setq in-token nil))
>                  (mark-missing-token
>                    (push missing-token result))))
>           ((not in-token) (setq in-token t)
>                           (setq end (+ pos 1)))))
>     (if (find (schar s 0) delimiter)                  ;finish start of string
>       (if mark-missing-token
>         (push missing-token result)
>         result)
>       (push (subseq s 0 end) result))))
> 
> 
> [2]> (split-string "mary had a little bat ")
> ("mary" "had" "a" "little" "bat")
> 
> [3]> (split-string " -it's wings were long,  and brown" :delimiter " -,")
> ("it's" "wings" "were" "long" "and" "brown")
> 
> [4]> (split-string "root:*:0:0::/root:/bin/sh"
>      :delimiter ":" :mark-missing-token t :missing-token "")
> ("root" "*" "0" "0" "" "/root" "/bin/sh")
> 

Another variation, a bit simpler...

(defun string-parse
    (string &optional (token #\space))
  "
string-parse

FUNCTION

---------------------------------------------------------------------

DESCRIPTION:

  Parse a string into a list of strings using using a token
  character.

SYNTAX:

  string-parse STR [TKN]

PARAMETERS:

  STR  ::= <string>

  TKN  ::= <character>  Default is #\space.

EVALUATES-TO:

  If none of the members of :token are found, a list
  containing  the input string is returned. The default for :token
  is the space character.
"
  (labels ((cons-list (p str list-out)
             (if p (cons (subseq str 0 p) list-out) list-out)))
    (do* ((str      string                (subseq str (1+ p)))
          (p        (position token str)  (position token str))
          (list-out (cons-list p str nil) (cons-list p str list-out)))
        ((null p) (nreverse (cons str list-out))))))


...steve