How to split a string (or arbitrary sequence) at each occurrence of a value.

From: Daniel Pittman
Subject: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 08:35:25 +0000
Message-ID: <873d4pjlwy.fsf@inanna.rimspace.net>

I am looking for the simplest way to split a string into four strings
based on a character -- to parse an IP address string, specifically.

What is the best, easiest, fastest, etc, way to split a string into
substrings based on a character position.  In Emacs Lisp I would just:

(let ((address "210.23.138.16"))
  (split-string address "\\."))  ; second arg is regexp to split on.

Now, I don't actually need regexp functionality here; a literal '.' is
enough for me.

This strikes me as the sort of idiom that would be common enough for
Common Lisp[1] to feature it as part of the standard.

I would also be interested to know if y'all can suggest a general way to
do this for generalized sequences as well as for strings, but that's not
what I need to do right now.


Oh, and am I making a really silly mistake storing an IP address in a
slot of ":type (vector (integer 0 255) 4)"?

        Daniel

Footnotes: 
[1]  CLISP 2.27, specifically, with the HyperSpec as reference.

-- 
Money won't buy happiness, but it will pay the salaries of
a large research staff to study the problem.
        -- Bill Vaughan

Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Wade Humeniuk
- Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Russell Senior
  - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Wade Humeniuk
    - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Jochen Schmidt
      - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Wade Humeniuk
  - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Pierre R. Mai
Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Haugan
- Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Haugan
Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Christophe Rhodes
Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Christophe Rhodes
- Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Marco Antoniotti
  - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Tim Moore
    - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Naggum
      - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Kenny Tilton
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Bulent Murtezaoglu
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Kenny Tilton
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Naggum
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Russell Senior
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Naggum
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Rob Warnock
      - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Christophe Rhodes
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Naggum
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Christophe Rhodes
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Erik Naggum
        Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Christophe Rhodes
  - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Christophe Rhodes
Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Dr. Edmund Weitz
Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Shannon Spires
Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Thomas F. Burdick
- Re: How to split a string (or arbitrary sequence) at each occurrence of a value. G. W. Puckett
  - Re: How to split a string (or arbitrary sequence) at each occurrence of a value. Thomas F. Burdick

From: Wade Humeniuk
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 14:33:25 +0000
Message-ID: <9q6upu$tpe$1@news3.cadvision.com>

I usually do this type of thing with

(defun read-delimited-string (string &optional (delimiter #\.))
  "Returns a read list of delimited values from a string"
  (read-from-string
   (concatenate 'string "("
                (substitute #\space delimiter string)
                ")")))

CL-USER 3 > (read-delimited-string "210.23.138.16")
(210 23 138 16)
15

CL-USER 4 >

Wade

"Daniel Pittman" <······@rimspace.net> wrote in message
···················@inanna.rimspace.net...
> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.
>
> What is the best, easiest, fastest, etc, way to split a string into
> substrings based on a character position.  In Emacs Lisp I would just:
>
> (let ((address "210.23.138.16"))
>   (split-string address "\\."))  ; second arg is regexp to split on.
>
> Now, I don't actually need regexp functionality here; a literal '.' is
> enough for me.
>

From: Russell Senior
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 20:26:30 +0000
Message-ID: <86het4iozt.fsf@coulee.tdb.com>

>>>>> "Wade" == Wade Humeniuk <········@cadvision.com> writes:

Wade> I usually do this type of thing with 

Wade> (defun read-delimited-string (string &optional (delimiter #\.))
Wade>   "Returns a read list of delimited values from a string"
Wade>   (read-from-string
Wade>    (concatenate 'string "("
Wade>                 (substitute #\space delimiter string)
Wade>                 ")")))
Wade> 
Wade> CL-USER 3 > (read-delimited-string "210.23.138.16")
Wade> (210 23 138 16)
Wade> 15

This, of course, won't work the way you want if the delimited values
also contain spaces.  

I've been using a split-sequence function that was discussed here on
comp.lang.lisp back in Sept 1998, which works reasonably well.  The
problem above, though, raises the question of how one might handle
quoting of delimiters.  It hasn't been a problem for me, as usually
things are arranged so that it won't be, but in the general case it
could.


-- 
Russell Senior         ``The two chiefs turned to each other.        
·······@aracnet.com      Bellison uncorked a flood of horrible       
                         profanity, which, translated meant, `This is
                         extremely unusual.' ''

From: Wade Humeniuk
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 23:09:23 +0000
Message-ID: <9q7t1b$k5o$1@news3.cadvision.com>

> This, of course, won't work the way you want if the delimited values
> also contain spaces.
>

Of course not, but it does not have to in the case of dotted IP addresses.

This raises the issue of a generalized parser/reader for any conceivable
situation or writing a special purpose reader for specific cases.  The time
needed to implement a generalized solution (like regular expressions)
outweighs the time to implement a 100 of the specific readers.  Lazy man's
way out.

Here is a snippet of a parsing/reading problem from the LWW port for Aserve.
Delimiters are slightly more complex.

;;;
;;; DATE-TO-UNIVERSAL-TIME
;;; This is a contribution of Wade Humeniuk <········@cadvision.com>
;;; It reimplements the original function without MATCH-REGEXP
;;; which is not fully implemented in ACL-COMPAT
;;;

(defvar *net.aserve-package* (find-package :net.aserve))

(defun date-to-universal-time (date)
  ;; convert  a date string to lisp's universal time
  ;; we accept all 3 possible date formats

  ;; check preferred type first (rfc1123 (formerly refc822)):
  ;;    Sun, 06 Nov 1994 08:49:37 GMT
  ;; now second best format (but used by Netscape sadly):
  ;;    Sunday, 06-Nov-94 08:49:37 GMT
  ;; finally the third format, from unix's asctime
  ;;    Sun Nov  6 08:49:37 1994

  (let ((date (copy-seq date))
        (*read-eval* nil)
        (*package* *net.aserve-package*))
    (loop for char across date
          for i = 0 then (1+ i)
          when (or (char= #\, char)
                   (char= #\- char)
                   (char= #\: char))
          do (setf (elt date i) #\space))
    (setf date (concatenate 'string "(" date ")"))

    (destructuring-bind (day-of-week day month year hour minute second
                                     &optional timezone)
        (read-from-string date)
      (declare (ignore day-of-week timezone))
      (when (symbolp day) ;; probably third format, swap values
        (let ((real-day month)
              (real-month day)
              (real-hour year)
              (real-minute hour)
              (real-second minute)
              (real-year second))
          (setf day real-day
                month real-month
                year real-year
                hour real-hour
                minute real-minute
                second real-second)))
      (setf month (ecase month
                    (jan 1)
                    (feb 2)
                    (mar 3)
                    (apr 4)
                    (may 5)
                    (jun 6)
                    (jul 7)
                    (aug 8)
                    (sep 9)
                    (oct 10)
                    (nov 11)
                    (dec 12)))
      (cond
       ((and (> year 70) (< year 100)) (incf year 1900))
       ((<= year 70) (incf year 2000)))
      (encode-universal-time second minute hour day month year))))

#| The original code
(defun date-to-universal-time (date)
  ;; convert  a date string to lisp's universal time
  ;; we accept all 3 possible date formats

  (flet ((cvt (str start-end)
    (let ((res 0))
      (do ((i (car start-end) (1+ i))
    (end (cdr start-end)))
   ((>= i end) res)
        (setq res
   (+ (* 10 res)
      (- (char-code (schar str i)) #.(char-code #\0))))))))
    ;; check preferred type first (rfc1123 (formerly refc822)):
    ;;   Sun, 06 Nov 1994 08:49:37 GMT
    (multiple-value-bind (ok whole
     day
     month
     year
     hour
     minute
     second)
 (match-regexp
  "[A-Za-z]+, \\([0-9]+\\) \\([A-Za-z]+\\) \\([0-9]+\\)
\\([0-9]+\\):\\([0-9]+\\):\\([0-9]+\\) GMT"
  date
  :return :index)
      (declare (ignore whole))
      (if* ok
  then (return-from date-to-universal-time
  (encode-universal-time
   (cvt date second)
   (cvt date minute)
   (cvt date hour)
   (cvt date day)
   (compute-month date (car month))
   (cvt date year)
   0))))

    ;; now second best format (but used by Netscape sadly):
    ;;  Sunday, 06-Nov-94 08:49:37 GMT
    ;;
    (multiple-value-bind (ok whole
     day
     month
     year
     hour
     minute
     second)
 (match-regexp

  "[A-Za-z]+, \\([0-9]+\\)-\\([A-Za-z]+\\)-\\([0-9]+\\)
\\([0-9]+\\):\\([0-9]+\\):\\([0-9]+\\) GMT"
  date
  :return :index)

      (declare (ignore whole))

      (if* ok
  then (return-from date-to-universal-time
  (encode-universal-time
   (cvt date second)
   (cvt date minute)
   (cvt date hour)
   (cvt date day)
   (compute-month date (car month))
   (cvt date year) ; cl does right thing with 2 digit dates
   0))))


    ;; finally the third format, from unix's asctime
    ;;     Sun Nov  6 08:49:37 1994
    (multiple-value-bind (ok whole
     month
     day
     hour
     minute
     second
     year
     )
 (match-regexp

  "[A-Za-z]+ \\([A-Za-z]+\\) +\\([0-9]+\\)
\\([0-9]+\\):\\([0-9]+\\):\\([0-9]+\\) \\([0-9]+\\)"
  date
  :return :index)

      (declare (ignore whole))

      (if* ok
  then (return-from date-to-universal-time
  (encode-universal-time
   (cvt date second)
   (cvt date minute)
   (cvt date hour)
   (cvt date day)
   (compute-month date (car month))
   (cvt date year)
   0))))


    ))
|#


Wade

From: Jochen Schmidt
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 02:32:50 +0000
Message-ID: <9q891o$sfs$1@rznews2.rrze.uni-erlangen.de>

Wade Humeniuk wrote:

>> This, of course, won't work the way you want if the delimited values
>> also contain spaces.
>>
> 
> Of course not, but it does not have to in the case of dotted IP addresses.
> 
> This raises the issue of a generalized parser/reader for any conceivable
> situation or writing a special purpose reader for specific cases.  The
> time needed to implement a generalized solution (like regular expressions)
> outweighs the time to implement a 100 of the specific readers.  Lazy man's
> way out.
> 
> Here is a snippet of a parsing/reading problem from the LWW port for
> Aserve. Delimiters are slightly more complex.

[example snipped]

This parsing routine got replaced a while ago through a function using the 
META Parser. The problem with using the READER for that stuff was that some
Browsers (Netscape) had a semicolon and some further characters behind the
date and if you wrap that string in parens, the closing paren is behind the
semicolon and therefore commented out.

The actual code in portableaserve is like this (which is a quick hack 
written with META and not really nice...)

(eval-when (:compile-toplevel :load-toplevel :execute)
  (meta:enable-meta-syntax)
(deftype alpha-char () '(and character (satisfies alpha-char-p)))
(deftype digit-char () '(and character (satisfies digit-char-p)))
)

(defun date-to-universal-time (date)
  ;; convert  a date string to lisp's universal time
  ;; we accept all 3 possible date formats
  
  ;; check preferred type first (rfc1123 (formerly refc822)):
  ;;    Sun, 06 Nov 1994 08:49:37 GMT
  ;; now second best format (but used by Netscape sadly):
  ;;    Sunday, 06-Nov-94 08:49:37 GMT
  ;; finally the third format, from unix's asctime
  ;;    Sun Nov  6 08:49:37 1994

  (let (last-result)
    (meta:with-string-meta (buffer date)
           (labels ((make-result ()
                        (make-array 0 
                                    :element-type 'base-char 
                                    :fill-pointer 0 :adjustable t))
               (skip-day-of-week (&aux c)
                                 (meta:match [$[@(alpha-char c)] 
                                               !(skip-delimiters)]))
               (skip-delimiters ()
                                (meta:match $[{#\: #\, #\space #\-}]))
               (word (&aux (old-index meta::index) c
                           (result (make-result)))
                     (or (meta:match [!(skip-delimiters) @(alpha-char c) 
                                       !(vector-push-extend c result)
                                      $[@(alpha-char c) 
                                         !(vector-push-extend c result)]
                                      !(setf last-result result)])
                         (progn (setf meta::index old-index) nil)))
               (integer (&aux (old-index meta::index) c
                              (result (make-result)))
                        (or (meta:match [!(skip-delimiters) @(digit-char c) 
                                          !(vector-push-extend c result)
                                         $[@(digit-char c) 
                                            !(vector-push-extend c result)]
                                         !(setf last-result 
                                                (parse-integer result))])
                            (progn (setf meta::index old-index) nil)))
               (date (&aux day month year hours minutes seconds)
                     (and (meta:match [!(skip-day-of-week)
                                       {[!(word) !(setf month last-result)
                                         !(integer) !(setf day last-result)]
                                        [!(integer) !(setf day last-result)
                                         !(word) !(setf month
                                                        last-result)]} 
                                       !(integer) !(setf year last-result) 
                                       !(integer) !(setf hours last-result) 
                                       !(integer) !(setf minutes 
                                                         last-result) 
                                       !(integer) !(setf seconds 
                                                         last-result)])
                         ; (values seconds minutes hours day month)
                          (encode-universal-time seconds minutes hours day
                                                 (net.aserve::compute-month 
                                                  (coerce month 'simple-string)
                                                  0)
                                                 year
                                                 0))))
              (date)))))

 
ciao,
Jochen

--
http://www.dataheaven.de

From: Wade Humeniuk
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 04:29:06 +0000
Message-ID: <9q8foq$qkb$1@news3.cadvision.com>

> This parsing routine got replaced a while ago through a function using the
> META Parser. The problem with using the READER for that stuff was that
some
> Browsers (Netscape) had a semicolon and some further characters behind the
> date and if you wrap that string in parens, the closing paren is behind
the
> semicolon and therefore commented out.
>

Would it have worked to have substituted the #\; for #\space first?  Same
kind of routine but discarding the extra vars in destructuring-bind?

> The actual code in portableaserve is like this (which is a quick hack
> written with META and not really nice...)
>

Wow, I would not have thought a macro like meta existed.

Wade

From: Pierre R. Mai
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 22:19:52 +0000
Message-ID: <87g08oh56f.fsf@orion.bln.pmsf.de>

Russell Senior <·······@aracnet.com> writes:

> I've been using a split-sequence function that was discussed here on
> comp.lang.lisp back in Sept 1998, which works reasonably well.  The
> problem above, though, raises the question of how one might handle
> quoting of delimiters.  It hasn't been a problem for me, as usually
> things are arranged so that it won't be, but in the general case it
> could.

Once you throw escaping, or similar things into the equation, IMHO the
time has come to write a lexer/parser.  This is often only slightly
more complex than calling split-sequence/partition/what-have-you, but
offers you much more flexibility, and IMHO clarity.

Regs, Pierre.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein

From: Erik Haugan
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 09:28:07 +0000
Message-ID: <87lmihdx7c.fsf@kometknut.neitileu.no>

* Daniel Pittman <······@rimspace.net>
> What is the best, easiest, fastest, etc, way to split a string into
> substrings based on a character position.  In Emacs Lisp I would just:

This may not be fast (I don't know), but it's straight-forward and readable.

(defun split (string &optional (delimiter #\Space))
  (with-input-from-string (*standard-input* string)
    (let ((*standard-output* (make-string-output-stream)))
      (nconc (loop for char = (read-char nil nil nil)
                   while char
                   if (char= char delimiter)
                     collect (get-output-stream-string *standard-output*)
                   else
                     do (write-char char))
             (list (get-output-stream-string *standard-output*))))))

Erik

From: Erik Haugan
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 13:01:41 +0000
Message-ID: <87elo9dnbe.fsf@kometknut.neitileu.no>

Sorry for replying to my own article, however, I made such an inelegant
twist in the code I posted that I feel I have to correct it:

(defun string-split (string &optional (delimiter #\Space))
  (with-input-from-string (*standard-input* string)
    (let ((*standard-output* (make-string-output-stream)))
      (loop for char = (read-char nil nil nil)
            if (or (null char)
                   (char= char delimiter))
              collect (get-output-stream-string *standard-output*)
            else
              do (write-char char)
            while char))))

Erik

From: Christophe Rhodes
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 09:40:41 +0000
Message-ID: <sqk7y1qjqe.fsf@cam.ac.uk>

Daniel Pittman <······@rimspace.net> writes:

> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.
>
> [snip] 
>
> This strikes me as the sort of idiom that would be common enough for
> Common Lisp[1] to feature it as part of the standard.
> 
> I would also be interested to know if y'all can suggest a general way to
> do this for generalized sequences as well as for strings, but that's not
> what I need to do right now.

See <URL:http://ww.telent.net/cliki/PARTITION>. 

> Oh, and am I making a really silly mistake storing an IP address in a
> slot of ":type (vector (integer 0 255) 4)"?

No, that's less stupid than a lot of other representations :-)

Cheers,

Christophe
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Christophe Rhodes
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 09:43:03 +0000
Message-ID: <sqg08pqjmg.fsf@cam.ac.uk>

[ superseded to clarify ]

Daniel Pittman <······@rimspace.net> writes:

> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.
>
> [snip] 
>
> This strikes me as the sort of idiom that would be common enough for
> Common Lisp[1] to feature it as part of the standard.
> 
> I would also be interested to know if y'all can suggest a general way to
> do this for generalized sequences as well as for strings, but that's not
> what I need to do right now.

See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
community-discussed function is described in roughly
specification-level detail, with links to a reference implementation.

> Oh, and am I making a really silly mistake storing an IP address in a
> slot of ":type (vector (integer 0 255) 4)"?

No, that's less stupid than a lot of other representations :-)

Cheers,

Christophe
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Marco Antoniotti
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 13:49:03 +0000
Message-ID: <y6citdlarzk.fsf@octagon.mrl.nyu.edu>

Christophe Rhodes <·····@cam.ac.uk> writes:

> [ superseded to clarify ]
> 
> Daniel Pittman <······@rimspace.net> writes:
> 
> > I am looking for the simplest way to split a string into four strings
> > based on a character -- to parse an IP address string, specifically.
> >
> > [snip] 
> >
> > This strikes me as the sort of idiom that would be common enough for
> > Common Lisp[1] to feature it as part of the standard.
> > 
> > I would also be interested to know if y'all can suggest a general way to
> > do this for generalized sequences as well as for strings, but that's not
> > what I need to do right now.
> 
> See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
> community-discussed function is described in roughly
> specification-level detail, with links to a reference
> implementation.

I am sorry to be sooo nagging (again) on such a stupid matter. But......

The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
descriptive of what the function does.

Cheers

-- 
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group        tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                 fax  +1 - 212 - 995 4122
New York, NY 10003, USA                 http://bioinformatics.cat.nyu.edu
                    "Hello New York! We'll do what we can!"
                           Bill Murray in `Ghostbusters'.

From: Tim Moore
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 14:57:41 +0000
Message-ID: <9q70d5$3q4$0@216.39.145.192>

In article <···············@octagon.mrl.nyu.edu>, "Marco Antoniotti"
<·······@cs.nyu.edu> wrote:


> Christophe Rhodes <·····@cam.ac.uk> writes:

>> See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
>> community-discussed function is described in roughly
>> specification-level detail, with links to a reference implementation.
> I am sorry to be sooo nagging (again) on such a stupid matter. But......
>  The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
> descriptive of what the function does.  Cheers
> 
Get over it!

Tim

Add smileys as necessary

From: Erik Naggum
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 23:15:29 +0000
Message-ID: <3211917326833236@naggum.net>

* Christophe Rhodes
| See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
| community-discussed function is described in roughly specification-level
| detail, with links to a reference implementation.

* Marco Antoniotti
| I am sorry to be sooo nagging (again) on such a stupid matter. But......
| The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
| descriptive of what the function does.

* Tim Moore
| Get over it!

  But "partition" is such a _fantastically_ bad name, especially to people
  who know a bit of mathematical terminology.  Effectively using up that
  name forever for something so totally unrelated to the mathematical
  concept is hostile.  It is like defining a programming language where
  "sin" and "tan" are operations on (in) massage parlor just because the
  designers are more familiar with them than with mathematics.  "Partition"
  is a good name for a string-related function when the _only_ thing you
  think about is strings, or sequences at best.  At the very least, it
  should be called partition-sequence, but even this sounds wrong to me.

  I tend to use :start and :end arguments to various functions instead of
  splitting one string into several, and make sure that functions I write
  accept :start and :end arguments, and that they work with all sequences
  and useful element types, not only strings and characters.

///

From: Kenny Tilton
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of  a value.
Date: Sat, 13 Oct 2001 02:11:00 +0000
Message-ID: <3BC7A2FB.98EFAC69@nyc.rr.com>

Erik Naggum wrote:
> 
> * Christophe Rhodes
> | See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
> | community-discussed function is described in roughly specification-level
> | detail, with links to a reference implementation.
> 
> * Marco Antoniotti
> | I am sorry to be sooo nagging (again) on such a stupid matter. But......
> | The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
> | descriptive of what the function does.
> 
> * Tim Moore
> | Get over it!
> 
>   But "partition" is such a _fantastically_ bad name, especially to people
>   who know a bit of mathematical terminology.

hmmm. my dictionary says partition means to divide into parts. if
partition means something else to mathematicians, that's fine, natural
language is like that, but it's a bit harsh to moan about someone using
a word correctly just because someone else took liberties with it. 

besides, in a custody fight between mathematics and sequences over the
symbol-function of 'partition, well this is Lisp, I think sequences win.
we could solomon-like split the baby in half and not let anyone use
'partition, but consider this: the only sequence function I see listed
in CLTL2 which does not take a generic name (such as 'position) all for
itself is the trivial case of 'copy-seq.

i think the math literates amongst us gots to remember whose house they
are in when reading Lisp. (y'all can grok (+ 2 2) right?). seems to me
unadorned function names go to sequences, and it is the guest domains
that need to tack on tie-break syllables.

kenny
clinisys

  Effectively using up that
>   name forever for something so totally unrelated to the mathematical
>   concept is hostile.  It is like defining a programming language where
>   "sin" and "tan" are operations on (in) massage parlor just because the
>   designers are more familiar with them than with mathematics.  "Partition"
>   is a good name for a string-related function when the _only_ thing you
>   think about is strings, or sequences at best.  At the very least, it
>   should be called partition-sequence, but even this sounds wrong to me.
> 
>   I tend to use :start and :end arguments to various functions instead of
>   splitting one string into several, and make sure that functions I write
>   accept :start and :end arguments, and that they work with all sequences
>   and useful element types, not only strings and characters.
> 
> ///

From: Bulent Murtezaoglu
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of   a value.
Date: Sat, 13 Oct 2001 06:37:36 +0000
Message-ID: <878zeg11ue.fsf@nkapi.internal>

>>>>> "KT" == Kenny Tilton <·······@nyc.rr.com> writes:
[...]
    KT> hmmm. my dictionary says partition means to divide into
    KT> parts. if partition means something else to mathematicians,
    KT> that's fine, natural language is like that, but it's a bit
    KT> harsh to moan about someone using a word correctly just
    KT> because someone else took liberties with it. [...]

Unfortunately it also means something to computer scientists, possibly 
the same thing it means to mathematicians (what an equivalence relation 
does to a set) so the overlap is not just with some remote mathematical 
lingo.  When you say partition, a CS type would think of sets, not strings. 
I therefore don't think Erik was being unduly harsh.  

Of course I was too distracted/lazy to say any of this and even read cll 
when what became partition was being discussed, so I should probably shut 
up now.

cheers,

BM

From: Kenny Tilton
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence  of   a value.
Date: Sat, 13 Oct 2001 08:49:52 +0000
Message-ID: <3BC80077.E1B17877@nyc.rr.com>

Bulent Murtezaoglu wrote:

>  When you say partition, a CS type would think of sets, not strings.
> I therefore don't think Erik was being unduly harsh.

<h> actually i was mimicking the teenager usage of "harsh", which usage
is highly exaggerated as with most teenspeak. and i was thinking of the
general case of objecting to someone using a word correctly, wasn't
thinking about EN's post at all at that point tho I can see why one
would construe it that way.

actually we used "partition" in our code recently in the sense you
described, in a partial DB replication scheme: a DB instance viewed as
partitioning the set of all DB instances according to whether the key
instance had a direct or indirect owning relationship of any given
instance.

that said, turning from my dictionary to my thesaurus I discover split
and partition listed together under "allocation". :(

do i hear you all saying that the objection is that this string
manipulation we are discussing takes an ordered sequence and chops it up
by finding certain delimiters and then crucially considering the order
when dividing up the string, ie, every element _between_ two delimiters
ends up in the same partition, whereas in partitioning order does not
matter, each set member gets tested individually with the predicate? if
so, ok, i get that distinction.

sadly, i just looked up "split" and though the definition sounded as if
order was a factor in the partitioning denoted by "split", the two
examples given were "split into groups" and "split up the money". :(

interesting, what synonym for partition implies order matters? i guess
"subseq" kinda hits the problem over the head (just checked, that was
omitted from the list of sequence functions I saw in CLTL2) so with that
precedent something like 'split-sequence or 'splitseq would indeed be
preferable.

kenny
clinisys

> 
> >>>>> "KT" == Kenny Tilton <·······@nyc.rr.com> writes:
> [...]
>     KT> hmmm. my dictionary says partition means to divide into
>     KT> parts. if partition means something else to mathematicians,
>     KT> that's fine, natural language is like that, but it's a bit
>     KT> harsh to moan about someone using a word correctly just
>     KT> because someone else took liberties with it. [...]
> 
> Unfortunately it also means something to computer scientists, possibly
> the same thing it means to mathematicians (what an equivalence relation
> does to a set) so the overlap is not just with some remote mathematical
> 
> Of course I was too distracted/lazy to say any of this and even read cll
> when what became partition was being discussed, so I should probably shut
> up now.
> 
> cheers,
> 
> BM

From: Erik Naggum
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of  a value.
Date: Sat, 13 Oct 2001 09:27:43 +0000
Message-ID: <3211954062966885@naggum.net>

* Kenny Tilton
| hmmm. my dictionary says partition means to divide into parts. if
| partition means something else to mathematicians, that's fine, natural
| language is like that, but it's a bit harsh to moan about someone using
| a word correctly just because someone else took liberties with it. 

  To repeat myself from the article you responded to, since a teenager's
  attention span is so short:

  At the very least, it should be called partition-sequence, but even this
  sounds wrong to me.
  
  The more general a name, the more general the functionality it should
  provide in order to defend usurping the general name.  If it only works
  on sequences and only uses _one_ meaning of a word at the exclusion of
  another, make it more specific.  I posted the first version of the code
  that got discussed and transmogrified and then renamed into "partition"
  without any discussion here.  It was called "split-sequence" as I recall.
  The code that they base "partition" on was initially called just "split"
  and renamed "partition".  Bad move.

  Common Lisp does not have a simple way to import a symbol from a package
  under another name.  This means the connection to a badly chosen name is
  broken if you choose to rename it.  This is all the more reason to be a
  little careful when you name things very generally.  "split" was horrible
  in that sense, too.  I notice in passing that Franz Inc's "aserve" has
  split-on-character, split-into-words, and split-string functions which
  all seem overly specific, but which are at leas properly named.

///

From: Russell Senior
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of  a value.
Date: Sat, 13 Oct 2001 10:54:13 +0000
Message-ID: <86r8s7dd4a.fsf@coulee.tdb.com>

>>>>> "Erik" == Erik Naggum <····@naggum.net> writes:

Erik> [...] I posted the first version of the code that got discussed
Erik> and transmogrified and then renamed into "partition" without any
Erik> discussion here.  It was called "split-sequence" as I recall.

I think I might have been the one to call it split-sequence.  This
function was discussed on this newsgroup in September 1998, initially
in a thread titled "I don't understand Lisp".  During a discussion of
regular expressions (I think it was) Erik posted a function with a
slightly different interface and purpose called delimited-substrings,
and I followed up with one (pretty horrifying, but functioning) called
split-sequence, which I had adapted/generalized from one I'd found
called split-string.  Over the next few days it was substantially
revised/rewritten several times on the newsgroup by various authors.
At the end of that thread, it was still being called split-sequence,
which I continue to like and still use.

It appears this is what resurfaced in a still mutating form about a
year ago, called variously split and partition.

When the Christophe Rhodes "split-sequence/partition" thread started
back in June/July, I wasn't paying very much attention and so I didn't
participate.

BTW, one useful feature that got lost along the way seems to be the
ability to provide a value for empty subsequences.

For what it's worth, I still like the name split-sequence.

-- 
Russell Senior         ``The two chiefs turned to each other.        
·······@aracnet.com      Bellison uncorked a flood of horrible       
                         profanity, which, translated meant, `This is
                         extremely unusual.' ''

From: Erik Naggum
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of  a value.
Date: Sat, 13 Oct 2001 11:40:35 +0000
Message-ID: <3211962033165365@naggum.net>

* Russell Senior <·······@aracnet.com>
| I think I might have been the one to call it split-sequence.

  Yes.  Thank you for the correction and clarification.

| When the Christophe Rhodes "split-sequence/partition" thread started back
| in June/July, I wasn't paying very much attention and so I didn't
| participate.

  It looked to me like nobody really liked "partition" and the consensus
  was clearly on "split-sequence".  The name "partition" was just handed to
  us as something to accept despite the strong opposition.  However, I have
  not found the discussion behind this comment in "partition.lisp":

;;; * naming the function PARTITION rather than SPLIT.

  I wonder how this change was chosen.  Where can I find the discussion?

///

From: Rob Warnock
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of  a value.
Date: Tue, 30 Oct 2001 03:11:36 +0000
Message-ID: <9rl5p8$466ss$1@fido.engr.sgi.com>

Erik Naggum  <····@naggum.net> wrote:
+---------------
| ;;; * naming the function PARTITION rather than SPLIT.
|   I wonder how this change was chosen.  Where can I find the discussion?
+---------------

I only saved a few of those, but here are some snippets from near the
end of the thread ("Subject: Re: (final?) PARTITION specification").
Hopefully the Message-IDs will help you find them::

	Date: 05 Jul 2001 10:23:25 -0400
	From: Marco Antoniotti <·······@cs.nyu.edu>
	Message-ID: <···············@octagon.mrl.nyu.edu>
	...
	I would use SPLIT-SEQUENCE and SPLIT-SEQUENCE-IF.  In this way
	it is clear that these functions work on any sequence.
      ====
	Date: 09 Jul 2001 23:46:56 +0100
	From: Christophe Rhodes <·····@cam.ac.uk>
	Message-ID: <··············@lambda.jesus.cam.ac.uk>
	...
	I remain unconvinced by the legion clamouring for a name change from
	partition, to be honest. I think that anything I choose will either
	clash with something else or be hideously ugly (or both, of course);
	so I'm going to stick to my guns and go with PARTITION. Sorry if that
	makes the code or the specification unuseable by anyone.
      ====
	Date: 10 Jul 2001 14:01:07 +0200
	Subject: Re: (final?) PARTITION specification
	Message-ID: <··············@orion.bln.pmsf.de>
	...
	In any case, while I'm the original proponent of sticking
	to PARTITION, I'd like to add that I could also live with
	SPLIT-SEQUENCE or maybe SPLIT-SEQ, if it mattered.

The general sense I got was that a *lot* of people were initially for
SPLIT, but then someone mentioned a conflict with the series package,
so most shifted to SPLIT-SEQUENCE, with decreasing support for PARTITION
as time wore on... except for Christophe. [Apologies if I've severly
mis-stated anything.]


-Rob

-----
Rob Warnock, 30-3-510		<····@sgi.com>
SGI Network Engineering		<http://www.meer.net/~rpw3/>
1600 Amphitheatre Pkwy.		Phone: 650-933-1673
Mountain View, CA  94043	PP-ASEL-IA

[Note: ·········@sgi.com and ········@sgi.com aren't for humans ]

From: Christophe Rhodes
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 09:45:23 +0000
Message-ID: <sq669jrhzg.fsf@cam.ac.uk>

Erik Naggum <····@naggum.net> writes:

> * Christophe Rhodes
> | See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
> | community-discussed function is described in roughly specification-level
> | detail, with links to a reference implementation.
> 
> * Marco Antoniotti
> | I am sorry to be sooo nagging (again) on such a stupid matter. But......
> | The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
> | descriptive of what the function does.
> 
> * Tim Moore
> | Get over it!
> 
>   But "partition" is such a _fantastically_ bad name, especially to people
>   who know a bit of mathematical terminology.  

I can't help but be slightly irritated by this, I'm afraid, as I noted
at the time the conspicuous absence of certain people (not just Erik)
in the debate about the splitting function and its naming, at times
when I thought they might well have something to contribute.

Nevertheless, the question is probably more "so what are we going to
do about it?" Well, that's a good question... my personal attitude at
this point right now is "why bother?"

No doubt my idealism will resurface at some point,

Christophe
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Erik Naggum
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 11:16:32 +0000
Message-ID: <3211960591334243@naggum.net>

* Christophe Rhodes <·····@cam.ac.uk>
| I can't help but be slightly irritated by this, I'm afraid, as I noted
| at the time the conspicuous absence of certain people (not just Erik)
| in the debate about the splitting function and its naming, at times
| when I thought they might well have something to contribute.

  Where did this debate occur?  I have just stuffed a private archive of a
  _lot_ of news into a huge database, and cannot find any discussion of the
  name "partition" in this forum.  If you go away and make up your own
  community and you do something stupid and somebody complains about it, it
  is fairly bad taste to blame the people _you_ left behind for not taking
  part in your discussion.  This is one of the reasons I do not think those
  mini-communities are doing any good.  You need a large number of people
  to weed out the silly ideas that look good to everyone in a small group.

| Nevertheless, the question is probably more "so what are we going to do
| about it?" Well, that's a good question... my personal attitude at this
| point right now is "why bother?"

  Yeah, why use something that is so badly named?  So, who cares?

  As I have indicated, I think splitting strings and creating huge amounts
  of garbage during parsing is bad software design.  The incessant copying
  of characters that plague most parsers is _the_ source of bad performance.

///

From: Christophe Rhodes
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 13:31:06 +0000
Message-ID: <sqelo7hdk5.fsf@cam.ac.uk>

Erik Naggum <····@naggum.net> writes:

> * Christophe Rhodes <·····@cam.ac.uk>
> | I can't help but be slightly irritated by this, I'm afraid, as I noted
> | at the time the conspicuous absence of certain people (not just Erik)
> | in the debate about the splitting function and its naming, at times
> | when I thought they might well have something to contribute.
> 
>   Where did this debate occur?  I have just stuffed a private archive of a
>   _lot_ of news into a huge database, and cannot find any discussion of the
>   name "partition" in this forum.  

A quick google gets me 

<URL:http://groups.google.com/groups?q=group:comp.lang.lisp+partition+split-sequence&hl=en&rnum=3&selm=y6clmm3lajm.fsf%40octagon.mrl.nyu.edu>

for instance; there's a thread of 35 articles, according to google.

>   If you go away and make up your own
>   community and you do something stupid and somebody complains about it, it
>   is fairly bad taste to blame the people _you_ left behind for not taking
>   part in your discussion.  

Granted; I thought that comp.lang.lisp was the closest we had to a
lisp community these days. If there's somewhere else that I should
have been writing, let me know, please!

>   This is one of the reasons I do not think those
>   mini-communities are doing any good.  You need a large number of people
>   to weed out the silly ideas that look good to everyone in a small group.

True. 

> | Nevertheless, the question is probably more "so what are we going to do
> | about it?" Well, that's a good question... my personal attitude at this
> | point right now is "why bother?"
> 
>   Yeah, why use something that is so badly named?  So, who cares?

Now, if on reading the thread one article of which is cited above, you
observe that I ignored the Kassandras telling me that "partition" is a
bad name, you might be more justified. Since I sincerely doubt (though
do disabuse me if this isn't true) that any vendor has yet adopted
partition, the cost to the community, wherever it resides, in changing
the specification to the extent of the name (to split-sequence,
cleave, or any of the other names discussed in that thread; see for
example
<URL:http://groups.google.com/groups?q=g:thl1983237200d&hl=en&selm=87ofr2hqdq.fsf%40palomba.bananos.org>)
is minimal[*]. Given this, it's not a problem.

>   As I have indicated, I think splitting strings and creating huge amounts
>   of garbage during parsing is bad software design.  The incessant copying
>   of characters that plague most parsers is _the_ source of bad performance.

And this is another matter entirely. Nevertheless, given the frequency
of requests in this forum for "a string splitting function" it might
be useful to have something that was designed rather than 5 ad-hoc
security-flawed answers for each occasion.

Cheers,

Christophe

[*] It might cause me to persuade Dan Barlow to implement topic
synonyms for CLiki.
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Erik Naggum
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 17:02:24 +0000
Message-ID: <3211981343483341@naggum.net>

* Christophe Rhodes
| A quick google gets me
|
| <URL:http://groups.google.com/groups?q=group:comp.lang.lisp+partition+split-sequence&hl=en&rnum=3&selm=y6clmm3lajm.fsf%40octagon.mrl.nyu.edu>
|
| for instance; there's a thread of 35 articles, according to google.

No, this is not discussing the transition from "split" _to_ "partition".
Specifically, no articles even attempts to explain how "partition" was
chosen and why it is a good name. The overwhelmingly negative response
to that name when you _did_ publish it was just ignored, as you admit. I
wonder how you can complain about people not raising their concerns when
you just walked away when they did.

| Since I sincerely doubt (though do disabuse me if this isn't true) that
| any vendor has yet adopted partition, the cost to the community, wherever
| it resides, in changing the specification to the extent of the name
| ([...]) is minimal[*]. Given this, it's not a problem.

It is and remains a problem if it is not _actually_ done.

* Erik Naggum
| As I have indicated, I think splitting strings and creating huge amounts
| of garbage during parsing is bad software design. The incessant copying
| of characters that plague most parsers is _the_ source of bad performance.

* Christophe Rhodes
| And this is another matter entirely.

Well, some of us think that if people ask for tail recursion, they should
be told about other iteration constructs. It is downright sad that as
Common Lisp is such a great language for its ability to maintain identity
of objects and therefore was inherently "object-oriented" before anyone
invented that term, has succumbed to the very primitive properties to C
and Unix tools where copying characters around all the time is _not_ seen
as pretty damn stupid, which it is. Strings are fairly expensive objects
in Common Lisp -- they actually are in any language -- but copying text
is a more expensive operation in Common Lisp than in languages that do it
so often they have super-optimized copying functions. This is even more
true when the Common Lisp system uses Unicode internally and talks to a
world that still uses 7- or 8-bit-encoded character sets.

| Nevertheless, given the frequency of requests in this forum for "a string
| splitting function" it might be useful to have something that was
| designed rather than 5 ad-hoc security-flawed answers for each occasion.

Giving people what they want when they express their desire in the form
of an implementation of a solution they could not write on their own is
never going to help them. People who ask such questions need to be told
that they have to present their problem and not the solution they have
chosen in their _ignorance_ of the solution space.

But from where did "security-flawed" enter the picture? I sense another
matter entirely. :)

///
--
The United Nations before and after the leadership of Kofi Annan are two
very different organizations. The "before" United Nations did not deserve
much credit and certainly not a Nobel peace prize. The "after" United
Nations equally certainly does. I applaud the Nobel committee's choice.

From: Christophe Rhodes
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Sat, 13 Oct 2001 20:00:44 +0000
Message-ID: <sqwv1z48er.fsf@cam.ac.uk>

Erik Naggum <····@naggum.net> writes:

> * Christophe Rhodes
> | A quick google gets me 
> | 
> | <URL:http://groups.google.com/groups?q=group:comp.lang.lisp+partition+split-sequence&hl=en&rnum=3&selm=y6clmm3lajm.fsf%40octagon.mrl.nyu.edu>
> | 
> | for instance; there's a thread of 35 articles, according to google.
> 
>   No, this is not discussing the transition from "split" _to_ "partition".
>   Specifically, no articles even attempts to explain how "partition" was
>   chosen and why it is a good name.  

Hmm. I'm afraid in that case I can't refer you to the recent thread.
However, and without wishing to hide behind other people as I confess
to liking the name, please see

<URL:http://groups.google.com/groups?hl=en&rnum=5&selm=878zrlp1cr.fsf%40orion.bln.pmsf.de>

which happened to be my starting point. Call it historical accident,
if you will; I certainly didn't mean to hide this. On rereading the
2001 thread, I agree that this wasn't made plain.

>   The overwhelmingly negative response
>   to that name when you _did_ publish it was just ignored, as you admit.  I
>   wonder how you can complain about people not raising their concerns when
>   you just walked away when they did.

Some liked it, maybe most didn't; there was no consensus that I could
see on a preferred alternative; and there was agreement (or at least
nem con) when it was said:

Pierre Mai:
> In any case I agree that as long
> as the name is not unduly ugly, this is probably the least problem
> hindering acceptance.

It would appear that this isn't true. It would have been nice to know
that there was strong feeling from several sources.

> | Since I sincerely doubt (though do disabuse me if this isn't true) that
> | any vendor has yet adopted partition, the cost to the community, wherever
> | it resides, in changing the specification to the extent of the name
> | ([...]) is minimal[*]. Given this, it's not a problem.
> 
>   It is and remains a problem if it is not _actually_ done.

A community-based standard is obviously only as strong as the
community's use for it. However, my impression was in fact that few
people were at all interested, few were paying attention and few
actually cared. I'm unfairly maligning people, but this was my
_impression_.

There is no defined mechanism for finalizing or changing
specifications of this kind; also, the best-formatted and
most-easily-accessible version of the specification is on a
world-writeable web page. Maybe we need a more formal structure for
these things (� la SRFI)? Discussion period, followed by vote? I don't
know. Since I seem still to be at the focus, I'm willing, I suppose,
to tally votes, or something.

I mean, I don't know. Fortunately, should the consensus be for a name
change, we've recently discussed here ways of deprecating
interfaces...  :-)

>   But from where did "security-flawed" enter the picture?  I sense another
>   matter entirely.  :)

Oh, that was a reference to the read-based solutions to this problem
that appear with appalling regularity. :)

Christophe

PS: I would like to say, since I've invoked Pierre's posts twice, just
in case that it isn't clear: I am speaking for myself only in this
message. There is no cabal.
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Christophe Rhodes
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 14:43:14 +0000
Message-ID: <sqbsjcrkal.fsf@cam.ac.uk>

Marco Antoniotti <·······@cs.nyu.edu> writes:

> Christophe Rhodes <·····@cam.ac.uk> writes:
> 
> > [ superseded to clarify ]
> > 
> > Daniel Pittman <······@rimspace.net> writes:
> > 
> > > I am looking for the simplest way to split a string into four strings
> > > based on a character -- to parse an IP address string, specifically.
> > >
> > > [snip] 
> > >
> > > This strikes me as the sort of idiom that would be common enough for
> > > Common Lisp[1] to feature it as part of the standard.
> > > 
> > > I would also be interested to know if y'all can suggest a general way to
> > > do this for generalized sequences as well as for strings, but that's not
> > > what I need to do right now.
> > 
> > See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
> > community-discussed function is described in roughly
> > specification-level detail, with links to a reference
> > implementation.
> 
> I am sorry to be sooo nagging (again) on such a stupid matter. But......
> 
> The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
> descriptive of what the function does.

I suppose this depends if you're a physicist or a set theorist; to a
physicist (me, for example) partition has connotation of putting
partitions into something, to divide it up; 

I freely give permission to vendors to include the partition code in
their Lisps; if vendors think that it will help, they are free to call
it 'SPLIT-SEQUENCE' if they like, or 'SPLIT', or whatever. Not that I
generally believe in appeals to the market to determine correctness,
but in this case it's my way of dodging the issue. I *like* the name
'PARTITION', so that's what I call it; others are free to do
otherwise, though as a matter of unifying the community I would rather
hope that they didn't. Ultimately, I accept the possibility that I
will be in a minority of one.

Anyone else want to volunteer ideas for utility functions that
everyone writes? Imagine that CL had a 'PARTITION' in the language;
what would people lament the absence of to comp.lang.lisp once a week?

Cheers,

Christophe 
-- 
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar 
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

From: Dr. Edmund Weitz
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 09:24:17 +0000
Message-ID: <m3ofndgqim.fsf@bird.agharta.de>

Daniel Pittman <······@rimspace.net> writes:

> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.

I needed something similar last week and came up with this solution:

(defun split (sequence &key
                       (test #'(lambda (x) (eq x #\Space))))
  "Returns a list of sub-sequences of SEQUENCE where each
element that satisfies TEST is treated as a separator."
  (let (result)
    (do* ((old-pos (position-if-not test sequence)
                   (when old-pos
                     (position-if-not test sequence
                                      :start old-pos))))
         ((null old-pos) (nreverse result))
     (let ((new-pos
              (position-if test sequence
                           :start old-pos)))
       (if new-pos
           (setf result (cons
                         (subseq sequence old-pos new-pos)
                         result)
                 old-pos (1+ new-pos))
         (setf result (cons
                       (subseq sequence old-pos)
                       result)
               old-pos nil))))))

Note that this might not be very fast, I didn't need it. Also note
that I'm rather new to CL, so others here will definitely have better
solutions.

Best regards,
Edi.

From: Shannon Spires
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Fri, 12 Oct 2001 21:52:04 +0000
Message-ID: <svspire-09C52A.15520412102001@news.sandia.gov>

In article <··············@inanna.rimspace.net>, Daniel Pittman 
<······@rimspace.net> wrote:

> Oh, and am I making a really silly mistake storing an IP address in a
> slot of ":type (vector (integer 0 255) 4)"?

I usually store them as 32-bit integers. It's simple that way, and 
my TCP/IP stack routines use integers internally anyway. Provided you 
have good conversion routines to and from dotted notation for human I/O, 
it works well.

Shannon Spires
·······@nmia.com

From: Thomas F. Burdick
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Mon, 15 Oct 2001 00:44:38 +0000
Message-ID: <xcvvghhpw95.fsf@conquest.OCF.Berkeley.EDU>

Daniel Pittman <······@rimspace.net> writes:

> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.
> 
> What is the best, easiest, fastest, etc, way to split a string into
> substrings based on a character position.  In Emacs Lisp I would just:
> 
> (let ((address "210.23.138.16"))
>   (split-string address "\\."))  ; second arg is regexp to split on.
> 
> Now, I don't actually need regexp functionality here; a literal '.' is
> enough for me.
> 
> This strikes me as the sort of idiom that would be common enough for
> Common Lisp[1] to feature it as part of the standard.

Except that, as I'm sure you've seen by now, it's a source of
contention as to how exactly this should be done.  I'd had a
SPLIT-VECTOR function that I used to use:

  [57]> (split-vector " la dee dah " #\space :start 1)
  ("la" "dee" "dah" "")

But I didn't like all the pointless consing.  So I'd been doing it by
hand with LOOP and reusing the string.  Then, I felt stupid for using
an idiom that I could turn into a more concise macro.  So I came up
with DO-VECTOR-SPLIT.  I don't really like the name, but it does act
like a DO-... macro.

(defun call-splitting-vector (vector splitter fn
                                     &key (start 0)
                                          (end (length vector))
                                          (test #'eq))
  (when (< start 0)
    (error ":START should be <= 0, not ~S" start))
  (when (> end (length vector))
    (error ":END is out of bounds"))
  (loop with begin = start
        for i from start below end
        when (funcall test splitter (aref vector i))
        do (funcall fn begin i)
        (setf begin (1+ i))
        finally (funcall fn begin i)))

(defmacro do-vector-split ((start end (vector splitter)
                                  &rest keys &key &allow-other-keys)
                           &body forms)
  (when (null start) (setf start (gensym)))
  (when (null end) (setf end (gensym)))
  `(call-splitting-vector ,vector ,splitter
                          #'(lambda (,start ,end) ,@forms)
                          ,@keys))

You can use DO-VECTOR-SPLIT to collect a list of substrings:

  [58]> (let ((result ())
              (string " la dee dah "))
          (do-vector-split (s e (string #\space)
                              :start 1)
            (push (subseq string s e) result))
          (nreverse result))
  ("la" "dee" "dah" "")

Of course, you can always define SPLIT-VECTOR in terms of
CALL-SPLITTING-VECTOR:

  (defun split-vector (vector splitter &rest keys &key &allow-other-keys)
    (let ((result ()))
      (apply #'call-splitting-vector
             vector splitter #'(lambda (s e)
                                 (push (subseq vector s e) result))
             keys)
      (nreverse result)))

But you'll probably only very rarely need it, because you're probably
splitting the string as an intermediate step to some other end
(stuffing numbers into a vector that represents an IP address, for
example), so you may as well avoid consing up new strings and a new
list just to throw them away.

From: G. W. Puckett
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Mon, 15 Oct 2001 14:29:53 +0000
Message-ID: <b32pu7pdli6.fsf@w4pphx2t.us.nortel.com>

···@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
>   (when (< start 0)
>     (error ":START should be <= 0, not ~S" start))

Is this error message incorrect?


-- 
I. M. Puckett                              replace "sendnospam" with "puckett"

From: Thomas F. Burdick
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Date: Mon, 15 Oct 2001 19:40:24 +0000
Message-ID: <xcvzo6sadzr.fsf@tornado.OCF.Berkeley.EDU>

··········@nortelnetworks.com (G. W. Puckett) writes:

> ···@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
> >   (when (< start 0)
> >     (error ":START should be <= 0, not ~S" start))
> 
> Is this error message incorrect?

Oops, I've even gotten that error before, but since I wrote the
message, I knew what I meant:

  (when (< start 0)
    (error ":START should be >= 0, not ~S" start))