From: Dr. Mirko Luedde
Subject: newbie: emacs 'split-string'
Date: 
Message-ID: <38DB24DC.9E9DECB3@Computer.Org>
Hi folks, 

I have a problem with the emacs lisp 'split-string' function.

Let's say I want to split the string "abc" at every match of the
regular expression "a+".  The expression (split-string "bab" "a+")
evaluates to ("b" "b"), which is fine.

On the other hand, (split-string "abc" "a+") evaluates to ("bc")
instead of to ("" "bc").  This means, there is no leading empty
string, despite the leading match. So in a sense, 'split-string' does
not really 'split'. We loose information about the matches at the
beginning (and at the end).

In which way can I obtain the desired behavior?  Surrounding the
string with a leading and trailing character does not work, since it
could interfere with the regexp match. E.g., if we choose
"a" as the leading/trailing character (some definite character must be
 chosen), evaluating (split-string "aabca" "a+" ) still yields ("bc").
Note that this example shows that 'split-string' does
MATCH the regexp at the beginning of a string. It just does not insert
a "" into the result list.

I'm not claiming there is anything wrong with 'split-string', I simply
need a different function.

Any help is appreciated. 

Cheers, Mirko. 

-- 
Dr. Mirko Luedde
············@Computer.Org
From: Tom Breton
Subject: Re: newbie: emacs 'split-string'
Date: 
Message-ID: <m3hfdwnqn3.fsf@world.std.com>
"Dr. Mirko Luedde" <············@Computer.Org> writes:

> Hi folks, 
> 
> I have a problem with the emacs lisp 'split-string' function.
> 
> Let's say I want to split the string "abc" at every match of the
> regular expression "a+".  The expression (split-string "bab" "a+")
> evaluates to ("b" "b"), which is fine.
> 
> On the other hand, (split-string "abc" "a+") evaluates to ("bc")
> instead of to ("" "bc").  This means, there is no leading empty
> string, despite the leading match. So in a sense, 'split-string' does
> not really 'split'. We loose information about the matches at the
> beginning (and at the end).
> 
> In which way can I obtain the desired behavior?  

Sure, add the leading and trailing empty strings by hand,
conditionally.  Pseudocode:

        Do the usual split-string

        (when (string-match (concat "^" regex) string)
        add a leading ""
        )

        (when (string-match (concat regex "$") string)
        add a trailing ""
        )

You'll want to build the regular expressions a little more carefully
in case they already begin/end with bol/eol, but I'm sure you get the
idea.

> 
> I'm not claiming there is anything wrong with 'split-string', I simply
> need a different function.

-- 
Tom Breton, http://world.std.com/~tob
Not using "gh" since 1997. http://world.std.com/~tob/ugh-free.html
Rethink some Lisp features, http://world.std.com/~tob/rethink-lisp/index.html