I had a need just now to replace one substring in my string with a different
substring, of a different length (for quoting parameters to SQL queries,
incidentally). It appears that accomplishing this isn't straightforward in
CL. Below is what I came up with. Does anyone have a better way than this?
You'd figure CL would have something that does this, but I couldn't find
anything. Maybe it's time to read the standard cover-to-cover.
Chris Capel
(defun replace-sequence (array old new)
(do ((new-array (make-array (length array)
:adjustable t
:element-type (array-element-type array)))
(new-array-position 0 (+ new-array-position (length old)))
(array-position (search old array)
(search old array :start2 (1+ array-position)))
(prev-array-position 0 array-position)
(difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
((not array-position) (progn (replace new-array array
:start1 new-array-position
:start2 prev-array-position)
new-array))
(replace new-array array
:start1 new-array-position
:start2 prev-array-position :end2 array-position)
(incf new-array-position (- array-position prev-array-position))
(when difference
(adjust-array new-array (+ (length new-array) difference)))
(replace new-array new :start1 new-array-position)))
Chris Capel <······@iba.nktech.net> writes:
>
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
If I need to do something like that, I usually end up using a string
stream and writing various substrings to it, splicing in the substituted
values.
The advantage is that the string maintenance is nice and clean.
One drawback is that the setup overhead and consing is not trivial, so
it is often worthwhile to first do a search on the string to see if the
substring to be replaced even occurs. It adds an additional pass
through the string, but if cases where no substitution is needed, it can
provide a big speedup. That makes it perhaps useful in a general
utility than in special purpose code where you know or expect to be
doing the substitution.
Herewith is my entry:
(defun string-substitute (string old new)
(if (search old string)
(with-output-to-string (out)
(loop with old-len = (length old)
for start = 0 then (+ end old-len)
as end = (search old string :start2 start)
do (write-string (subseq string start end) out)
(if end
(write-string new out)
(loop-finish))))
string))
--
Thomas A. Russ, USC/Information Sciences Institute
Thomas A. Russ wrote:
> Chris Capel <······@iba.nktech.net> writes:
>
>>
>> I had a need just now to replace one substring in my string with a
>> different substring, of a different length (for quoting parameters to SQL
>> queries, incidentally). It appears that accomplishing this isn't
>> straightforward in CL. Below is what I came up with. Does anyone have a
>> better way than this? You'd figure CL would have something that does
>> this, but I couldn't find anything. Maybe it's time to read the standard
>> cover-to-cover.
>
> If I need to do something like that, I usually end up using a string
> stream and writing various substrings to it, splicing in the substituted
> values.
>
> The advantage is that the string maintenance is nice and clean.
Very elegant, I agree. I'm of the opinion, however, that you write it once,
and forget about it. So do it well, make it fast, and you're set
forevermore.
Chris Capel
Chris Capel <······@iba.nktech.net> writes:
> Thomas A. Russ wrote:
>
> > Chris Capel <······@iba.nktech.net> writes:
> >
> >>
> >> I had a need just now to replace one substring in my string with a
> >> different substring, of a different length (for quoting parameters to SQL
> >> queries, incidentally). It appears that accomplishing this isn't
> >> straightforward in CL. Below is what I came up with. Does anyone have a
> >> better way than this? You'd figure CL would have something that does
> >> this, but I couldn't find anything. Maybe it's time to read the standard
> >> cover-to-cover.
> >
> > If I need to do something like that, I usually end up using a string
> > stream and writing various substrings to it, splicing in the substituted
> > values.
> >
> > The advantage is that the string maintenance is nice and clean.
>
> Very elegant, I agree. I'm of the opinion, however, that you write it once,
> and forget about it. So do it well, make it fast, and you're set
> forevermore.
Thomas Russ' is both efficient (assuming an efficient w-o-t-string),
and, very importantly, it scales nicely to large strings. Here's what
I get on sbcl: On small strings with small replacements, your is about
30% faster; on large strings with large replacements, it's many orders
of magnitude slower. When you're evaluating stuff for library use,
don't make the mistake of only considering best-case behavior!
Jason Kantz wrote:
> You could be better off using a library like CL-PPCRE.
> http://www.weitz.de/cl-ppcre/#regex-replace
Wow. That's about seven times as fast as mine.
Wow.
Chris Capel
Chris Capel wrote:
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>
(defun replace-sequence (array old new &key (test #'eql))
(let ((new-array (make-array (length array)
:adjustable t
:fill-pointer 0
:element-type (array-element-type array))))
(loop with index = 0
while (< index (length array))
if (and (funcall test (elt array index) (elt old 0))
(search old array :start2 index :end2 (+ index (length old)) :test test))
do
(loop for new-index from 0 below (length new)
do (vector-push-extend (elt new new-index) new-array))
(incf index (length old))
else do
(vector-push-extend (elt array index) new-array)
(incf index))
new-array))
CL-USER 9 > (replace-sequence "test spot the dog" "dog" "the dogs"
:test #'char=)
"test spot the the dogs"
CL-USER 10 > (replace-sequence "test spot the dog" "the" "many"
:test #'char=)
"test spot many dog"
CL-USER 11 >
It is a little confusing if you want this function to work on all
sequences (in which case it needs some work or change the name)
or just arrays.
Wade
Wade Humeniuk wrote:
> Chris Capel wrote:
>
>> I had a need just now to replace one substring in my string with a
>> different substring, of a different length (for quoting parameters to SQL
>> queries, incidentally). It appears that accomplishing this isn't
>> straightforward in CL. Below is what I came up with. Does anyone have a
>> better way than this? You'd figure CL would have something that does
>> this, but I couldn't find anything. Maybe it's time to read the standard
>> cover-to-cover.
>>
>
> (defun replace-sequence (array old new &key (test #'eql))
> (let ((new-array (make-array (length array)
> :adjustable t
> :fill-pointer 0
> :element-type (array-element-type
> :array))))
> (loop with index = 0
> while (< index (length array))
> if (and (funcall test (elt array index) (elt old 0))
> (search old array :start2 index :end2 (+ index (length
> old)) :test test))
> do
> (loop for new-index from 0 below (length new)
> do (vector-push-extend (elt new new-index) new-array))
> (incf index (length old))
> else do
> (vector-push-extend (elt array index) new-array)
> (incf index))
> new-array))
Hmm. That's pretty fast--faster than my original code--but not as fast as
this:
(defun replace-sequence (array old new)
(let ((hits (list (- 0 (length old))))
(n-hits 0))
(let (hit (pos 0))
(while (setf hit (search old array :start2 pos))
(push hit hits)
(incf n-hits)
(setf pos (1+ hit)))
(setf hits (nreverse hits)))
(let* ((difference (- (length new) (length old)))
(new-array (make-array (+ (length array)
(* difference n-hits))
:element-type (array-element-type array)))
(new-array-position 0))
(docdr (x hits)
(let ((start (+ (car x) (length old)))
(next (cadr x)))
(replace new-array array
:start1 new-array-position
:start2 start :end2 next)
(when next
(incf new-array-position (- next start))
(replace new-array new
:start1 new-array-position)
(incf new-array-position (length new)))))
new-array)))
Yours is a bit easier to read, because of the fill-pointers, though. (I
could probably use loop, but I don't know if that would help much, and then
I'd have to learn it.) It'd be nice if I could have some sort of
fill-pointer that let me append sequences to my sequence at the fill
pointer position. Then I could get rid of some counters and do something
nicer than REPLACE.
Oh, and in case you want to test this, you'll probably need these:
(defmacro while (eval-form &body body)
(with-gensyms (ret)
`(do ((,ret))
((not ,eval-form) ,ret)
(setf ,ret
(progn
,@body)))))
(defmacro docdr ((var list) &body body)
`(do ((,var ,list (cdr ,var)))
((null ,var))
,@body))
Worth the price of avoiding loop, I say. Speaking of which, I could probably
include collecting in the first part of my code above, so that it becomes
(loop with hit = -1 then (search old array :start2 (1+ hit))
while hit
do collect hit into hits
(incf n-hits))
My first loop! Thanks to Peter Seibel's book for a quick reference. It's an
improvement of two lines. Also, considerably faster than my WHILE. I wonder
where the speed boost is coming from. But ... I shudder at the sheer
loopiness of it. (*groan*) The second section of my function could probably
be replaced with some sort of loop equivalent. Anyone care to take a shot?
I wonder if there's a way to make COLLECT available separately from any sort
of iteration facilities. It'd require some masterful macrology, I do
believe. But I seem to remember a package simulating pointers to setf'able
places that would help out nicely in the exercise.
> It is a little confusing if you want this function to work on all
> sequences (in which case it needs some work or change the name)
> or just arrays.
It'd probably need to be able to create a blank sequence of the same format
as the input (nil in the case of a list). It'd probably be best to provide
two sections of the body that worked only on vectors and lists,
respectively. Unless you can figure out how this abstraction applies to any
array. Is there anything besides lists and vectors to worry about?
Chris Capel
Chris Capel <······@iba.nktech.net> writes:
> (defun replace-sequence (array old new)
> (let ((hits (list (- 0 (length old))))
> (n-hits 0))
> (let (hit (pos 0))
> (while (setf hit (search old array :start2 pos))
> (push hit hits)
> (incf n-hits)
> (setf pos (1+ hit)))
> (setf hits (nreverse hits)))
Hmmm. This is interesting. You seem to allow for overlapping
replacements, since each match position is incremented by 1 instead
of the length of OLD. Do you get what you expect from the
following call:
(replace-sequence "aaaaWWWWaaaWWWaa" "aa" "bb")
BTW, when I tried the above, I got an error....
> (let* ((difference (- (length new) (length old)))
> (new-array (make-array (+ (length array)
> (* difference n-hits))
> :element-type (array-element-type array)))
> (new-array-position 0))
> (docdr (x hits)
> (let ((start (+ (car x) (length old)))
> (next (cadr x)))
> (replace new-array array
> :start1 new-array-position
> :start2 start :end2 next)
> (when next
> (incf new-array-position (- next start))
> (replace new-array new
> :start1 new-array-position)
> (incf new-array-position (length new)))))
> new-array)))
--
Thomas A. Russ, USC/Information Sciences Institute
Chris Capel wrote:
>
> It'd probably need to be able to create a blank sequence of the same format
> as the input (nil in the case of a list). It'd probably be best to provide
> two sections of the body that worked only on vectors and lists,
> respectively. Unless you can figure out how this abstraction applies to any
> array. Is there anything besides lists and vectors to worry about?
>
Like Frank's version, a generalized sequence version:
(defun replace-sequence (seq old new &key (start 0) (end nil) (test #'eql))
(if (< start (or end (length seq)))
(let ((position (search old seq :start2 start :end2 end)))
(if position
(concatenate (type-of seq)
(subseq seq start position)
new
(replace-sequence seq
old new
:start (+ position (length old))
:end end
:test test))
(subseq seq start)))
(subseq seq start)))
Wade
There are a few bugs in the code.
A better version is:
(defun replace-sequence (array old new &key (test #'eql))
(assert (> (length old) 0))
(let ((new-array (make-array (length array)
:adjustable t
:fill-pointer 0
:element-type (array-element-type array))))
(loop with index = 0
while (< index (length array))
if (search old array :start2 index
:end2 (min (length array) (+ index (length old)))
:test test)
do
(loop for new-index from 0 below (length new)
do (vector-push-extend (elt new new-index) new-array))
(incf index (length old))
else do
(vector-push-extend (elt array index) new-array)
(incf index))
new-array))
Wade
Perhaps something like:
(defun replace-string (string old new)
(let ((end-of-first (search old string)))
(if (not end-of-first) string ;; nothing to replace
(concatenate 'string
(subseq string 0 end-of-first)
new
(subseq string (+ end-of-first (length old)))))))
This _is_ consy, and not general purpose (we are assuming old occurs
at most once in string), but that's what I'd do first.
cheers,
BM
Frank Buss wrote:
> A function, which works with every sequence (I hope):
>
> (defun replace-sequence (seq old new &key (start 0))
> (let ((seq-len (length seq))
> (old-len (length old))
> (new-len (length new)))
> (if (or (< (- seq-len start) old-len) (= old-len 0))
> (copy-seq seq)
> (if (search seq old :start1 start :end1 (+ start old-len))
> (replace-sequence
> (concatenate (type-of seq)
^^^^^^^^^^^^^
This doesn't seem to be portable. In LispWorks:
(type-of "foo") => SIMPLE-TEXT-STRING
But in CLISP and SBCL:
(type-of "foo") => (SIMPLE-BASE-STRING 3)
So your code works as long as the NEW sequence is the same length as the
OLD:
(replace-sequence "Is this not pung?" "this" "that") => "Is that not pung?"
But if the lengths are different you violate the type specifier:
(replace-sequence "abc" "b" "123") =>
*** - sequence type forces length 3, but result has length 5
I think you need something like:
(concatenate (typecase seq (string 'string) (vector 'vector)) ...)
David Sletten
David Sletten <·····@slytobias.com> wrote:
> But if the lengths are different you violate the type specifier:
> (replace-sequence "abc" "b" "123") =>
> *** - sequence type forces length 3, but result has length 5
>
> I think you need something like:
> (concatenate (typecase seq (string 'string) (vector 'vector)) ...)
but if I create my own sequence type, I have to add it to this typecase
list. How can I write a concatenate, which works for all sequence types,
and which perhaps creates a (SIMPLE-BASE-STRING 5) in CLISP, if the inputs
are (SIMPLE-BASE-STRING 2) and (SIMPLE-BASE-STRING 3)?
--
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
In article <···········@newsreader2.netcologne.de>,
Frank Buss <··@frank-buss.de> wrote:
> David Sletten <·····@slytobias.com> wrote:
>
> > But if the lengths are different you violate the type specifier:
> > (replace-sequence "abc" "b" "123") =>
> > *** - sequence type forces length 3, but result has length 5
> >
> > I think you need something like:
> > (concatenate (typecase seq (string 'string) (vector 'vector)) ...)
>
> but if I create my own sequence type, I have to add it to this typecase
> list. How can I write a concatenate, which works for all sequence types,
> and which perhaps creates a (SIMPLE-BASE-STRING 5) in CLISP, if the inputs
> are (SIMPLE-BASE-STRING 2) and (SIMPLE-BASE-STRING 3)?
CL doesn't provide a way for you to create new sequence types, so it
naturally doesn't provide a way to extend the sequence functions, either.
--
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Chris Capel wrote:
> You'd figure CL would have something that does this,
> but I couldn't find anything.
Well, I guess it doesn't. How did something like this not make it into the
standard? Did people just not use strings back then? Every other language I
know of has something to do this, that works at least for strings, built
in. Was the ANSI committee simply too pressed for time?
Chris Capel
Chris Capel wrote:
> Chris Capel wrote:
>
> > You'd figure CL would have something that does this,
> > but I couldn't find anything.
>
> Well, I guess it doesn't. How did something like this not make it
> into the standard? Did people just not use strings back then? Every
> other language I know of has something to do this, ...
What languages?
Aside from scripting languages that have some form of regular
expression parsing (eg, Perl, Python), I don't know of any [compiled]
language that can replace sections of a string of different lengths. It
is just easier in some languages to find the string you want replaced,
and easier to concatenate your new string with the previous characters
that you want to keep.
Jeff M.
--
http://www.retrobyte.org
··············@gmail.com
Chris Capel wrote:
> I had a need just now to replace one substring in my string with a
> different substring, of a different length
If you want something that's quick to just "drop in and use", might I
suggest using my regular expression package (url below). It allows very
quick and simple replacing of text in a string:
(re:replace-all "Some random string"
#/\s(\a+)$/
(format nil "~D" (length $1)))
==> "Some random 6"
--
http://www.retrobyte.org
··············@gmail.com
Chris Capel wrote:
> I had a need just now to replace one substring in my string with a different
> substring, of a different length (for quoting parameters to SQL queries,
> incidentally). It appears that accomplishing this isn't straightforward in
> CL. Below is what I came up with. Does anyone have a better way than this?
> You'd figure CL would have something that does this, but I couldn't find
> anything. Maybe it's time to read the standard cover-to-cover.
>
> Chris Capel
>
> (defun replace-sequence (array old new)
> (do ((new-array (make-array (length array)
> :adjustable t
> :element-type (array-element-type array)))
> (new-array-position 0 (+ new-array-position (length old)))
> (array-position (search old array)
> (search old array :start2 (1+ array-position)))
> (prev-array-position 0 array-position)
> (difference (if (> 0 #1=(- (length new) (length old))) nil #1#)))
> ((not array-position) (progn (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position)
> new-array))
> (replace new-array array
> :start1 new-array-position
> :start2 prev-array-position :end2 array-position)
> (incf new-array-position (- array-position prev-array-position))
> (when difference
> (adjust-array new-array (+ (length new-array) difference)))
> (replace new-array new :start1 new-array-position)))
>
May I suggest that whichever implementation you choose, you follow the
convention of parameter order in SUBST and SUBSTITUTE? Namely,
(defun replace-sequence (new old seq) ...)
David Sletten