From: Robert Uhl
Subject: URI Escape/Unescape Library?
Date:
Message-ID: <m38y224t1j.fsf@4dv.net>
I've been digging around on CLiki and Google trying to find a decent URI
escape/unescape library, and have so far been unsuccessful. Someone
_has_ to have handled this before.
Specifically, I'm looking for something to replace '+' or %20 with ' '
and so on.
--
Robert Uhl <http://public.xdi.org/=ruhl>
Si Kristo ay muling nabuhay! Buhay na tunay magpakailanman!
Robert Uhl <······@SPAM4dv.net> writes:
> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful. Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.
Check my COM.INFORMATIMAGO.COMMON-LISP.HTML package.
It contains a couple of QUERY-* functions that may interest you.
http://www.informatimago.com/develop/lisp/index.html
--
__Pascal Bourguignon__ http://www.informatimago.com/
You never feed me.
Perhaps I'll sleep on your face.
That will sure show you.
On Wed, 25 May 2005 20:31:20 -0600, Robert Uhl <······@SPAM4dv.net> wrote:
> I've been digging around on CLiki and Google trying to find a decent
> URI escape/unescape library, and have so far been unsuccessful.
> Someone _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with '
> ' and so on.
That's so trivial that it's probably not worth a separate library.
Most web-related libs will contain something like this - see for
example (shameless self-plug) TBNL.
Cheers,
Edi.
--
Lisp is not dead, it just smells funny.
Real email: (replace (subseq ·········@agharta.de" 5) "edi")
Robert Uhl <······@SPAM4dv.net> writes:
> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful. Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.
(asdf-install:install :araneida)
(araneida:urlstring-escape "foo bar spa�") => "foo%20bar%20spa%DF"
(araneida:urlstring-unescape "foo+bar+spa%DF") => "foo bar spa�"
--
/|_ .-----------------------.
,' .\ / | Free Mumia Abu-Jamal! |
,--' _,' | Abolish the racist |
/ / | death penalty! |
( -. | `-----------------------'
| ) |
(`-. '--.)
`. )----'
"Robert Uhl" <······@SPAM4dv.net> schrieb im Newsbeitrag
···················@4dv.net...
> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful. Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.
> ...
I do not encode these things because the browser (I think :)) does this for
me, but of course I decode it with the following code snippet:
(defun uncode (text)
"decode posted values: + sign is replaced by space and %nn by its ASCII
char"
(let ((ret "")
(len (length text)))
(loop with i = 0
with c do
(when (>= i len) (return))
(setf c (elt text i))
(cond
((equal c #\+) (setf c #\Space))
((equal c #\%)
(setf c (code-char (read-from-string (concatenate 'string "#x"
(subseq text (1+ i) (+ i 3))))))
(incf i 2)))
(setf ret (concatenate 'string ret (string c)))
(incf i))
ret))
Andreas
"Andreas Thiele" <······@nospam.com> writes:
> (defun uncode (text)
> "decode posted values: + sign is replaced by space and %nn by its ASCII
> char"
> (let ((ret "")
> (len (length text)))
> [...]
> (setf ret (concatenate 'string ret (string c)))
> (incf i))
> ret))
This kind of thing is often done using with-output-to-string, or
possibly by vector-push-extending an (adjustable) string with a fill
pointer.
"Lars Brinkhoff" <·········@nocrew.org> schrieb im Newsbeitrag
···················@junk.nocrew.org...
> "Andreas Thiele" <······@nospam.com> writes:
> > (defun uncode (text)
> > "decode posted values: + sign is replaced by space and %nn by its
ASCII
> > char"
> > (let ((ret "")
> > (len (length text)))
> > [...]
> > (setf ret (concatenate 'string ret (string c)))
> > (incf i))
> > ret))
>
> This kind of thing is often done using with-output-to-string, or
> possibly by vector-push-extending an (adjustable) string with a fill
> pointer.
Thanks for the hint. What are the advantages. Are those faster? Which off
those will be the fastet?
Andreas
"Andreas Thiele" <······@nospam.com> writes:
> "Lars Brinkhoff" <·········@nocrew.org> schrieb im Newsbeitrag
> ···················@junk.nocrew.org...
>> "Andreas Thiele" <······@nospam.com> writes:
>> > (defun uncode (text)
>> > "decode posted values: + sign is replaced by space and %nn by its
> ASCII
>> > char"
>> > (let ((ret "")
>> > (len (length text)))
>> > [...]
>> > (setf ret (concatenate 'string ret (string c)))
>> > (incf i))
>> > ret))
>> This kind of thing is often done using with-output-to-string, or
>> possibly by vector-push(-extending) an (adjustable) string with a
>> fill pointer.
> Thanks for the hint. What are the advantages. Are those faster? Which off
> those will be the fastet?
They are probably faster, but if speed is very important to you, you
should profile your code.
From: Robert Uhl
Subject: Re: URI Escape/Unescape Library?
Date:
Message-ID: <m3is102xqb.fsf@4dv.net>
Lars Brinkhoff <·········@nocrew.org> writes:
>
> > (defun uncode (text)
> > "decode posted values: + sign is replaced by space and %nn by its ASCII
> > char"
> > (let ((ret "")
> > (len (length text)))
> > [...]
> > (setf ret (concatenate 'string ret (string c)))
> > (incf i))
> > ret))
>
> This kind of thing is often done using with-output-to-string, or
> possibly by vector-push-extending an (adjustable) string with a fill
> pointer.
So something like this, then? I also replaced the loop and length with
DOTIMES. The behaviour appears to be the same.
(defun unencode (text)
"decode URL-escaped values: + is replaced with space; %n with the
appropriate ASCII char"
(with-output-to-string (s)
(dotimes (i (length text))
(let ((c (elt text i)))
(cond ((equal c #\+) (setf c #\Space))
((equal c #\%)
(setf c (code-char
(read-from-string
(concatenate 'string "#x"
(subseq text (1+ i) (+ i 3))))))
(incf i 2)))
(write-char c s)))
s))
Thanks Andreas for the original snippet.
--
Robert Uhl <http://public.xdi.org/=ruhl>
Christos ecrorexti! Were ecrorexti!
Robert Uhl <······@SPAM4dv.net> writes:
> Lars Brinkhoff <·········@nocrew.org> writes:
>> This kind of thing is often done using with-output-to-string
> So something like this, then?
Yes, but...
> I also replaced the loop and length with DOTIMES. The behaviour
> appears to be the same.
...I believe you shouldn't rely on being able to modify the loop
variable (i below). CLHS says:
It is implementation-dependent whether dotimes establishes a new
binding of var on each iteration or whether it establishes a
binding for var once at the beginning and then assigns it on any
subsequent iterations.
You could use with-input-from-string to read characters from text.
> (defun unencode (text)
> "decode URL-escaped values: + is replaced with space; %n with the
> appropriate ASCII char"
> (with-output-to-string (s)
> (dotimes (i (length text))
> (let ((c (elt text i)))
> (cond ((equal c #\+) (setf c #\Space))
> ((equal c #\%)
> (setf c (code-char
> (read-from-string
> (concatenate 'string "#x"
> (subseq text (1+ i) (+ i 3))))))
> (incf i 2)))
> (write-char c s)))
> s))
From: Robert Uhl
Subject: Re: URI Escape/Unescape Library?
Date:
Message-ID: <m31x7n2k9s.fsf@4dv.net>
Lars Brinkhoff <·········@nocrew.org> writes:
>
> > I also replaced the loop and length with DOTIMES. The behaviour
> > appears to be the same.
>
> ...I believe you shouldn't rely on being able to modify the loop
> variable (i below). CLHS says:
>
> It is implementation-dependent whether dotimes establishes a new
> binding of var on each iteration or whether it establishes a
> binding for var once at the beginning and then assigns it on any
> subsequent iterations.
Doh! That's what I get for just writing a bit of test code instead of
reading the spec. Gotta get out of that habit.
> You could use with-input-from-string to read characters from text.
Nah--it doesn't really make sense conceptually to me to be reading
characters instead of iterating over the string. It's probably not all
that efficient, either, not that _that_ really matters. Here's the
(probable) final version.
(defun unencode (text)
"decode URL-escaped values: + is replaced with space; %nn with the
appropriate ASCII char"
(declare (string text))
(with-output-to-string (output)
(let ((len (length text)))
(loop with i = 0
with c do
(when (>= i len) (return))
(setf c (elt text i))
(cond ((equal c #\+) (setf c #\Space))
((equal c #\%)
(setf c (code-char
(read-from-string
(concatenate 'string "#x"
(subseq text (1+ i) (+ i 3))))))
(incf i 2)))
(write-char c output)))
output))
--
Robert Uhl <http://public.xdi.org/=ruhl>
Kristus is riesen! Weerelk, Hi is riesen!
On 9126 day of my life Robert Uhl wrote:
> So something like this, then? I also replaced the loop and length with
> DOTIMES. The behaviour appears to be the same.
>
> (defun unencode (text)
> "decode URL-escaped values: + is replaced with space; %n with the
> appropriate ASCII char"
> (with-output-to-string (s)
> (dotimes (i (length text))
> (let ((c (elt text i)))
> (cond ((equal c #\+) (setf c #\Space))
> ((equal c #\%)
> (setf c (code-char
> (read-from-string
> (concatenate 'string "#x"
> (subseq text (1+ i) (+ i 3))))))
> (incf i 2)))
> (write-char c s)))
> s))
Your version is not safe. What if TEXT is "bla-bla-bla%3"?
--
Ivan Boldyrev
XML -- new language of ML family.
From: Robert Uhl
Subject: Re: URI Escape/Unescape Library?
Date:
Message-ID: <m364wz2kfp.fsf@4dv.net>
Ivan Boldyrev <···············@cgitftp.uiggm.nsc.ru> writes:
>
> Your version is not safe. What if TEXT is "bla-bla-bla%3"?
I believe that's an invalid encoding, and thus it's appropriate for
there to be an error. I may eventually get around to throwing an
appropriate one, but for now I'll let the bad access do the trick.
--
Robert Uhl <http://public.xdi.org/=ruhl>
Let the heavens be glad as is meet, and let the earth rejoice; and let
the whole world, visible and invisible, keep festival, for Christ, the
eternal gladness, is risen.
Robert Uhl <······@SPAM4dv.net> wrote:
> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful. Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.
I doubt you are going to find a specific library devoted to this, since
it is rather simple and probably part of any library dealing with
HTTP/HTML.
My lisp-cgi-utils package[0] contains the functions url-enode-string
and url-decode-string to deal with this.
Regards,
Alex.
[0] http://www.thangorodrim.de/software/lisp-cgi-utils/index.html
--
"Opportunity is missed by most people because it is dressed in overalls and
looks like work." -- Thomas A. Edison