From: Robert Uhl
Subject: URI Escape/Unescape Library?
Date: 
Message-ID: <m38y224t1j.fsf@4dv.net>
I've been digging around on CLiki and Google trying to find a decent URI
escape/unescape library, and have so far been unsuccessful.  Someone
_has_ to have handled this before.

Specifically, I'm looking for something to replace '+' or %20 with ' '
and so on.

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
Si Kristo ay muling nabuhay!  Buhay na tunay magpakailanman!

From: Pascal Bourguignon
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <87ll62ipse.fsf@thalassa.informatimago.com>
Robert Uhl <······@SPAM4dv.net> writes:

> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful.  Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.

Check my COM.INFORMATIMAGO.COMMON-LISP.HTML package.
It contains a couple of QUERY-* functions that may interest you.
http://www.informatimago.com/develop/lisp/index.html

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
You never feed me.
Perhaps I'll sleep on your face.
That will sure show you.
From: Edi Weitz
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <uy8a2s7dl.fsf@agharta.de>
On Wed, 25 May 2005 20:31:20 -0600, Robert Uhl <······@SPAM4dv.net> wrote:

> I've been digging around on CLiki and Google trying to find a decent
> URI escape/unescape library, and have so far been unsuccessful.
> Someone _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with '
> ' and so on.

That's so trivial that it's probably not worth a separate library.
Most web-related libs will contain something like this - see for
example (shameless self-plug) TBNL.

Cheers,
Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")
From: Thomas F. Burdick
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <xcvzmuh6fex.fsf@conquest.OCF.Berkeley.EDU>
Robert Uhl <······@SPAM4dv.net> writes:

> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful.  Someone
> _has_ to have handled this before.
> 
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.

(asdf-install:install :araneida)
(araneida:urlstring-escape "foo bar spa�") => "foo%20bar%20spa%DF"
(araneida:urlstring-unescape "foo+bar+spa%DF") => "foo bar spa�"

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | Free Mumia Abu-Jamal! |
     ,--'    _,'   | Abolish the racist    |
    /       /      | death penalty!        |
   (   -.  |       `-----------------------'
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Andreas Thiele
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <d7469g$n02$01$1@news.t-online.com>
"Robert Uhl" <······@SPAM4dv.net> schrieb im Newsbeitrag
···················@4dv.net...
> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful.  Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.
> ...

I do not encode these things because the browser (I think :)) does this for
me, but of course I decode it with the following code snippet:

(defun uncode (text)
  "decode posted values: + sign is replaced by space and %nn by its ASCII
char"
  (let ((ret "")
        (len (length text)))
    (loop with i = 0
          with c do
      (when (>= i len) (return))
      (setf c (elt text i))
      (cond
        ((equal c #\+) (setf c #\Space))
        ((equal c #\%)
         (setf c (code-char (read-from-string (concatenate 'string "#x"
(subseq text (1+ i) (+ i 3))))))
         (incf i 2)))
      (setf ret (concatenate 'string ret (string c)))
      (incf i))
    ret))

Andreas
From: Lars Brinkhoff
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <85psvddso3.fsf@junk.nocrew.org>
"Andreas Thiele" <······@nospam.com> writes:
> (defun uncode (text)
>   "decode posted values: + sign is replaced by space and %nn by its ASCII
> char"
>   (let ((ret "")
>         (len (length text)))
>       [...]
>       (setf ret (concatenate 'string ret (string c)))
>       (incf i))
>     ret))

This kind of thing is often done using with-output-to-string, or
possibly by vector-push-extending an (adjustable) string with a fill
pointer.
From: Andreas Thiele
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <d7c02p$pg1$02$1@news.t-online.com>
"Lars Brinkhoff" <·········@nocrew.org> schrieb im Newsbeitrag
···················@junk.nocrew.org...
> "Andreas Thiele" <······@nospam.com> writes:
> > (defun uncode (text)
> >   "decode posted values: + sign is replaced by space and %nn by its
ASCII
> > char"
> >   (let ((ret "")
> >         (len (length text)))
> >       [...]
> >       (setf ret (concatenate 'string ret (string c)))
> >       (incf i))
> >     ret))
>
> This kind of thing is often done using with-output-to-string, or
> possibly by vector-push-extending an (adjustable) string with a fill
> pointer.

Thanks for the hint. What are the advantages. Are those faster? Which off
those will be the fastet?

Andreas
From: Lars Brinkhoff
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <85zmudask2.fsf@junk.nocrew.org>
"Andreas Thiele" <······@nospam.com> writes:
> "Lars Brinkhoff" <·········@nocrew.org> schrieb im Newsbeitrag
> ···················@junk.nocrew.org...
>> "Andreas Thiele" <······@nospam.com> writes:
>> > (defun uncode (text)
>> >   "decode posted values: + sign is replaced by space and %nn by its
> ASCII
>> > char"
>> >   (let ((ret "")
>> >         (len (length text)))
>> >       [...]
>> >       (setf ret (concatenate 'string ret (string c)))
>> >       (incf i))
>> >     ret))
>> This kind of thing is often done using with-output-to-string, or
>> possibly by vector-push(-extending) an (adjustable) string with a
>> fill pointer.
> Thanks for the hint. What are the advantages. Are those faster? Which off
> those will be the fastet?

They are probably faster, but if speed is very important to you, you
should profile your code.
From: Robert Uhl
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <m3is102xqb.fsf@4dv.net>
Lars Brinkhoff <·········@nocrew.org> writes:
>
> > (defun uncode (text)
> >   "decode posted values: + sign is replaced by space and %nn by its ASCII
> > char"
> >   (let ((ret "")
> >         (len (length text)))
> >       [...]
> >       (setf ret (concatenate 'string ret (string c)))
> >       (incf i))
> >     ret))
>
> This kind of thing is often done using with-output-to-string, or
> possibly by vector-push-extending an (adjustable) string with a fill
> pointer.

So something like this, then?  I also replaced the loop and length with
DOTIMES.  The behaviour appears to be the same.

(defun unencode (text)
  "decode URL-escaped values: + is replaced with space; %n with the
appropriate ASCII char"
  (with-output-to-string (s)
    (dotimes (i (length text))
      (let ((c (elt text i)))
	(cond ((equal c #\+)  (setf c #\Space))
	      ((equal c #\%)
	       (setf c (code-char
			(read-from-string
			 (concatenate 'string "#x"
				      (subseq text (1+ i) (+ i 3))))))
	       (incf i 2)))
	(write-char c s)))
    s))

Thanks Andreas for the original snippet.

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
Christos ecrorexti!  Were ecrorexti!
From: Lars Brinkhoff
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <85fyw3bwrg.fsf@junk.nocrew.org>
Robert Uhl <······@SPAM4dv.net> writes:
> Lars Brinkhoff <·········@nocrew.org> writes:
>> This kind of thing is often done using with-output-to-string
> So something like this, then?

Yes, but...

> I also replaced the loop and length with DOTIMES.  The behaviour
> appears to be the same.

...I believe you shouldn't rely on being able to modify the loop
variable (i below).  CLHS says:

    It is implementation-dependent whether dotimes establishes a new
    binding of var on each iteration or whether it establishes a
    binding for var once at the beginning and then assigns it on any
    subsequent iterations.

You could use with-input-from-string to read characters from text.

> (defun unencode (text)
>   "decode URL-escaped values: + is replaced with space; %n with the
> appropriate ASCII char"
>   (with-output-to-string (s)
>     (dotimes (i (length text))
>       (let ((c (elt text i)))
> 	(cond ((equal c #\+)  (setf c #\Space))
> 	      ((equal c #\%)
> 	       (setf c (code-char
> 			(read-from-string
> 			 (concatenate 'string "#x"
> 				      (subseq text (1+ i) (+ i 3))))))
> 	       (incf i 2)))
> 	(write-char c s)))
>     s))
From: Robert Uhl
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <m31x7n2k9s.fsf@4dv.net>
Lars Brinkhoff <·········@nocrew.org> writes:
>
> > I also replaced the loop and length with DOTIMES.  The behaviour
> > appears to be the same.
>
> ...I believe you shouldn't rely on being able to modify the loop
> variable (i below).  CLHS says:
>
>     It is implementation-dependent whether dotimes establishes a new
>     binding of var on each iteration or whether it establishes a
>     binding for var once at the beginning and then assigns it on any
>     subsequent iterations.

Doh!  That's what I get for just writing a bit of test code instead of
reading the spec.  Gotta get out of that habit.

> You could use with-input-from-string to read characters from text.

Nah--it doesn't really make sense conceptually to me to be reading
characters instead of iterating over the string.  It's probably not all
that efficient, either, not that _that_ really matters.  Here's the
(probable) final version.

(defun unencode (text)
  "decode URL-escaped values: + is replaced with space; %nn with the
appropriate ASCII char"
  (declare (string text))
  (with-output-to-string (output)
    (let ((len (length text)))
      (loop with i = 0
	    with c do
	    (when (>= i len) (return))
	    (setf c (elt text i))
	    (cond ((equal c #\+)  (setf c #\Space))
		  ((equal c #\%)
		   (setf c (code-char
			    (read-from-string
			     (concatenate 'string "#x"
					  (subseq text (1+ i) (+ i 3))))))
		   (incf i 2)))
	    (write-char c output)))
    output))

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
Kristus is riesen!  Weerelk, Hi is riesen!
From: Ivan Boldyrev
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <8grtm2-kbq.ln1@ibhome.cgitftp.uiggm.nsc.ru>
On 9126 day of my life Robert Uhl wrote:
> So something like this, then?  I also replaced the loop and length with
> DOTIMES.  The behaviour appears to be the same.
>
> (defun unencode (text)
>   "decode URL-escaped values: + is replaced with space; %n with the
> appropriate ASCII char"
>   (with-output-to-string (s)
>     (dotimes (i (length text))
>       (let ((c (elt text i)))
> 	(cond ((equal c #\+)  (setf c #\Space))
> 	      ((equal c #\%)
> 	       (setf c (code-char
> 			(read-from-string
> 			 (concatenate 'string "#x"
> 				      (subseq text (1+ i) (+ i 3))))))
> 	       (incf i 2)))
> 	(write-char c s)))
>     s))

Your version is not safe.  What if TEXT is "bla-bla-bla%3"?

-- 
Ivan Boldyrev

                                       XML -- new language of ML family.
From: Robert Uhl
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <m364wz2kfp.fsf@4dv.net>
Ivan Boldyrev <···············@cgitftp.uiggm.nsc.ru> writes:
>
> Your version is not safe.  What if TEXT is "bla-bla-bla%3"?

I believe that's an invalid encoding, and thus it's appropriate for
there to be an error.  I may eventually get around to throwing an
appropriate one, but for now I'll let the bad access do the trick.

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
Let the heavens be glad as is meet, and let the earth rejoice; and let
the whole world, visible and invisible, keep festival, for Christ, the
eternal gladness, is risen.
From: Alexander Schreiber
Subject: Re: URI Escape/Unescape Library?
Date: 
Message-ID: <slrnd9brc9.er1.als@mordor.angband.thangorodrim.de>
Robert Uhl <······@SPAM4dv.net> wrote:
> I've been digging around on CLiki and Google trying to find a decent URI
> escape/unescape library, and have so far been unsuccessful.  Someone
> _has_ to have handled this before.
>
> Specifically, I'm looking for something to replace '+' or %20 with ' '
> and so on.

I doubt you are going to find a specific library devoted to this, since
it is rather simple and probably part of any library dealing with
HTTP/HTML.

My lisp-cgi-utils package[0] contains the functions url-enode-string
and url-decode-string to deal with this.

Regards,
      Alex.
[0] http://www.thangorodrim.de/software/lisp-cgi-utils/index.html
-- 
"Opportunity is missed by most people because it is dressed in overalls and
 looks like work."                                      -- Thomas A. Edison