Hi all,
I am trying to convert an HTML page that includes accent characters
into unicode. Is there a way to do this is Common Lisp?
Here is a small snippet:
<br>saté
This snippet prints the word saté.
Thanks ,
Deech
deech <············@gmail.com> writes:
> Hi all,
> I am trying to convert an HTML page that includes accent characters
> into unicode. Is there a way to do this is Common Lisp?
>
> Here is a small snippet:
> <br>saté
>
> This snippet prints the word sat�.
>
> Thanks ,
> Deech
CL-USER 1 >
#<The CL-WHO package, 58/128 internal, 28/64 external>
CL-WHO 2 > (with-html-output (*standard-output*)
(esc "�"))
é
"é"
-jens
I'm looking to convert HTML to Unicode. CL-WHO seems to convert
Unicode to HTML.
Thanks for your quick response,
Deech
Jens Teich wrote:
> deech <············@gmail.com> writes:
>
> > Hi all,
> > I am trying to convert an HTML page that includes accent characters
> > into unicode. Is there a way to do this is Common Lisp?
> >
> > Here is a small snippet:
> > <br>saté
> >
> > This snippet prints the word sat�.
> >
> > Thanks ,
> > Deech
>
> CL-USER 1 >
> #<The CL-WHO package, 58/128 internal, 28/64 external>
>
> CL-WHO 2 > (with-html-output (*standard-output*)
> (esc "�"))
> é
> "é"
>
> -jens
On Thu, 30 Oct 2008 12:33:51 -0700, deech wrote:
> Hi all,
> I am trying to convert an HTML page that includes accent characters into
> unicode. Is there a way to do this is Common Lisp?
Yes. Unless you need to verify the correctness of the input or need some
output format other than HTML, a simple algorithm that replaces strings
using a table (eg "é" -> "é") should suffice.
Search for the terms "replace string" in the c.l.l archives (eg using
Google groups).
HTH,
Tamas
Running html-entities:decode HTML does the trick. Thanks!
-deech
On Oct 30, 4:19 pm, Tamas K Papp <······@gmail.com> wrote:
> On Thu, 30 Oct 2008 12:33:51 -0700, deech wrote:
> > Hi all,
> > I am trying to convert an HTML page that includes accent characters into
> > unicode. Is there a way to do this is Common Lisp?
>
> Yes. Unless you need to verify the correctness of the input or need some
> output format other than HTML, a simple algorithm that replaces strings
> using a table (eg "é" -> "é") should suffice.
>
> Search for the terms "replace string" in the c.l.l archives (eg using
> Google groups).
>
> HTH,
>
> Tamas
From: Harald Hanche-Olsen
Subject: Re: Convert html diacritics to unicode
Date:
Message-ID: <pcobpx1wz6m.fsf@math.ntnu.no>
+ deech <············@gmail.com>:
> I am trying to convert an HTML page that includes accent characters
> into unicode. Is there a way to do this is Common Lisp?
>
> Here is a small snippet:
> <br>saté
>
> This snippet prints the word sat�.
Look at http://www.cliki.net/html-entities for example.
There may be more options at http://www.cliki.net/web
(where I found that one).
--
* Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
when there is no ground whatsoever for supposing it is true.
-- Bertrand Russell