Hola!
I am playing with a small web application of mine. I got SBCL + CLSQL +
TBNL + MySQL + Apache + mod_lisp working, so I was actually quite
happy. Until I thought I should test with Unicode data in the database.
First I needed to upgrade to MySQL 4.1 and after an hour or so it
worked and I can properly see Unicode characters both in Emacs (SLIME)
and gnome-terminal by setting the encoding to utf-8 in all places.
However, SBCL reports the following error as soon as it gets Unicode
data in a CLSQL-call. Before I did the call, I told MySQL to be nice,
from inside SBCL:
* (clsql:execute-command "SET NAMES utf8")
* (clsql:execute-command "SET CHARACTER SET utf8")
And here is the query that fails (other queries work at this point).
Btw, the data is the word "правда" (pravda = truth), which is one
of the few words I can type in cyrillic:
* (clsql:query "SELECT t.name FROM tag_versions t")
debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread
#<THREAD "initial thread" {B2AA0D1}>:
encoding error on stream #<SB-SYS:FD-STREAM for "standard output"
{B2AA339}>
(:EXTERNAL-FORMAT :LATIN-1):
the character with code 1087 cannot be encoded.
Type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [OUTPUT-NOTHING] Skip output of this character.
1: [ABORT ] Exit debugger, returning to top level.
(SB-INT:STREAM-ENCODING-ERROR
#<SB-SYS:FD-STREAM for "standard output" {B2AA339}>
1087)
0] 1
Now, the error is quite "good" in my opinion; I can partly understand
what is going wrong (seems it tries to encode a number as high as 1087
as latin-1, which will never work.
But I have a hard time figuring out where to begin debugging this. I
got all the above components (uffi and all too) installed by reading a
lot of blog posts etc and it is probably a miracle that it works in the
first place :), and my Lisp experience is mostly with Emacs Lisp, so...
I scanned the CLSQL package for ":external-format" and only found it in
once place.
Any pointers on where to look further or what do tweak? Assuming I get
this step to work I want, of course, to have TBNL let the data through
aaaaall the way to the browser, and I don't expect that to happen by
itself either.
Okay, some version information might be in place:
SBCL 0.9.13
MySQL 4.1
My asdf:registry says the rest:
(setf asdf:*central-registry*
'(*default-pathname-defaults*
(concatenate 'string *lisp-dirs* "cl-base64-3.3.1/")
(concatenate 'string *lisp-dirs* "cl-ppcre-1.2.14/")
(concatenate 'string *lisp-dirs* "cl-who-0.6.0/")
(concatenate 'string *lisp-dirs* "uffi-1.5.16/")
(concatenate 'string *lisp-dirs* "clsql-3.6.6/")
(concatenate 'string *lisp-dirs* "rt-20040621/")
(concatenate 'string *lisp-dirs* "kmrcl-1.85/")
(concatenate 'string *lisp-dirs* "md5-1.8.5/")
(concatenate 'string *lisp-dirs* "tbnl-0.9.10/")
(concatenate 'string *lisp-dirs* "rfc2388/")
(concatenate 'string *lisp-dirs* "url-rewrite/")))
Thanks!
/Mathias
> See in uffi src/strings.lisp
> uffi is not support non-ascii (or not in :latin-1?) encodings for sbcl
> (or for all?)
So you're saying that I can never get this combination to work? I mean,
UFFI is used in CLSQL IIRC.
/Mathias
Mathias Dahl писал(а):
> So you're saying that I can never get this combination to work? I mean,
> UFFI is used in CLSQL IIRC.
Ask authors of the UFFI about term of realization of encodings support.
--
WBR, Yaroslav Kavenchuk.
On Fri, 15 Sep 2006 00:38:57 -0700, kavenchuk wrote:
> Mathias:
>
>> So you're saying that I can never get this combination to work? I mean,
>> UFFI is used in CLSQL IIRC.
>
> Ask authors of the UFFI about term of realization of encodings support.
How utterly strange: i use UTF-8 encoded data from my postgresql database
via clsql and SBCL since sbcl introduced experimental unicode support.
Never was a problem ...
HTH Ralf Mattes ;-)
+ Ralf Mattes <··@mh-freiburg.de>:
| How utterly strange: i use UTF-8 encoded data from my postgresql database
| via clsql and SBCL since sbcl introduced experimental unicode support.
| Never was a problem ...
I don't think clsql uses uffi to access a postgresql database. Mysql,
however, is a different story.
--
* Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
when there is no ground whatsoever for supposing it is true.
-- Bertrand Russell
Harald Hanche-Olsen wrote:
> I don't think clsql uses uffi to access a postgresql database. Mysql,
> however, is a different story.
:) Oops, yes, uffi is support utf-8 encoding for c-string on sbcl, but
only this encoding.
--
WBR, Yaroslav Kavenchuk.
On Fri, 15 Sep 2006 18:46:09 +0200, Harald Hanche-Olsen wrote:
> + Ralf Mattes <··@mh-freiburg.de>:
>
> | How utterly strange: i use UTF-8 encoded data from my postgresql database
> | via clsql and SBCL since sbcl introduced experimental unicode support.
> | Never was a problem ...
>
> I don't think clsql uses uffi to access a postgresql database.
Just for the records: it does.
You probably think of the postgres socket interface ...
Cheers, RalfD
> Mysql,
> however, is a different story.
+ Ralf Mattes <··@mh-freiburg.de>:
| On Fri, 15 Sep 2006 18:46:09 +0200, Harald Hanche-Olsen wrote:
|
|> I don't think clsql uses uffi to access a postgresql database.
|
| Just for the records: it does.
| You probably think of the postgres socket interface ...
Um, you're right, I probably do. It's the one I am using after all.
--
* Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
when there is no ground whatsoever for supposing it is true.
-- Bertrand Russell
I had a similar problem. SBCL parses your environment's 'LANG' and
sets it's standard-output accordingly. In your shell, what do you get
when you do a 'locale' or 'echo $LANG'? When i added
export LANG=en_US.UTF-8
to my .bashrc (and sourced it) sbcl started setting standard-output
to utf-8, which fixed the standard-output problem.
BTW, I believe most lispers just create
one or two directories that store asdf links back to the real asdf
files and push these dirs onto their central-registry.
That way, when using asdf, your start-up times will be a little quicker
if asdf only has to search one or two directories instead
of 11 directories (if that matters to you).
Lou
Mathias Dahl wrote:
> debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread
> #<THREAD "initial thread" {B2AA0D1}>:
> encoding error on stream #<SB-SYS:FD-STREAM for "standard output"
> {B2AA339}>
> (:EXTERNAL-FORMAT :LATIN-1):
> the character with code 1087 cannot be encoded.
>
> Type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL.
>
> restarts (invokable by number or by possibly-abbreviated name):
> 0: [OUTPUT-NOTHING] Skip output of this character.
> 1: [ABORT ] Exit debugger, returning to top level.
>
> (SB-INT:STREAM-ENCODING-ERROR
> #<SB-SYS:FD-STREAM for "standard output" {B2AA339}>
> 1087)
+ "Mathias Dahl" <············@gmail.com>:
| And here is the query that fails (other queries work at this point).
| Btw, the data is the word "���ѧӧէ�" (pravda = truth), which is one
| of the few words I can type in cyrillic:
|
| * (clsql:query "SELECT t.name FROM tag_versions t")
|
| debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread
| #<THREAD "initial thread" {B2AA0D1}>:
| encoding error on stream #<SB-SYS:FD-STREAM for "standard output"
| {B2AA339}>
| (:EXTERNAL-FORMAT :LATIN-1):
| the character with code 1087 cannot be encoded.
So clearly, your standard output stream is latin-1 encoded, and all
the clsql/mysql stuff is just a red herring. If anything, your test
shows that the communication to and from the database probably does
work as intended. It's the communication link between sbcl and
emacs/slime that is the problem.
This is easily tested: Just try to evaluate "���ѧӧէ�" from the slime
prompt. Or to test just one direction or the other, try running
(char-code #\��) or (code-char #o434).
What is usually recommended (I think) is setting
slime-net-coding-system to 'utf-8-unix before starting slime.
I do the following instead, which also works:
(setf slime-lisp-implementations
'((sbcl ("sbcl") :coding-system utf-8-unix)
(cmucl ("cmucl") :coding-system iso-latin-1-unix)))
--
* Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
when there is no ground whatsoever for supposing it is true.
-- Bertrand Russell
Harald Hanche-Olsen <······@math.ntnu.no> writes:
> This is easily tested: Just try to evaluate "правда" from the slime
> prompt.
Becareful. It's not so easy. Inputing and outputing whole utf-8
strings doesn't tell much about utf-8 support by the tested
software. At most, it tells that it's 8-bit transparent.
(aref "правда" 0) is a better test.
In "clisp -Eterminal utf-8":
[213]> (aref "правда" 0)
#\CYRILLIC_SMALL_LETTER_PE
In "sbcl --noinform" (version 0.9.12):
* (aref "правда" 0)
#\LATIN_CAPITAL_LETTER_ETH
*
Both accept and display the utf-8 strings, but in the first you have
unicode characters, while in the second you only have 8-bit
characters.
> Or to test just one direction or the other, try running
> (char-code #\д) or (code-char #o434).
* (char-code #\д)
debugger invoked on a READER-ERROR: READER-ERROR on #<SYNONYM-STREAM :SYMBOL SB-SYS:*STDIN* {90CCFB1}>:
unrecognized character name: "д"
* (map 'vector (function identity) "д")
#(#\LATIN_CAPITAL_LETTER_ETH #\ACUTE_ACCENT)
> What is usually recommended (I think) is setting
> slime-net-coding-system to 'utf-8-unix before starting slime.
>
> I do the following instead, which also works:
>
> (setf slime-lisp-implementations
> '((sbcl ("sbcl") :coding-system utf-8-unix)
> (cmucl ("cmucl") :coding-system iso-latin-1-unix)))
--
__Pascal Bourguignon__ http://www.informatimago.com/
There is no worse tyranny than to force a man to pay for what he does not
want merely because you think it would be good for him. -- Robert Heinlein
+ Pascal Bourguignon <···@informatimago.com>:
| Harald Hanche-Olsen <······@math.ntnu.no> writes:
|> This is easily tested: Just try to evaluate "���ѧӧէ�" from the slime
|> prompt.
|
| Becareful. It's not so easy. Inputing and outputing whole utf-8
| strings doesn't tell much about utf-8 support by the tested
| software. At most, it tells that it's 8-bit transparent.
Good point, though for slime <-> lisp communications (which is the
context here) I think my test will still reveal any problem, since you
have to work pretty hard to make slime and the backend have different
notions of the encoding on the channel - so long as the slime
developers have done that part right. But the tests you propose will
certainly reveal any problems in that area that my simple test might
overlook.
--
* Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
when there is no ground whatsoever for supposing it is true.
-- Bertrand Russell
> the clsql/mysql stuff is just a red herring. If anything, your test
> shows that the communication to and from the database probably does
> work as intended. It's the communication link between sbcl and
> emacs/slime that is the problem.
Sorry that I mixed things together. I get this error in "pure" SBCL,
running from the shell (gnome-terminal set to use utf-8):
* (clsql:query "SELECT t.name FROM tag_versions t")
debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread
#<THREAD "initial thread" {B2AA0D1}>:
encoding error on stream #<SB-SYS:FD-STREAM for "standard output"
{B2AA339}>
(:EXTERNAL-FORMAT :LATIN-1):
the character with code 65533 cannot be encoded.
In this case, Emacs nor SLIME is involved.
> This is easily tested: Just try to evaluate "правда" from the slime
> prompt.
That works:
* "правда"
"правда"
> Or to test just one direction or the other, try running
> (char-code #\д) or (code-char #o434).
Here I get the same result as Pascal did:
* (char-code #\д)
debugger invoked on a READER-ERROR in thread
#<THREAD "initial thread" {B2AA0D1}>:
READER-ERROR on #<SYNONYM-STREAM :SYMBOL SB-SYS:*STDIN* {90CDF61}>:
unrecognized character name: "д"
and:
* (code-char #o434)
#\LATIN_CAPITAL_LETTER_G_WITH_CIRCUMFLEX
> I do the following instead, which also works:
>
> (setf slime-lisp-implementations
> '((sbcl ("sbcl") :coding-system utf-8-unix)
> (cmucl ("cmucl") :coding-system iso-latin-1-unix)))
For me, doing that makes SLIME not start. I never get the REPL, instead
I get stuck in the inferior lisp buffer. It even fails if I use
iso-latin-1-unix for SBCL.
If we forget Emacs and SLIME for the moment, can you see any other
things that I can do to make the call to MySQL return the desired
results?
/Mathias
+ "Mathias Dahl" <············@gmail.com>:
| Sorry that I mixed things together. I get this error in "pure" SBCL,
| running from the shell (gnome-terminal set to use utf-8):
|
| * (clsql:query "SELECT t.name FROM tag_versions t")
|
| debugger invoked on a SB-INT:STREAM-ENCODING-ERROR in thread
| #<THREAD "initial thread" {B2AA0D1}>:
| encoding error on stream #<SB-SYS:FD-STREAM for "standard output"
| {B2AA339}>
| (:EXTERNAL-FORMAT :LATIN-1):
| the character with code 65533 cannot be encoded.
|
| In this case, Emacs nor SLIME is involved.
No, but the error message still seems to mumble about communications
with the terminal, not with the database. Try
(stream-external-format *standard-output*)
(stream-external-format *standard-input*)
to check that you have the right encoding. (Which won't work under
slime, BTW.)
If you get :ascii rather than :utf-8, you may try setting LC_CTYPE to
some UTF-8 locale before starting sbcl.
|> I do the following instead, which also works:
|>
|> (setf slime-lisp-implementations
|> '((sbcl ("sbcl") :coding-system utf-8-unix)
|> (cmucl ("cmucl") :coding-system iso-latin-1-unix)))
|
| For me, doing that makes SLIME not start. I never get the REPL,
| instead I get stuck in the inferior lisp buffer. It even fails if I
| use iso-latin-1-unix for SBCL.
Strange. Works for me, though I admit I have not studied this
carefully. I just tweak my environment until it works. 8-)
The slime mailing list may be a better place to ask about this bit.
| If we forget Emacs and SLIME for the moment, can you see any other
| things that I can do to make the call to MySQL return the desired
| results?
Not me personally, no. There is also a clsql mailing list.
--
* Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
when there is no ground whatsoever for supposing it is true.
-- Bertrand Russell
> No, but the error message still seems to mumble about communications
> with the terminal, not with the database. Try
>
> (stream-external-format *standard-output*)
> (stream-external-format *standard-input*)
>
> to check that you have the right encoding. (Which won't work under
> slime, BTW.)
>
> If you get :ascii rather than :utf-8, you may try setting LC_CTYPE to
> some UTF-8 locale before starting sbcl.
Yay! That's it!
When I ran this:
(stream-external-format *standard-output*)
I got :LATIN-1 as a response.
So, I exited SBCL and did:
export LC_CTYPE=en_US.UTF-8
Started SBCL again and checked the external-format. Got :UTF-8.
Promising!
Ran my database setup stuff and executed a query that should return
UTF-8 encoded data.
It worked! "правда" in all its glory! :)
Many thanks!
Now let's see if the rest of the "stuff" (TBNL and more) will let this
through all the way to the browser... :)
/Mathias
> Now let's see if the rest of the "stuff" (TBNL and more) will let this
> through all the way to the browser... :)
Aaaaand.... It works too! Wee! All systems go (etc etc)!
Now I can concentrate on the app instead. As soon as I got this running
under SLIME (the way I formerly tried to set the encoding in SLIME did
not seem to work)...
/Mathias
Harald Hanche-Olsen wrote:
> + "Mathias Dahl" <············@gmail.com>:
>
> | Sorry that I mixed things together. I get this error in "pure" SBCL,
> | running from the shell (gnome-terminal set to use
> |> (setf slime-lisp-implementations
> |> '((sbcl ("sbcl") :coding-system utf-8-unix)
> |> (cmucl ("cmucl") :coding-system iso-latin-1-unix)))
> |
> | For me, doing that makes SLIME not start. I never get the REPL,
> | instead I get stuck in the inferior lisp buffer. It even fails if I
> | use iso-latin-1-unix for SBCL.
>
> Strange. Works for me, though I admit I have not studied this
> carefully. I just tweak my environment until it works. 8-)
Seems I was using a very old SLIME version (1.2). After downloading
2.0, the following works perfectly:
(setq slime-net-coding-system 'utf-8-unix)
After doing that I see the Unicode chars in Emacs too. Yay! :)
I found that variable in the manual, here:
http://common-lisp.net/project/slime/doc/html/slime_42.html#SEC42
On Thu, 14 Sep 2006 10:00:49 +0200 Harald Hanche-Olsen wrote:
> What is usually recommended (I think) is setting
> slime-net-coding-system to 'utf-8-unix before starting slime.
>
> I do the following instead, which also works:
>
> (setf slime-lisp-implementations
> '((sbcl ("sbcl") :coding-system utf-8-unix)
> (cmucl ("cmucl") :coding-system iso-latin-1-unix)))
Additionally i had to tell swank to use utf-8 for the
communication (from ~/.sbclrc):
(asdf:operate 'asdf:load-op 'swank)
(swank:create-swank-server
4005 :spawn #'swank::simple-announce-function t :utf-8-unix)
David