From: Jonathon McKitrick
Subject: Can't get Unicode on SBCL to work
Date: 
Message-ID: <1147928824.705239.279880@j55g2000cwa.googlegroups.com>
I'm sure this is simple, but believe it or not, I can't find the word
'unicode' anywhere in the manual.

I'm having issues outputting char-codes for certain symbols, and I
think unicode is what I need.  Is this on by default, or is there
something I have to do to 'activate' unicode support?

From: Thomas F. Burdick
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <xcvr72rblx0.fsf@conquest.OCF.Berkeley.EDU>
"Jonathon McKitrick" <···········@bigfoot.com> writes:

> I'm sure this is simple, but believe it or not, I can't find the word
> 'unicode' anywhere in the manual.
> 
> I'm having issues outputting char-codes for certain symbols, and I
> think unicode is what I need.  Is this on by default, or is there
> something I have to do to 'activate' unicode support?

It's on by default.  You can double-check by looking for :SB-UNICODE
on *FEATURES*.  Without a more specific question, that's all I can say.
From: Zach Beane
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <m3u07nfvre.fsf@unnamed.xach.com>
"Jonathon McKitrick" <···········@bigfoot.com> writes:

> I'm sure this is simple, but believe it or not, I can't find the word
> 'unicode' anywhere in the manual.
> 
> I'm having issues outputting char-codes for certain symbols, and I
> think unicode is what I need.  Is this on by default, or is there
> something I have to do to 'activate' unicode support?

In the shell, what does "locale" print?

Zach
From: R. Mattes
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <pan.2006.05.18.10.43.59.305173@mh-freiburg.de>
On Wed, 17 May 2006 22:07:04 -0700, Jonathon McKitrick wrote:

> I'm sure this is simple, but believe it or not, I can't find the word
> 'unicode' anywhere in the manual.
> 
> I'm having issues outputting char-codes for certain symbols,

? What symbols? 

> and I
> think unicode is what I need.  

You probably need a character _encoding_ (like utf-8 or similar
encodings).

> Is this on by default, or is there
> something I have to do to 'activate' unicode support?

What version of SBCL? What's your SBCL's feature list? Mine says:

 CL-USER> *features*
(:ASDF :CLC-OS-DEBIAN
       :COMMON-LISP-CONTROLLER
       :ANSI-CL
       :COMMON-LISP
       :SBCL
       :UNIX
       :SB-DOC
       :SB-TEST
       :SB-PACKAGE-LOCKS
       :SB-UNICODE             ; <-------- Look for this
       :SB-SOURCE-LOCATIONS
       :IEEE-FLOATING-POINT
       :PPC
       :ELF
       :LINUX
       :STACK-ALLOCATABLE-CLOSURES
       :OS-PROVIDES-DLOPEN
       :OS-PROVIDES-DLADDR
       :OS-PROVIDES-PUTWC)

Can you elaborate on your problem? What exactly isn't working?

 HTH Ralf Mattes
From: Jonathon McKitrick
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <1147953998.165111.98140@g10g2000cwb.googlegroups.com>
> Can you elaborate on your problem? What exactly isn't working?

I found sb-unicode on *features*, of course.  And 'locale' on my
machine says 'en_US.UTF-8' for each item.

I have discovered 2 problem areas.

First, I need to explain some data flow.

A CSV file of data is read into the database by an SQL script.  This
file has international characters, mostly just Spanish or Portuguese
names.  A CL-SQL function parses the table, and re-inserts the data
where necessary into another table.  This is the problem area. SBCL
chokes on the characters coming from the DB queries with the
international characters.  I have to go through the CSV file first,
changing them to simple vowels before the import will work, even though
the SQL has no problem.

Later, when I want to generate something as simple as a copyright or
trademark symbol with cl-pdf, each of the characters appears with an
'A' with a carat before it.  So

(format nil "~c" #\COPYRIGHT_SIGN)

produces, roughly,

··@

where the carat should be over the A and the @ is a copyright.  I
really need to learn utf-8 input on Firefox.  ;-)

Interestingly,

(code-char 169)

gives #\COPYRIGHT_SIGN on sbcl, but ··@ (the actual copyright sign, not
@) on ACL.
From: Marcin 'Qrczak' Kowalczyk
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <87wtcjcufa.fsf@qrnik.zagroda>
"Jonathon McKitrick" <···········@bigfoot.com> writes:

> Later, when I want to generate something as simple as a copyright or
> trademark symbol with cl-pdf, each of the characters appears with an
> 'A' with a carat before it.  So
>
> (format nil "~c" #\COPYRIGHT_SIGN)
>
> produces, roughly,
>
> ··@

This means that UTF-8 encoded text is reinterpreted as if it were
ISO-8859-1.

-- 
   __("<         Marcin Kowalczyk
   \__/       ······@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/
From: Jonathon McKitrick
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <1147960952.170124.234150@j73g2000cwa.googlegroups.com>
Marcin 'Qrczak' Kowalczyk wrote:
> > (format nil "~c" #\COPYRIGHT_SIGN)
> >
> > produces, roughly,
> >
> > ··@
>
> This means that UTF-8 encoded text is reinterpreted as if it were
> ISO-8859-1.

Is this a simple fix?
From: Pascal Bourguignon
Subject: Re: Can't get Unicode on SBCL to work
Date: 
Message-ID: <87lkszwoma.fsf@thalassa.informatimago.com>
"Jonathon McKitrick" <···········@bigfoot.com> writes:

> I'm sure this is simple, but believe it or not, I can't find the word
> 'unicode' anywhere in the manual.
>
> I'm having issues outputting char-codes for certain symbols, and I
> think unicode is what I need.  Is this on by default, or is there
> something I have to do to 'activate' unicode support?

It should be activated by default.   Do you have a recent version of sbcl?
However, it has some difficulties with korean for example:

% /usr/local/bin/sbcl
This is SBCL 0.9.12, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
;; Reading ASDF packages from /home/pjb/asdf-central-registry.data...
; loading system definition from
; /usr/local/languages/sbcl/lib/sbcl/sb-bsd-sockets/sb-bsd-sockets.asd into
; #<PACKAGE "ASDF0">
; registering #<SYSTEM SB-BSD-SOCKETS {AB17591}> as SB-BSD-SOCKETS
; registering #<SYSTEM SB-BSD-SOCKETS-TESTS {AE89119}> as SB-BSD-SOCKETS-TESTS
; loading system definition from
; /usr/local/languages/sbcl/lib/sbcl/sb-posix/sb-posix.asd into
; #<PACKAGE "ASDF0">
; registering #<SYSTEM SB-POSIX {A6ED601}> as SB-POSIX
; registering #<SYSTEM SB-POSIX-TESTS {A873E81}> as SB-POSIX-TESTS
~/.sbclrc loaded
* (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
  (loop :for i :from בכוכ :to номер :by 단계 :collect i))


|ιοτα|
* (ιοτα :номер 10 :단계 2  :בכוכ 2)

(2 4 6 8 10)
* (print '(ιοτα :номер 10 :단계 2  :בכוכ 2))

(|ιοτα| :|номер| 10 :|ˋ�ʳ�| 2 :|בכוכ| 2) 
(|ιοτα| :|номер| 10 :|ˋ�ʳ�| 2 :|בכוכ| 2)
*features*

(:COM.INFORMATIMAGO.PJB :ASDF :ANSI-CL :COMMON-LISP :SBCL :UNIX :SB-DOC :SB-TEST :SB-PACKAGE-LOCKS :SB-UNICODE :SB-SOURCE-LOCATIONS :IEEE-FLOATING-POINT :X86 :ELF :LINUX :GENCGC :STACK-GROWS-DOWNWARD-NOT-UPWARD :C-STACK-IS-CONTROL-STACK :STACK-ALLOCATABLE-CLOSURES :ALIEN-CALLBACKS :LINKAGE-TABLE :OS-PROVIDES-DLOPEN :OS-PROVIDES-DLADDR :OS-PROVIDES-PUTWC)
* (sb-ext:string-to-octets  "단계" :external-format :utf-8)

#(195 171 194 139 194 168 195 170 194 179 194 132)

Here is what clisp gives:

[73]> (ext:convert-string-to-bytes "단계" charset:utf-8)
#(235 139 168 234 179 132)

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"You can tell the Lisp programmers.  They have pockets full of punch
 cards with close parentheses on them." --> http://tinyurl.com/8ubpf