From: Matthew Economou
Subject: Parsing CL: Unicode support?
Date: 
Message-ID: <w4ozox0ouct.fsf@nemesis.irtnog.org>
Hello,

Is there a Common Lisp implementation that supports non-ASCII
character sets (i.e. it uses Unicode or some other multi-byte
representation internally for CHARACTER)?  Is there a standard way for
READ to parse code that is in a non-English character set?  I've been
reading through Section 2 (Syntax) of the CLHS, and while it doesn't
seem to mandate the use of ASCII, it only specifies READ-and-family's
behavior on the 7-bit ASCII character set.

I just glanced through Section 13 (Characters), and section 13.1.2.1
(Character Scripts) seems to imply some kind of ISO support in the
form of this script subtype.  But it seems almost purposefully vague.
Can anyone shed some light on the subject?

-- 
"low ping bastard: n. anybody getting more frags than the person running their
client on the server." - Steve Caskey

From: Valeriy E. Ushakov
Subject: Re: Parsing CL: Unicode support?
Date: 
Message-ID: <7vheoc$7jk$1@news.ptc.spbu.ru>
Matthew Economou <········@irtnog.org> wrote:

> Is there a Common Lisp implementation that supports non-ASCII
> character sets (i.e. it uses Unicode or some other multi-byte
> representation internally for CHARACTER)?

Latest release of CLISP comes with unicode support.  Its BASE-CHAR
type is Unicode.

SY, Uwe
-- 
···@ptc.spbu.ru                         |       Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/            |       Ist zu Grunde gehen
From: Paolo Amoroso
Subject: Re: Parsing CL: Unicode support?
Date: 
Message-ID: <381e4e40.4720830@news.mclink.it>
On 31 Oct 1999 02:35:46 -0500, Matthew Economou <········@irtnog.org>
wrote:

> Is there a Common Lisp implementation that supports non-ASCII
> character sets (i.e. it uses Unicode or some other multi-byte
> representation internally for CHARACTER)?  Is there a standard way for

I think that a version of Allegro and the latest one of CLISP support
Unicode.


Paolo
-- 
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/
From: Jason Trenouth
Subject: Re: Parsing CL: Unicode support?
Date: 
Message-ID: <Y44dOKtKOse9cTwWVWRfTGet8V0=@4ax.com>
On Sun, 31 Oct 1999 14:22:08 GMT, ·······@mclink.it (Paolo Amoroso) wrote:

> On 31 Oct 1999 02:35:46 -0500, Matthew Economou <········@irtnog.org>
> wrote:
> 
> > Is there a Common Lisp implementation that supports non-ASCII
> > character sets (i.e. it uses Unicode or some other multi-byte
> > representation internally for CHARACTER)?  Is there a standard way for
> 
> I think that a version of Allegro and the latest one of CLISP support
> Unicode.

Harlequin LispWorks has had Unicode support on all platforms for several
years: http://www.harlequin.com

__Jason
From: Pekka P. Pirinen
Subject: Re: Parsing CL: Unicode support?
Date: 
Message-ID: <ixln8idkfz.fsf@gaspode.cam.harlequin.co.uk>
Matthew Economou <········@irtnog.org> writes:
> Is there a standard way for
> READ to parse code that is in a non-English character set?

Actually, there's no trick to it, you just call READ -- it's just that
the result isn't standardized.  Presumably, the implementation has
reasonable syntax definitions for the additional characters.  If not,
you can change the syntax types using SET-SYNTAX-FROM-CHAR.  If the
token syntax is screwed (the "constituent traits" (ANS 2.1.4.2) are
not what you want them to be), you're in trouble, because there's no
standard way of changing it.

> I just glanced through Section 13 (Characters), and section 13.1.2.1
> (Character Scripts) seems to imply some kind of ISO support in the
> form of this script subtype.  But it seems almost purposefully vague.

It _is_ purposefully vague.  The committee didn't feel it could
standardize this area, so they just defined the terms.  This section
says CL CHARACTERs represent ISO characters, and there's no particular
script or coding you need to support, as long as you have all the
standard characters.  Section 13.1.10 says you should document your
scripts and their syntactic properties.
-- 
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited
"If you don't look after knowledge, it goes away."
  - Terry Pratchett, The Carpet People