From: Alexandre Oberlin
Subject: iso8859-1
Date: 
Message-ID: <3A000518.825306AB@netcourrier.com>
Hi all

I just downloaded Allegro 6.0 for Linux.
I'd like to read from files and manipulate strings containing iso8859-1
characters.
I tried (setf  *default-external-format* :iso8859-1) but lisp goes on
printing character codes instead of neat accentuated characters.
Is there a simple solution to this or am I asking too much?

Thank you

From: Charles A. Cox
Subject: Re: iso8859-1
Date: 
Message-ID: <8tpsul$p7c$1@news2.franz.com>
"Alexandre Oberlin" <········@netcourrier.com> wrote in message
······················@netcourrier.com...
> I'd like to read from files and manipulate strings containing iso8859-1
> characters.
> I tried (setf  *default-external-format* :iso8859-1) but lisp goes on
> printing character codes instead of neat accentuated characters.
> Is there a simple solution to this or am I asking too much?

  Thanks for your question.  First, I'd like to caution you against
using *default-external-format*.  Its use is being deprecated as Allegro
CL now determines the default external-format from the locale object
bound to *locale*.  If, however, you already know the character encoding
for the file you wish to open, you can explicitly specify the stream's
external-format with the :external-format argument to
open/with-open-file/etc.  You can also change a stream's external-format
after it has been created using (setf stream-external-format).  For more
information, please look at ACL 6.0's online documentation's iacl.htm,
available over the internet as follows:

 http://www.franz.com/support/documentation/6.0/doc/iacl.htm

  As for how characters are being printed, I'm unclear what you mean by
"printing character codes instead of neat accentuated characters".  I'm
guessing that you are running lisp as a subprocess of emacs, and that
the emacs window is displaying, for example,
#\latin_small_letter_c_with_cedilla as `\347' instead of as a letter.
If this is the case for you, you might try setting the buffer coding
system for that emacs process to be "iso-8859-1".  [In emacs 20.X, I
have found you can do this by switching to the emacs window that is
running lisp and, from the emacs menu bar, specify "Mule" -> "Set Coding
System" -> "Buffer Process" and enter "iso-8859-1" (without quotes) at
each of the subsequent minibuffer prompts.]

  If you have further questions, please feel free to follow up to
·····@franz.com'.

    Charley
From: Alexandre Oberlin
Subject: Re: iso8859-1
Date: 
Message-ID: <3A0165A6.36DBEBA7@netcourrier.com>
"Charles A. Cox" wrote:
> 
 
>   Thanks for your question.  First, I'd like to caution you against
> using *default-external-format*.  Its use is being deprecated as Allegro
> CL now determines the default external-format from the locale object
> bound to *locale*.  If, however, you already know the character encoding
> for the file you wish to open, you can explicitly specify the stream's
> external-format with the :external-format argument to
> open/with-open-file/etc.  You can also change a stream's external-format
> after it has been created using (setf stream-external-format).  For more
> information, please look at ACL 6.0's online documentation's iacl.htm,
> available over the internet as follows:
> 
>  http://www.franz.com/support/documentation/6.0/doc/iacl.htm

Thank you very much for your answer. 
I had actually read the file iacl.htm and the first thing I did was
using the :latin1 key argument in with-open-file. 

>   As for how characters are being printed, I'm unclear what you mean by
> "printing character codes instead of neat accentuated characters".  I'm
> guessing that you are running lisp as a subprocess of emacs, and that
> the emacs window is displaying, for example,
> #\latin_small_letter_c_with_cedilla as `\347' instead of as a letter.
Exactly

> If this is the case for you, you might try setting the buffer coding
> system for that emacs process to be "iso-8859-1".  [In emacs 20.X, I
> have found you can do this by switching to the emacs window that is
> running lisp and, from the emacs menu bar, specify "Mule" -> "Set Coding
> System" -> "Buffer Process" and enter "iso-8859-1" (without quotes) at
> each of the subsequent minibuffer prompts.]
> 
It is not just a display problem. The calculations on strings are wrong
too.

I've found a simple workaround, but of course it causes overhead. I will
try to describe in detail the problem, which should not be too
difficult.
Let's find a simple example. 
Lets' have a file "dummy.dic" beginning with the word "�clair" on one
line.
(defun read-first-line ( dic)         
  (with-open-file (dic-stream dic :direction :input)
    (let*((rawstring (read-line dic-stream nil nil))
	  (rawlist  (coerce rawstring 'list)))
	   (show2 rawstring)
	   (show2 rawlist)
	   rawstring)))	 
cl-user(31): (setq *rawstring* (read-first-line "dummy.dic"))
rawstring = "�clair"
rawlist = (#\latin_small_letter_e_with_acute #\c #\l #\a #\i #\r)
"�clair"
Note that using 
(with-open-file (dic-stream dic :external-format :latin1 :direction
:input)
yields exactly the same results. 

But now if I type (coerce "�clair" 'list) into lisp, I get
(#\%^a #\latin_small_letter_e_with_acute #\c #\l #\a #\i #\r)
There is an extra #\%^a character before the accentuated character and
the string displays correctly. 
Of course we have :
cl-user(27): (string-equal *toto* "�clair")
nil

What I did to be able to work with strings read from a file is manually
add #\%^a before any character being alpha-char-p but not
standard-char-p.
This of course is not particularly elegant ;-) Is there a better
solution?
So it seems that 2 bytes are used to correctly represent 8-bit extended
characters or am I wrong ?

Thank you!
From: Alexandre Oberlin
Subject: Re: iso8859-1
Date: 
Message-ID: <3A016BE5.4BFEBEAA@netcourrier.com>
> "Charles A. Cox" wrote:

 
> > If this is the case for you, you might try setting the buffer coding
> > system for that emacs process to be "iso-8859-1".  [In emacs 20.X, I
> > have found you can do this by switching to the emacs window that is
> > running lisp and, from the emacs menu bar, specify "Mule" -> "Set Coding
> > System" -> "Buffer Process" and enter "iso-8859-1" (without quotes) at
> > each of the subsequent minibuffer prompts.]
> >
> It is not just a display problem. The calculations on strings are wrong
> too.

Well I did what you explain in emacs and it actually did the trick. Now
I get
cl-user(27): (string-equal *toto* "�clair")
t
Thanks a lot !