Hi all
I just downloaded Allegro 6.0 for Linux.
I'd like to read from files and manipulate strings containing iso8859-1
characters.
I tried (setf *default-external-format* :iso8859-1) but lisp goes on
printing character codes instead of neat accentuated characters.
Is there a simple solution to this or am I asking too much?
Thank you
"Alexandre Oberlin" <········@netcourrier.com> wrote in message
······················@netcourrier.com...
> I'd like to read from files and manipulate strings containing iso8859-1
> characters.
> I tried (setf *default-external-format* :iso8859-1) but lisp goes on
> printing character codes instead of neat accentuated characters.
> Is there a simple solution to this or am I asking too much?
Thanks for your question. First, I'd like to caution you against
using *default-external-format*. Its use is being deprecated as Allegro
CL now determines the default external-format from the locale object
bound to *locale*. If, however, you already know the character encoding
for the file you wish to open, you can explicitly specify the stream's
external-format with the :external-format argument to
open/with-open-file/etc. You can also change a stream's external-format
after it has been created using (setf stream-external-format). For more
information, please look at ACL 6.0's online documentation's iacl.htm,
available over the internet as follows:
http://www.franz.com/support/documentation/6.0/doc/iacl.htm
As for how characters are being printed, I'm unclear what you mean by
"printing character codes instead of neat accentuated characters". I'm
guessing that you are running lisp as a subprocess of emacs, and that
the emacs window is displaying, for example,
#\latin_small_letter_c_with_cedilla as `\347' instead of as a letter.
If this is the case for you, you might try setting the buffer coding
system for that emacs process to be "iso-8859-1". [In emacs 20.X, I
have found you can do this by switching to the emacs window that is
running lisp and, from the emacs menu bar, specify "Mule" -> "Set Coding
System" -> "Buffer Process" and enter "iso-8859-1" (without quotes) at
each of the subsequent minibuffer prompts.]
If you have further questions, please feel free to follow up to
·····@franz.com'.
Charley
"Charles A. Cox" wrote:
>
> Thanks for your question. First, I'd like to caution you against
> using *default-external-format*. Its use is being deprecated as Allegro
> CL now determines the default external-format from the locale object
> bound to *locale*. If, however, you already know the character encoding
> for the file you wish to open, you can explicitly specify the stream's
> external-format with the :external-format argument to
> open/with-open-file/etc. You can also change a stream's external-format
> after it has been created using (setf stream-external-format). For more
> information, please look at ACL 6.0's online documentation's iacl.htm,
> available over the internet as follows:
>
> http://www.franz.com/support/documentation/6.0/doc/iacl.htm
Thank you very much for your answer.
I had actually read the file iacl.htm and the first thing I did was
using the :latin1 key argument in with-open-file.
> As for how characters are being printed, I'm unclear what you mean by
> "printing character codes instead of neat accentuated characters". I'm
> guessing that you are running lisp as a subprocess of emacs, and that
> the emacs window is displaying, for example,
> #\latin_small_letter_c_with_cedilla as `\347' instead of as a letter.
Exactly
> If this is the case for you, you might try setting the buffer coding
> system for that emacs process to be "iso-8859-1". [In emacs 20.X, I
> have found you can do this by switching to the emacs window that is
> running lisp and, from the emacs menu bar, specify "Mule" -> "Set Coding
> System" -> "Buffer Process" and enter "iso-8859-1" (without quotes) at
> each of the subsequent minibuffer prompts.]
>
It is not just a display problem. The calculations on strings are wrong
too.
I've found a simple workaround, but of course it causes overhead. I will
try to describe in detail the problem, which should not be too
difficult.
Let's find a simple example.
Lets' have a file "dummy.dic" beginning with the word "�clair" on one
line.
(defun read-first-line ( dic)
(with-open-file (dic-stream dic :direction :input)
(let*((rawstring (read-line dic-stream nil nil))
(rawlist (coerce rawstring 'list)))
(show2 rawstring)
(show2 rawlist)
rawstring)))
cl-user(31): (setq *rawstring* (read-first-line "dummy.dic"))
rawstring = "�clair"
rawlist = (#\latin_small_letter_e_with_acute #\c #\l #\a #\i #\r)
"�clair"
Note that using
(with-open-file (dic-stream dic :external-format :latin1 :direction
:input)
yields exactly the same results.
But now if I type (coerce "�clair" 'list) into lisp, I get
(#\%^a #\latin_small_letter_e_with_acute #\c #\l #\a #\i #\r)
There is an extra #\%^a character before the accentuated character and
the string displays correctly.
Of course we have :
cl-user(27): (string-equal *toto* "�clair")
nil
What I did to be able to work with strings read from a file is manually
add #\%^a before any character being alpha-char-p but not
standard-char-p.
This of course is not particularly elegant ;-) Is there a better
solution?
So it seems that 2 bytes are used to correctly represent 8-bit extended
characters or am I wrong ?
Thank you!
> "Charles A. Cox" wrote:
> > If this is the case for you, you might try setting the buffer coding
> > system for that emacs process to be "iso-8859-1". [In emacs 20.X, I
> > have found you can do this by switching to the emacs window that is
> > running lisp and, from the emacs menu bar, specify "Mule" -> "Set Coding
> > System" -> "Buffer Process" and enter "iso-8859-1" (without quotes) at
> > each of the subsequent minibuffer prompts.]
> >
> It is not just a display problem. The calculations on strings are wrong
> too.
Well I did what you explain in emacs and it actually did the trick. Now
I get
cl-user(27): (string-equal *toto* "�clair")
t
Thanks a lot !