From: Mathias  Dahl
Subject: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <1182097358.390732.118300@n2g2000hse.googlegroups.com>
Hi!

I have a problem with how to redefine the external format used in SBCL
after starting it up. At least this is what I think I have a problem
with...

When I run this code:

 (DIRECTORY "/big/dat/gfx/mp3/artist/T*/*.mp3")

I get this error:

c-string encoding error (:external-format :ASCII):
  the character with code 195 cannot be encoded.
   [Condition of type SB-INT:C-STRING-ENCODING-ERROR]

Restarts:
  0: [ABORT-REQUEST] Abort handling SLIME request.
  1: [TERMINATE-THREAD] Terminate this thread (#<THREAD "repl-
thread" {BE70D91}>)

Backtrace:
  0: (SB-INT:C-STRING-ENCODING-ERROR :ASCII 195)
  1: (SB-IMPL::OUTPUT-TO-C-STRING/ASCII "/big/dat/gfx/mp3/artist/Tommy
Körberg")
  2: (SB-UNIX:UNIX-STAT "/big/dat/gfx/mp3/artist/Tommy Körberg")
  3: (SB-IMPL::%ENUMERATE-DIRECTORIES "/big/dat/gfx/mp3/artist/" (#<SB-
IMPL::PATTERN ("T" :MULTI-CHAR-WILD)>) #P"/big/dat/gfx/mp3/artist/T*/
*.mp3" T T ((837 . 1360279) (837 . 1343489) (837 . 1146908) (837 .
1146881) (837 . 2)) #<CLOSURE (LAMBDA (SB-IMPL::MATCH)) {BF9C1DD}>)
  4: ((LABELS SB-IMPL::DO-DIRECTORY) #P"/big/dat/gfx/mp3/artist/T*/
*.mp3")
  5: (DIRECTORY "/big/dat/gfx/mp3/artist/T*/*.mp3")
  6: (SB-INT:SIMPLE-EVAL-IN-LEXENV (DIRECTORY "/big/dat/gfx/mp3/artist/
T*/*.mp3") #<NULL-LEXENV>)
  7: (SWANK::EVAL-REGION "(directory \"/big/dat/gfx/mp3/artist/T*/
*.mp3\")

The reason of course is that the file name above contain non-ASCII
characters.

Now, this only happens when my program is started as a system service
(I am running SBCL 1.0 under Mandriva GNU/Linux) but not if I start it
"manually" from the command line. I think I know why I get this
behavior but not how to solve the problem:

I am using UTF-8 as the default encoding in my system and when I start
SBCL from the command line it correctly inherits this:

* (sb-impl::default-external-format)

:UTF-8

However, when SBCL is started as a system service I get this instead:

* (sb-impl::default-external-format)

:ANSI_X3.4-1968

I have tried to change the default format like this, but I have no
idea if I am on the right track:

* (setf sb-impl::*default-external-format*) :UTF-8)

It does not solve my problem however, so it seems there are other
internal variables in SBCL that determines the encoding to use in the
call to (SB-UNIX:UNIX-STAT "/big/dat/gfx/mp3/artist/Tommy Körberg").

Any ideas on how to tackle this?

/Mathias

PS. It might be possible to tweak the script that defines the service
and change the default encoding there, but I am more interested in
learning how to solve this from inside SBCL.

PPS. I have tried to search the documentation for encoding and formats
etc, to no avail. Is there any place where I could have found the
answer myself?

From: Richard M Kreuter
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <87myyybifi.fsf@tan-ru.localdomain>
Mathias  Dahl <············@gmail.com> writes:

> I get this error:
>
> c-string encoding error (:external-format :ASCII):
>   the character with code 195 cannot be encoded.
>    [Condition of type SB-INT:C-STRING-ENCODING-ERROR]
<snip>
> Now, this only happens when my program is started as a system service
> (I am running SBCL 1.0 under Mandriva GNU/Linux) but not if I start it
> "manually" from the command line. I think I know why I get this
> behavior but not how to solve the problem:

SBCL determines the default external format by examining the process
environment, see src/code/octets.lisp in the SBCL sources.  Probably
your system is configured with a default locale of "C", but your
personal login profile sets some of the locale variables to something
more appropriate for you.

> Any ideas on how to tackle this?

Set up appropriate variables for locale in the process environment in
which you invoke the SBCL instance, just like you would for any other
Unix program.

> PS. It might be possible to tweak the script that defines the service
> and change the default encoding there, but I am more interested in
> learning how to solve this from inside SBCL.

I don't think there's a supported interface for changing these
settings once SBCL is running.

> PPS. I have tried to search the documentation for encoding and formats
> etc, to no avail. Is there any place where I could have found the
> answer myself?

Not really.  It's sort of assumed that the process environment that
SBCL finds itself in accurately describes the state of the operating
system, file systems, etc.

--
RmK
From: Mathias  Dahl
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <1182106924.966464.114120@g4g2000hsf.googlegroups.com>
> > PS. It might be possible to tweak the script that defines the service
> > and change the default encoding there, but I am more interested in
> > learning how to solve this from inside SBCL.
>
> I don't think there's a supported interface for changing these
> settings once SBCL is running.

Okay, fair enough. I used what Pascal suggested and changed LC_CTYPE
in the script that starts the service, and it works well. I guess I
thought it would be harder to do then those two lines, and if SBCL
does not export its internals for this, so be it.

Thanks both of you!

/Mathias - One step closer to building web apps using only Lisp,
yay! :)
From: Vassil Nikolov
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <kasl8qmjj2.fsf@localhost.localdomain>
On Sun, 17 Jun 2007 19:02:04 -0000, Mathias  Dahl <············@gmail.com> said:
| ... I used what Pascal suggested and changed LC_CTYPE
| in the script that starts the service, and it works well. I guess I
| thought it would be harder to do then those two lines, and if SBCL
| does not export its internals for this, so be it.

  By the way, rather than a matter of exposing an interface, this may
  be a matter of locale taking effect very early (and permanently for
  the lisp process), so by the time user code gets to run, it is too
  late to change anything in the area that depends on locale.

  (Note: it is quite usual for processes started from /etc/rc* to have
  a rather different environment from user processes, and it is then
  quite usual for the former to make very explicit arrangements to
  ensure that their environments contain appropriate values.)

  ---Vassil.


-- 
The truly good code is the obviously correct code.
From: Mathias  Dahl
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <1182121513.061946.35070@k79g2000hse.googlegroups.com>
>   By the way, rather than a matter of exposing an interface, this may
>   be a matter of locale taking effect very early (and permanently for
>   the lisp process), so by the time user code gets to run, it is too
>   late to change anything in the area that depends on locale.
>
>   (Note: it is quite usual for processes started from /etc/rc* to have
>   a rather different environment from user processes, and it is then
>   quite usual for the former to make very explicit arrangements to
>   ensure that their environments contain appropriate values.)

Interesting points!

/Mathias
From: Pascal Bourguignon
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <87k5u2gf06.fsf@thalassa.lan.informatimago.com>
Vassil Nikolov <···············@pobox.com> writes:

> On Sun, 17 Jun 2007 19:02:04 -0000, Mathias  Dahl <············@gmail.com> said:
> | ... I used what Pascal suggested and changed LC_CTYPE
> | in the script that starts the service, and it works well. I guess I
> | thought it would be harder to do then those two lines, and if SBCL
> | does not export its internals for this, so be it.
>
>   By the way, rather than a matter of exposing an interface, this may
>   be a matter of locale taking effect very early (and permanently for
>   the lisp process), so by the time user code gets to run, it is too
>   late to change anything in the area that depends on locale.

This doesn't prevent clisp for example to  take into account the
changes of its various encoding variables much later.

But we can't complain, sbcl is only at 1.0.6, while clisp is a 2.41
and allegro cl is at 8.0: you cannot expect the same level of
sophistication from sbcl...


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

NOTE: The most fundamental particles in this product are held
together by a "gluing" force about which little is currently known
and whose adhesive power can therefore not be permanently
guaranteed.
From: Vassil Nikolov
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <kair9k7e60.fsf@localhost.localdomain>
On Mon, 18 Jun 2007 04:42:17 +0200, Pascal Bourguignon <···@informatimago.com> said:

| Vassil Nikolov <···············@pobox.com> writes:
|| On Sun, 17 Jun 2007 19:02:04 -0000, Mathias  Dahl <············@gmail.com> said:
|| | ... I used what Pascal suggested and changed LC_CTYPE
|| | in the script that starts the service, and it works well. I guess I
|| | thought it would be harder to do then those two lines, and if SBCL
|| | does not export its internals for this, so be it.
|| 
|| By the way, rather than a matter of exposing an interface, this may
|| be a matter of locale taking effect very early (and permanently for
|| the lisp process), so by the time user code gets to run, it is too
|| late to change anything in the area that depends on locale.

| This doesn't prevent clisp for example to  take into account the
| changes of its various encoding variables much later.

  In my followup, I was only pointing out the distinction between the
  need to expose an interface and the need to make
  process-initialization-time decisions reversible (and that it was
  the latter which was at stake, rather than the former).

  As to whether or not the latter is desirable (i.e., what the
  tradeoffs are), Richard Kreuter has already written about that, of
  course.

| But we can't complain, sbcl is only at 1.0.6, while clisp is a 2.41
| and allegro cl is at 8.0: you cannot expect the same level of
| sophistication from sbcl...

  In terms of expectation, since SBCL is not a "green field"
  implementation, it probably should not be exempt from expectations
  of sophistication; in terms of whether it meets such expectations, I
  believe that in fact it does, since e.g. in terms of achieving high
  execution speed for large classes of programs it is more
  sophisticated than CLisp.  (It should be needless to say, but let me
  assert that that does not speak ill of CLisp.)

  I am not sure that comparing SBCL (or CLisp, for that matter) to
  Allegro CL would be a fair comparison.  (That is an intuitive
  opinion, though, and if anyone supplies arguments to the opposite,
  it may not stand.)

  ---Vassil.


-- 
The truly good code is the obviously correct code.
From: Richard M Kreuter
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <87bqfeartq.fsf@tan-ru.localdomain>
Mathias  Dahl <············@gmail.com> writes:

> [I]f SBCL does not export its internals for this, so be it.

Well, in principle there's not much to exposing the default external
format and making it mutable, there are various reasons why it's
undesirable to do so:

* The encodings used by the terminal, child processes, and other
  programs that might do I/O with Lisp aren't really under Lisp's
  control, and so letting the user change how an already-running Lisp
  encodes and decodes I/O doesn't actually affect what programs and
  hardware outside of Lisp are able to produce or accept.  The
  terminal, in particular, will have already been set up by the time
  you can run any Lisp code.

* It would almost certainly be a lose to let Lisp encode filenames
  differently from all other programs on the system, so there's a
  disincentive to letting the user second-guess SBCL's attempt to use
  environment variables for this purpose.

* Data that was pulled into Lisp before changing the default external
  format might not be usable after changing the default external
  format.  For example, a pathname constructed before the change might
  cause problems after the change, in case characters in some string
  in the pathname encode differently after the change.

--
RmK
From: Pascal Bourguignon
Subject: Re: How to change external-format in SBCL (c-string encoding error)
Date: 
Message-ID: <87odjeh78b.fsf@thalassa.lan.informatimago.com>
Mathias  Dahl <············@gmail.com> writes:
> [...]
> I have tried to change the default format like this, but I have no
> idea if I am on the right track:
>
> * (setf sb-impl::*default-external-format*) :UTF-8)
>
> It does not solve my problem however, so it seems there are other
> internal variables in SBCL that determines the encoding to use in the
> call to (SB-UNIX:UNIX-STAT "/big/dat/gfx/mp3/artist/Tommy Körberg").
>
> Any ideas on how to tackle this?

LC_CTYPE=en_US.UTF-8 
export LC_CTYPE
sbcl ...



-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

NOTE: The most fundamental particles in this product are held
together by a "gluing" force about which little is currently known
and whose adhesive power can therefore not be permanently
guaranteed.