From: ·············@gmail.com
Subject: internationalization of lisp programs
Date: 
Message-ID: <1120621352.080548.18440@g47g2000cwa.googlegroups.com>
Hello,

I'm scoping out the possibility of internationalizing a
large CL program (namely Maxima, http://maxima.sf.net/).
For the moment this effort is just exploratory.

The target languages would be Spanish and Portuguese first,
maybe Russian after that, dunno about anything else yet.
There are font issues, but for Spanish and Portuguese,
ISO-8859-1 is enough, if I'm not mistaken. Let's assume ISO-8859-1.

There are at least two separate issues that I can see.
One is to translate quoted strings
e.g. (format t "Hey, how goes it") --> (format t "Oye, como va")
I guess all format's could be replaced by stuff like
(format t msg-string-666) with msg-string-666 being assigned
the locale-appropriate message. That seems terribly painful.

Or maybe format could try to look up strings in a hash table
or something. How is this is kind of translation typically
approached in the Lisp world?

The other issue I can see is that it would be really nice
to replace some symbols with locale-appropriate equivalents.
E.g. it would be really nice if the symbols $true and $false,
which print as true and false, would print as, say,
verdadero and falso. Is there some way to get a symbol to
print differently than how it appears in the code?

Also it would be really nice if the user types in, say,
guardar and the function $save gets called.
I guess this could be handled by a lookup table,
although ideally I'd like $save to disappear entirely --
function body stays the same but the name is changed.

Maxima is implemented in Lisp and Lisp is really close to
the surface; Lisp symbols show up in Maxima and vice versa.
I'm guessing that the easiest way to implement i18n will
be to exploit this close relationship.

Thanks for any light you can shed on this problem.

Robert Dodier

From: Sam Steingold
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <u7jg4doew.fsf@gnu.org>
> *  <·············@tznvy.pbz> [2005-07-05 20:42:32 -0700]:
>
> Or maybe format could try to look up strings in a hash table
> or something. How is this is kind of translation typically
> approached in the Lisp world?

<http://www.podval.org/~sds/clisp/impnotes/i18n-mod.html>

(also <http://www.podval.org/~sds/clisp/impnotes/i18n.html>)

-- 
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.iris.org.il> <http://www.memri.org/>
<http://pmw.org.il/> <http://www.palestinefacts.org/> <http://www.dhimmi.com/>
Stupidity, like virtue, is its own reward.
From: ·············@gmail.com
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <1120683679.489691.251610@o13g2000cwo.googlegroups.com>
Sam, thanks a lot for your reply.

gettext looks promising. How hard would it be to port gettext
to other Lisps? (Is gettext written in C or in Lisp?)

Do you have some ideas about translating symbol names?
Maybe there is some way to establish an alias in the symbol table.
I'm just guessing here.

Incidentally I tried (setf (symbol-name 'foo) "bar") but that didn't
work as expected. Maybe someone knows how to effect a name change.

Thanks again for your help.

Robert Dodier
From: Pete Kazmier
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <87slyrk52i.fsf@coco.kazmier.com>
·············@gmail.com writes:

> Incidentally I tried (setf (symbol-name 'foo) "bar") but that didn't
> work as expected. Maybe someone knows how to effect a name change.

According to the hyperspec:

,----
| The name of a symbol is a string used to identify the symbol. Every
| symbol has a name, and the consequences are undefined if that name is
| altered.
`----
From: Peter Mechlenborg
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <dagogs$f7u$1@news.net.uni-c.dk>
·············@gmail.com wrote:
 > Hello,
 >
 > I'm scoping out the possibility of internationalizing a
 > large CL program (namely Maxima, http://maxima.sf.net/).
 > For the moment this effort is just exploratory.
 >
 > The target languages would be Spanish and Portuguese first,
 > maybe Russian after that, dunno about anything else yet.
 > There are font issues, but for Spanish and Portuguese,
 > ISO-8859-1 is enough, if I'm not mistaken. Let's assume ISO-8859-1.
 >
 > There are at least two separate issues that I can see.
 > One is to translate quoted strings
 > e.g. (format t "Hey, how goes it") --> (format t "Oye, como va")
 > I guess all format's could be replaced by stuff like
 > (format t msg-string-666) with msg-string-666 being assigned
 > the locale-appropriate message. That seems terribly painful.
 >
 > Or maybe format could try to look up strings in a hash table
 > or something. How is this is kind of translation typically
 > approached in the Lisp world?

Hi

I don't know how the normal approach is in LISP or any other language,
but I came up with a simple idea some time ago.

Instead of using format, make a small wrapper macro (perhaps called
international-format) that during expansion-time inserts the string in
a table, and emits code that, at run-time uses this table to print the
string in the correct language.

Some points to be made:

   - The actual code in very similar to normal code, the strings gets
     defined where they are used.

   - After your program is compiled, you can inspect the table and
     get a list of all the strings you need to translate.

   - If you want to print dynamic strings in the same style as format,
     you properly have to be a bit smarter. For strings that are
     generated at runtime (like numbers to strings), I don't think
     there is an easy solution, I hope you don't need to do this :-).


Happy coding and have fun.

   --  Peter Mechlenborg
From: Peter Mechlenborg
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <dagp6j$f7u$2@news.net.uni-c.dk>
Peter Mechlenborg wrote:
> ·············@gmail.com wrote:
>  > Hello,
>  >
>  > I'm scoping out the possibility of internationalizing a
>  > large CL program (namely Maxima, http://maxima.sf.net/).
>  > For the moment this effort is just exploratory.
>  >
>  > The target languages would be Spanish and Portuguese first,
>  > maybe Russian after that, dunno about anything else yet.
>  > There are font issues, but for Spanish and Portuguese,
>  > ISO-8859-1 is enough, if I'm not mistaken. Let's assume ISO-8859-1.
>  >
>  > There are at least two separate issues that I can see.
>  > One is to translate quoted strings
>  > e.g. (format t "Hey, how goes it") --> (format t "Oye, como va")
>  > I guess all format's could be replaced by stuff like
>  > (format t msg-string-666) with msg-string-666 being assigned
>  > the locale-appropriate message. That seems terribly painful.
>  >
>  > Or maybe format could try to look up strings in a hash table
>  > or something. How is this is kind of translation typically
>  > approached in the Lisp world?
> 
> Hi
> 
> I don't know how the normal approach is in LISP or any other language,
> but I came up with a simple idea some time ago.
> 
> Instead of using format, make a small wrapper macro (perhaps called
> international-format) that during expansion-time inserts the string in
> a table, and emits code that, at run-time uses this table to print the
> string in the correct language.
> 
> Some points to be made:
> 
>   - The actual code in very similar to normal code, the strings gets
>     defined where they are used.
> 
>   - After your program is compiled, you can inspect the table and
>     get a list of all the strings you need to translate.
> 
>   - If you want to print dynamic strings in the same style as format,
>     you properly have to be a bit smarter. For strings that are
>     generated at runtime (like numbers to strings), I don't think
>     there is an easy solution, I hope you don't need to do this :-).
> 
> 
> Happy coding and have fun.
> 
>   --  Peter Mechlenborg

Sam was quicker than I was, and I can see that I certainly wasn't the first
to get this idea :-).

Have fun.

   --  Peter Mechlenborg
From: nabla
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <a7680$42cc06ca$540a2bbf$9035@news.chello.pl>
·············@gmail.com napisał(a):
> Hello,
> 
> I'm scoping out the possibility of internationalizing a
> large CL program (namely Maxima, http://maxima.sf.net/).
> For the moment this effort is just exploratory.
> 
> The target languages would be Spanish and Portuguese first,
> maybe Russian after that, dunno about anything else yet.
> There are font issues, but for Spanish and Portuguese,
> ISO-8859-1 is enough, if I'm not mistaken. Let's assume ISO-8859-1.
> 
> There are at least two separate issues that I can see.
> One is to translate quoted strings
> e.g. (format t "Hey, how goes it") --> (format t "Oye, como va")
> I guess all format's could be replaced by stuff like
> (format t msg-string-666) with msg-string-666 being assigned
> the locale-appropriate message. That seems terribly painful.

Look at this:
http://common-lisp.net/project/cl-l10n/



-- 
Pozdrawiam,
Rafał Strzaliński (nabla)
From: drewc
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <Ri3ze.1879272$Xk.1176235@pd7tw3no>
nabla wrote:
> ·············@gmail.com napisał(a):

>> I'm scoping out the possibility of internationalizing a
>> large CL program (namely Maxima, http://maxima.sf.net/).
>> For the moment this effort is just exploratory.

> Look at this:
> http://common-lisp.net/project/cl-l10n/

I can second that. i'm using cl-l10n for a project right now and so far 
it has worked very well. includes FORMAT directives for dates and numbers.

There is also cl-icu :

http://common-lisp.net/project/bese/cl-icu.html

"cl-icu is a common lisp interface to IBM's ICU library. Its goal is to 
provide a convient, lispy interafce to ICU's i18n features. In 
particular cl-icu is focused on:

     * Formatting and parsing date/times, numbers and currency amounts.
     * Message formatting and parsing.
     * Resource bundles.

Apparently only tested on openMCL, but worth a look if cl-l10n does not 
meet your needs.


-- 
Drew Crampsie
drewc at tech dot coop
"Never mind the bollocks -- here's the sexp's tools."
	-- Karl A. Krueger on comp.lang.lisp
From: Marco Baringer
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <m2irznc51t.fsf@soma.local>
drewc <·····@rift.com> writes:

> There is also cl-icu :
>
> http://common-lisp.net/project/bese/cl-icu.html
>
> "cl-icu is a common lisp interface to IBM's ICU library. Its goal is
> to provide a convient, lispy interafce to ICU's i18n features. In
> particular cl-icu is focused on:
>
>      * Formatting and parsing date/times, numbers and currency amounts.
>      * Message formatting and parsing.
>      * Resource bundles.
>
> Apparently only tested on openMCL, but worth a look if cl-l10n does
> not meet your needs.

personally i believe ICU is the way to go (you will eventually need
the features it provides). however i feel required to mention that i
haven't updated that library since icu 2.6 (or maybe 2.8) since the
project i used it for ended (successfully).

the other option, if you don't like ffi, is to write a lisp parser for
icu's data. while this doesn't deal with boundry analysis or parsing
it will at least let you get printing of numbers, dates, currencies
and collation and case mapping right.

-- 
-Marco
Ring the bells that still can ring.
Forget the perfect offering.
There is a crack in everything.
That's how the light gets in.
	-Leonard Cohen
From: Alberto Riva
Subject: Re: internationalization of lisp programs
Date: 
Message-ID: <EoCdnY7QpfNXBlHfRVn-3g@rcn.net>
·············@gmail.com wrote:
> Hello,
> 
> I'm scoping out the possibility of internationalizing a
> large CL program
[...]
> I guess all format's could be replaced by stuff like
> (format t msg-string-666) with msg-string-666 being assigned
> the locale-appropriate message. That seems terribly painful.

I think you also need to take into account the fact that if you use 
arguments in your formats, the order in which they must appear might be 
different in different languages. So, either you "internationalize" the 
argument list to format as well, or you need to use ··@* directives to 
make sure you are referring to the correct argument.

Alberto
CHIP