[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [I18n]xterm and XIM



Kaixo!

On Sat, Jun 09, 2001 at 01:33:51AM +0900, Tomohiro KUBOTA wrote:

> > About iconv codeset names variants;
> > Of cource we are all aware of the problem. Therefore we have been
> > trying to come up with the solutions, such as codeset name aliasing,
> > codeset name normalization, developing standard naming convention
> > guideline, etc, in X, glibc, and standards.
> 
> It is a good effort.  It will improve the portability of iconv().

I don't know about other implementations, but with the GNU libc implementation
you don't need to care about that, it is the libc that handles it (charset
names are case insensitive and there is a wide array of aliases; the
full array of alias can be seen (and modified) in /usr/lib/gconv/gconv-modules

Anyone using a supported charset but refering to it with a name not yet
covered there truly desserves all the bad things that can happen to him.
 
> I understand this point.  I wish iconv() were very portable from the
> beginning.

The problem is more its availability. Not all systems have an iconv()
function.

> nl_langinfo(CODESET) is an XPG5 function but in reality many systems
> lacks it.  How about iconv() guideline?

in GNU libc nl_langinfo(CODESET) returns what is stored in the locale
definition file; which is defined at time of creating that file, trough
a command line parameter. That is, the exact name is completly system
dependen :) (well, GNU libc comes with a set of encoding files which are
all in uppercase, and named following the way of charset names for email/http
the LI18NUX guidelines follow that too; so, yes, there is some hope.
I suppose all recent GNU/Linux systems follow that too, as the name of charset
description files force somewhat that behaviour.

The problem is for charset naming in LANG-like variables, and for
the directory name (eg: /usr/share/locale/fr_FR.UTF-8).

Note that when nl_langinfo(CODESET) is used that is irrelevant, but some
old programs (among them XFree86) don't use nl_langinfo(CODESET) yet
and instead use the LANG or similar variable to get the charset name.

For the Linux-Mandrake distribution I made directory names and variable values
follow the same guidelines as in LI18NUX (since before those guidelines
were written); the only exception being "zh_TW.Big5" for historical
and practical reasons.
 

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://www.srtxg.easynet.be/		PGP Key available, key ID: 0x8F0E4975
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/