[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [li18nux2000:62] Comments on locale name guideline



Comments on:

  Locale name guideline [Public Review Draft 2001-05-31]
  http://www.li18nux.org/docs/text/locale-name-20010531.txt

I am writing these comments as we said at a li18nux meeting
that we wanted to be compatible as much as possible with
the ISO standard on POSIX locale names, ISO/IEC 15897.
ISO/IEC 15897 naming rules are implemented in glibc.

1. The following should be added to the number of fields:

             SOURCE
             VERSION

   SOURCE: There may be different sources for a locale, and thus
   different specifications, so the may need a way to specify
   a specific source. Eg. there may be a Unicode specification of
   sorting of Danish, and a Danish Standards specification, and an
   IBM specification, that may have smaller or bigger differences in them.
   
   VERSION: Locales are updated from time to time, and to run an
   application it may be necessary to refer to a specific version
   of the locale, eg to get a specific sorting, or avoid a specific error.
   The versioning system should be of the Dewey type, just like
   binary libraries - like libc-2.2.1.so
   
   These rules are specified in ISO/IEC 15897 and they are implemented
   in glibc.

2. Locale names character repertoire:

   The following characters should be added to the character repertoire:

   + : ( ) / *

   These are permitted by ISO/IEC 15897 CODESET naming rules, and
   they are used in some IANA registered charsets. Actually '_'
   is both a special and a delimiter, as "_" can be used in charset names.

3. Change ALPHABETS to LETTERS

   LETTERS is the right name here, or ALPHABETICS

4. Add '+' and ',' to the DELIMITERS

   These are delimiters in ISO/IEC 15897 locale syntax.

5. Change or add the following syntax for locales:

   LANGUAGE_TERRITORY+MODIFIER1+MODIFIER2,SOURCE_VERSION.CODESET

   This is the format for locale names in the ISO standard (implemented in glibc).

6. In the LANGUAGE description, 3-letter codes, please add after "ISO 639-2"
   the words "in the terminology version".

   There are two versions of ISO 693-2, a terminology version and a  bibliography
   version. The terminology version is more oriented towards the name of tha language
   in the language itself, where the bibliography version is more oriented towards
   the name of the language in English. Examples are "deu" in the terminology code
   for German, where the bibliography code is "ger". ISO 639-2 itself says that it
   would use the principle of names in its own language in the future, and the
   Internet RFC 3066, and the new draft of ISO 15897 also uses this. 

7. For the CODESET repertoire, please add the specials : ( ) / _ . *

   These are used in IANA charsets and allowed in the syntax in ISO 15897, and
   ISO 15897 has a number of registered charmaps containing these characters.

   I think you then should remove the lines in the CODESET specification
   (STRING1-STRING2-STRING3) as this is not general enough.

8. In MODIFIER, you should remove the line with "euro" as this is not a good example.
   The "euro" modifier is normally based on a dependency on special
   coding in the application to say whether this should be used, and 
   as it has not removed the internationalization code from the program,
   it is a bad example of i18n. This is also better done within one
   locale as with the glibc implementation of the new style locale.
   
Kind regards
Keld
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/