[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Status summary of various GNU/Linux/UTF-8 efforts




Thanks, Markus, for that summary. A few additions and remarks:

> UTF-8 patch for less (locale independent):
> 
>   Robert Brady <rwb197@ecs.soton.ac.uk> has done that
> 
> UTF-8 patch for Linux kernel tty driver and stty:
> 
>   Bruno Haible <haible@ilog.fr> has done that

I wouldn't say "done" until the patch is accepted by the maintainter into
the mainline source. It's important to push for that.

> Addition of UTF-8 support to numerous other GNU tools:
> 
>   Bruno Haible <haible@ilog.fr> and
>   Robert Brady <rwb197@ecs.soton.ac.uk>
>   have been looking into that, but I guess it might be prudent to
>   wait with this until glibc 2.2 ... are completed.

No need to wait, you can start hacking now. Here is an interim solution:
I set my environment variables to

  $ export LANG=de_DE.UTF-8
  $ export LD_PRELOAD=/usr/local/lib/libutf8_plug.so

(where libutf8_plug.so comes from
ftp://ftp.ilog.fr/pub/Users/haible/utf8/libutf8-0.2.tar,gz). Large parts of
the ISO C Amendment 1 functions (i.e. all that are present in glibc-2.1)
work with UTF-8 now.

>   ... as someone with an Japanese-only view has messed up ...

The same thing happened to the `vim' editor. Its multibyte support assumes
2-byte characters where the first byte is >= 0x80 and the second < 0x80.

> Addition of a UTF-8 locale to Xlib:
> 
>   This would be urgently necessary, but nobody is working on that yet.

I am working on it, did half of it last weekend.

>   Much of the UTF-8 functionality currently found in xterm (e.g., the
>   keysym->UCS conversion) really should move into Xlib

Isn't it already in Xlib? See file xc/lib/X11/imConv.c, by Mark Leisher.

> Extension of the Linux frame buffer console with the same UTF-8
> functionality that is available or on the wishlist for xterm:
> 
>   No volunteers yet.

Before doing an extra wishlist, look at the basic functionality. We need:
  - Fixed command-line editing: my n_tty patch.
  - In the keyboard driver, it should be possible to map any key to any
    Unicode character, plus possibly CapsLock treatment. For this,
    the keyboard mapping data structures need to be made 32-bit. They
    are 16-bit now. Major work.
  - Cut&Paste. There is a patch by Edmund Thomas Grimley Evans and
    Stanislav Voronyi, but it's too complicated (and keyboard.c is not
    prepared) for going into the kernel now.

Bruno
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/