[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GNU Emacs Unicode support
Finally I found the time to upgrade Emacs to 20.6, which made it
possible to try Miyashita Hisashi's ``Mule-UCS'' package. Since it
integrates better with GNU Emacs (in particular, it's much easier to
read/write mail in UTF-8), I've stopped using my homegrown converter
and use his package now.
I've extended "Mule-UCS" to cover the complete BMP. My modifications
are available at "http://www.cs.ust.hk/~otfried/Mule".
In short, the modified package
* defines a new encoding "utf-8", which can be used like any other
Emacs coding system. It is indicated by the letter "u" in the
mode line.
* covers the complete BMP.
* uses two Unicode fonts to render the whole BMP range, one for
half-width characters, one for full-width characters. (I'm normally
using 9x18 and 18x18ja from Markus' collection.)
* adheres to Markus' definition of ``wcwidth()'' to select between
the two fonts, so that Emacs should work fine on a UTF-8 aware
terminal emulator.
There are two exceptions to the last point. First, I had to
arbitrarily make a decision for the user-defined range U+e000..U+f8ff.
Second, I find the behaviour of "wcwidth" for the conjoining Johab
range irrational. It doesn't make any sense to make the leading
consonant full-width, with vowels and finals being half-width. I don't
want to have to split my glyphs over two fonts to see the Johab
elements on software that doesn't support conjoining. (On a
conjoining renderer the result of conjoining should of course be
full-width.)
Otfried
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/