[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU Emacs Unicode support




Finally I found the time to upgrade Emacs to 20.6, which made it
possible to try Miyashita Hisashi's ``Mule-UCS'' package.  Since it
integrates better with GNU Emacs (in particular, it's much easier to
read/write mail in UTF-8), I've stopped using my homegrown converter
and use his package now.

I've extended "Mule-UCS" to cover the complete BMP.  My modifications
are available at "http://www.cs.ust.hk/~otfried/Mule";.  

In short, the modified package

 * defines a new encoding "utf-8", which can be used like any other
   Emacs coding system.  It is indicated by the letter "u" in the
   mode line.  

 * covers the complete BMP.  

 * uses two Unicode fonts to render the whole BMP range, one for
   half-width characters, one for full-width characters. (I'm normally
   using 9x18 and 18x18ja from Markus' collection.)

 * adheres to Markus' definition of ``wcwidth()'' to select between
   the two fonts, so that Emacs should work fine on a UTF-8 aware
   terminal emulator.

There are two exceptions to the last point. First, I had to
arbitrarily make a decision for the user-defined range U+e000..U+f8ff.

Second, I find the behaviour of "wcwidth" for the conjoining Johab
range irrational.  It doesn't make any sense to make the leading
consonant full-width, with vowels and finals being half-width. I don't
want to have to split my glyphs over two fonts to see the Johab
elements on software that doesn't support conjoining.  (On a
conjoining renderer the result of conjoining should of course be
full-width.)

Otfried


-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/