[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mined editor handles combining characters



towo@xxxxxxxxxxxx wrote on 2000-07-11 22:48 UTC:
> It seems there are a few double-width combining characters (listed below); 

There is no such thing as a narrow or double-width combining character.
This is a property of the base character exclusively. The combining
characters always has the width of the base character.

> Is there something like a double-width space where I could place them on 
> in separated display mode?

You should place them onto U+3000 IDEOGRAPHIC SPACE, because wcwidth(0x3000) == 2.

> Or should it also work to combine them with a sequence of two spaces?

No. If you place a combining character onto a single-width character
such as U+0020, then the terminal emulator must pick the combining
character also from the single-width font. Think of combining characters
as characters that do not have a half-width/full-width property
associated with them. Only the base character preceding them determines
how widely they are displayed. In your separated-display mode, you
should prefix each combining characters with U+0020 if
wcwidth(base_character) == 1 and with U+3000 if wcwidth(base_character)
== 2. wcwidth() will not provide you with any width information for the
combining character itself, because wcwidth(combining_char) == 0.

> I couldn't test them anyway as they don't seem 
> to be contained in the current X fonts (18x18ja/ko).
> 
> 302A;IDEOGRAPHIC LEVEL TONE MARK;Mn;218;NSM;;;;;N;;;;;
> 302B;IDEOGRAPHIC RISING TONE MARK;Mn;228;NSM;;;;;N;;;;;
> 302C;IDEOGRAPHIC DEPARTING TONE MARK;Mn;232;NSM;;;;;N;;;;;
> 302D;IDEOGRAPHIC ENTERING TONE MARK;Mn;222;NSM;;;;;N;;;;;
> 302E;HANGUL SINGLE DOT TONE MARK;Mn;224;NSM;;;;;N;;;;;
> 302F;HANGUL DOUBLE DOT TONE MARK;Mn;224;NSM;;;;;N;;;;;
> 3099;COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK;Mn;8;NSM;;;;;N;NON-SPACING KATAKANA-HIRAGANA VOICED SOUND MARK;;;;
> 309A;COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK;Mn;8;NSM;;;;;N;NON-SPACING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK;;;;

U+3099 and U+309a are present in both 18x18ja and 9x18. So the fonts
certainly have everything you need to test this realistically for
Japanese. Even if they weren't present, you can also place U+302 etc.
onto wide characters, as some of the Latin combining accents are also
available in 18x18ja, and there should be no distinction at all between
a narrow and wide combining character.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/