[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wcwidth and glibc 2.2
Markus Kuhn writes:
> Shouldn't GB18030 be upwards compatible with GBK and hence also use the
> CJK varirant of wcwidth()?
The wcwidth of GB18030 cannot be compatible with GB2312 and UTF-8
simultaneously. Given the structure of the encoding, I assumed
compatibility with UTF-8 is more important. Does anyone have a copy of
the GB18030 standard?
> > But your wcwidth_cjk() function needs more modifications. It differs
> > from the EUC-JP wcwidth in more than 200 values.
>
> Thanks for your findings! This needs more investigation to distinguish
> the following three cases:
>
> a) http://www.unicode.org/Public/3.1-Update/EastAsianWidth-4d3.beta.txt
> needs to be modified
>
> b) glibc EUC-JP wcwidth() needs to be modified
>
> c) the rules according to which I derived by wcwidth[_cjk]() from
> the unicode.org data need to be modified
It's (c). CJK wcwidth must implement legacy behaviour. You cannot
assume that the unicode.org material will give the right answer here.
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/