[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: mk_wcwidth
>You do realize that people in CJK locales expect some characters to be
>double width that people in European/American locales expect to be single
>width.
Doublewidth roman letters are in the unicode range FF00-FFFE, so
when converting from a legacy encoding that assumes the ascii
ranges are all doublewidth, you map to (ascii+FEE0). With
unicode you can even mix double and singlewidth "ascii" in a
single document; many of the roman letters became "kanji"
when in doublewidth form (for example doublewidth capital
letter H can mean pornography) and have a different meaning
than their single-width brethren.
So a unicode char-cell width function should function identically
for all locales.
(I dont know of any unicode support for fullwidth greek or cyrillic,
but should such a thing be needed, there is room north of the BMP)
> > i was imagining perhaps the difference between O(2 log n) and O(log n)
> > would still be worthwhile :)
> O(2 log n) = O(log n).
yes, and O(1 billion years + log n) == O(log n) too :)
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/