> -----Original Message-----
> From: Tomohiro KUBOTA [mailto:tkubota@xxxxxxxxxxx]
...
> At Fri, 2 Feb 2001 14:51:52 +0000 (GMT),
> Robert Brady <robert@xxxxxxxxxx> wrote:
>
> > Is there any advantage over Xterm supporting ISO-2022,
> > over w3m supporting UTF-8?
Note that ISO/IEC 2022 *itself* is just a code switching scheme...
Most applications (which are few) that use 2022 can only switch
between a few encodings, not all that are registered "for use
with ISO/IEC 2022".
> (1) ISO-2022 is free from 'mojibake', as I wrote.
>>'mojibake' - broken characters caused by encoding mismatching.
I'm not sure what you mean. Most of the "broken characters"
in 10646/Unicode come from East Asian legacy encodings, and
there are even more in Unicode 3.1 because of recent additions
of Han compatibility characters going into 10646-2.
> (2) ISO-2022 is free from Unicode's overreaching CJK han unification.
> (bad legacy from mess 16bit Unicode.)
Originally Unicode was intended only for characters in active use
these days. But the scope has been extended to cover also historic
characters. That is why about 43 000 new Han characters are
going into Unicode/10646, still with the same unification principles
as for the BMP. Most are collected from various dictionaries,
only a smaller number are compatibility Han (Kanji) characters
(insisted on by Japan).
It is important to note that the Han unification was suggested,
developed, and still maintained by representatives from south-east
Asia; Japan, PRC, HKSAR, Taiwan, RoK, and lately DPRK. It's NOT
a US invention! It's not even a Unicode invention; Unicode
and SC2/WG2 take input from experts all over the world. And
the Han unification was NOT done bacause of the (now historic)
limitation to 16 bit fixed-width (even though it did fit; but
does not anymore).
> (3) ISO-2022 code space is wider than Unicode.
Well, depends on how you count! Unicode has a large number
of characters encoded ONLY in 10646/Unicode and no other encoding
(registered for use with 2022). And there is on the order of
1 million unused code points. Nobody expects that to be filled
any time soon.
> (4) ISO-2022 support is easier for the current implimentation of w3m
> than Unicode. (I don't know about the internal of w3m; this is
> what the w3m developer said.)
Really?? I have a hard time believing that.
/kent k