[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode, character ambiguities
Followup to: <200201100556.g0A5usb18767@xxxxxxxxxxxxxxxx>
By author: Tomohiro KUBOTA <tkubota@xxxxxxxxxxx>
In newsgroup: linux.utf8
>
> Yes, unified. The most famous example is U+76F4.
> I'd like to show an image but I cannot find....
>
> Images are not available at:
> http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=76f4
>
Here is a view from someone who is very much a beginner student at
Japanese (so forgive me if I'm completely off the mark)...
It seems to me that unification, while probably a good idea overall,
has been a bit heavyhanded, probably in no small part due to the
pressure in Unicode 1.x to fit into 16 bits. This ceiling has since
been broken as it pretty much turned out to be impossible to keep to.
My wife's name is Suzi (Susan). Since it happens to phoneticize
pretty poorly into Japanese, she has chosen to use the same Suzuran
("lily of the valley") in Japanese rather than spelling her name in
Katakana. "Suzuran" is U+9234 U+862D (鈴蘭); however, I could
personally not have told the reference glyph for U+9324 was the same
character. I actually found a "compatibility form", U+F9B1 (鈴)
which looks a lot more like I thought the character should look like,
but that one is apparently only supposed to be used for Korean.
Interestingly, at least on my system U+9234 is displayed in the
Japanese glyph rather than the reference glyph.
-hpa (ペーテル アンビーン)
--
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <amsp@xxxxxxxxx>
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/