[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode, character ambiguities
Followup to: <20020109071127.GA19291@xxxxxxxx>
By author: Glenn Maynard <g_lutf8@xxxxxxxx>
In newsgroup: linux.utf8
>
> Hmm. Round-trip between Unicode and JIS and EUC-JP is guaranteed,
> apparently ... so I can't see any real reason for this belief. You need
> to know the language in some cases, but before you needed to know the
> encoding. That's an improvement. (If the "selectors" which were just
> mentioned allow selecting languages for single characters, then there'll
> be even less need to be able to change the language mid-sentence.)
>
<LANGUAGE TAG><TAG 'z'><TAG 'h'>
<LANGUAGE TAG><TAG 'j'><TAG 'p'>
<LANGUAGE TAG><TAG 'k'><TAG 'r'>
This has been in Unicode for a while; the Unicode specification is not
very precise, as far as I gather, but everyone does seem to agree how
it should be handled.
-hpa
--
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <amsp@xxxxxxxxx>
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/