[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode, character ambiguities



Followup to:  <20020109071127.GA19291@xxxxxxxx>
By author:    Glenn Maynard <g_lutf8@xxxxxxxx>
In newsgroup: linux.utf8
> 
> Hmm.  Round-trip between Unicode and JIS and EUC-JP is guaranteed,
> apparently ... so I can't see any real reason for this belief.  You need
> to know the language in some cases, but before you needed to know the
> encoding.  That's an improvement.  (If the "selectors" which were just
> mentioned allow selecting languages for single characters, then there'll
> be even less need to be able to change the language mid-sentence.)
> 

<LANGUAGE TAG><TAG 'z'><TAG 'h'>
<LANGUAGE TAG><TAG 'j'><TAG 'p'>
<LANGUAGE TAG><TAG 'k'><TAG 'r'>

This has been in Unicode for a while; the Unicode specification is not
very precise, as far as I gather, but everyone does seem to agree how
it should be handled.

	-hpa
-- 
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@xxxxxxxxx>
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/