[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 as the single common encoding everywhere



Followup to:  <CMM.0.90.4.991864985.fdc@xxxxxxxxxxxxxxxxxxxxxx>
By author:    Frank da Cruz <fdc@xxxxxxxxxxxx>
In newsgroup: linux.utf8
> 
> The UTF-8 which allows non-shortest sequences to be read versus the one that
> does not.  The UTF-8 which emits non-shortest sequences versus one that does
> not.  The UTF-8 which "decodes" surrogates versus the one that treats them as
> if they were regular UCS-2 characters.  The UTF-8 that is limited to 6-bit
> seqences versus the unlimited one.  There's a lot of talk on the Unico[dr]e
> lists about this recently, and new proposals for modifications to UTF-8
> surface all the time (the current hot topic is UTF-8S, proposed by Oracle).
> 

There is only one UTF-8.  The rest are bugs.

	-hpa
-- 
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/