[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fw: Document Action: UTF-16, an encoding of ISO 10646 to Informational



Followup to:  <200001241125.MAA28060@xxxxxxxxxxxxxx>
By author:    Bruno Haible <haible@xxxxxxx>
In newsgroup: linux.utf8
>
> Alexander Voropay writes:
> 
> >  JFYI. New RFC for "UTF-16".
> 
> Ouch.
> 
> > ( The archive of <ietf-charsets@xxxxxxxx> :
> > http://lists.w3.org/Archives/Public/ietf-charsets/ )
> 
> Thanks for the URL. Summary of the discussions on that list: everyone is
> aware of the serious drawbacks of UTF-16 vs. UTF-8, and UTF-8 remains the
> recommended Unicode encoding.
> 
> Note that BOMs have been assigned a "maybe" status in this new RFC:
> "the character 0xFEFF in the first position of a stream MAY be interpreted
> as a zero-width non-breaking space, and is not always a byte-order mark."
> 

Which is of course silly.  0xFEFF at the beginning is *ALWAYS* a
ZWNB space; 0xFFFE at the beginning is *ALWAYS* an indication at your
stream is horked; quite possibly because you got the byte order wrong.

       -hpa

-- 
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/