[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Fw: Document Action: UTF-16, an encoding of ISO 10646 to Informational
H. Peter Anvin wrote:
> > Note that BOMs have been assigned a "maybe" status in this new RFC:
> > "the character 0xFEFF in the first position of a stream MAY
> be interpreted
> > as a zero-width non-breaking space, and is not always a
> byte-order mark."
> >
>
> Which is of course silly. 0xFEFF at the beginning is *ALWAYS* a
> ZWNB space; 0xFFFE at the beginning is *ALWAYS* an indication at your
> stream is horked; quite possibly because you got the byte order wrong.
That's not what the community thinks nor what ISO 10646 says:
>From Annex H (The use of "signatures" to identify UCS):
"An application receiving data may either use these signatures to identify
the coded representation form, or may ignore them and treat FEFF as the ZERO
WIDTH NO-BREAK SPACE character."
--
François Yergeau
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/