[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Fw: Document Action: UTF-16, an encoding of ISO 10646 to Informational



H. Peter Anvin wrote:
> > Note that BOMs have been assigned a "maybe" status in this new RFC:
> > "the character 0xFEFF in the first position of a stream MAY
> be interpreted
> > as a zero-width non-breaking space, and is not always a
> byte-order mark."
> >
>
> Which is of course silly.  0xFEFF at the beginning is *ALWAYS* a
> ZWNB space; 0xFFFE at the beginning is *ALWAYS* an indication at your
> stream is horked; quite possibly because you got the byte order wrong.

That's not what the community thinks nor what ISO 10646 says:

>From Annex H (The use of "signatures" to identify UCS):
"An application receiving data may either use these signatures to identify
the coded representation form, or may ignore them and treat FEFF as the ZERO
WIDTH NO-BREAK SPACE character."

--
François Yergeau


-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/