[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: W3C and UTF-16
On Thu, Apr 08, 2004 at 08:35:21PM -0400, Michael B Allen wrote:
> This is probably states the definitive position for text handling:
>
> http://www.w3.org/TR/1999/WD-charmod-19991129/#Encodings
>
> But even though the encoding is not clearly stated as UTF-16, the Document
> Object Model (DOM) which is basically the document tree inside a web
> browser and key to all HTML and XML processing including JavaScript and
> XSLT processing *requires* the encoding be UTF-16:
>
> http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578
"The UTF-16 encoding was chosen because of its widespread industry practice."
Very funny; it was chosen since it's what Windows is stuck with.
That aside, "all" above is incorrect. You don't have to use DOM to process
HTML and XML. (Ultimately, if one *had* to use UTF-16 to process HTML, then
something along the line is horribly wrong: a language specification can't
legitimately make any requirements about transparent implementation details.)
--
Glenn Maynard
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/