[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Debian UTF-8, Mandrake UTF-8, and Apache UTF-8
"Edward H. Trager" wrote on 2004-10-11 18:06 UTC:
> You have a good point that perhaps Apache should have no encoding
> set by default, thus forcing everyone to read the documentation and
> make a decision.
The spec on
http://www.w3.org/TR/html4/charset.html#h-5.2.2
says
The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as
a default character encoding when the "charset" parameter is absent
from the "Content-Type" header field. In practice, this recommendation
has proved useless because some servers don't allow a "charset"
parameter to be sent, and others may not be configured to send
the parameter. Therefore, user agents must not assume any default
value for the "charset" parameter.
I guess, this "assume-nothing" rule could reasonably be extended to HTTP
servers as well. With the now rapidly ongoing deployment of UTF-8,
ISO 8859-1 is by far not as commonplace any more as it was two years ago.
Markus
--
Markus Kuhn, Computer Lab, Univ of Cambridge, GB
http://www.cl.cam.ac.uk/~mgk25/ | __oo_O..O_oo__
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/