[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Byte-order-marks considered harmful



Den man, 29 nov 1999 skrev du:
> [Sorry for the delay, I have been on holidays]
> 
> Kai Henningsen wrote:
... 
> > Let's be reasonable. The whole *point* of using UTF-8 is to remain ASCII- 
> > compatible. Putting in BOMs is not ASCII-compatible.
> 
> I don't understand this remark.  If the file only contains ASCII characters
> (<128) there should be no BOM, since the UTF-8 file is equal to the ASCII
> file, thus it's not really UTF-8 encoded.  When there are non-ASCII
> characters, the file is not ASCII compatible and the BOM can be used.

I think what he means is that you can treat an ASCII file as if it were a
UTF-8 file from now on, an ASCII file and a UTF-8 file without any two-byte
characters are completely similar. You can replace the ASCII encoding with
UTF-8.

The problem is not ASCII but the ISO-8859-[1-9] encodings, especially
ISO-8859-1 that has become standard in Linux for lack of a better alternative.

-- 
Med venlig hilsen/Best regards
Birger Langkjer
http://members.xoom.com/langkjer
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/