[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: filetype field?
Bram Moolenaar wrote:
> What I am looking for myself is a solution
> that works on all platforms that Vim runs on, including
> Windows...
Then be prepared to support a BOM in UTF-8. Microsoft has decided that it
should be so, wise or not. If you save a text file in UTF-8 from Notepad on
Windows 2000b3, you get a BOM at the beginning. Reloading the file in
Notepad works fine, but catting the file on the command line (shell)
produces garbage.
> One issue I just thought off: Isn't it true that an UTF-8
> file can always legally start with a BOM?
Yes, the BOM is also a valid character (ZERO-WIDTH NON-BREAKING SPACE) that
is valid anywhere.
> Then all UTF-8 aware applications should be able to handle it
> correctly. This mostly means they ignore the BOM,...
No, there's nothing that says that UTF-8 apps must or even should ignore a
BOM. The BOM was designed for solving (hum!) the byte-order issue in UCS-*
and UTF-16, it has only recently appeared in the UTF-8 landscape (pushed by
Microsoft, it would seem).
This might change with Unicode 3.0, which is supposed to have much better
text on the BOM, but I haven't seen it yet.
--
François Yergeau
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/