[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

filetype field?



On Wed, 3 Nov 1999, Bram Moolenaar wrote:
 
> >  - The Unix kernel #!/bin/sh mechanism will break, because the
> >    file will not start any more with #!
> 
> Good point.  Putting the BOM in the second line would work.  But that's a bit
> strange.  It would be better to adjust the kernel to handle UTF-8 files, and
> thus ignore the BOM in this position.  Just one more place that needs to be
> UTF-8 aware, not a big deal.

If the kernel is to look for a UTF-8 BOM, it might as well look for a
general encoding marker.  That seems to be what you are using the BOM for.
There is no byte order to be marked in UTF-8 texts, is there?

If the kernel is to be changed, why not go to the roots and introduce an
filetype field into the inode table, similar to the permissions field,
with commands like

	$ chft "text/plain; charset=utf-8" file1.txt 
	$ chft "text/plain; charset=iso-8859-1" file2.txt 
	$ chft "image/png" file.png

and a 

	/etc/filetypes

table that associates mime types to code numbers in a
tending-to-become-standardized way?

That could at least ensure that no BOMs are misplaced during 

	$ cat file1.txt file2.txt > file.txt

and might solve a lot of other problems.

--
phm

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/