[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: filename encoding (was: ISO-2022)



Bram Moolenaar <Bram@xxxxxxxxxxxxx>:

> > No. A filename is just a sequence of bytes - no conversion required
> > or desirable.
> 
> >From the point of view of the kernel it's just a sequence of bytes (except for
> '/').  From the point of view of the user the bytes form characters with a
> specific meaning.  If you use the wrong character set, that meaning is lost.
> Conversion is required to keep the meaning.

The question is: where should this conversion be performed?

I suggest it should be performed in individual programs, if at all
(I'm not sure it's worth implementing).

> I think the problem is clear: file names can be encoded in any character set.
> We need to know the character set used to do anything with those names.  Thus
> the character set must be stored with the file system.

I disagree. What have file systems got to do with it?

For example, files under /home/tom/ might use ISO-8859-1, while files
under /home/dick/ use UTF-8, but those might both be subdirectories of
the same NFS file system mounted on /home. On the server /home might
correspond to /export/home, where /export is an ext2 fs ...

An ideal multi-encoding file browser might allow you to specify
arbitrarily complex rules for deciding what encoding is in use for
which file names. But personally, I don't think this is worth
implementing: just use UTF-8.

CDs are a special case, because they're read-only.

Edmund
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/