[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: filename encoding (was: ISO-2022)
Bram Moolenaar <Bram@xxxxxxxxxxxxx>:
> > No. A filename is just a sequence of bytes - no conversion required
> > or desirable.
>
> >From the point of view of the kernel it's just a sequence of bytes (except for
> '/'). From the point of view of the user the bytes form characters with a
> specific meaning. If you use the wrong character set, that meaning is lost.
> Conversion is required to keep the meaning.
The question is: where should this conversion be performed?
I suggest it should be performed in individual programs, if at all
(I'm not sure it's worth implementing).
> I think the problem is clear: file names can be encoded in any character set.
> We need to know the character set used to do anything with those names. Thus
> the character set must be stored with the file system.
I disagree. What have file systems got to do with it?
For example, files under /home/tom/ might use ISO-8859-1, while files
under /home/dick/ use UTF-8, but those might both be subdirectories of
the same NFS file system mounted on /home. On the server /home might
correspond to /export/home, where /export is an ext2 fs ...
An ideal multi-encoding file browser might allow you to specify
arbitrarily complex rules for deciding what encoding is in use for
which file names. But personally, I don't think this is worth
implementing: just use UTF-8.
CDs are a special case, because they're read-only.
Edmund
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/