[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Linux and UTF8 filenames



Followup to:  <3.0.1.32.20020916125316.0095fb30@xxxxxxxxxxxxxxxxxx>
By author:    Martin Kochanski <unicode@xxxxxxxxxxx>
In newsgroup: linux.utf8
> 
> Linux, to me, is more of a puzzle. The kernel simply treats
> filenames as a sequence of bytes, so it will happily accept almost
> anything you throw at it. In particular, 52 EA 76 65 and 52 C3 AA 76
> 65 are both valid filenames. What I can't immediately work out is
> what the tools (such as 'ls') will do. Is it universally the case
> that the tools will assume that those byte-sequence filenames are in
> UTF8 (in which case the two examples come out as R?ve and Rêve)? Or
> do they assume a standard locale (perhaps yielding Rêve and Rêve)?
> Or is this a switchable option that the user can set? In any case,
> how can a poor innocent server discover enough about the context in
> which it is running to know what filename it has to use so that a
> user who lists a file directory will see "Rêve" on his screen?
> 

Linux follows the locale.  Most systems operate with a system default
locale; occationally a user sets a different locale.

We're trying to push Linux systems toward using UTF-8 everywhere, but
it's a long push.

	-hpa
-- 
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@xxxxxxxxx>
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/