[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: question on Linux UTF8 support
Hi Bruno,
Today at 17:24, Bruno Haible wrote:
> This will mess up users who have their LC_CTYPE set to a non-UTF-8 encoding.
> It is weird if a user, in an application, enters a new file name "Süß",
> and then in a terminal, the filename appears as "Süà " (wow, it even
> hangs my xterm!).
Oh, indeed. But what about user deciding to change LC_CTYPE? Or even
worse, what if administrator provides some dirs for the user in an
encoding different from the one user wants to use?
Eg. imagine having a global "/Müsik" in ISO-8859-1, and user desires
to use UTF-8 or ISO-8859-5. Now not only that it will be weird (and
possibly even hang your xterm!), you'd be in a mess if you try to fix it.
My point is that the filesystem encoding should be filesystem-wide
(not per-user), because that's the only way to warrant that it won't
break. And in the sense of POSIX API, UTF-8 makes most sense as a
single, backwards compatible filesystem encoding (well, it wasn't
originally called "UTF-FS" for no reason :), which can work for
everybody.
> It is just as bad as those old Motif applications which assume that
> everything is ISO-8859-1. This makes these applications useless in UTF-8
> locales.
No, it's not. UTF-8 can encode all characters, so you'd be able to
use whatever characters you wish, give or take a conversion step.
ISO-8859-1 limits you not only on "implementation details" step, but
also on features.
> In summary, I'd suggest
> - that ALL application follow LC_ALL/LC_CTYPE/LANG, like POSIX specifies,
> - that users switch to UTF-8 locale when they want.
That's not closer to ever solving the problem. It's status quo. I
think we should at least recommend improvements, if not require them
(and nobody suggested requiring them).
Basically, my recommendation was to set LC_CTYPE to UTF-8 on all new
systems.
Cheers,
Danilo
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/