[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to detect the encoding of a string?
Στις 20/Ιούν/2005, ημέρα Δευτέρα και ώρα 12:59, ο/η Mike FABIAN έγραψε:
> Simos Xenitellis <simos74@xxxxxxx> さんは書きました:
>
> > Hi All,
> > The ZIP format (http://www.info-zip.org/pub/infozip/doc/) appears not
> > to specify the text encoding
> > of the filenames of the compressed files, which causes a problem with
> > unzip utilities when they try
> > to uncompress .ZIP files that include filenames in non-UTF-8 encodings.
> >
> > Such ZIP programs are "unzip", "file-roller" (GNOME, at
> > http://fileroller.sourceforge.net/), "ark" (KDE)
> > cannot guess the encoding of the filenames and automatically convert
> > to UTF-8.
> >
> > To solve this problem, a "workaround" is to be able to detect the
> > encoding and automagically convert to UTF-8.
> >
> > Is there a library or sample program that can do such a "encoding
> > detection" based on short strings of unknown encoding
> > (or to choose from encodings based on a smaller list than "iconv --list")?
>
> I think it is better to use the filename-encoding-conversion tool
> "convmv" to fix the encoding *after* unpacking the archive.
>
> See: http://j3e.de/linux/convmv/
>
> ("convmv" is already included in SuSE Linux).
Thanks.
Though you must agree that this does not follow the principle of "Just
works"; the GUI tool will not be able to do the work for them. Most
end-users will be in a "Am stuck&give up" situation. :(
If we cannot solve it in a gracefull way, we might be able to put this
whole issue under the carpet if we identify that a very limited number
of end-users are really affected. Can we say that? People from distros,
do you have feedback on this?
Simos
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/