[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to detect the encoding of a string?
On Fri, Jun 03, 2005 at 12:43:05AM +0300, Alexandros Diamantidis wrote:
>
> I encountered this problem recently, when I tried to unpack a zip file
> with greek filenames created with WinZip. I didn't try any graphical
> decompression software, only command-line unzip, and discovered that
> while the filenames were stored in the zipfile in CodePage 737, unzip
> tried to map them using a CP-437 to latin-1 translation table on
> extraction, and the result was a complete mess...
>
> I found that I could display the stored filenames correctly with the
> following command:
>
> zipnote file.zip | iconv -f cp737 -t utf-8
>
I've had similar problems with unzip trying to extract Shift-JIS
filenames. Using ZipGenius on windows works fine (and with my setup
actually end up in proper encoding on NTFS). unzip -l in my case also
shows unmangled filenames (| iconv -f sjis works successfully, whereas
the extracted filenames fail to translate). I tried looking through the
source to find out what was going on and it was difficult to trace.
fastjar seems to extract the files without mangling the names, but gets
easily confused by some zip files.
-Scott
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/