[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode Filenames in Archives



O/H SrinTuar έγραψε:

Using some fairly recent O/S's, such as Fedora core 8 and WIndows XP,
I seem to have no way to move a bunch of files from one to the other while preserving
the nice unicode filenames I have.


In specific, the files were created on the fc8 system. (a few thousand of them)

Putting them together in a zip file works fine fc8->fc8, but fails miserably
when trying to unzip in windows.


A bit of searching shows this:
http://www.pkware.com/documents/casestudies/APPNOTE.TXT

pkware has apparently declared a flag bit to mean all filenames are utf-8

But at the same time, the developers of info-zip say this:
  http://www.info-zip.org/FAQ.html

Basically, that utf-8 support is nowhere on their radar.

Things work poorly in the opposite direction for zipfiles created on windows as well:
sometimes i can guess the original encoding and reverse the damage, other times
I cannot : perhaps the software that made the archive has already trashed the filenames.


Ive also given tarballs a shot for this task, but sadly cygwin is ascii-only.

Because it works linux to linux, or at least fedora to fedora, and that is really good enough for me,
Its not a major issue. But I'm curious to know if other have run into this cross-platform problem, and how they
resolved it for themselves. That is, if anyone still reads this list.


How do you go about making a basic archive containing non-ascii filenames that you can have confidence
will unpack well on most operating systems.
If you check the list archives, you will notice a discussion a few years back.
One of the outcomes was that it's a bit messy to use ZIP and filenames in encoding other than ASCII.


I would suggest that you to tar and GZip (or BZip) your archives. Will these work on Windows?
Try with 7zip to extract the said files. I would appreciate it if you could report back on this.


Talking about 7Zip, 7z is another option as well.

Simos


-- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/