[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: garbled file names on a linux/windows volume
- To: linux-utf8@xxxxxxxxxxxx
- Subject: Re: garbled file names on a linux/windows volume
- From: "Ray Chuan" <rctay89@xxxxxxxxx>
- Date: Sun, 2 Nov 2008 09:26:51 +0800
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=BeuD8oGc81vMueO1rTxp7vOf0OOt0Z/0O8w3r4s4VQw=; b=UG0d+A6byxAPsOa8lKZBqCeiyDy2PWJiEqXe1RNHNtpGIWAGvulx4O1DuEkzV3sfLB /VVBugzLIt4XKzqiTt0U6Vbznp/u+gSJqum2Rrm+R9+DtdbmWu+qcOMgUuglDJZbSQLP gD7imgMA6S57RBKLkIYkP6mPcMrykusQneAw4=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=Si0oAtSF9MOGndl72IkkIxRqhODBFfDx9jmbLAzEdZF0bBa+8VxX1Mther3QDMPv9e XSSp9JTaAeGZOeGh1i1LMtDuVen/Lg/GBfcOdzCmIXyd5v79GbHAmfjuh5G0b6wM+3BZ vgVCH/6YZPR8bF4g+OcBmWzc59aVEQAE56xY8=
- In-reply-to: <c05211120810311549g72770647s44970127d2e4b1a3@mail.gmail.com>
- List-archive: <http://mail.nl.linux.org/linux-utf8/>
- List-help: <mailto:ecartis@nl.linux.org?Subject=help>
- List-id: <linux-utf8.nl.linux.org>
- List-owner: <mailto:ecartis-owner@nl.linux.org>
- List-post: <mailto:linux-utf8@nl.linux.org>
- List-software: Ecartis version 1.0.0
- List-subscribe: <mailto:linux-utf8-request@nl.linux.org?Subject=subscribe>
- List-unsubscribe: <mailto:linux-utf8-request@nl.linux.org?Subject=unsubscribe>
- References: <be6fef0d0810311051s6009d198l5a5c6c47cd5ebd3a@mail.gmail.com> <20081031203153.GA21134@mette> <c05211120810311549g72770647s44970127d2e4b1a3@mail.gmail.com>
- Reply-to: linux-utf8@xxxxxxxxxxxx
- Sender: linux-utf8-bounce@xxxxxxxxxxxx
thanks, that worked.
2008/10/31 Ben Wiley Sittler <bsittler@xxxxxxxxx>:
> if you need to fix a lot of these automatically from a shell script,
> you might consider something like this:
>
> python -c 'import sys, urllib; print urllib.unquote("
> ".join(sys.argv[1:])).decode("utf-8").encode("iso-8859-1")' \
> '%C3%83%C2%A9' \
> '%C3%A4%C2%B8%C2%93%C3%A8%C2%BE%C2%91'
>
> é 专辑
>
> it works like "echo", but decodes the %-escaping and one of the levels
> of utf-8 encoding.
>
> On Fri, Oct 31, 2008 at 1:31 PM, Andries E. Brouwer
> <Andries.Brouwer@xxxxxx> wrote:
>> On Sat, Nov 01, 2008 at 01:51:42AM +0800, Ray Chuan wrote:
>>
>>> using an edonkey client, which has a function to convert file names to
>>> url-friendly strings (aka ed2k links), i was able to see that "é"
>>> showed up as %C3%83%C2%A9, while the more complex "专辑"
>>> (专辑) would be %C3%A4%C2%B8%C2%93%C3%A8%C2%BE%C2%91.
>>
>> You converted twice to UTF-8, so have to go back once.
>>
>> (é is U+00e9 which is 11000011 10101001 in UTF-8, but if you read
>> the latter as Latin-1 and convert once more to UTF-8 you get
>> 11000011 10000011 11000010 10101001, that is, %C3%83%C2%A9 as you reported)
>>
>>
>> --
>> Linux-UTF8: i18n of Linux on all levels
>> Archive: http://mail.nl.linux.org/linux-utf8/
>>
>>
>
--
Cheers,
Ray Chuan
??L_"咨?抚?'jYez鳐??+????Y?)钇??Щ?Ï