[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
charset handling by mail program
This isn't really about UTF-8, but I'm guessing people here are
experienced with working in a multi-charset environment and can
perhaps help me.
I'm thinking of making mutt (an MUA - www.mutt.org) handle the charset
of an attachment as follows.
There are two configuration variables, local_charsets and
send_charsets, both equal to a list of charsets.
When a file is attached, and before you actually send the message, the
MIME parts are listed. Each text part has two charsets associated with
it in the list: the original charset and the target charset.
When a file is attached, the original charset is set equal to the
first local_charset that makes sense, i.e. can be converted by iconv
without an EILSEQ. Except that a file containing characters in the
range \x80-\x9f will not be accepted as iso-8859-X, and perhaps we
need some other exceptions here?
The target charset is then set to the first send_charset into which
the file can be converted in a reversible fashion. Failing that, the
original charset is used as the target charset.
Now there are two commands available to the user before he or she
sends the message:
(edit_charset) Sets the original charset (it will be an error if the
file is misformed with respect to that charset) and causes the target
charset to be recomputed.
(change_charset) Sets the target charset. A warning will be issued if
the conversion is not reversible.
I might configure mutt thus:
set local_charsets="utf-8:iso-8859-1"
set send_charsets="us-ascii:iso-8859-1:iso-8859-3:utf-8"
(Most people would have us-ascii at the start of send_charsets.)
Can anyone see a fundamental problem with this approach or suggest any
improvements?
(The same send_charsets would be used in deciding how to encode the
headers (RFC 2047, etc), but I don't intend allowing any user
intervention for that.)
Edmund
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/