[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
UTF-8 tin
Hello, tin hackers, and utf-8 people. :)
I have put together a small patch against tin which partially adds UTF-8
support. It needs a UTF-8 terminal atm, and depends upon a recent CVS
version libunicode (see http://developer.gnome.org/tools/cvs.html,
to get it) Actually, it only uses iconv from that, so it would be
easy to port to glibc2.1 or other systems with iconv(3). The dependency
upon a UTF-8 terminal should be trivial to fix, though.
Features are :
* Will correctly display UTF-8 articles.
* Will correctly display articles in other character sets that
the iconv knows about, if they are Content-Transfer-Encoding: 8bit
Right know, if articles are tagged as being in US-ASCII, or
ISO-8859-1, (or untagged). it assumes they are in Windows-1252. This
is due to the vast proliferation of broken Windows news clients. I am
not sure whether this behaviour is desirable.
Right now, however :
* there is no support for decoding multibyte characters sets.
I can't see how to do this without rewriting mm_decode. Ideas?
* base64 encoded articles aren't sent through the charset converter,
* and i was observing some odd behaviour with quoted-printable articles.
* finally, there is no support for converting from raw 8bit characters
in the header to UTF-8. I am uncertain of how to do this. Have you any
thoughts?
* oh, and it doesn't convert character sets for quoted text, etc.
(anything I forgot?)
It can be obtained from here :
http://www.ecs.soton.ac.uk/~rwb197/tin-utf.tar.gz
Obviously this diff is in an unsuitable state to go into tin-devel right
now, but if it were finished, and preserved the existing behaviour on
systems without iconv, etc, would something like it be OK do go into tin?
--
Robert
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/