[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: intelligent charset recognition for irc



Dear Martin:

> Well, on the irc channels I am on, iso 646 is still used, however
> it is typed manually by people without access to a Swedish
> keyboard. You are right that this is a crude hack, but there is no
> way to specify your charset on irc, unfortunately.

This is not true, IRC can already do better than that:

	/VERSION
	*** Client: ircII 4.4M (internal version 20000126)
	/HELP SET TRANSLATION
	*** Help on translation
	Usage: SET TRANSLATION <character translation table>
	  The TRANSLATION variable defines a character translation
	  table.  By default, ircII assumes that all text processed
	  over the network is in the ISO 8859/1 map, also known as
	  Latin-1.  This is identical to standard ASCII, except that
	  it is extended with additional characters in the range
	  128-255.  Many environments by default use the Latin-1 map,
	  such as X Windows, MS Windows, AmigaDOS, and modern ANSI
	  terminals including Digital VT200, VT300, VT400 series and
	  MS-Kermit.  However, many older environments use non-standard
	  extensions to ASCII, and yet others use 7-bit national
	  replacement sets.

	  Some available settings for the TRANSLATION variable:

	  8-bit sets:
	    HP_MCS              Hewlett Packard Extended Roman 8.
	    MACINTOSH           Apple Macintosh computers and boat
	                        anchors.
	    CP437               Old IBM PC, compatibles and Atari ST.
	    CP850               New IBM PC compatibles and IBM PS/2.
	    CP850               New IBM PC compatibles and IBM PS/2.
	    DEC_MCS             DEC Multinational Character Set.
	                        VAX/VMS.  VT320's and other 8-bit
	                        Digital terminals use this set by
	                        default, but I recommend changing to
	                        Latin-1 in the terminal Set-Up.
	    DG_MCS              Data General Multinational Character Set.
	    NEXT                NeXT.

	  7-bit sets:
	    ASCII               ANSI ASCII, ISO Reg. 006.  For American
	                        terminals in 7-bit environments.  Default.
	    DANISH              Norwegian/Danish.
	    DUTCH               Dutch.
	    FINNISH             Finnish.
	    FRENCH              ISO French, ISO Reg. 025.
	    FRENCH_CANADIAN     French in Canada.
	    GERMAN              ISO German, ISO Reg. 021.
	    IRV                 International Reference Version, ISO
	                        Reg. 002.  For use pedantic in ISO 646
	                        environments.
	    ITALIAN             ISO Italian, ISO Reg. 015.
	    JIS                 JIS ASCII, ISO Reg. 014.  Japanese
	                        ASCII hybrid.
	    NORWEGIAN_1         ISO Norwegian, Version 1, ISO Reg. 060.
	    NORWEGIAN_2         ISO Norwegian, Version 2, ISO Reg. 061.
	    POLISH              Converts windows codepage 1250 to ISO-8859-2
	    POLISH_NOPL         Converts both cp1250 and iso8859-2 to latin
	                        equivalents
	    PORTUGUESE          ISO Portuguese, ISO Reg. 016.
	    PORTUGUESE_COM      Portuguese on Digital terminals.
	    RUSSIAN             Russian.
	    RUSSIAN_ALT         Alternative Russian.
	    RUSSIAN_WIN         Russian with Windows.
	    SPANISH             ISO Spanish, ISO Reg. 017.
	    SWEDISH             ISO Swedish, ISO Reg. 010.
	    SWEDISH_NAMES       ISO Swedish for Names, ISO Reg. 011.
	    SWEDISH_NAMES_COM   Swedish.  Digital, Hewlett Packard.
	    SWISS               Swiss.
	    UNITED_KINGDOM      ISO United Kingdom, ISO Reg. 004.
	    UNITED_KINGDOM_COM  United Kingdom on DEC and HP terminals.
	See Also:
	  DIGRAPH
	  BIND ENTER_DIGRAPH

In http://czyborra.com/utf/#UTF-8 I wrote:

	PGP 5.0i and IRC II-4.4 still use Latin1 as their canonical
	text encoding instead of UTF-8: {cp850,ebcdic}_to_latin1
	in pgp-5.0i/src/lib/pgp/helper/pgpCharMap.c and
	ircii-4.4/source/translat.c 

They oughta move to UTF-8, though.

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/