[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Unicode transliteration table package: transtab
I have decided to publish and maintain my transliteration table now as a
proper package:
http://www.cl.cam.ac.uk/~mgk25/download/transtab.tar.gz
Greeklish, Greek polytonic->monotonic downgrading and Cyrillic are still
missing (plus all the non-European scripts), but the rest is now already
in pretty good shape. The table comes in ISO/IEC TR 14652 format, to
allow simple inclusion into POSIX locale definition files.
If you feel like providing a (preferably) Latin transliteration for a
not yet covered script, let me know which one, and I'll prepare you a
template file that you can fill in very easily. Also a Perl script
and Unix Makefile for reformatting the table is included.
Also pointers to any information on transliteration are very welcome.
ISO has for example:
ISO 9:1995 Information and documentation -- Transliteration of
Cyrillic characters into Latin characters -- Slavic and non-Slavic languages
ISO 233:1984 Documentation -- Transliteration of Arabic characters into
Latin characters
ISO 233-2:1993 Information and documentation -- Transliteration of Arabic
characters into Latin characters -- Part 2: Arabic language --
Simplified transliteration
ISO 233-3:1999 Information and documentation -- Transliteration of
Arabic characters into Latin characters -- Part 3: Persian language
-- Simplified transliteration (available in English only)
ISO 259:1984 Documentation -- Transliteration of Hebrew characters
into Latin characters
ISO 259-2:1994 Information and documentation -- Transliteration of
Hebrew characters into Latin characters -- Part 2: Simplified transliteration
ISO 9984:1996 Information and documentation -- Transliteration of
Georgian characters into Latin characters
ISO 9985:1996 Information and documentation -- Transliteration of
Armenian characters into Latin characters
ISO 11940:1998 Information and documentation -- Transliteration of Thai
ISO/TR 11941:1996 Information and documentation -- Transliteration
of Korean script into Latin characters
but I have only access to some of these (in the form of my copy of
ISO's 1991 Information and documentation handbook).
Please email me any pointers to good documents describing the
established practices that people use in email and on typewriters to
represent unavailable characters and let me know where my table differs
from these.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/