[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Korean input using Xkb (was Re: switching to UTF-8)





On Thu, 2 May 2002, Pablo Saratxaga wrote:

> On Thu, May 02, 2002 at 02:14:08AM -0400, Jungshik Shin wrote:
>
> >   BTW, Xkb may work for Korean Hangul, too and we don't need
> > XIM  if we use 'three-set keyboard' instead of 'two-set keyboard'

> If it is indeed doable, it wouldn't be very practical due to the high
> amount of possible combinations (the Xkb based solution consist to

  As you correctly noticed, I was talking  about using U+1100 Jamos.
Some Koreans believe that it's the greatest mistake of Korean nat'l
standard body to insist that 11,172 syllables are encoded in Unicode/ISO
10646 and to prevail in ISO/IEC JTC1/SC2/WG2 because 11,172 precomposed
syllables are not sufficient for _even modern_ Korean,  we need U+1100
Jamo support anyway for Middle (and future) Korean and the inclusion of
11,172 syllables only delayed support of U+1100 Jamos by giving (or rather
strengthening) a **false** impression to developers that Hangul doesn't
need to be treated as a complex script (as Indic and Thai scripts do). For
instance, Sun's complex text/script support plan _did_ not mention Korean
Hangul while listing various South and Southeast Asian scripts and Hebrew
and Arabic as the targets of complex text processing.  It's frustrating
to have to debunk the myth (that Korean Hangul can be treated just like
Japanese and Chinese writing systems and that it doesn't have anything in
common with South and Southeast Asian scripts)  time and again. At the
moment, it's only Microsoft that fully understands the issue and offers
the full range of Hangul support. Fortunately, Pango is moving forward in
the right direction and hopefully ST and ATSUI(of MacOS)  would help, too.

> But that's true that hangul-only typing doesn't require any user
> interactivity at all, an on-the-spot method that just analyzes the input
> and convert to preformed hangul syllabes on the fly is enoguh.

  Yup. It's basically a not-so-complex automata. For three-set
keyboard, it's very simple while for two-set keyboard, it's a bit more
complicated. An automata for two-set Middle Korean KBD would be much
more complicated than three-set Middle Korean KBD.  Of course, Hanja
input does require dictionary look-up and user interaction.


> Now, if by "Xkb is enough to type Korean" you meant typing directly
> the single jamos without composing, yes, that's perfectly possbile;

  You have to note that I had two conditions under which
that might be possible. One of them is that 'three-set' keyboard is used.
'Three-set' keyboard distinguishes between leading consonants
and trailing consonants while 'two-set' keyboard doesn't.
The other is that we use U+1100 Jamos to represent Hangul.

> but the produced output won't be in the standardized precomposed form
> for the common korena syllabes, that could be a compatibility problem
> if you exchange files written that way.

  Well, I have to quote 'standard' in standard precomposed form :-).
It's certainly true that precomposed form is widely used for Korean.
However, many people including me want to go all the way to using
exclusively U+1100 Hangul Jamos for both modern and Middle Korean when
a large enough number of programs and fonts support that.  To achieve
backward compatibility, post and pre-processing (to convert modern syllables
into and out of NFC -precomposed forms) can be done if necessary.

   Jungshik Shin

P.S. Attached is an example of Xkb definition for a 3-set Korean keyboard. It's
made by PARK Won Kyu <wkpark@xxxxxxxxxxxxxxx>.
xkb_keymap "korea3fin" {
    xkb_keycodes        { include "xfree86"             };
    xkb_types           { include "default"             };
    xkb_compatibility   { include "default"             };
    xkb_geometry        { include "pc(pc102)"           };

xkb_symbols  {

    include "en_US(pc105)+group(toggle)"

    name[Group1]= "US/ASCII";
    name[Group2]= "Korean";
    key <TLDE> {	[	grave,	asciitilde	],
			[	asterisk,	hexagram	]	};
    key <AB01> {	[	z,	Z	],
			[	Hangul_J_Mieum,	Hangul_J_Cieuc	]	};
    key <AC01> {	[	a,	A	],
			[	Hangul_J_Ieung,	Hangul_J_Dikeud	]	};
    key <AD01> {	[	q,	Q	],
			[	Hangul_J_Sios,	Hangul_J_Pieub	]	};
    key <AE01> {	[	1,	exclam	],
			[	Hangul_J_Hieuh,	Hangul_J_SsangKiyeog	]	};
    key <AB02> {	[	x,	X	],
			[	Hangul_J_Kiyeog,	Hangul_J_PieubSios	]	};
    key <AC02> {	[	s,	S	],
			[	Hangul_J_Nieun,	Hangul_J_NieunHieuh	]	};
    key <AD02> {	[	w,	W	],
			[	Hangul_J_Rieul,	Hangul_J_Tieut	]	};
    key <AE02> {	[	2,	at	],
			[	Hangul_J_SsangSios,	Hangul_J_RieulKiyeog	]	};
    key <AB03> {	[	c,	C	],
			[	Hangul_E,	Hangul_J_Khieuq	]	};
    key <AC03> {	[	d,	D	],
			[	Hangul_I,	Hangul_J_RieulPieub	]	};
    key <AD03> {	[	e,	E	],
			[	Hangul_YEO,	Hangul_J_NieunJieuj	]	};
    key <AE03> {	[	3,	numbersign	],
			[	Hangul_J_Pieub,	Hangul_J_Jieuj	]	};
    key <AB04> {	[	v,	V	],
			[	Hangul_O,	Hangul_J_KiyeogSios	]	};
    key <AC04> {	[	f,	F	],
			[	Hangul_A,	Hangul_J_RieulMieum	]	};
    key <AD04> {	[	r,	R	],
			[	Hangul_AE,	Hangul_J_RieulHieuh	]	};
    key <AE04> {	[	4,	dollar	],
			[	Hangul_YO,	Hangul_J_RieulPhieuf	]	};
    key <AB05> {	[	b,	B	],
			[	Hangul_U,	question	]	};
    key <AC05> {	[	g,	G	],
			[	Hangul_EU,	Hangul_YAE	]	};
    key <AD05> {	[	t,	T	],
			[	Hangul_EO,	Hangul_J_RieulSios	]	};
    key <AE05> {	[	5,	percent	],
			[	Hangul_U,	Hangul_J_RieulTieut	]	};
    key <AB06> {	[	n,	N	],
			[	Hangul_Sios,	minus	]	};
    key <AC06> {	[	h,	H	],
			[	Hangul_Nieun,	0	]	};
    key <AD06> {	[	y,	Y	],
			[	Hangul_Rieul,	5	]	};
    key <AE06> {	[	6,	asciicircum	],
			[	Hangul_YA,	equal	]	};
    key <AB07> {	[	m,	M	],
			[	Hangul_Hieuh,	quotedbl	]	};
    key <AC07> {	[	j,	J	],
			[	Hangul_Ieung,	1	]	};
    key <AD07> {	[	u,	U	],
			[	Hangul_Dikeud,	6	]	};
    key <AE07> {	[	7,	ampersand	],
			[	Hangul_YAE,	leftdoublequotemark	]	};
    key <AB08> {	[	comma,	less	],
			[	comma,	comma	]	};
    key <AC08> {	[	k,	K	],
			[	Hangul_Kiyeog,	2	]	};
    key <AD08> {	[	i,	I	],
			[	Hangul_Mieum,	7	]	};
    key <AE08> {	[	8,	asterisk	],
			[	Hangul_YI,	rightdoublequotemark	]	};
    key <AB09> {	[	period,	greater	],
			[	period,	period	]	};
    key <AC09> {	[	l,	L	],
			[	Hangul_Jieuj,	3	]	};
    key <AD09> {	[	o,	O	],
			[	Hangul_Cieuc,	8	]	};
    key <AE09> {	[	9,	parenleft	],
			[	Hangul_U,	rightsinglequotemark	]	};
    key <AB10> {	[	slash,	question	],
			[	Hangul_O,	exclam	]	};
    key <AC10> {	[	semicolon,	colon	],
			[	Hangul_Pieub,	4	]	};
    key <AD10> {	[	p,	P	],
			[	Hangul_Phieuf,	9	]	};
    key <AE10> {	[	0,	parenright	],
			[	Hangul_Cieuc,	asciitilde	]	};
    key <AC11> {	[	apostrophe,	quotedbl	],
			[	Hangul_Tieut,	period	]	};
    key <AD11> {	[	bracketleft,	braceleft	],
			[	parenleft,	percent	]	};
    key <AE11> {	[	minus,	underscore	],
			[	parenright,	semicolon	]	};
    key <AD12> {	[	bracketright,	braceright	],
			[	less,	slash	]	};
    key <AE12> {	[	equal,	plus	],
			[	greater,	plus	]	};
    key <BKSL> {	[	backslash,	bar	],
			[	colon,	slash	]	};
    // End alphanumeric section

    // Begin modifier mappings

//    modifier_map Shift  { Shift_L };
//    modifier_map Lock   { Caps_Lock, ISO_Lock };
//    modifier_map Control{ Control_L };
//    modifier_map Mod3   { Mode_switch };
};
};