[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 keyboard mode



Birger Langkjer wrote on 1999-09-16 22:44 UTC:
> In yudit, to write the unicode character U+1260, you first type 'b' and
> then 'e' to get the ethiopian 'be' sign, but less could interpret 'b' as
> the accelerator key unless it is in search mode. 

The encoding used by the keyboard driver and the input method used to
encode the character are two completely orthogonal (independent) issues.

My point was just that whenever the Linux console (screen) driver is
switched to the UTF-8 encoding, the keyboard should fully automatically
be switched as well. Two mechanisms seem to me most feasible here:

  a) write "stty iutf8"

  b) print ESC % G  (as defined in ISO 2022)

The first method has the advantage that the generic tty driver also
becomes aware of it, such that the cooked mode processing (kernel
line editing) can also be made aware of that we now use UTF-8.

Neither of these should under any circumstances change the input method!
The only thing that could change is that a few key combinations that
were meaning less before give now access to new characters that are not
available without UTF-8. For instance, I have already on my GB keymap
the following:

  AltGr+E  euro
  AltGr+-  en-dash
  AltGr+_  em-dash
  AltGr+\  left quotation mark
  AltGr+/  right quotation mark
  Shift+AltGR+\  double left quotation mark
  Shift+AltGR+\  double right quotation mark

and a couple mathematical symbols on various other combinations. There
exist many ways of extending keyboard repertoires: numeric entry,
compose key functions, level 3 shift, non-spacing keys, special language
or domain specific input methods that are activated by some hotkey, etc.

If you want to activate a special input method for comfortable entry of
ethiopian, then you have to do this completely separately of the switch
to UTF-8. In this case, you have to know how it works and how you can
continue to enter ASCII characters when you are in Ethiopian mode. A
well-designed Ethiopian mode will still allow you to enter ASCII
characters reasonably well, such that you can enter a "b" to scroll
backwards with less. Hotkeys are the most common way of switching
between various input methods (Ctrl-F1 = normal, Ctrl-F2 = Cyrillic,
Ctrl-F3 = Ethiopian, etc.).

> > > You will spot non-UTF-8 file quickly, because they look funny in your
> > > UTF-8 terminal emulator.
> > 
> > Can you define funny??
> 
> Usually you'll see a little white '?' on a black background instead of
> the non-ASCII character. Not exactly hilarious but perhaps good for a
> little giggle :-)

This glyph is Unicode character 0xFFFD, known as the "replacement
character" and commonly used to represent anything that is not a Unicode
character, e.g. an illegal UTF-8 sequence or a character from another
character set that is not found in Unicode.

> 10 years is too long. 

Most magnetic storage media become somewhat unreliable after 5-8 years.
That is the upper limit of when a major reinstallation is due in a PC
environment. UTF-8 will certainly come much earlier.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/