[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: unicode_start
Andries.Brouwer@xxxxxx wrote:
> So, maybe you can save time and instead of installing something
> new, describe exactly what you do, what you would like to
> happen but what happens instead.
OK.. here´s the detailed and boring description...
My locale is en_GB.UTF-8. My keyboard is us_intl (in X) and
us-intl (on the console).
A) I leave X and go to console by means of CTRL-ALT-F2.
I enter the following:
cat >test.1 [enter]
é [enter]
CTRL-D
The é character is made by first typing ', then e.
I check the result by means of hexdump -C test.1, which gives:
00000000 e9 0a |..|
00000002
B) The same on the console, after running unicode_start, gives the
same result:
00000000 e9 0a |..|
00000002
I.e., the accented character is entered into the file as *one
byte* (0xE9; the 0x0A is of course the result of typing ENTER).
The only difference is that the é character is not *visible*
immediately when I type ', e.
C) Going back to X, after the same in an xterm, hexdump -C test.1
produces:
00000000 c3 a9 0a |...|
00000003
In other words, in xterm (with UTF-8 locale) a proper UTF-8
representation of the é character is generated. Also the character
is visible when it is typed (like A, but unlike B).
Now I assume that the idea of 'putting the console in UTF-8 mode'
is to make the unicode console (case B) behave like the unicode
xterm (case C). But it doesn´t.
For completeness, there is:
D) In a xterm, I type LANG=C xterm, creating a new xterm with "C"
locale. This behaves exactly like case A.
The console *screen* gets properly switched by unicode_start and
unicode_stop, as can be seen by cat-ing the test file in the two
modes. The two-byte version of é is only displayed when unicode is
on, the one-byte version only when it is off.
Regards, Jan
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/