[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Progress on xterm with combining characters, wcwidth
Robert Brady announced his xterm with support for doublewidth characters
and combining characters three weeks ago. [1]
I've modified CLISP Common Lisp [2] to take advantage of this feature:
- Added two functions
(char-width character) -> integer
(string-width string) -> integer
which return the number of screen column needed for a character or string.
- Modified the pretty printer and formatted I/O system to use
(string-width string) instead of (length string) in the right places.
A screenshot is available in [3].
This is all based on `wcwidth'. The pretty printer uses `wcwidth'
extensively, to keep track of the screen columns. For speed, I chose a
`wcwidth' implementation based on table lookup [4], not binary search,
and even so the pretty printer got a 30% slowdown.
Thomas Wolff wrote:
> One thing I'd need is a function to tell me which characters have which
> sort of behaviour.
Markus posted such a function. Except that you should call it "isnonspacing",
not "iscombining": It covers the "Non-spacing" property of PropList.txt [5].
Note there are also combining characters with a width of 1. The first one
is U+0903. All of them are in Indic scripts. How are they supposed to be
rendered by a simple rendering engine as xterm?
Bruno
[1] http://mail.nl.linux.org/linux-utf8/1999-11/msg00069.html
[2] ftp://cellar.goems.com/pub/clisp/clisp-1999-11-30.tgz
[3] http://clisp.cons.org/~haible/fibjap-xterm.gif
[4] ftp://ftp.ilog.fr/pub/Users/haible/utf8/libutf8-0.6.1.tar.gz
file libutf8-0.6.1/extras/wcwidth.c
[5] ftp://ftp.unicode.org/Public/3.0-Update/PropList-3.0.0.txt
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/