[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wcwidth and zero width spaces




On Sat, 7 Oct 2000, Markus Kuhn wrote:

> In the -misc-fixed-* fonts, all these characters would then be
> represented as an empty space glyph, such that they remain invisible if
> treated like an overstriking combining character.
> 
> Read section 13.2 of The Unicode Standard 3.0 for the semantics and for
> application examples of these characters.
> 
> Comments and opinions?

I agree with the idea. What about other characters of class Cf? Those are:

	U+070F	SYRIAC ABBREVIATION MARK
	U+180B	MONGOLIAN FREE VARIATION SELECTOR ONE
	U+180C	MONGOLIAN FREE VARIATION SELECTOR TWO
	U+180D	MONGOLIAN FREE VARIATION SELECTOR THREE
	U+180E	MONGOLIAN VOWEL SEPARATOR

They seem to be in the same general category of ZWJ and ZWNJ. Take a look
at pages 200 and 291 of TUS 3.0:

	"U+070F SYRIAC ABBREVIATION MARK (SAM) is a user-selectable
	 zero-width formatting code..."

	"...these characters normally have no visual appearance. Their
	 sole purpose is to guide the rendering process in selecting the
	 appropriate glyphs to represent base Mongolian letters in a
	 particular context."

There are also these in Cf that I don't know how may one chose to handle.
They are somehow control characters:

	U+206A	INHIBIT SYMMETRIC SWAPPING
	U+206B	ACTIVATE SYMMETRIC SWAPPING
	U+206C	INHIBIT ARABIC FORM SHAPING
	U+206D	ACTIVATE ARABIC FORM SHAPING
	U+206E	NATIONAL DIGIT SHAPES	
	U+206F	NOMINAL DIGIT SHAPES
	U+FFF9	INTERLINEAR ANNOTATION ANCHOR
	U+FFFA	INTERLINEAR ANNOTATION SEPARATOR
	U+FFFB	NTERLINEAR ANNOTATION TERMINATOR

--roozbeh

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/