[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Substituting malformed UTF-8 sequences in a decoder



David Starner writes:

> >   Whereas a central point of Unicode is that applications
> >   know the behaviour of *all* characters, definitely.
> 
> Okay, what's the width of U+F001? U+1EFA? The first is private use, and
> the second hasn't been defined yet, but quite possibly will.

"Private use" means in practice that a vendor (for example, Apple) uses
these codepoints for internal purposes, and that his libraries support
these codepoints. Thus applications using these vendor's libraries
transparently know the properties of such private use characters.

About not yet defined characters: Yes, applications will behave a
little wrong. "ls" will print a question mark if you use that
character in a file name and the underlying system libraries don't yet
support it.

Bruno
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/