[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: filename and normalization (was gcc identifiers)



Followup to:  <Pine.BSI.3.91.1021204095842.545D-100000@xxxxxxxxxxxxx>
By author:    Henry Spencer <henry@xxxxxxxxxxxxx>
In newsgroup: linux.utf8
> 
> The main chance of difficulties with UTF-8 is if different programs take
> different approaches to normalization of filenames.  A standard for that
> would help, as would suitable code in libraries.
> 

I expect that Unix systems will use normalization form C, and yes, we
need libraries to do all the various kinds of manipulation that one
can do on Unicode text (as opposed to "general localized text")
including producing the various normalization forms, querying
character properties, and converting to and from the various UTF
forms.

That being said, I consider the production of normalization form C to
be the responsibility of the user input system, and *perhaps* of
editing programs.  I don't believe we should attempt to insert
normalization everywhere, partially because it inevitably leads to
security holes.

	-hpa

-- 
<hpa@xxxxxxxxxxxxx> at work, <hpa@xxxxxxxxx> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@xxxxxxxxx>
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/