[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: ISO9660 & UTF-8
> > - Joliet use Unicode by default. Every character is represented in
> > UTF-16, so characters above U+FFFF can't be stored properly when
> > done the windows way.
>
> Doesn't it use surrogate pairs? I thought that was "the windows way".
If the Joliet specification calls for UTF-16 (which, by the way, is used
at many other places besides Windows; e.g. MacOS, Epoc, ...) then characters
above U+FFFF should be represented as UTF-16 code pairs. If the Joliet
specification calls for UCS-2, then you can still represent characters
above U+FFFF as UTF-16 code pairs, since that does not interfere with
UCS-2 per se (and the Joliet spec. should be updated).
(B.t.w. wasn't Joliet something prior to the latest ISO spec. on CD
file system formatting? Any up-to-date references? http://bmrc.berkeley.edu/people/chaffee/jolspec.html#unicode seems
to specify UCS-2, and Unicode 1(!), which is really bad, for Korean
in particular; I just hope I stumbled on a really out-of-date document.
I should keep track of the CD formats, but haven't. Sorry.)
Kind regards
/kent k
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/