[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: perlunitut - feedback appreciated
On Sun, 11 Nov 2001 12:57:27 -0800, in perl.unicode you wrote:
> ISO Latin-1 characters encoded as 10-FF in single bytes are not Unicode.
Hm? ISO Latin-1 characters from 00 to 7F encoded in single bytes
represent the same Unicode characters as those bytes interpreted as
UTF-8, simply because ASCII is a subset both of Latin-1 and UTF-8. 00 to
7F is that common subset.
> There is no Unicode transformation format or other encoding that permits
> this. The code point range is actually x000010-x0000FF, and the encodings
> are
>
> 0000000010000000 0000000011111111 UTF-16 Big Endian
That first number of 0x80, not 0x10. If you meant 0x80 .. 0xFF, then I
agree with you.
Cheers,
Philip
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/