[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Announcing Bytext
On Sun, Feb 03, 2002 at 06:15:33PM +0100, Pablo Saratxaga wrote:
> > Many of the elegant features of Unixes depend on the notion of 8 bit
> > transparency: pipe, cat, echo... the byte stream is the common denominator.
> > The functions are general purpose and thus more useful. Bytext takes this
> > elegant notion to it?s logical conclusion: not only can you process text
> > as bytes, you can also process bytes as text.
>
> I don't understand, how can you encode in an 8bit space all the characters
> of the world languages ?
>
> And if it is a multi-byte encoding, then it should have about the same
> problems as utf-8 or euc have when faced with byte-only utilities.
It sounds to me that any 8-bit character sequence (hopefully excluding nuls)
is a valid character. That doesn't sound particularly useful, though.
(So what if an arbitrary byte sequence can be displayed as random-ish
characters of equally random languages?)
If it's the case that any string of bytes is a valid character, then that
brings up the question of how robust it is. (Seeking, sync; issues that
UTF-8 solved.) I tried to look this up, but one of the first things I saw when
paging down the Word version (after it asked me for a password but worked
anyway) was:
"Unicode is messed up beyond repair."
I promptly became disgusted and closed the window. Remarks like that have
no place whatsoever in a "standard". How can he possibly wonder why he
gets negative reactions from Unicode folks when he's making comments like
this?
--
Glenn Maynard
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/