[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Utf-8 support in C functions on Linux



Hi,

At Tue, 18 Dec 2001 16:49:08 -0500,
Richard, Francois M <Francois.M.Richard@xxxxxxxxxxxxx> wrote:

> If strncpy() recognizes n characters encoded in utf-8, it means that when it
> reads the bytes, leading and trailing bytes are detected/understood. There
> is some utf-8 decoding operation going on.
> In this case, why strlen() can count only bytes?

By definition, size_t means number of bytes, not number of characters.

(CJK people have treated multibyte characters by trusting strlen() returns
the number of BYTES for tens of years.)

---
Tomohiro KUBOTA <kubota@xxxxxxxxxx>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/