[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: less-344 with UTF-8



Is there a recommendation anywhere on how to deal with illegal UTF-8?
For communication between a program and a library, for example, I've
been simply ignoring the bad octets, something like:

  while ((k = mbtowc(&wc, s, n))) {
    if (k == -1) {
      ++s, --n;
      continue;
    }
    s += k, n -= k;
    do_something_with(wc);
  }

If this were a standard way to behave, then perhaps xterm should just
ignore bad octets, too.

Edmund
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/