[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

iconv and others



Well, this is rather a newbie question. We were trying to use glibc 2.2's
iconv to convert an input UTF-8 stream to our internal UCS4. But it came
out that it gives us little-endian UCS4 instead of a big-endian one we
needed for internal processing on our PC. Which of the following solutions
is better/more beautiful/more portable/etc?

0. Simply reverse the bytes in each wchar after the conversion. (Is there
a glibc/glib function for this?)

1. Use other independent iconvs that support UTF-32BE. Ship that with the
program or require the user to download them before building.

2. Require glibc 2.2.x (I saw in the CVS that UTF-32 support is only added 
3 weeks ago, so I don't know if any version is shipped with that).

3. Forget system's UTF-8. Read and convert it ourselves. (This way we can
be compliant with Unicode 3.1's more strict UTF-8.)

4. Some ultimate correct way we have missed.

--roozbeh

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/