[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
iconv and others
Well, this is rather a newbie question. We were trying to use glibc 2.2's
iconv to convert an input UTF-8 stream to our internal UCS4. But it came
out that it gives us little-endian UCS4 instead of a big-endian one we
needed for internal processing on our PC. Which of the following solutions
is better/more beautiful/more portable/etc?
0. Simply reverse the bytes in each wchar after the conversion. (Is there
a glibc/glib function for this?)
1. Use other independent iconvs that support UTF-32BE. Ship that with the
program or require the user to download them before building.
2. Require glibc 2.2.x (I saw in the CVS that UTF-32 support is only added
3 weeks ago, so I don't know if any version is shipped with that).
3. Forget system's UTF-8. Read and convert it ourselves. (This way we can
be compliant with Unicode 3.1's more strict UTF-8.)
4. Some ultimate correct way we have missed.
--roozbeh
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/