All strings are UTF-8 capable; the unit of data is simply bytes
instead of characters. If you're looking for a class that treats
strings as a sequence of abstract characters rather than a sequence of
bytes, you could look for a library to do this or write your own.
However I suspect the most useful way to do this on C++ would be to
extend whatever standard byte-based string class you're using with a
derived class.
Maybe there's something like this built in to the C++ STL classes
already that I'm not aware of. As I said I don't know much of (modern)
C++. Can someone who knows the language better provide an answer?
It would also be easier to provide you answers if we knew better what
you're trying to do with the strings, i.e. whether you just need to
store them and spit them back in output, or whether you need to do
higher-level unicode processing like line breaks, collation,
rendering, etc.
Rich
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/