The way in which Perl supports Unicode, you normally should hardly ever have to call a UTF-8 encoder or decoder explicitely and manually. You just have to make sure that when a UTF-8 string enters Perl, it does so tagged as a UTF-8 string and not as an octet string. How that happens depends on how the string gets into Perl. When opening files, for instance, you can tell Perl the charset to expect or to look at the LC_CTYPE locale.
Question: What is a quick way in Perl to get a regular expression that matches all Unicode characters in the range U0100..U10FFFF, in other words all non-ASCII Unicode characters?
-- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/