[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gettext - was: Re: ASCII and Unicode Quotation Marks



This seems to be a recipe for getting what Ulrich was referring to:

$ cvs -z 9 -d :pserver:anoncvs@xxxxxxxxxxxxxxxxxx:/cvs/glibc login
{enter "anoncvs" as the password}
$ cvs -z 9 -d :pserver:anoncvs@xxxxxxxxxxxxxxxxxx:/cvs/glibc co libc/manual
$ cd libc/manual
{"make" fails because of missing libm-err-tab.pl, move-if-change,
 libm-err.texi, ... but then the following incantation does something:}
$ makeinfo --force libc.texinfo

Searching for a discussion of catgets vs gettext I eventually read, in
the section "Translation with gettext":

>    The `gettext' approach has some advantages but also some
> disadvantages.  Please see the GNU `gettext' manual for a detailed
> discussion of the pros and cons.

Is this it?

On the subject of gettext and UTF-8, I suggested something in
gnupg-i18n that people here might have an opinion on (or might tell me
to go and consult an anoncvs server about :-) I'll append my e-mail
...

Edmund
--- Begin Message ---
Sorry to reply to myself like this ...

> But I suppose we really ought to be thinking about wide characters and
> charset support, too; a Russian user might be using koi8-r or utf-8.
> The same problem effects lots of programs, not just gnupg ...

I've heard that soon gettext will automatically convert message
strings to the charset of the user's current locale. A single
non-ASCII character will become several octets when converted into
UTF-8 (which will be the most widely used charset soon). So any code
that wants to look at the third character of a translated string by
just doing ans[2], say, will break horribly.

Perhaps it would be useful to use a function that searches for an
exact match or an unambiguous prefix in a set of commands. For
example:

int f(char *cmd, char *cmds, int *x);

f("a", "a,cw;bx,by;cz", &x) = 0, x = 1  /* exact match in group 1 */
f("b", "a,cw;bx,by;cz", &x) = 1, x = 2  /* unambiguous prefix in group 2 */
f("c", "a,cw;bx,by;cz", &x) = 2, x = ?  /* ambiguous prefix (1 or 3) */
f("d", "a,cw;bx,by;cz", &x) = 3, x = ?  /* no match */

The cmds string would be translated, of course, and it shouldn't
matter if charset conversion changes the number of bytes. If a system
like this were in general use in GNU software, translators would soon
learn.

The question of whether to use the English commands as a canonical
alternative is left to individual translator teams.

Edmund

--- End Message ---