[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: what shall we do about iconv?
(I hate following up to my own postings, but I can't help myself.)
> What should we do?
>
> (1) Live with it. Either copy stream input into a temporary file,
> convert it once to find out how long the output is, then malloc that
> much memory and convert the file again, or do something complex and
> fragile that involves detecting where the last complete character in
> the input buffer ends by running a separate iconv (with a different
> cd) on the buffer contents. Or maybe there are other work-arounds,
> too. Any ideas?
I still think that the iconv API is unsatisfactory and should,
ideally, be changed or replaced, however, there does seem to be a
better work-around for the case where you are only interested in
whether the data was converted exactly: you just convert the data back
again and compare. This doesn't require storing all the data and is
fairly simple to implement.
The attached program is supposed to do this, and I guess this is what
I'll do in Mutt, so if anyone thinks it won't work or knows of a
better way, please tell me.
(Like in the first program, I didn't bother with iconv_close.)
Edmund
#include <errno.h>
#include <iconv.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void fatal(char *s)
{
fprintf(stderr, "%d %d\n", errno, E2BIG);
perror(s);
exit(1);
}
int main(int argc, char *argv[])
{
iconv_t cd, cd2;
char bufi[256], bufo[256], bufc[sizeof(bufi)];
const char *ib, *t;
char *ob;
size_t ibl, obl;
int r;
int exact = 1;
if (argc != 3) {
fprintf(stderr, "Usage: %s FROMCODE TOCODE\n", argv[0]);
return 1;
}
cd = iconv_open(argv[2], argv[1]);
cd2 = iconv_open(argv[1], argv[2]);
if (cd == (iconv_t)-1 || cd2 == (iconv_t)-1)
fatal("iconv_open");
ibl = 0;
for (;;) {
/* Fill input buffer */
for (; ibl < sizeof(bufi); ibl += r) {
r = read(0, bufi + ibl, sizeof(bufi) - ibl);
if (r == -1)
fatal("read");
if (!r)
break;
}
/* Convert */
ib = bufi;
ob = bufo, obl = sizeof(bufo);
r = iconv(cd, &ib, &ibl, &ob, &obl);
if (r == -1 && errno != EINVAL && errno != E2BIG)
fatal("iconv");
if (ob == bufo)
break;
if (r != -1 && r)
exact = 0;
/* Output */
for (t = bufo; t < ob; t += r) {
r = write(1, t, ob - t);
if (r == -1)
fatal("write");
}
if (exact) {
/* Convert back */
const char *ib2 = bufo;
size_t ib2l = ob - bufo;
char *ob2 = bufc;
size_t ob2l = sizeof(bufc);
r = iconv(cd2, &ib2, &ib2l, &ob2, &ob2l);
if (r || ib2l || ob2 - bufc != ib - bufi ||
memcmp(bufi, bufc, ob2 - bufc))
exact = 0;
}
/* Save unused input */
memmove(bufi, ib, ibl);
}
fprintf(stderr, exact ?
"Conversion was reversible\n" :
"Conversion was not reversible\n");
return 0;
}