[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: mk_wcwidth
On Mon, Jun 17, 2002 at 11:55:02PM -0400, Seer wrote:
> struct interval
> {
> int first;
> int last;
> int width
> };
>
> static int bisearch(wchar_t ucs, const struct interval *table, int
> max)
> {
> int min = 0;
> int mid;
>
> if (ucs < table[0].first || ucs > table[max].last)
> while (max >= min)
> {
> mid = (min + max) / 2;
> if (ucs > table[mid].last)
> min = mid + 1;
> else if (ucs < table[mid].first)
> max = mid - 1;
> else
> return table[mid].width;
> }
>
> return -1;
> }
Um. This'll return -1 unless ucs is out of range. And if you're
interested in simplification, you'd reverse the logic:
if (ucs is out of range)
return -1;
do_binary_search() { }
return -1;
You'd also move the "mid" definition within the while() loop, since it's
only actually used there.
A tip: if you're posting completely untested code, say so. (You left
out a semicolon above, too.)
> even better: using a btree instead of a bisearch; itd be
> a bit tougher to read the source: probably better to have it
> generated in that case. Ive attached an example of that style;
> it uses hand balanced trees to minimize the search to determine
> if a character belongs in a language. I think a perl script
> code generator would be needed do the tree-stringing in the width
> case, then the focus would be on tweaking the
> (range/width/frequency) data.
(Err ... how in the nineteen hells is this "simplification"?)
If you really need a speedup for specific cases, it could work, but
it's actually a tradeoff; speed one up and slow down others. (And
it's not an even trade: for every one you move up the tree, you move
two down.) Except for ASCII, that kind of tradeoff isn't very useful
in general-purpose code.
And it might not be a speedup anyway: you're implicitely using a lot
more memory in a tree than a simple array, which is likely to slow the
whole thing down.
However, this is a generic implementation, probably written for
correctness, ease of testing and maintainability; not speed. I doubt
Marcus would be interested in speed improvements if they're at the cost
of the rest.
--
Glenn Maynard
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/