[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: strstr
On Sat, Oct 06, 2001 at 07:21:03PM +0100, Markus Kuhn wrote:
> Please substantiate any claims about performance by actually making a
> realistic measurement, not a guess. Most such guesses are naive on modern
> processor architectures, which typically are RAM bound for searches, not
> CPU bound.
It comes out about the same; the decoding logic offsets the gain of doing
less compares. It pulls ahead of raw strstr with a simple optimization
of removing an unnecessary inner loop conditional (about 10% on my system),
so it's not quite RAM-bound.
I hadn't looked at decoding logic much; a big reason this ends up faster
is the "(minor) hit of UTF-8 decoding logic" is in fact no hit at all,
since it's just a small table lookup. Full UTF-8 decoding is a bunch of
conditionals, which is about three times slower (it's all the memory access
with a lot more branching)--using Vim's UTF8 decoder, which looks
fairly standard-issue--but that's not needed here.
There are also tradeoffs for "safe" behavior--should every string
function do validation logic? The strpbrk function posted recently has
a conditional in the inner loop to report errors for invalid UTF-8
sequences; it may or may not make a speed difference. Personally, I'd
leave validation in one place, validate strings when they're created and
have thinner library routines. (It might not matter.)
--
Glenn Maynard
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/