I recall that we had about two years ago heated discussions here on
whether UTF-8 support should be implemented by
a) hardwired mechanisms fully optimized to make good use of UTF-8's
neat properties
b) relying entirely on ISO C's generic multi-byte functions, to make
sure that even stateful monsters like the ISO 2022 encodings
are supported equally.
Unfortunately, it seems that grep has become an excellent teaching
example of how option b) can backfire with a ridiculous performance loss
in a basic text-processing tool.