[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Byte-order-marks considered harmful
Karlsson Kent - keka wrote on 1999-11-08 15:41 UTC:
> How do you propose to handle SOFT HYPHEN? It should be
> zero-width and invisible unless a linebreak is done at that point
> (in which case it should be rendered as a hyphen).
You obviously misunderstood ISO 8859 here, which admittedly used
extremely bad language in the relevant section. SOFT HYPHEN is a
completely normal character, with a glyph identical to or almost
identical to HYPHEN. A soft hyphen should be used whenever a hyphen is
inserted to make a line break when a paragraph in a formatted text file
is broken into lines. Soft hyphen characters plus the following white
space obviously have to be removed by the paragraph formatting algorithm
before reformatting the paragraph. Soft hyphens are NOT there to mark
invisibly in words potential hyphenation points. Neither ISO 8859 nor
ISO 10646 reserve an extra code point ZERO-WIDTH HYPHENATION POINT for
this. Feel free to suggest adding such a code if you need one.
If a soft hyphen appears anywhere else then at the end of a line, this
is a probably due to a bug in the software that (re)formatted this
character and forgot to remove the soft hyphens from words that are not
broken any more after the formatting. There is absolutely no reason for
a terminal emulator or other text display software to treat SOFT HYPHEN
any differently from HYPHEN. The difference only matters to paragraph
reformatting routines, but even there it has not been widely
implemented.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/