[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: quotearg should quote "like this" instead of `like this' in C locale
Francois Pinard wrote on 2000-07-14 01:23 UTC:
> This sends the wrong message to users that C locale should prefer "quote"
> over `quote', which is untrue if C locale is basing itself on ASCII.
Users of the C locale should not abuse the grave accent as a quotation
mark, because it looks silly with a very significant number of fonts.
http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
Most implementations of the C locale are not based on ANSI X3.4-1968
(ASCII) today, but on ISO 646 IRV, ISO 8859, ISO 10646, etc.
The C standard does definitely *not* imply that the C locale is based on
ASCII or even a particular flavour of ASCII. In fact, the C standard
does not require the C locale to support for instance the characters
U+0024 ($), U+0040 (@), or U+0060 (`) at all! Let me quote ISO/IEC
9899:1999 (E) section 5.2.1, paragraph 3:
[#3] Both the basic source and basic execution character
sets shall have the following members: the 26 uppercase
letters of the Latin alphabet
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
the 26 lowercase letters of the Latin alphabet
a b c d e f g h i j k l m
n o p q r s t u v w x y z
the 10 decimal digits
0 1 2 3 4 5 6 7 8 9
the following 29 graphic characters
! " # % & ' ( ) * + , - . / :
; < = > ? [ \ ] ^ _ { | } ~
the space character, and control characters representing
horizontal tab, vertical tab, and form feed. The
representation of each member of the source and execution
basic character sets shall fit in a byte. In both the
source and execution basic character sets, the value of each
character after 0 in the above list of decimal digits shall
be one greater than the value of the previous. In source
files, there shall be some way of indicating the end of each
line of text; this International Standard treats such an
end-of-line indicator as if it were a single new-line
character. In the basic execution character set, there
shall be control characters representing alert, backspace,
carriage return, and new line. If any other characters are
encountered in a source file (except in an identifier, a
character constant, a string literal, a header name, a
comment, or a preprocessing token that is never converted to
a token), the behavior is undefined.
That is all that C requires of the character set, and UTF-8 or CP1252
fit that requirement just as good as ASCII.
Again, please do not claim that the C locale is in any way tied to ASCII
or even a particular incarnation or font style of it, because obviously
it really isn't.
I very much advocate that once glibc 2.2 has been roled out and widely
deployed, we start thinking about making UTF-8 the character encoding of
the C locale. Plan9 has been doing that successfully for almost a decade
now, and this way the fathers of C and Unix at Ball Labs have already
sent us a very clear message with regard to their preference in this
matter.
> We should not send message to users that under C locale, [quotation mark]
> directionality is better lost. Reverting to the previous behaviour is
> the correct thing to do.
Sorry, I strongly disagree. The most portable way of using quotation
marks in C's portable basic execution set is 'quote' or "quote", but
certainly not `quote'.
ANSI X3.4-1968 ASCII is horribly annoying, especially with proportional
fonts. The most annoying bit for me is the unification of the hyphen and
the minus. This might not have been a problem for monospaced typewriter
fonts, but I really can't stand any more seeing hyphens glyphs, which
are fatter and half the width of a plus sign in proportional fonts,
being used as minus signs (which should look just like the horizontal
bar of a plus) or en-dashes (e.g., as dashes in unnumbered lists). The
quotation mark problem is in my opinion trivial compared to the minus
mess.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/