From mboxrd@z Thu Jan  1 00:00:00 1970
From: Zack Weinberg <zack@codesourcery.com>
To: dewar@gnat.com
Cc: fw@deneb.enyo.de, bosch@gnat.com, dnovillo@redhat.com, gcc@gcc.gnu.org, kenner@vlsi1.ultra.nyu.edu
Subject: Re: Ada files now checked in
Date: Sun, 07 Oct 2001 11:14:00 -0000
Message-id: <20011007111450.N9432@codesourcery.com>
References: <20011007123426.63316F28AE@nile.gnat.com>
X-SW-Source: 2001-10/msg00499.html

On Sun, Oct 07, 2001 at 08:34:26AM -0400, dewar@gnat.com wrote:
> >>I'm surprised to read that English grammar includes typography. ;-)
> 
> it most certainly includes punctuation, and that is what we are talking
> about here! A quotation (which is what this is) appears in quotation
> marks (which look roughly like "").
> 
> The character ` is indeed a grave accent in Latin-1, and of course English
> doesn't have accents at all, further clarifying my point (I used the
> phrase reverse quote, since I thought it would be more familiar to
> english readers, and furthermore, that's what is going on here, we
> are misusing a grave accent as an opening quotation mark. Indeed it
> is a bit odd if we insist on this misuse, that we do not also use
> Acute (Latin-1 code 180) as the closing quote, at least that would
> be just slightly less odd to the eye.

The issue here is that there are (historically) two competing
standards for the meaning and therefore the appropriate glyph for
ASCII codes 27 and 60 (hexadecimal).  One is Latin-1: 27 is APOSTROPHE
and 60 is GRAVE ACCENT.  The other standard was never officially
promulgated but it is deeply embedded in the assumptions of programs
such as M4, TeX, and GNU Info.  I suspect its origin is the way the
original VT100 terminal displayed the characters.  Anyway, in that
standard 27 is RIGHT SINGLE QUOTATION MARK and 60 is LEFT SINGLE
QUOTATION MARK.  The error messages issued by cc1 are obeying the
second standard.

Unicode matches Latin-1 and there seems to be a trend toward that
choice in font design, for instance XFree86 4.1's fonts use the first
standard (previous versions use the second).  I haven't seen any
believable transition plan for programs like M4 and TeX, though.

It is probably best if GCC avoids the issue by using exclusively ASCII
22 (QUOTATION MARK) in error messages.  If I understand you, this is
what Ada does now.  Unlike some people, I do not expect Unicode to be
universally accepted anytime soon.  (But of course we can use U+201C
and U+201D (LEFT and RIGHT DOUBLE QUOTATION MARK) or U+2018 and U+2019
(LEFT and RIGHT SINGLE QUOTATION MARK) in translations targeted at
Unicode locales.)

Search-and-replace patches to change all the existing error messages
should be welcome.

zw