From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zack Weinberg To: dewar@gnat.com Cc: fw@deneb.enyo.de, bosch@gnat.com, dnovillo@redhat.com, gcc@gcc.gnu.org, kenner@vlsi1.ultra.nyu.edu Subject: Re: Ada files now checked in Date: Sun, 07 Oct 2001 11:14:00 -0000 Message-id: <20011007111450.N9432@codesourcery.com> References: <20011007123426.63316F28AE@nile.gnat.com> X-SW-Source: 2001-10/msg00499.html On Sun, Oct 07, 2001 at 08:34:26AM -0400, dewar@gnat.com wrote: > >>I'm surprised to read that English grammar includes typography. ;-) > > it most certainly includes punctuation, and that is what we are talking > about here! A quotation (which is what this is) appears in quotation > marks (which look roughly like ""). > > The character ` is indeed a grave accent in Latin-1, and of course English > doesn't have accents at all, further clarifying my point (I used the > phrase reverse quote, since I thought it would be more familiar to > english readers, and furthermore, that's what is going on here, we > are misusing a grave accent as an opening quotation mark. Indeed it > is a bit odd if we insist on this misuse, that we do not also use > Acute (Latin-1 code 180) as the closing quote, at least that would > be just slightly less odd to the eye. The issue here is that there are (historically) two competing standards for the meaning and therefore the appropriate glyph for ASCII codes 27 and 60 (hexadecimal). One is Latin-1: 27 is APOSTROPHE and 60 is GRAVE ACCENT. The other standard was never officially promulgated but it is deeply embedded in the assumptions of programs such as M4, TeX, and GNU Info. I suspect its origin is the way the original VT100 terminal displayed the characters. Anyway, in that standard 27 is RIGHT SINGLE QUOTATION MARK and 60 is LEFT SINGLE QUOTATION MARK. The error messages issued by cc1 are obeying the second standard. Unicode matches Latin-1 and there seems to be a trend toward that choice in font design, for instance XFree86 4.1's fonts use the first standard (previous versions use the second). I haven't seen any believable transition plan for programs like M4 and TeX, though. It is probably best if GCC avoids the issue by using exclusively ASCII 22 (QUOTATION MARK) in error messages. If I understand you, this is what Ada does now. Unlike some people, I do not expect Unicode to be universally accepted anytime soon. (But of course we can use U+201C and U+201D (LEFT and RIGHT DOUBLE QUOTATION MARK) or U+2018 and U+2019 (LEFT and RIGHT SINGLE QUOTATION MARK) in translations targeted at Unicode locales.) Search-and-replace patches to change all the existing error messages should be welcome. zw