public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* dg-error vs. i18n?
@ 2009-10-23 23:04 Dave Korn
  2009-10-23 23:08 ` Richard Guenther
  2009-10-23 23:19 ` Andrew Pinski
  0 siblings, 2 replies; 17+ messages in thread
From: Dave Korn @ 2009-10-23 23:04 UTC (permalink / raw)
  To: gcc


    Hi everyone,

  Sorry for posting a dumb question, but it's not my strongest area: now that
cygwin is handling i18n and unicode and "all that stuff", I started seeing a
whole slew of test failures, e.g.:

> FAIL: g++.dg/debug/pr22514.C  (test for errors, line 12)
> FAIL: g++.dg/debug/pr22514.C (test for excess errors)
> Excess errors:
> /gnu/gcc/releases/4.3.4-2/gcc4-4.3.4-2/src/gcc-4.3.4/gcc/testsuite/g++.dg/debug/pr22514.C:12:
> error: expected unqualified-id before ‘}’ token

  The reason appears to be because the testcase has single-quotes in the regex
pattern:

>> $ cat g++.dg/debug/pr22514.C -n
>>      1  /* { dg-do compile } */
>>      2  namespace s
>>      3  {
>>      4    template <int> struct _List_base
>>      5    {
>>      6       int _M_impl;
>>      7    };
>>      8    template<int i> struct list : _List_base<i>
>>      9    {
>>     10      using _List_base<i>::_M_impl;
>>     11    }
>>     12  }  /* { dg-error "expected unqualified-id before '\}'" } */
>>     13  s::list<1> OutputModuleListType;

... where the actual compiler outputs those fancy left- and right-facing
quotes.  It will probably go away if I set LC_ALL=c or something like that,
but is dg-error meant to be insensitive to this kind of transformation, or
would it be best if dg-error test patterns didn't include any kind of quote
chars that might get i14ed?

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: dg-error vs. i18n?
@ 2009-10-28 11:30 Ross Ridge
  2009-10-28 17:34 ` Joseph S. Myers
  0 siblings, 1 reply; 17+ messages in thread
From: Ross Ridge @ 2009-10-28 11:30 UTC (permalink / raw)
  To: gcc

Eric Blake writes:
>The correct workaround is indeed to specify a locale with specific charset 
>encodings, rather than relying on plain "C" (hopefully cygwin will 
>support "C.ASCII", if it does not already).

The correct fix is for GCC not to intentionally choose to rely on
implementation defined behaviour when using the "C" locale.  GCC can't
portably assume any other locale exists, but can portibly and easily
choose to get consistant output when using the "C" locale.

>As far as I know, the hole is intentional.  But if others would like
>me to, I am willing to pursue the action of raising a defect against
>the POSIX standard, requesting that the next version of POSIX consider
>including a standardized name for a locale with guaranteed single-byte
>encoding.

I don't see how a defect in POSIX is exposed here.  Nothing in
the standard forced GCC to output multi-byte characters when
nl_langinfo(CHARSET) returns something like "utf-8".  GCC chould just
as easily have choosen to output these quotes as single-byte characters
when nl_langinfo(CHARSET) returns something like "windows-1252", or some
other non-ASCII single-byte characters when it returned "iso-8859-1".

					Ross Ridge

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: dg-error vs. i18n?
@ 2009-10-28 18:59 Ross Ridge
  0 siblings, 0 replies; 17+ messages in thread
From: Ross Ridge @ 2009-10-28 18:59 UTC (permalink / raw)
  To: gcc

Ross Ridge wrote:
> The correct fix is for GCC not to intentionally choose to rely on
> implementation defined behaviour when using the "C" locale.  GCC can't
> portably assume any other locale exists, but can portibly and easily
> choose to get consistant output when using the "C" locale.

Joseph S. Myers writes:
>GCC is behaving properly according to the user's locale (representing 
>English-language diagnostics as best it can - remember that ASCII does not 
>allow good representation of English in all cases).  

This is an issue of style, but as I far as I'm concerned using these
fancy quotes in English locales is unnecessary and unhelpful. 

>The problem here is not a bug in the compiler proper, it is an issue
>with how to test the compiler portably - that is, how the testsuite can
>portably set a locale with English language and ASCII character set in
>order to test the output the compiler gives in such a locale.

It's a design flaw in GCC.  The "C" locale is the only locale that GCC
can use to reliably and portably get consistant output across all ASCII
systems and so should be the locale used to achieve consistant output.
GCC can simply choose to restrict it's output to ASCII.  It's not in
any way being forced by POSIX to output non-ASCII characters, or for
that matter to treat the "C" locale as an English locale. 

					Ross Ridge

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-10-28 18:13 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-23 23:04 dg-error vs. i18n? Dave Korn
2009-10-23 23:08 ` Richard Guenther
2009-10-24  0:04   ` Joseph S. Myers
2009-10-24  2:28     ` Dave Korn
2009-10-24  5:06       ` Charles Wilson
2009-10-25 11:19     ` Dave Korn
2009-10-25 18:42       ` Joseph S. Myers
2009-10-23 23:19 ` Andrew Pinski
2009-10-24  1:28   ` Dave Korn
2009-10-24 20:05     ` Andreas Schwab
2009-10-25 11:08       ` Charles Wilson
2009-10-27 15:49         ` Eric Blake
2009-10-27 20:18           ` Dave Korn
2009-10-24 15:17   ` Andreas Schwab
2009-10-28 11:30 Ross Ridge
2009-10-28 17:34 ` Joseph S. Myers
2009-10-28 18:59 Ross Ridge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).