public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals
@ 2018-02-27 14:41 Jonathan Wakely
  2018-02-27 16:50 ` Tim Song
  2018-02-27 18:11 ` Jason Merrill
  0 siblings, 2 replies; 5+ messages in thread
From: Jonathan Wakely @ 2018-02-27 14:41 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 896 bytes --]

Since the fix for PR c++/80955 any suffix on a string literal that
begins with an underscore is assumed to be a user-defined literal
suffix, not a macro. This assumption is invalid for a suffix beginning
with two underscores, because such names are reserved and can't be used
for UDLs anyway. Checking for exactly one underscore restores support
for macro expansion in cases like "File: "__FILE__ or "Date: "__DATE__
(which are formally ill-formed but accepted with a warning, as a
conforming extension).

gcc/testsuite:

        PR preprocessor/84517
        * g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.

libcpp:

        PR preprocessor/84517
        * lex.c (is_macro_not_literal_suffix): New function.
        (lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
        decide when to issue -Wliteral-suffix warnings.


Tested powerpc64le-linux, OK for trunk?

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 4124 bytes --]

commit 9781a25354e07503690259d7fa95e9dd459c51b2
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Feb 22 19:59:38 2018 +0000

    PR preprocessor/84517 allow double-underscore macros after string literals
    
    Since the fix for PR c++/80955 any suffix on a string literal that
    begins with an underscore is assumed to be a user-defined literal
    suffix, not a macro. This assumption is invalid for a suffix beginning
    with two underscores, because such names are reserved and can't be used
    for UDLs anyway. Checking for exactly one underscore restores support
    for macro expansion in cases like "File: "__FILE__ or "Date: "__DATE__
    (which are formally ill-formed but accepted with a warning, as a
    conforming extension).
    
    gcc/testsuite:
    
            PR preprocessor/84517
            * g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.
    
    libcpp:
    
            PR preprocessor/84517
            * lex.c (is_macro_not_literal_suffix): New function.
            (lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
            decide when to issue -Wliteral-suffix warnings.

diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C b/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
index fb518281811..7ef324b7e04 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
@@ -16,7 +16,7 @@ int operator""_ID(const char*, size_t) { return 0; }
 int main()
 {
   long i64 = 123;
-  char buf[100];
+  char buf[] = "xxxxxx"__FILE__;      // { dg-warning "invalid suffix on literal" }
   sprintf(buf, "%"PRId64"abc", i64);  // { dg-warning "invalid suffix on literal" }
   return strcmp(buf, "123abc")
 	 + ""_zero
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 92c62517a4d..37c365a3560 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -1630,6 +1630,21 @@ is_macro(cpp_reader *pfile, const uchar *base)
   return !result ? false : (result->type == NT_MACRO);
 }
 
+/* Returns true if a literal suffix does not have the expected form
+   and is defined as a macro.  */
+
+static bool
+is_macro_not_literal_suffix(cpp_reader *pfile, const uchar *base)
+{
+  /* User-defined literals outside of namespace std must start with a single
+     underscore, so assume anything of that form really is a UDL suffix.
+     We don't need to worry about UDLs defined inside namespace std because
+     their names are reserved, so cannot be used as macro names in valid
+     programs.  */
+  if (base[0] == '_' && base[1] != '_')
+    return false;
+  return is_macro (pfile, base);
+}
 
 /* Lexes a raw string.  The stored string contains the spelling, including
    double quotes, delimiter string, '(' and ')', any leading
@@ -1900,10 +1915,8 @@ lex_raw_string (cpp_reader *pfile, cpp_token *token, const uchar *base,
     {
       /* If a string format macro, say from inttypes.h, is placed touching
 	 a string literal it could be parsed as a C++11 user-defined string
-	 literal thus breaking the program.
-	 Try to identify macros with is_macro. A warning is issued.
-	 The macro name should not start with '_' for this warning. */
-      if ((*cur != '_') && is_macro (pfile, cur))
+	 literal thus breaking the program.  */
+      if (is_macro_not_literal_suffix (pfile, cur))
 	{
 	  /* Raise a warning, but do not consume subsequent tokens.  */
 	  if (CPP_OPTION (pfile, warn_literal_suffix) && !pfile->state.skipping)
@@ -2031,10 +2044,8 @@ lex_string (cpp_reader *pfile, cpp_token *token, const uchar *base)
     {
       /* If a string format macro, say from inttypes.h, is placed touching
 	 a string literal it could be parsed as a C++11 user-defined string
-	 literal thus breaking the program.
-	 Try to identify macros with is_macro. A warning is issued.
-	 The macro name should not start with '_' for this warning. */
-      if ((*cur != '_') && is_macro (pfile, cur))
+	 literal thus breaking the program.  */
+      if (is_macro_not_literal_suffix (pfile, cur))
 	{
 	  /* Raise a warning, but do not consume subsequent tokens.  */
 	  if (CPP_OPTION (pfile, warn_literal_suffix) && !pfile->state.skipping)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals
  2018-02-27 14:41 [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals Jonathan Wakely
@ 2018-02-27 16:50 ` Tim Song
  2018-02-27 16:59   ` Jonathan Wakely
  2018-02-27 18:11 ` Jason Merrill
  1 sibling, 1 reply; 5+ messages in thread
From: Tim Song @ 2018-02-27 16:50 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-patches

On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> Since the fix for PR c++/80955 any suffix on a string literal that
> begins with an underscore is assumed to be a user-defined literal
> suffix, not a macro. This assumption is invalid for a suffix beginning
> with two underscores, because such names are reserved and can't be used
> for UDLs anyway.

[lex.name]/3 reserves all identifiers containing double underscore,
but the spaceless one-token form does not actually use such an
identifier.

See the (equally reserved) _Bq example in [over.literal]/8:

double operator""_Bq(long double);                  // OK: does not
use the reserved identifier _­Bq ([lex.name])
double operator"" _Bq(long double);                 // uses the
reserved identifier _­Bq ([lex.name])

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals
  2018-02-27 16:50 ` Tim Song
@ 2018-02-27 16:59   ` Jonathan Wakely
  2018-02-27 17:13     ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2018-02-27 16:59 UTC (permalink / raw)
  To: Tim Song; +Cc: gcc-patches

On 27 February 2018 at 16:49, Tim Song <t.canens.cpp@gmail.com> wrote:
> On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>> Since the fix for PR c++/80955 any suffix on a string literal that
>> begins with an underscore is assumed to be a user-defined literal
>> suffix, not a macro. This assumption is invalid for a suffix beginning
>> with two underscores, because such names are reserved and can't be used
>> for UDLs anyway.
>
> [lex.name]/3 reserves all identifiers containing double underscore,
> but the spaceless one-token form does not actually use such an
> identifier.
>
> See the (equally reserved) _Bq example in [over.literal]/8:
>
> double operator""_Bq(long double);                  // OK: does not
> use the reserved identifier _­Bq ([lex.name])
> double operator"" _Bq(long double);                 // uses the
> reserved identifier _­Bq ([lex.name])

I know, but GCC doesn't implement the rule accurately. I reported PR
80955 because GCC's UDL parsing meant we reject valid programs. The
fix for that bug was a bit of a hack, simply adding a special case so
that a suffix starting with an underscore is never expanded as a
macro, which makes the above examples do the right thing. But that
introduces a regression where we no longer accept ""__FILE__ (because
it starts with an underscore), where previously that was accepted as
an extension, just like "%"PRIu64 is accepted (with a warning).

After the hack for 80955 we still do the wrong thing for:

#define foo
int operator""foo();

But that can't appear in a valid program, so we get away with it. But
it's still a hack.

My patch doesn't try to change how we parse operator""_Bq it just
ensures we accept ""__FILE__ and similar cases. I think my patch means
we reject:

int operator""__X(unsigned long long);

But that also can't appear in a valid program, so again we get away with it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals
  2018-02-27 16:59   ` Jonathan Wakely
@ 2018-02-27 17:13     ` Jonathan Wakely
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Wakely @ 2018-02-27 17:13 UTC (permalink / raw)
  To: Tim Song; +Cc: gcc-patches

On 27 February 2018 at 16:59, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> On 27 February 2018 at 16:49, Tim Song <t.canens.cpp@gmail.com> wrote:
>> On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>>> Since the fix for PR c++/80955 any suffix on a string literal that
>>> begins with an underscore is assumed to be a user-defined literal
>>> suffix, not a macro. This assumption is invalid for a suffix beginning
>>> with two underscores, because such names are reserved and can't be used
>>> for UDLs anyway.
>>
>> [lex.name]/3 reserves all identifiers containing double underscore,
>> but the spaceless one-token form does not actually use such an
>> identifier.
>>
>> See the (equally reserved) _Bq example in [over.literal]/8:
>>
>> double operator""_Bq(long double);                  // OK: does not
>> use the reserved identifier _­Bq ([lex.name])
>> double operator"" _Bq(long double);                 // uses the
>> reserved identifier _­Bq ([lex.name])
>
> I know, but GCC doesn't implement the rule accurately. I reported PR
> 80955 because GCC's UDL parsing meant we reject valid programs. The
> fix for that bug was a bit of a hack, simply adding a special case so
> that a suffix starting with an underscore is never expanded as a
> macro, which makes the above examples do the right thing. But that
> introduces a regression where we no longer accept ""__FILE__ (because
> it starts with an underscore), where previously that was accepted as
> an extension, just like "%"PRIu64 is accepted (with a warning).
>
> After the hack for 80955 we still do the wrong thing for:
>
> #define foo
> int operator""foo();
>
> But that can't appear in a valid program, so we get away with it. But
> it's still a hack.
>
> My patch doesn't try to change how we parse operator""_Bq it just
> ensures we accept ""__FILE__ and similar cases. I think my patch means
> we reject:
>
> int operator""__X(unsigned long long);
>
> But that also can't appear in a valid program, so again we get away with it.

Sorry for being inaccurate, what I mean is the problem case where a
macro clashes with a UDL can't appear (because defining the macro
would be invalid):

#define __X
int operator""__X(unsigned long long);

However, that doesn't help for this case:

int operator""__FILE__(unsigned long long);

This is allowed as a UDL, and doesn't use a reserved identifier
(because it doesn't define __FILE__, the implementation does), but
would fail after my patch because __FILE__ gets incorrectly expanded
as a macro. I don't have much sympathy for anybody defining such a
UDL.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals
  2018-02-27 14:41 [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals Jonathan Wakely
  2018-02-27 16:50 ` Tim Song
@ 2018-02-27 18:11 ` Jason Merrill
  1 sibling, 0 replies; 5+ messages in thread
From: Jason Merrill @ 2018-02-27 18:11 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-patches

OK.

On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> Since the fix for PR c++/80955 any suffix on a string literal that
> begins with an underscore is assumed to be a user-defined literal
> suffix, not a macro. This assumption is invalid for a suffix beginning
> with two underscores, because such names are reserved and can't be used
> for UDLs anyway. Checking for exactly one underscore restores support
> for macro expansion in cases like "File: "__FILE__ or "Date: "__DATE__
> (which are formally ill-formed but accepted with a warning, as a
> conforming extension).
>
> gcc/testsuite:
>
>         PR preprocessor/84517
>         * g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.
>
> libcpp:
>
>         PR preprocessor/84517
>         * lex.c (is_macro_not_literal_suffix): New function.
>         (lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
>         decide when to issue -Wliteral-suffix warnings.
>
>
> Tested powerpc64le-linux, OK for trunk?

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-02-27 18:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-27 14:41 [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals Jonathan Wakely
2018-02-27 16:50 ` Tim Song
2018-02-27 16:59   ` Jonathan Wakely
2018-02-27 17:13     ` Jonathan Wakely
2018-02-27 18:11 ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).