[PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]
@ 2023-08-24 13:58 Jakub Jelinek
  2023-10-20 20:12 ` Jason Merrill
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2023-08-24 13:58 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches

Hi!

The following patch implements C++26 unevaluated-string.
As it seems to me just extra pedanticity, it is implemented only for
-std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.
Nothing is done for inline asm, while the spec changes those, it changes it
to a balanced token sequence with implementation defined rules on what is
and isn't allowed (so pedantically accepting asm ("" : "+m" (x));
was accepts-invalid before C++26, but we didn't diagnose anything).
For the other spots mentioned in the paper, static_assert message,
linkage specification, deprecated/nodiscard attributes it enforces the
requirements (no prefixes, udlit suffixes, no octal/hexadecimal escapes
(conditional escape sequences were rejected with pedantic already before).
For the deprecated operator "" identifier case I've kept things as is,
because everything seems to have been diagnosed already (a lot being implied
from the string having to be empty).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-08-24  Jakub Jelinek  <jakub@redhat.com>

	PR c++/110342
gcc/cp/
	* parser.cc: Implement C++26 P2361R6 - Unevaluated strings.
	(uneval_string_attr): New enumerator.
	(cp_parser_string_literal_common): Add UNEVAL argument.  If true,
	pass CPP_UNEVAL_STRING rather than CPP_STRING to
	cpp_interpret_string_notranslate.
	(cp_parser_string_literal, cp_parser_userdef_string_literal): Adjust
	callers of cp_parser_string_literal_common.
	(cp_parser_unevaluated_string_literal): New function.
	(cp_parser_parenthesized_expression_list): Handle uneval_string_attr.
	(cp_parser_linkage_specification): Use
	cp_parser_unevaluated_string_literal for C++26.
	(cp_parser_static_assert): Likewise.
	(cp_parser_std_attribute): Use uneval_string_attr for standard
	deprecated and nodiscard attributes.
gcc/testsuite/
	* g++.dg/cpp26/unevalstr1.C: New test.
	* g++.dg/cpp26/unevalstr2.C: New test.
	* g++.dg/cpp0x/udlit-error1.C (lol): Expect an error for C++26
	about user-defined literal in deprecated attribute.
libcpp/
	* include/cpplib.h (TTYPE_TABLE): Add CPP_UNEVAL_STRING literal
	entry.  Use C++11 instead of C++-0x in comments.
	* charset.cc (convert_escape): Add UNEVAL argument, if true,
	pedantically diagnose numeric escape sequences.
	(cpp_interpret_string_1): Formatting fix.  Adjust convert_escape
	caller.
	(cpp_interpret_string): Formatting string.
	(cpp_interpret_string_notranslate): Pass type through to
	cpp_interpret_string if it is CPP_UNEVAL_STRING.

--- gcc/cp/parser.cc.jj	2023-08-23 11:22:28.006593913 +0200
+++ gcc/cp/parser.cc	2023-08-23 12:21:31.384232520 +0200
@@ -2267,7 +2267,8 @@ static vec<tree, va_gc> *cp_parser_paren
   (cp_parser *, int, bool, bool, bool *, location_t * = NULL,
    bool = false);
 /* Values for the second parameter of cp_parser_parenthesized_expression_list.  */
-enum { non_attr = 0, normal_attr = 1, id_attr = 2, assume_attr = 3 };
+enum { non_attr = 0, normal_attr = 1, id_attr = 2, assume_attr = 3,
+       uneval_string_attr = 4 };
 static void cp_parser_pseudo_destructor_name
   (cp_parser *, tree, tree *, tree *);
 static cp_expr cp_parser_unary_expression
@@ -4409,7 +4410,8 @@ cp_parser_identifier (cp_parser* parser)
     return error_mark_node;
 }
 
-/* Worker for cp_parser_string_literal and cp_parser_userdef_string_literal.
+/* Worker for cp_parser_string_literal, cp_parser_userdef_string_literal
+   and cp_parser_unevaluated_string_literal.
    Do not call this directly; use either of the above.
 
    Parse a sequence of adjacent string constants.  Return a
@@ -4417,7 +4419,8 @@ cp_parser_identifier (cp_parser* parser)
    constant.  If TRANSLATE is true, translate the string to the
    execution character set.  If WIDE_OK is true, a wide string is
    valid here.  If UDL_OK is true, a string literal with user-defined
-   suffix can be used in this context.
+   suffix can be used in this context.  If UNEVAL is true, diagnose
+   numeric and conditional escape sequences in it if pedantic.
 
    C++98 [lex.string] says that if a narrow string literal token is
    adjacent to a wide string literal token, the behavior is undefined.
@@ -4431,7 +4434,7 @@ cp_parser_identifier (cp_parser* parser)
 static cp_expr
 cp_parser_string_literal_common (cp_parser *parser, bool translate,
 				 bool wide_ok, bool udl_ok,
-				 bool lookup_udlit)
+				 bool lookup_udlit, bool uneval)
 {
   tree value;
   size_t count;
@@ -4584,6 +4587,8 @@ cp_parser_string_literal_common (cp_pars
       cp_parser_error (parser, "a wide string is invalid in this context");
       type = CPP_STRING;
     }
+  if (uneval)
+    type = CPP_UNEVAL_STRING;
 
   if ((translate ? cpp_interpret_string : cpp_interpret_string_notranslate)
       (parse_in, strs, count, &istr, type))
@@ -4658,7 +4663,8 @@ cp_parser_string_literal (cp_parser *par
 {
   return cp_parser_string_literal_common (parser, translate, wide_ok,
 					  /*udl_ok=*/false,
-					  /*lookup_udlit=*/false);
+					  /*lookup_udlit=*/false,
+					  /*uneval=*/false);
 }
 
 /* Parse a string literal or user defined string literal.
@@ -4673,7 +4679,21 @@ cp_parser_userdef_string_literal (cp_par
 {
   return cp_parser_string_literal_common (parser, /*translate=*/true,
 					  /*wide_ok=*/true, /*udl_ok=*/true,
-					  lookup_udlit);
+					  lookup_udlit, /*uneval=*/false);
+}
+
+/* Parse an unevaluated string literal.
+
+   unevaluated-string:
+     string-literal  */
+
+static inline cp_expr
+cp_parser_unevaluated_string_literal (cp_parser *parser)
+{
+  return cp_parser_string_literal_common (parser, /*translate=*/false,
+					  /*wide_ok=*/false, /*udl_ok=*/false,
+					  /*lookup_udlit=*/false,
+					  /*uneval=*/true);
 }
 
 /* Look up a literal operator with the name and the exact arguments.  */
@@ -8578,6 +8598,8 @@ cp_parser_parenthesized_expression_list
 	  expr = cp_lexer_consume_token (parser->lexer)->u.value;
 	else if (is_attribute_list == assume_attr)
 	  expr = cp_parser_conditional_expression (parser);
+	else if (is_attribute_list == uneval_string_attr)
+	  expr = cp_parser_unevaluated_string_literal (parser);
 	else
 	  expr
 	    = cp_parser_parenthesized_expression_list_elt (parser, cast_p,
@@ -16319,8 +16341,12 @@ cp_parser_linkage_specification (cp_pars
 
   /* Look for the string-literal.  */
   cp_token *string_token = cp_lexer_peek_token (parser->lexer);
-  tree linkage = cp_parser_string_literal (parser, /*translate=*/false,
-					   /*wide_ok=*/false);
+  tree linkage;
+  if (cxx_dialect >= cxx26)
+    linkage = cp_parser_unevaluated_string_literal (parser);
+  else
+    linkage = cp_parser_string_literal (parser, /*translate=*/false,
+					/*wide_ok=*/false);
 
   /* Transform the literal into an identifier.  If the literal is a
      wide-character string, or contains embedded NULs, then we can't
@@ -16449,8 +16475,11 @@ cp_parser_static_assert (cp_parser *pars
       cp_parser_require (parser, CPP_COMMA, RT_COMMA);
 
       /* Parse the string-literal message.  */
-      message = cp_parser_string_literal (parser, /*translate=*/false,
-					  /*wide_ok=*/true);
+      if (cxx_dialect >= cxx26)
+	message = cp_parser_unevaluated_string_literal (parser);
+      else
+	message = cp_parser_string_literal (parser, /*translate=*/false,
+					    /*wide_ok=*/true);
 
       /* A `)' completes the static assertion.  */
       if (!parens.require_close (parser))
@@ -29442,6 +29471,11 @@ cp_parser_std_attribute (cp_parser *pars
 	     && attribute_takes_identifier_p (attr_id))
       /* A GNU attribute that takes an identifier in parameter.  */
       attr_flag = id_attr;
+    else if (attr_ns == NULL_TREE
+	     && cxx_dialect >= cxx26
+	     && (is_attribute_p ("deprecated", attr_id)
+		 || is_attribute_p ("nodiscard", attr_id)))
+      attr_flag = uneval_string_attr;
 
     /* If this is a fake attribute created to handle -Wno-attributes,
        we must skip parsing the arguments.  */
--- gcc/testsuite/g++.dg/cpp26/unevalstr1.C.jj	2023-08-23 13:07:05.960665571 +0200
+++ gcc/testsuite/g++.dg/cpp26/unevalstr1.C	2023-08-23 13:09:59.782410316 +0200
@@ -0,0 +1,103 @@
+// C++26 P2361R6 - Unevaluated strings
+// { dg-do compile { target c++26 } }
+
+static_assert (true, "foo");
+static_assert (true, "foo" " " "bar");
+static_assert (true, "\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v");
+static_assert (true, L"foo");		// { dg-error "a wide string is invalid in this context" }
+static_assert (true, u"foo");		// { dg-error "a wide string is invalid in this context" }
+static_assert (true, U"foo");		// { dg-error "a wide string is invalid in this context" }
+static_assert (true, u8"foo");		// { dg-error "a wide string is invalid in this context" }
+static_assert (true, L"fo" "o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, u"fo" "o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, U"fo" "o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, u8"fo" "o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, "fo" L"o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, "fo" u"o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, "fo" U"o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, "fo" u8"o");	// { dg-error "a wide string is invalid in this context" }
+static_assert (true, "\0");		// { dg-error "numeric escape sequence in unevaluated string" }
+static_assert (true, "\17");		// { dg-error "numeric escape sequence in unevaluated string" }
+static_assert (true, "\x20");		// { dg-error "numeric escape sequence in unevaluated string" }
+static_assert (true, "\o{17}");		// { dg-error "numeric escape sequence in unevaluated string" }
+static_assert (true, "\x{20}");		// { dg-error "numeric escape sequence in unevaluated string" }
+static_assert (true, "\h");		// { dg-error "unknown escape sequence" }
+
+extern "C" "+" "+" int f0 ();
+extern "C" int f1 ();
+extern "C" { int f2 (); };
+extern L"C" int f3 ();			// { dg-error "a wide string is invalid in this context" }
+extern L"C" { int f4 (); }		// { dg-error "a wide string is invalid in this context" }
+extern u"C" int f5 ();			// { dg-error "a wide string is invalid in this context" }
+extern u"C" { int f6 (); }		// { dg-error "a wide string is invalid in this context" }
+extern U"C" int f7 ();			// { dg-error "a wide string is invalid in this context" }
+extern U"C" { int f8 (); }		// { dg-error "a wide string is invalid in this context" }
+extern u8"C" int f9 ();			// { dg-error "a wide string is invalid in this context" }
+extern u8"C" { int f10 (); }		// { dg-error "a wide string is invalid in this context" }
+extern "\x43" int f11 ();		// { dg-error "numeric escape sequence in unevaluated string" }
+extern "\x{43}" { int f12 (); }		// { dg-error "numeric escape sequence in unevaluated string" }
+extern "\103" int f13 ();		// { dg-error "numeric escape sequence in unevaluated string" }
+extern "\o{0103}" { int f14 (); }	// { dg-error "numeric escape sequence in unevaluated string" }
+
+[[deprecated ("foo")]] int g0 ();
+[[deprecated ("foo" " " "bar")]] int g1 ();
+[[deprecated ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int g2 ();
+[[deprecated (L"foo")]] int g3 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (u"foo")]] int g4 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (U"foo")]] int g5 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (u8"foo")]] int g6 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (L"fo" "o")]] int g7 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (u"fo" "o")]] int g8 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (U"fo" "o")]] int g9 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated (u8"fo" "o")]] int g10 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated ("fo" L"o")]] int g11 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated ("fo" u"o")]] int g12 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated ("fo" U"o")]] int g13 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated ("fo" u8"o")]] int g14 ();	// { dg-error "a wide string is invalid in this context" }
+[[deprecated ("\0")]] int g15 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[deprecated ("\17")]] int g16 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[deprecated ("\x20")]] int g17 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[deprecated ("\o{17}")]] int g18 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[deprecated ("\x{20}")]] int g19 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[deprecated ("\h")]] int g20 ();	// { dg-error "unknown escape sequence" }
+
+[[nodiscard ("foo")]] int h0 ();
+[[nodiscard ("foo" " " "bar")]] int h1 ();
+[[nodiscard ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int h2 ();
+[[nodiscard (L"foo")]] int h3 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (u"foo")]] int h4 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (U"foo")]] int h5 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (u8"foo")]] int h6 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (L"fo" "o")]] int h7 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (u"fo" "o")]] int h8 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (U"fo" "o")]] int h9 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard (u8"fo" "o")]] int h10 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard ("fo" L"o")]] int h11 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard ("fo" u"o")]] int h12 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard ("fo" U"o")]] int h13 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard ("fo" u8"o")]] int h14 ();	// { dg-error "a wide string is invalid in this context" }
+[[nodiscard ("\0")]] int h15 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[nodiscard ("\17")]] int h16 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[nodiscard ("\x20")]] int h17 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[nodiscard ("\o{17}")]] int h18 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[nodiscard ("\x{20}")]] int h19 ();	// { dg-error "numeric escape sequence in unevaluated string" }
+[[nodiscard ("\h")]] int h20 ();	// { dg-error "unknown escape sequence" }
+
+float operator "" _my0 (const char *);
+float operator "" "" _my1 (const char *);
+float operator L"" _my2 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u"" _my3 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator U"" _my4 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u8"" _my5 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator L"" "" _my6 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u"" "" _my7 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator U"" "" _my8 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u8"" "" _my9 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" L"" _my10 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" u"" _my11 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" U"" _my12 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" u8"" _my13 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "\0" _my14 (const char *);	// { dg-error "expected empty string after 'operator' keyword" }
+float operator "\x00" _my15 (const char *);	// { dg-error "expected empty string after 'operator' keyword" }
+float operator "\h" _my16 (const char *);	// { dg-error "expected empty string after 'operator' keyword" }
+						// { dg-error "unknown escape sequence" "" { target *-*-* } .-1 }
--- gcc/testsuite/g++.dg/cpp26/unevalstr2.C.jj	2023-08-23 13:10:17.120185018 +0200
+++ gcc/testsuite/g++.dg/cpp26/unevalstr2.C	2023-08-23 13:20:18.152371965 +0200
@@ -0,0 +1,110 @@
+// C++26 P2361R6 - Unevaluated strings
+// { dg-do compile { target { c++11 && c++23_down } } }
+// { dg-options "-pedantic" }
+
+static_assert (true, "foo");
+static_assert (true, "foo" " " "bar");
+static_assert (true, "\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v");
+// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } .-1 }
+// { dg-warning "named universal character escapes are only valid in" "" { target c++20_down } .-2 }
+static_assert (true, L"foo");
+static_assert (true, u"foo");
+static_assert (true, U"foo");
+static_assert (true, u8"foo");
+static_assert (true, L"fo" "o");
+static_assert (true, u"fo" "o");
+static_assert (true, U"fo" "o");
+static_assert (true, u8"fo" "o");
+static_assert (true, "fo" L"o");
+static_assert (true, "fo" u"o");
+static_assert (true, "fo" U"o");
+static_assert (true, "fo" u8"o");
+static_assert (true, "\0");
+static_assert (true, "\17");
+static_assert (true, "\x20");
+static_assert (true, "\o{17}");		// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+static_assert (true, "\x{20}");		// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+static_assert (true, "\h");		// { dg-warning "unknown escape sequence" }
+
+extern "C" "+" "+" int f0 ();
+extern "C" int f1 ();
+extern "C" { int f2 (); };
+extern L"C" int f3 ();			// { dg-error "a wide string is invalid in this context" }
+extern L"C" { int f4 (); }		// { dg-error "a wide string is invalid in this context" }
+extern u"C" int f5 ();			// { dg-error "a wide string is invalid in this context" }
+extern u"C" { int f6 (); }		// { dg-error "a wide string is invalid in this context" }
+extern U"C" int f7 ();			// { dg-error "a wide string is invalid in this context" }
+extern U"C" { int f8 (); }		// { dg-error "a wide string is invalid in this context" }
+extern u8"C" int f9 ();			// { dg-error "a wide string is invalid in this context" }
+extern u8"C" { int f10 (); }		// { dg-error "a wide string is invalid in this context" }
+extern "\x43" int f11 ();
+extern "\x{43}" { int f12 (); }		// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+extern "\103" int f13 ();
+extern "\o{0103}" { int f14 (); }	// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+
+[[deprecated ("foo")]] int g0 ();
+[[deprecated ("foo" " " "bar")]] int g1 ();
+[[deprecated ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int g2 ();
+// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } .-1 }
+// { dg-warning "named universal character escapes are only valid in" "" { target c++20_down } .-2 }
+[[deprecated (L"foo")]] int g3 ();
+[[deprecated (u"foo")]] int g4 ();
+[[deprecated (U"foo")]] int g5 ();
+[[deprecated (u8"foo")]] int g6 ();
+[[deprecated (L"fo" "o")]] int g7 ();
+[[deprecated (u"fo" "o")]] int g8 ();
+[[deprecated (U"fo" "o")]] int g9 ();
+[[deprecated (u8"fo" "o")]] int g10 ();
+[[deprecated ("fo" L"o")]] int g11 ();
+[[deprecated ("fo" u"o")]] int g12 ();
+[[deprecated ("fo" U"o")]] int g13 ();
+[[deprecated ("fo" u8"o")]] int g14 ();
+[[deprecated ("\0")]] int g15 ();
+[[deprecated ("\17")]] int g16 ();
+[[deprecated ("\x20")]] int g17 ();
+[[deprecated ("\o{17}")]] int g18 ();	// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+[[deprecated ("\x{20}")]] int g19 ();	// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+[[deprecated ("\h")]] int g20 ();	// { dg-warning "unknown escape sequence" }
+
+[[nodiscard ("foo")]] int h0 ();
+[[nodiscard ("foo" " " "bar")]] int h1 ();
+[[nodiscard ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int h2 ();
+// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } .-1 }
+// { dg-warning "named universal character escapes are only valid in" "" { target c++20_down } .-2 }
+[[nodiscard (L"foo")]] int h3 ();
+[[nodiscard (u"foo")]] int h4 ();
+[[nodiscard (U"foo")]] int h5 ();
+[[nodiscard (u8"foo")]] int h6 ();
+[[nodiscard (L"fo" "o")]] int h7 ();
+[[nodiscard (u"fo" "o")]] int h8 ();
+[[nodiscard (U"fo" "o")]] int h9 ();
+[[nodiscard (u8"fo" "o")]] int h10 ();
+[[nodiscard ("fo" L"o")]] int h11 ();
+[[nodiscard ("fo" u"o")]] int h12 ();
+[[nodiscard ("fo" U"o")]] int h13 ();
+[[nodiscard ("fo" u8"o")]] int h14 ();
+[[nodiscard ("\0")]] int h15 ();
+[[nodiscard ("\17")]] int h16 ();
+[[nodiscard ("\x20")]] int h17 ();
+[[nodiscard ("\o{17}")]] int h18 ();	// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+[[nodiscard ("\x{20}")]] int h19 ();	// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } }
+[[nodiscard ("\h")]] int h20 ();	// { dg-warning "unknown escape sequence" }
+
+float operator "" _my0 (const char *);
+float operator "" "" _my1 (const char *);
+float operator L"" _my2 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u"" _my3 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator U"" _my4 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u8"" _my5 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator L"" "" _my6 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u"" "" _my7 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator U"" "" _my8 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator u8"" "" _my9 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" L"" _my10 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" u"" _my11 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" U"" _my12 (const char *);	// { dg-error "invalid encoding prefix in literal operator" }
+float operator "" u8"" _my13 (const char *);	// { dg-error "invalid encoding prefix in literal operator" "" { target c++20 } }
+float operator "\0" _my14 (const char *);	// { dg-error "expected empty string after 'operator' keyword" }
+float operator "\x00" _my15 (const char *);	// { dg-error "expected empty string after 'operator' keyword" }
+float operator "\h" _my16 (const char *);	// { dg-error "expected empty string after 'operator' keyword" }
+						// { dg-warning "unknown escape sequence" "" { target *-*-* } .-1 }
--- gcc/testsuite/g++.dg/cpp0x/udlit-error1.C.jj	2023-01-26 22:03:00.657122433 +0100
+++ gcc/testsuite/g++.dg/cpp0x/udlit-error1.C	2023-08-24 15:46:18.149708095 +0200
@@ -13,7 +13,7 @@ void operator""_x(const char *, decltype
 extern "C"_x { void g(); } // { dg-error "before user-defined string literal" }
 static_assert(true, "foo"_x); // { dg-error "string literal with user-defined suffix is invalid in this context|expected" }
 
-[[deprecated("oof"_x)]]
+[[deprecated("oof"_x)]]	// { dg-error "string literal with user-defined suffix is invalid in this context" "" { target c++26 } }
 void
 lol () // { dg-error "not a string" }
 {
--- libcpp/include/cpplib.h.jj	2023-08-22 16:12:27.709260416 +0200
+++ libcpp/include/cpplib.h	2023-08-23 11:24:56.100650548 +0200
@@ -129,17 +129,18 @@ struct _cpp_file;
   TK(UTF8STRING,	LITERAL) /* u8"string" */			\
   TK(OBJC_STRING,	LITERAL) /* @"string" - Objective-C */		\
   TK(HEADER_NAME,	LITERAL) /* <stdio.h> in #include */		\
+  TK(UNEVAL_STRING,	LITERAL) /* unevaluated "string" - C++26 */	\
 									\
-  TK(CHAR_USERDEF,	LITERAL) /* 'char'_suffix - C++-0x */		\
-  TK(WCHAR_USERDEF,	LITERAL) /* L'char'_suffix - C++-0x */		\
-  TK(CHAR16_USERDEF,	LITERAL) /* u'char'_suffix - C++-0x */		\
-  TK(CHAR32_USERDEF,	LITERAL) /* U'char'_suffix - C++-0x */		\
-  TK(UTF8CHAR_USERDEF,	LITERAL) /* u8'char'_suffix - C++-0x */		\
-  TK(STRING_USERDEF,	LITERAL) /* "string"_suffix - C++-0x */		\
-  TK(WSTRING_USERDEF,	LITERAL) /* L"string"_suffix - C++-0x */	\
-  TK(STRING16_USERDEF,	LITERAL) /* u"string"_suffix - C++-0x */	\
-  TK(STRING32_USERDEF,	LITERAL) /* U"string"_suffix - C++-0x */	\
-  TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++-0x */	\
+  TK(CHAR_USERDEF,	LITERAL) /* 'char'_suffix - C++11 */		\
+  TK(WCHAR_USERDEF,	LITERAL) /* L'char'_suffix - C++11 */		\
+  TK(CHAR16_USERDEF,	LITERAL) /* u'char'_suffix - C++11 */		\
+  TK(CHAR32_USERDEF,	LITERAL) /* U'char'_suffix - C++11 */		\
+  TK(UTF8CHAR_USERDEF,	LITERAL) /* u8'char'_suffix - C++11 */		\
+  TK(STRING_USERDEF,	LITERAL) /* "string"_suffix - C++11 */		\
+  TK(WSTRING_USERDEF,	LITERAL) /* L"string"_suffix - C++11 */		\
+  TK(STRING16_USERDEF,	LITERAL) /* u"string"_suffix - C++11 */		\
+  TK(STRING32_USERDEF,	LITERAL) /* U"string"_suffix - C++11 */		\
+  TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++11 */	\
 									\
   TK(COMMENT,		LITERAL) /* Only if output comments.  */	\
 				 /* SPELL_LITERAL happens to DTRT.  */	\
--- libcpp/charset.cc.jj	2023-07-11 13:40:40.398430000 +0200
+++ libcpp/charset.cc	2023-08-23 12:56:48.926671275 +0200
@@ -2156,7 +2156,7 @@ static const uchar *
 convert_escape (cpp_reader *pfile, const uchar *from, const uchar *limit,
 		struct _cpp_strbuf *tbuf, struct cset_converter cvt,
 		cpp_string_location_reader *loc_reader,
-		cpp_substring_ranges *ranges)
+		cpp_substring_ranges *ranges, bool uneval)
 {
   /* Values of \a \b \e \f \n \r \t \v respectively.  */
 #if HOST_CHARSET == HOST_CHARSET_ASCII
@@ -2183,12 +2183,20 @@ convert_escape (cpp_reader *pfile, const
 			  char_range, loc_reader, ranges);
 
     case 'x':
+      if (uneval && CPP_PEDANTIC (pfile))
+	cpp_error (pfile, CPP_DL_PEDWARN,
+		   "numeric escape sequence in unevaluated string: "
+		   "'\\%c'", (int) c);
       return convert_hex (pfile, from, limit, tbuf, cvt,
 			  char_range, loc_reader, ranges);
 
     case '0':  case '1':  case '2':  case '3':
     case '4':  case '5':  case '6':  case '7':
     case 'o':
+      if (uneval && CPP_PEDANTIC (pfile))
+	cpp_error (pfile, CPP_DL_PEDWARN,
+		   "numeric escape sequence in unevaluated string: "
+		   "'\\%c'", (int) c);
       return convert_oct (pfile, from, limit, tbuf, cvt,
 			  char_range, loc_reader, ranges);
 
@@ -2296,7 +2304,7 @@ converter_for_type (cpp_reader *pfile, e
 
 static bool
 cpp_interpret_string_1 (cpp_reader *pfile, const cpp_string *from, size_t count,
-			cpp_string *to,  enum cpp_ttype type,
+			cpp_string *to, enum cpp_ttype type,
 			cpp_string_location_reader *loc_readers,
 			cpp_substring_ranges *out)
 {
@@ -2427,7 +2435,7 @@ cpp_interpret_string_1 (cpp_reader *pfil
 
 	  struct _cpp_strbuf *tbuf_ptr = to ? &tbuf : NULL;
 	  p = convert_escape (pfile, p + 1, limit, tbuf_ptr, cvt,
-			      loc_reader, out);
+			      loc_reader, out, type == CPP_UNEVAL_STRING);
 	}
     }
 
@@ -2465,7 +2473,7 @@ cpp_interpret_string_1 (cpp_reader *pfil
    false for failure.  */
 bool
 cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count,
-		      cpp_string *to,  enum cpp_ttype type)
+		      cpp_string *to, enum cpp_ttype type)
 {
   return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL);
 }
@@ -2548,7 +2556,7 @@ cpp_interpret_string_ranges (cpp_reader
 bool
 cpp_interpret_string_notranslate (cpp_reader *pfile, const cpp_string *from,
 				  size_t count,	cpp_string *to,
-				  enum cpp_ttype type ATTRIBUTE_UNUSED)
+				  enum cpp_ttype type)
 {
   struct cset_converter save_narrow_cset_desc = pfile->narrow_cset_desc;
   bool retval;
@@ -2557,7 +2565,9 @@ cpp_interpret_string_notranslate (cpp_re
   pfile->narrow_cset_desc.cd = (iconv_t) -1;
   pfile->narrow_cset_desc.width = CPP_OPTION (pfile, char_precision);
 
-  retval = cpp_interpret_string (pfile, from, count, to, CPP_STRING);
+  retval = cpp_interpret_string (pfile, from, count, to,
+				 type == CPP_UNEVAL_STRING
+				 ? CPP_UNEVAL_STRING : CPP_STRING);
 
   pfile->narrow_cset_desc = save_narrow_cset_desc;
   return retval;

	Jakub


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]
  2023-08-24 13:58 [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342] Jakub Jelinek
@ 2023-10-20 20:12 ` Jason Merrill
  2023-10-20 21:59   ` Jakub Jelinek
  0 siblings, 1 reply; 4+ messages in thread
From: Jason Merrill @ 2023-10-20 20:12 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 8/24/23 09:58, Jakub Jelinek wrote:
> Hi!
> 
> The following patch implements C++26 unevaluated-string.
> As it seems to me just extra pedanticity, it is implemented only for
> -std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.

Hmm, I assumed it was accepted as a DR, but apparently not.  In addition 
to making things ill-formed, it clarifies that these strings are never 
converted to the execution character set.  Do we support 
cross-compilation to EBCDIC, which was the motivating context?

Jason


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]
  2023-10-20 20:12 ` Jason Merrill
@ 2023-10-20 21:59   ` Jakub Jelinek
  2023-10-21  0:28     ` Jason Merrill
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2023-10-20 21:59 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches

On Fri, Oct 20, 2023 at 04:12:48PM -0400, Jason Merrill wrote:
> On 8/24/23 09:58, Jakub Jelinek wrote:
> > The following patch implements C++26 unevaluated-string.
> > As it seems to me just extra pedanticity, it is implemented only for
> > -std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.
> 
> Hmm, I assumed it was accepted as a DR, but apparently not.  In addition to
> making things ill-formed, it clarifies that these strings are never
> converted to the execution character set.

I believe we implement it that way.  cp_parser_unevaluated_string_literal
(but several other spots as well) pass false to translate argument.

>  Do we support cross-compilation
> to EBCDIC, which was the motivating context?

Do you mean EBCDIC just as execution charset, or translation charset as well?
There is some support for EBCDIC translation charset, but one needs iconv
with UTF-EBCDIC support, which glibc doesn't support, so I have no way to
actually verify it.

	Jakub


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]
  2023-10-20 21:59   ` Jakub Jelinek
@ 2023-10-21  0:28     ` Jason Merrill
  0 siblings, 0 replies; 4+ messages in thread
From: Jason Merrill @ 2023-10-21  0:28 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 10/20/23 17:59, Jakub Jelinek wrote:
> On Fri, Oct 20, 2023 at 04:12:48PM -0400, Jason Merrill wrote:
>> On 8/24/23 09:58, Jakub Jelinek wrote:
>>> The following patch implements C++26 unevaluated-string.
>>> As it seems to me just extra pedanticity, it is implemented only for
>>> -std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.
>>
>> Hmm, I assumed it was accepted as a DR, but apparently not.  In addition to
>> making things ill-formed, it clarifies that these strings are never
>> converted to the execution character set.
> 
> I believe we implement it that way.  cp_parser_unevaluated_string_literal
> (but several other spots as well) pass false to translate argument.

Ah, right.  The patch is OK, thanks.

Jason


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-10-21  0:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24 13:58 [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342] Jakub Jelinek
2023-10-20 20:12 ` Jason Merrill
2023-10-20 21:59   ` Jakub Jelinek
2023-10-21  0:28     ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).