From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jakub@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by sourceware.org (Postfix) with ESMTPS id 211423858418
	for <gcc-patches@gcc.gnu.org>; Sat,  3 Sep 2022 10:54:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 211423858418
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1662202484;
	h=from:from:reply-to:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:mime-version:mime-version:
	 content-type:content-type:in-reply-to:in-reply-to:  references:references;
	bh=BgJ9BF2hfsfvWkRkFGDD3ehey8InmjP9QsqJqOqVyAk=;
	b=Othjqi1Wi3Bbwj9oMfI4GJZ/PUCWM10eq09xmJ72BHSrQGIL+U+RF6y8imYK1X6JajVqjm
	ar7hTA2a1TAXlFhUB6YHliM9k9wFMJhSzByoCfmkJj22UxORTNY9/Dfh0gzZruHFCKnCJS
	hh3HoIqmAHUSvH/BlQzXqjkR/QK56m0=
Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com
 [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-508-P8kCs5GqOH6UypmI_0zDGw-1; Sat, 03 Sep 2022 06:54:40 -0400
X-MC-Unique: P8kCs5GqOH6UypmI_0zDGw-1
Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 995A8811E80;
	Sat,  3 Sep 2022 10:54:40 +0000 (UTC)
Received: from tucnak.zalov.cz (unknown [10.39.192.41])
	by smtp.corp.redhat.com (Postfix) with ESMTPS id 346491121314;
	Sat,  3 Sep 2022 10:54:40 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1])
	by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 283AsbFA654485
	(version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT);
	Sat, 3 Sep 2022 12:54:37 +0200
Received: (from jakub@localhost)
	by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 283AsVLC654482;
	Sat, 3 Sep 2022 12:54:31 +0200
Date: Sat, 3 Sep 2022 12:54:31 +0200
From: Jakub Jelinek <jakub@redhat.com>
To: Jason Merrill <jason@redhat.com>, Joseph Myers <joseph@codesourcery.com>,
        gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libcpp, v3: Named universal character escapes and
 delimited escape sequence tweaks
Message-ID: <YxMyZy6glKRqkL86@tucnak>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <alpine.DEB.2.22.394.2208302055240.446383@digraph.polyomino.org.uk>
 <Yw5+nPD8O+JTx3uL@tucnak>
 <Yw6DA3MhofyzWnje@tucnak>
 <Yw9xsBRmTqkLMlGC@tucnak>
 <5da578e7-9c43-99ea-15c1-aefc641a0654@redhat.com>
 <Yw95MR3YN1aT2ks6@tucnak>
 <df9730f4-d796-7bf6-dd18-d0c9c5a0cf12@redhat.com>
 <YxCULjMrhvN5f7xR@tucnak>
 <37250e6c-80f9-2b93-a381-c1c9b869c04d@redhat.com>
 <YxMsnC5ei4zydz+4@tucnak>
MIME-Version: 1.0
In-Reply-To: <YxMsnC5ei4zydz+4@tucnak>
X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Sat, Sep 03, 2022 at 12:29:52PM +0200, Jakub Jelinek wrote:
> On Thu, Sep 01, 2022 at 03:00:28PM -0400, Jason Merrill wrote:
> > We might as well use the same flag name, and document it to mean what it
> > currently means for GCC.
> 
> Ok, following patch introduces -Wunicode (on by default).
> 
> > It looks like this is handling \N{abc}, for which "incomplete" seems like
> > the wrong description; it's complete, just wrong, and the diagnostic doesn't
> > help correct it.
> 
> And also will emit the is not a valid universal character with did you mean
> if it matches loosely, otherwise will use the not terminated with } after
> ... wording.
> 
> Ok if it passes bootstrap/regtest?

Actually, treating the !strict case like the strict case except for always
warning instead of error if outside of literals is simpler.

The following version does that.  The only difference on the testcases is in
the
int f = a\N{abc});
cases where it emits different diagnostics.

2022-09-03  Jakub Jelinek  <jakub@redhat.com>

libcpp/
	* include/cpplib.h (struct cpp_options): Add cpp_warn_unicode member.
	(enum cpp_warning_reason): Add CPP_W_UNICODE.
	* init.cc (cpp_create_reader): Initialize cpp_warn_unicode.
	* charset.cc (_cpp_valid_ucn): In possible identifier contexts, don't
	handle \u{ or \N{ specially in -std=c* modes except -std=c++2{3,b}.
	In possible identifier contexts, don't emit an error and punt
	if \N isn't followed by {, or if \N{} surrounds some lower case
	letters or _.  In possible identifier contexts when not C++23, don't
	emit an error but warning about unknown character names and treat as
	separate tokens.  When treating as separate tokens \u{ or \N{, emit
	warnings.
gcc/
	* doc/invoke.texi (-Wno-unicode): Document.
gcc/c-family/
	* c.opt (Winvalid-utf8): Use ObjC instead of objC.  Remove
	" in comments" from description.
	(Wunicode): New option.
gcc/testsuite/
	* c-c++-common/cpp/delimited-escape-seq-4.c: New test.
	* c-c++-common/cpp/delimited-escape-seq-5.c: New test.
	* c-c++-common/cpp/delimited-escape-seq-6.c: New test.
	* c-c++-common/cpp/delimited-escape-seq-7.c: New test.
	* c-c++-common/cpp/named-universal-char-escape-5.c: New test.
	* c-c++-common/cpp/named-universal-char-escape-6.c: New test.
	* c-c++-common/cpp/named-universal-char-escape-7.c: New test.
	* g++.dg/cpp23/named-universal-char-escape1.C: New test.
	* g++.dg/cpp23/named-universal-char-escape2.C: New test.

--- libcpp/include/cpplib.h.jj	2022-09-03 09:35:41.465984642 +0200
+++ libcpp/include/cpplib.h	2022-09-03 11:30:57.250677870 +0200
@@ -565,6 +565,10 @@ struct cpp_options
      2 if it should be a pedwarn.  */
   unsigned char cpp_warn_invalid_utf8;
 
+  /* True if libcpp should warn about invalid forms of delimited or named
+     escape sequences.  */
+  bool cpp_warn_unicode;
+
   /* True if -finput-charset= option has been used explicitly.  */
   bool cpp_input_charset_explicit;
 
@@ -675,7 +679,8 @@ enum cpp_warning_reason {
   CPP_W_CXX20_COMPAT,
   CPP_W_EXPANSION_TO_DEFINED,
   CPP_W_BIDIRECTIONAL,
-  CPP_W_INVALID_UTF8
+  CPP_W_INVALID_UTF8,
+  CPP_W_UNICODE
 };
 
 /* Callback for header lookup for HEADER, which is the name of a
--- libcpp/init.cc.jj	2022-09-01 09:47:23.729892618 +0200
+++ libcpp/init.cc	2022-09-03 11:19:10.954452329 +0200
@@ -228,6 +228,7 @@ cpp_create_reader (enum c_lang lang, cpp
   CPP_OPTION (pfile, warn_date_time) = 0;
   CPP_OPTION (pfile, cpp_warn_bidirectional) = bidirectional_unpaired;
   CPP_OPTION (pfile, cpp_warn_invalid_utf8) = 0;
+  CPP_OPTION (pfile, cpp_warn_unicode) = 1;
   CPP_OPTION (pfile, cpp_input_charset_explicit) = 0;
 
   /* Default CPP arithmetic to something sensible for the host for the
--- libcpp/charset.cc.jj	2022-09-01 14:19:47.462235851 +0200
+++ libcpp/charset.cc	2022-09-03 12:42:41.800923600 +0200
@@ -1448,7 +1448,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const
   if (str[-1] == 'u')
     {
       length = 4;
-      if (str < limit && *str == '{')
+      if (str < limit
+	  && *str == '{'
+	  && (!identifier_pos
+	      || CPP_OPTION (pfile, delimited_escape_seqs)
+	      || !CPP_OPTION (pfile, std)))
 	{
 	  str++;
 	  /* Magic value to indicate no digits seen.  */
@@ -1462,8 +1466,22 @@ _cpp_valid_ucn (cpp_reader *pfile, const
   else if (str[-1] == 'N')
     {
       length = 4;
+      if (identifier_pos
+	  && !CPP_OPTION (pfile, delimited_escape_seqs)
+	  && CPP_OPTION (pfile, std))
+	{
+	  *cp = 0;
+	  return false;
+	}
       if (str == limit || *str != '{')
-	cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'");
+	{
+	  if (identifier_pos)
+	    {
+	      *cp = 0;
+	      return false;
+	    }
+	  cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'");
+	}
       else
 	{
 	  str++;
@@ -1489,15 +1507,19 @@ _cpp_valid_ucn (cpp_reader *pfile, const
 
 	  if (str < limit && *str == '}')
 	    {
-	      if (name == str && identifier_pos)
+	      if (identifier_pos && name == str)
 		{
+		  cpp_warning (pfile, CPP_W_UNICODE,
+			       "empty named universal character escape "
+			       "sequence; treating it as separate tokens");
 		  *cp = 0;
 		  return false;
 		}
 	      if (name == str)
 		cpp_error (pfile, CPP_DL_ERROR,
 			   "empty named universal character escape sequence");
-	      else if (!CPP_OPTION (pfile, delimited_escape_seqs)
+	      else if ((!identifier_pos || strict)
+		       && !CPP_OPTION (pfile, delimited_escape_seqs)
 		       && CPP_OPTION (pfile, cpp_pedantic))
 		cpp_error (pfile, CPP_DL_PEDWARN,
 			   "named universal character escapes are only valid "
@@ -1515,27 +1537,51 @@ _cpp_valid_ucn (cpp_reader *pfile, const
 					   uname2c_tree, NULL);
 		  if (result == (cppchar_t) -1)
 		    {
-		      cpp_error (pfile, CPP_DL_ERROR,
-				 "\\N{%.*s} is not a valid universal "
-				 "character", (int) (str - name), name);
+		      bool ret = true;
+		      if (identifier_pos
+			  && (!CPP_OPTION (pfile, delimited_escape_seqs)
+			      || !strict))
+			ret = cpp_warning (pfile, CPP_W_UNICODE,
+					   "\\N{%.*s} is not a valid "
+					   "universal character; treating it "
+					   "as separate tokens",
+					   (int) (str - name), name);
+		      else
+			cpp_error (pfile, CPP_DL_ERROR,
+				   "\\N{%.*s} is not a valid universal "
+				   "character", (int) (str - name), name);
 
 		      /* Try to do a loose name lookup according to
 			 Unicode loose matching rule UAX44-LM2.  */
 		      char canon_name[uname2c_max_name_len + 1];
 		      result = _cpp_uname2c_uax44_lm2 ((const char *) name,
 						       str - name, canon_name);
-		      if (result != (cppchar_t) -1)
+		      if (result != (cppchar_t) -1 && ret)
 			cpp_error (pfile, CPP_DL_NOTE,
 				   "did you mean \\N{%s}?", canon_name);
 		      else
-			result = 0x40;
+			result = 0xC0;
+		      if (identifier_pos
+			  && (!CPP_OPTION (pfile, delimited_escape_seqs)
+			      || !strict))
+			{
+			  *cp = 0;
+			  return false;
+			}
 		    }
 		}
 	      str++;
 	      extend_char_range (char_range, loc_reader);
 	    }
 	  else if (identifier_pos)
-	    length = 1;
+	    {
+	      cpp_warning (pfile, CPP_W_UNICODE,
+			   "'\\N{' not terminated with '}' after %.*s; "
+			   "treating it as separate tokens",
+			   (int) (str - base), base);
+	      *cp = 0;
+	      return false;
+	    }
 	  else
 	    {
 	      cpp_error (pfile, CPP_DL_ERROR,
@@ -1584,12 +1630,17 @@ _cpp_valid_ucn (cpp_reader *pfile, const
       }
     while (--length);
 
-  if (delimited
-      && str < limit
-      && *str == '}'
-      && (length != 32 || !identifier_pos))
+  if (delimited && str < limit && *str == '}')
     {
-      if (length == 32)
+      if (length == 32 && identifier_pos)
+	{
+	  cpp_warning (pfile, CPP_W_UNICODE,
+		       "empty delimited escape sequence; "
+		       "treating it as separate tokens");
+	  *cp = 0;
+	  return false;
+	}
+      else if (length == 32)
 	cpp_error (pfile, CPP_DL_ERROR,
 		   "empty delimited escape sequence");
       else if (!CPP_OPTION (pfile, delimited_escape_seqs)
@@ -1607,6 +1658,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const
      error message in that case.  */
   if (length && identifier_pos)
     {
+      if (delimited)
+	cpp_warning (pfile, CPP_W_UNICODE,
+		     "'\\u{' not terminated with '}' after %.*s; "
+		     "treating it as separate tokens",
+		     (int) (str - base), base);
       *cp = 0;
       return false;
     }
--- gcc/doc/invoke.texi.jj	2022-09-03 09:35:40.966991672 +0200
+++ gcc/doc/invoke.texi	2022-09-03 11:39:03.875914845 +0200
@@ -365,7 +365,7 @@ Objective-C and Objective-C++ Dialects}.
 -Winfinite-recursion @gol
 -Winit-self  -Winline  -Wno-int-conversion  -Wint-in-bool-context @gol
 -Wno-int-to-pointer-cast  -Wno-invalid-memory-model @gol
--Winvalid-pch  -Winvalid-utf8 -Wjump-misses-init  @gol
+-Winvalid-pch  -Winvalid-utf8  -Wno-unicode  -Wjump-misses-init  @gol
 -Wlarger-than=@var{byte-size}  -Wlogical-not-parentheses  -Wlogical-op  @gol
 -Wlong-long  -Wno-lto-type-mismatch -Wmain  -Wmaybe-uninitialized @gol
 -Wmemset-elt-size  -Wmemset-transposed-args @gol
@@ -9577,6 +9577,12 @@ Warn if an invalid UTF-8 character is fo
 This warning is on by default for C++23 if @option{-finput-charset=UTF-8}
 is used and turned into error with @option{-pedantic-errors}.
 
+@item -Wno-unicode
+@opindex Wunicode
+@opindex Wno-unicode
+Don't diagnose invalid forms of delimited or named escape sequences which are
+treated as separate tokens.  @option{Wunicode} is enabled by default.
+
 @item -Wlong-long
 @opindex Wlong-long
 @opindex Wno-long-long
--- gcc/c-family/c.opt.jj	2022-09-03 09:35:40.206002393 +0200
+++ gcc/c-family/c.opt	2022-09-03 11:17:04.529201926 +0200
@@ -822,8 +822,8 @@ C ObjC C++ ObjC++ CPP(warn_invalid_pch)
 Warn about PCH files that are found but not used.
 
 Winvalid-utf8
-C objC C++ ObjC++ CPP(cpp_warn_invalid_utf8) CppReason(CPP_W_INVALID_UTF8) Var(warn_invalid_utf8) Init(0) Warning
-Warn about invalid UTF-8 characters in comments.
+C ObjC C++ ObjC++ CPP(cpp_warn_invalid_utf8) CppReason(CPP_W_INVALID_UTF8) Var(warn_invalid_utf8) Init(0) Warning
+Warn about invalid UTF-8 characters.
 
 Wjump-misses-init
 C ObjC Var(warn_jump_misses_init) Warning LangEnabledby(C ObjC,Wc++-compat)
@@ -1345,6 +1345,10 @@ Wundef
 C ObjC C++ ObjC++ CPP(warn_undef) CppReason(CPP_W_UNDEF) Var(cpp_warn_undef) Init(0) Warning
 Warn if an undefined macro is used in an #if directive.
 
+Wunicode
+C ObjC C++ ObjC++ CPP(cpp_warn_unicode) CppReason(CPP_W_UNICODE) Var(warn_unicode) Init(1) Warning
+Warn about invalid forms of delimited or named escape sequences.
+
 Wuninitialized
 C ObjC C++ ObjC++ LTO LangEnabledBy(C ObjC C++ ObjC++ LTO,Wall)
 ;
--- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-4.c.jj	2022-09-03 11:13:37.570068845 +0200
+++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-4.c	2022-09-03 11:56:52.818054420 +0200
@@ -0,0 +1,13 @@
+/* P2290R3 - Delimited escape sequences */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=gnu99 -Wno-c++-compat" { target c } } */
+/* { dg-options "-std=gnu++20" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\u{});		/* { dg-warning "empty delimited escape sequence; treating it as separate tokens" } */
+int c = a\u{);		/* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" } */
+int d = a\u{12XYZ});	/* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" } */
+int e = a\u123);
+int f = a\U1234567);
--- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-5.c.jj	2022-09-03 11:13:37.570068845 +0200
+++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-5.c	2022-09-03 12:01:35.618124647 +0200
@@ -0,0 +1,13 @@
+/* P2290R3 - Delimited escape sequences */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c17 -Wno-c++-compat" { target c } } */
+/* { dg-options "-std=c++23" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\u{});		/* { dg-warning "empty delimited escape sequence; treating it as separate tokens" "" { target c++23 } } */
+int c = a\u{);		/* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" "" { target c++23 } } */
+int d = a\u{12XYZ});	/* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" "" { target c++23 } } */
+int e = a\u123);
+int f = a\U1234567);
--- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-6.c.jj	2022-09-03 11:59:36.573778876 +0200
+++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-6.c	2022-09-03 11:59:55.808511591 +0200
@@ -0,0 +1,13 @@
+/* P2290R3 - Delimited escape sequences */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=gnu99 -Wno-c++-compat -Wno-unicode" { target c } } */
+/* { dg-options "-std=gnu++20 -Wno-unicode" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\u{});		/* { dg-bogus "empty delimited escape sequence; treating it as separate tokens" } */
+int c = a\u{);		/* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" } */
+int d = a\u{12XYZ});	/* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" } */
+int e = a\u123);
+int f = a\U1234567);
--- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-7.c.jj	2022-09-03 12:01:48.958939255 +0200
+++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-7.c	2022-09-03 12:02:16.765552854 +0200
@@ -0,0 +1,13 @@
+/* P2290R3 - Delimited escape sequences */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c17 -Wno-c++-compat -Wno-unicode" { target c } } */
+/* { dg-options "-std=c++23 -Wno-unicode" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\u{});		/* { dg-bogus "empty delimited escape sequence; treating it as separate tokens" } */
+int c = a\u{);		/* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" } */
+int d = a\u{12XYZ});	/* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" } */
+int e = a\u123);
+int f = a\U1234567);
--- gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-5.c.jj	2022-09-03 11:13:37.570068845 +0200
+++ gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-5.c	2022-09-03 12:45:18.968747909 +0200
@@ -0,0 +1,17 @@
+/* P2071R2 - Named universal character escapes */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=gnu99 -Wno-c++-compat" { target c } } */
+/* { dg-options "-std=gnu++20" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\N{});				/* { dg-warning "empty named universal character escape sequence; treating it as separate tokens" } */
+int c = a\N{);				/* { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" } */
+int d = a\N);
+int e = a\NARG);
+int f = a\N{abc});				/* { dg-warning "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" } */
+int g = a\N{ABC.123});				/* { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" } */
+int h = a\N{NON-EXISTENT CHAR});	/* { dg-warning "\\\\N\\\{NON-EXISTENT CHAR\\\} is not a valid universal character; treating it as separate tokens" } */
+int i = a\N{Latin_Small_Letter_A_With_Acute});	/* { dg-warning "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" } */
+					/* { dg-message "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target *-*-* } .-1 } */
--- gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-6.c.jj	2022-09-03 11:13:37.570068845 +0200
+++ gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-6.c	2022-09-03 11:44:34.558316155 +0200
@@ -0,0 +1,17 @@
+/* P2071R2 - Named universal character escapes */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c17 -Wno-c++-compat" { target c } } */
+/* { dg-options "-std=c++20" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\N{});
+int c = a\N{);
+int d = a\N);
+int e = a\NARG);
+int f = a\N{abc});
+int g = a\N{ABC.123});
+int h = a\N{NON-EXISTENT CHAR});	/* { dg-bogus "is not a valid universal character" } */
+int i = a\N{Latin_Small_Letter_A_With_Acute});
+int j = a\N{LATIN SMALL LETTER A WITH ACUTE});
--- gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-7.c.jj	2022-09-03 12:18:31.296022384 +0200
+++ gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-7.c	2022-09-03 12:45:57.663212248 +0200
@@ -0,0 +1,17 @@
+/* P2071R2 - Named universal character escapes */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=gnu99 -Wno-c++-compat -Wno-unicode" { target c } } */
+/* { dg-options "-std=gnu++20 -Wno-unicode" { target c++ } } */
+
+#define z(x) 0
+#define a z(
+int b = a\N{});				/* { dg-bogus "empty named universal character escape sequence; treating it as separate tokens" } */
+int c = a\N{);				/* { dg-bogus "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" } */
+int d = a\N);
+int e = a\NARG);
+int f = a\N{abc});				/* { dg-bogus "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" } */
+int g = a\N{ABC.123});				/* { dg-bogus "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" } */
+int h = a\N{NON-EXISTENT CHAR});	/* { dg-bogus "\\\\N\\\{NON-EXISTENT CHAR\\\} is not a valid universal character; treating it as separate tokens" } */
+int i = a\N{Latin_Small_Letter_A_With_Acute});	/* { dg-bogus "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" } */
+					/* { dg-bogus "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target *-*-* } .-1 } */
--- gcc/testsuite/g++.dg/cpp23/named-universal-char-escape1.C.jj	2022-09-03 11:13:37.571068831 +0200
+++ gcc/testsuite/g++.dg/cpp23/named-universal-char-escape1.C	2022-09-03 12:44:03.893787182 +0200
@@ -0,0 +1,16 @@
+// P2071R2 - Named universal character escapes
+// { dg-do compile }
+// { dg-require-effective-target wchar }
+
+#define z(x) 0
+#define a z(
+int b = a\N{});				// { dg-warning "empty named universal character escape sequence; treating it as separate tokens" "" { target c++23 } }
+int c = a\N{);				// { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" "" { target c++23 } }
+int d = a\N);
+int e = a\NARG);
+int f = a\N{abc});			// { dg-warning "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" "" { target c++23 } }
+int g = a\N{ABC.123});			// { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" "" { target c++23 } }
+int h = a\N{NON-EXISTENT CHAR});	// { dg-error "is not a valid universal character" "" { target c++23 } }
+					// { dg-error "was not declared in this scope" "" { target c++23 } .-1 }
+int i = a\N{Latin_Small_Letter_A_With_Acute});	// { dg-warning "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" "" { target c++23 } }
+					// { dg-message "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target c++23 } .-1 }
--- gcc/testsuite/g++.dg/cpp23/named-universal-char-escape2.C.jj	2022-09-03 11:13:37.571068831 +0200
+++ gcc/testsuite/g++.dg/cpp23/named-universal-char-escape2.C	2022-09-03 12:44:31.723401937 +0200
@@ -0,0 +1,18 @@
+// P2071R2 - Named universal character escapes
+// { dg-do compile }
+// { dg-require-effective-target wchar }
+// { dg-options "" }
+
+#define z(x) 0
+#define a z(
+int b = a\N{});				// { dg-warning "empty named universal character escape sequence; treating it as separate tokens" }
+int c = a\N{);				// { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" }
+int d = a\N);
+int e = a\NARG);
+int f = a\N{abc});			// { dg-warning "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" }
+int g = a\N{ABC.123});			// { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" }
+int h = a\N{NON-EXISTENT CHAR});	// { dg-error "is not a valid universal character" "" { target c++23 } }
+					// { dg-error "was not declared in this scope" "" { target c++23 } .-1 }
+					// { dg-warning "\\\\N\\\{NON-EXISTENT CHAR\\\} is not a valid universal character; treating it as separate tokens" "" { target c++20_down } .-2 }
+int i = a\N{Latin_Small_Letter_A_With_Acute});	// { dg-warning "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" }
+					// { dg-message "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target *-*-* } .-1 }


	Jakub