From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jakub@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
	by sourceware.org (Postfix) with ESMTPS id B739B3851407
	for <gcc-patches@gcc.gnu.org>; Tue, 30 Aug 2022 21:37:15 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B739B3851407
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1661895435;
	h=from:from:reply-to:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:in-reply-to:in-reply-to:  references:references;
	bh=dFrUOrXQpEM9dXXVtc7RTARGmWQilL8hKwkIgyhC0Fg=;
	b=KCB1gGzM7aN32u5Wi8Ouu8PK08leyLEpPQKTBnj6jbEzvfzqfXet2jyezhGawz5sPyPpVg
	CEtp9whGv1cBojqtB8AVKB1ageK/Ay8+ZK0R3Ajn4HR+DcMEoAMaZxuhnpiwRgL3zY7fO5
	RcJKjDu9ArsgdJdT0h/T1n6xop8/wxU=
Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com
 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-659-D07pzEh3PY-IkuHmdACypg-1; Tue, 30 Aug 2022 17:37:12 -0400
X-MC-Unique: D07pzEh3PY-IkuHmdACypg-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9E3E038041C3;
	Tue, 30 Aug 2022 21:37:11 +0000 (UTC)
Received: from tucnak.zalov.cz (unknown [10.39.192.41])
	by smtp.corp.redhat.com (Postfix) with ESMTPS id 5937E2166B26;
	Tue, 30 Aug 2022 21:37:11 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1])
	by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 27ULb8ts2418089
	(version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT);
	Tue, 30 Aug 2022 23:37:09 +0200
Received: (from jakub@localhost)
	by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 27ULb8Ha2418086;
	Tue, 30 Aug 2022 23:37:08 +0200
Date: Tue, 30 Aug 2022 23:37:07 +0200
From: Jakub Jelinek <jakub@redhat.com>
To: Joseph Myers <joseph@codesourcery.com>, Jason Merrill <jason@redhat.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] c++, v2: Implement C++23 P2071R2 - Named universal
 character escapes [PR106648]
Message-ID: <Yw6DA3MhofyzWnje@tucnak>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <YwJ22kdlxJ70JcPJ@tucnak>
 <4fcd7e74-6f1c-dbec-a42c-e4e3fd13470b@redhat.com>
 <Ywc3pI1lnzq/FvOu@tucnak>
 <alpine.DEB.2.22.394.2208302055240.446383@digraph.polyomino.org.uk>
 <Yw5+nPD8O+JTx3uL@tucnak>
MIME-Version: 1.0
In-Reply-To: <Yw5+nPD8O+JTx3uL@tucnak>
X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Tue, Aug 30, 2022 at 11:18:20PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Aug 30, 2022 at 09:10:37PM +0000, Joseph Myers wrote:
> > I'm seeing build failures of glibc for powerpc64, as illustrated by the 
> > following C code:
> > 
> > #if 0
> > \NARG
> > #endif
> > 
> > (the actual sysdeps/powerpc/powerpc64/sysdep.h code is inside #ifdef 
> > __ASSEMBLER__).
> > 
> > This shows some problems with this feature - and with delimited escape 
> > sequences - as it affects C.  It's fine to accept it as an extension 
> > inside string and character literals, because \N or \u{...} would be 
> > invalid in the absence of the feature (i.e. the syntax for such literals 
> > fails to match, meaning that the rule about undefined behavior for a 
> > single ' or " as a pp-token applies).  But outside string and character 
> > literals, the usual lexing rules apply, the \ is a pp-token on its own and 
> > the code is valid at the preprocessing level, and with expansion of macros 
> > appearing before or after the \ (e.g. u defined as a macro in the \u{...} 
> > case) it may be valid code at the language level as well.  I don't know 
> > what older C++ versions say about this, but for C this means e.g.
> > 
> > #define z(x) 0
> > #define a z(
> > int x = a\NARG);
> > 
> > needs to be accepted as expanding to "int x = 0;", not interpreted as 
> > using the \N feature in an identifier and produce an error.
> 
> Thanks, will look at it tomorrow.

If
#define z(x) 0
#define a z(
int x = a\NARG);
is valid in C and C++ <= 20 then
#define z(x) 0
#define a z(
int x = a\N{LATIN SMALL LETTER A WITH ACUTE});
is too and shall preprocess to int x = 0; too.
Which would likely mean that we want to only handle it in identifiers if
in C++23 and not actually treat it as an extension except in literals.

Jason, your toughts about that?

	Jakub