From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-508931-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 115996 invoked by alias); 12 Sep 2019 00:33:34 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 115979 invoked by uid 89); 12 Sep 2019 00:33:34 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-3.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 spammy=H*i:sk:vtckDjU, H*f:CAA_5UQ4, H*f:sk:vtckDjU, H*i:CAA_5UQ4
X-HELO: esa4.mentor.iphmx.com
Received: from esa4.mentor.iphmx.com (HELO esa4.mentor.iphmx.com) (68.232.137.252) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Sep 2019 00:33:33 +0000
IronPort-SDR: 7ikCwQxDJrNG9EoLKxekvZszlyztC8JvqT60ncWnmiOEEBCyeASYECFZmnjgRXXWZZiTNb5UBl dvXlG7KehVuUKU4150ux2Cf34KUInEFxKE7OaW0PdgR+GYd+30ad1gSekHpsBFf8SyrWRrYhjS VANZLJ900SMm0p0re+hjyBA4id5BzThc3Beh6IO1uBo1GFgTJlRN0N+lcciBDmAuy9ME35iOCu D35mXZzOkkeSQdS+2w8zz9XaVUIFtcxeg6X8oj5QS5Oj54legO1SbE1WgpV2xw2enSp6r0otyH E+w=
Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165])  by esa4.mentor.iphmx.com with ESMTP; 11 Sep 2019 16:33:31 -0800
IronPort-SDR: w0Ot78fy54n7ekFZmiTsW3DZHVpBUDPL5jWxZZXjol/2AHO5vMtVB1Wmg2yaPpMLCnBW6/h+kK wO4AVeplT/Mqppak9O/NexDiHesu3AsDtiyBFAb634zZ92YyowROd//oFWmOfw8D1ELAHTPvYG f4XUbpdT+6VDGqeOXaPou5s9wpKvyk6av13W2i7skhv8LHLgXz5xVGghJA+Cb2pDTYGSKTo8wb SI80ztAz+0na7bNsbuMyxqQY/bACzXC1+uRxyyzQQp5KCVJZxgz78/dKaBc7jon9NDlAku1276 H4o=
Date: Thu, 12 Sep 2019 00:33:00 -0000
From: Joseph Myers <joseph@codesourcery.com>
To: Lewis Hyatt <lhyatt@gmail.com>
CC: <gcc-patches@gcc.gnu.org>
Subject: Re: Patch to support extended characters in C/C++ identifiers
In-Reply-To: <CAA_5UQ4+vtckDjUoLtURHghU3X7-6fWBcc4EYyCv7940xyZTLQ@mail.gmail.com>
Message-ID: <alpine.DEB.2.21.1909120031370.28563@digraph.polyomino.org.uk>
References: <20190812220121.GA9251@ldh.local> <alpine.DEB.2.21.1909102334390.25537@digraph.polyomino.org.uk> <CAA_5UQ4+vtckDjUoLtURHghU3X7-6fWBcc4EYyCv7940xyZTLQ@mail.gmail.com>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Return-Path: joseph@codesourcery.com
X-SW-Source: 2019-09/txt/msg00822.txt.bz2

On Wed, 11 Sep 2019, Lewis Hyatt wrote:

> things that may be a little surprising. For instance, you can take a
> UTF-8 encoded file and insert a backslash line continuation in the
> middle of a multibyte sequence, and gcc will happily paste it back
> together and then interpret the resulting UTF-8. I think it's
> technically OK standardwise since the conversion from extended
> characters to the source character set is implementation-defined, but
> it's hardly a straightforward definition. It is sort of consistent
> with the treatment of undefined behavior with UCN escapes though,
> which gcc already permits to be pasted together over a line
> continuation. Anyway, should this behavior be documented as well? I

I don't think that peculiarity should be documented.  (Whereas accepting 
arbitrary bytes inside comments and strings by default is arguably 
actually a feature.)

> > gcc/testsuite/g++.dg/cpp/ucnid-2-utf8.C and
> > gcc/testsuite/g++.dg/cpp/ucnid-3-utf8.C are testing double stringizing in
> > C++, where strictly the results they expect show that GCC does not conform
> > to the C++ standard requirement to convert all extended characters to UCNs
> > (because C++ does not have the special C rule making it
> > implementation-defined whether the \ of a UCN in a string literal is
> > doubled when stringizing).
> 
> Thanks, I didn't mean to ignore this point when you made it on the PR
> comments, I just wasn't sure what was the best way to handle it. Do
> you find it preferable to just add a comment, or should I rather
> change the test to look for the standard-confirming output, and make
> it an XFAIL?

My inclination would be a comment, with reference to a bug filed for this 
issue in Bugzilla.

> Finally, one general question, when I submit these last changes, is it
> better to send them as a new patch relative to what I already sent, or
> is it better to send the whole thing updated from scratch? Thanks
> again.

A complete patch that can be applied to trunk is best.

-- 
Joseph S. Myers
joseph@codesourcery.com