From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1286 invoked by alias); 1 Jul 2015 21:14:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 1257 invoked by uid 89); 1 Jul 2015 21:14:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_50,KAM_ASCII_DIVIDERS,RP_MATCHES_RCVD,SPF_PASS autolearn=no version=3.3.2 X-HELO: userp1040.oracle.com Received: from userp1040.oracle.com (HELO userp1040.oracle.com) (156.151.31.81) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 01 Jul 2015 21:14:30 +0000 Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t61LERaU019292 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 1 Jul 2015 21:14:27 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id t61LER9R011158 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Wed, 1 Jul 2015 21:14:27 GMT Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id t61LEQ22010050; Wed, 1 Jul 2015 21:14:27 GMT Received: from [192.168.1.4] (/87.19.223.167) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 01 Jul 2015 14:14:26 -0700 Message-ID: <5594582F.2030604@oracle.com> Date: Wed, 01 Jul 2015 21:14:00 -0000 From: Paolo Carlini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Andreas Schwab CC: "gcc-patches@gcc.gnu.org" , Jason Merrill , Tom Tromey , tom@tromey.com Subject: Re: [C++/preprocessor Patch] PR c++/53690 References: <55944DE8.9030800@oracle.com> <87vbe3y5cp.fsf@igel.home> <55945394.5050605@oracle.com> In-Reply-To: <55945394.5050605@oracle.com> Content-Type: multipart/mixed; boundary="------------050809030701010901000507" X-IsSubscribed: yes X-SW-Source: 2015-07/txt/msg00091.txt.bz2 This is a multi-part message in MIME format. --------------050809030701010901000507 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1352 Hi again, On 07/01/2015 10:54 PM, Paolo Carlini wrote: > Hi, > > On 07/01/2015 10:45 PM, Andreas Schwab wrote: >> Paolo Carlini writes: >> >>> @@ -972,21 +972,20 @@ ucn_valid_in_identifier (cpp_reader *pfile, >>> cppcha >>> or 0060 (`), nor one in the range D800 through DFFF inclusive. >>> *PSTR must be preceded by "\u" or "\U"; it is assumed that the >>> - buffer end is delimited by a non-hex digit. Returns zero if the >>> - UCN has not been consumed. >>> + buffer end is delimited by a non-hex digit. Returns one if the >>> + UCN has not been consumed, zero otherwise. >> The name of the function would make more sense if it returned a boolean >> with true meaning success. > To be clear (the diff can be a little confusing) we are talking about > _cpp_valid_ucn, the name of the function I'm patching. That said, I'm > of course open to proposals about a better name for the patched > function. Likewise for the values returned, which I picked for > consistency with all the other functions in libcpp, the usual "unix" > convention. I stand corrected: in fact we are already using a mix of bool and int return types in those functions. Thus I'm also testing the below version, which simply changes the return type to bool with true meaning success. Thanks! Paolo. /////////////////// --------------050809030701010901000507 Content-Type: text/plain; charset=UTF-8; name="CL_53690_2" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="CL_53690_2" Content-length: 451 /libcpp 2015-07-01 Paolo Carlini PR c++/53690 * charset.c (_cpp_valid_ucn): Add cppchar_t * parameter and change return type to bool. Fix encoding of \u0000 and \U00000000 in C++. (convert_ucn): Adjust call. * lex.c (forms_identifier_p): Likewise. * internal.h (_cpp_valid_ucn): Adjust declaration. /gcc/testsuite 2015-07-01 Paolo Carlini PR c++/53690 * g++.dg/cpp/pr53690.C: New. --------------050809030701010901000507 Content-Type: text/plain; charset=UTF-8; name="patch_53690_2" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="patch_53690_2" Content-length: 4100 Index: gcc/testsuite/g++.dg/cpp/pr53690.C =================================================================== --- gcc/testsuite/g++.dg/cpp/pr53690.C (revision 0) +++ gcc/testsuite/g++.dg/cpp/pr53690.C (working copy) @@ -0,0 +1,7 @@ +// PR c++/53690 +// { dg-do compile { target c++11 } } + +int array1[U'\U00000000' == 0 ? 1 : -1]; +int array2[U'\u0000' == 0 ? 1 : -1]; +int array3[u'\U00000000' == 0 ? 1 : -1]; +int array4[u'\u0000' == 0 ? 1 : -1]; Index: libcpp/charset.c =================================================================== --- libcpp/charset.c (revision 225275) +++ libcpp/charset.c (working copy) @@ -972,21 +972,20 @@ ucn_valid_in_identifier (cpp_reader *pfile, cppcha or 0060 (`), nor one in the range D800 through DFFF inclusive. *PSTR must be preceded by "\u" or "\U"; it is assumed that the - buffer end is delimited by a non-hex digit. Returns zero if the - UCN has not been consumed. + buffer end is delimited by a non-hex digit. Returns false if the + UCN has not been consumed, true otherwise. - Otherwise the nonzero value of the UCN, whether valid or invalid, - is returned. Diagnostics are emitted for invalid values. PSTR - is updated to point one beyond the UCN, or to the syntactically - invalid character. + The value of the UCN, whether valid or invalid, is returned in *CP. + Diagnostics are emitted for invalid values. PSTR is updated to point + one beyond the UCN, or to the syntactically invalid character. IDENTIFIER_POS is 0 when not in an identifier, 1 for the start of an identifier, or 2 otherwise. */ -cppchar_t +bool _cpp_valid_ucn (cpp_reader *pfile, const uchar **pstr, const uchar *limit, int identifier_pos, - struct normalize_state *nst) + struct normalize_state *nst, cppchar_t *cp) { cppchar_t result, c; unsigned int length; @@ -1030,8 +1029,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const uchar **p multiple tokens in identifiers, so we can't give a helpful error message in that case. */ if (length && identifier_pos) - return 0; - + { + *cp = 0; + return false; + } + *pstr = str; if (length) { @@ -1079,10 +1081,8 @@ _cpp_valid_ucn (cpp_reader *pfile, const uchar **p (int) (str - base), base); } - if (result == 0) - result = 1; - - return result; + *cp = result; + return true; } /* Convert an UCN, pointed to by FROM, to UTF-8 encoding, then translate @@ -1100,7 +1100,7 @@ convert_ucn (cpp_reader *pfile, const uchar *from, struct normalize_state nst = INITIAL_NORMALIZE_STATE; from++; /* Skip u/U. */ - ucn = _cpp_valid_ucn (pfile, &from, limit, 0, &nst); + _cpp_valid_ucn (pfile, &from, limit, 0, &nst, &ucn); rval = one_cppchar_to_utf8 (ucn, &bufp, &bytesleft); if (rval) Index: libcpp/internal.h =================================================================== --- libcpp/internal.h (revision 225275) +++ libcpp/internal.h (working copy) @@ -744,9 +744,10 @@ struct normalize_state #define NORMALIZE_STATE_UPDATE_IDNUM(st, c) \ ((st)->previous = (c), (st)->prev_class = 0) -extern cppchar_t _cpp_valid_ucn (cpp_reader *, const unsigned char **, - const unsigned char *, int, - struct normalize_state *state); +extern bool _cpp_valid_ucn (cpp_reader *, const unsigned char **, + const unsigned char *, int, + struct normalize_state *state, + cppchar_t *); extern void _cpp_destroy_iconv (cpp_reader *); extern unsigned char *_cpp_convert_input (cpp_reader *, const char *, unsigned char *, size_t, size_t, Index: libcpp/lex.c =================================================================== --- libcpp/lex.c (revision 225275) +++ libcpp/lex.c (working copy) @@ -1244,9 +1244,10 @@ forms_identifier_p (cpp_reader *pfile, int first, && *buffer->cur == '\\' && (buffer->cur[1] == 'u' || buffer->cur[1] == 'U')) { + cppchar_t s; buffer->cur += 2; if (_cpp_valid_ucn (pfile, &buffer->cur, buffer->rlimit, 1 + !first, - state)) + state, &s)) return true; buffer->cur -= 2; } --------------050809030701010901000507--