From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 55BFD384F022 for ; Wed, 7 Sep 2022 01:32:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55BFD384F022 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662514335; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AtthsFhtqn/RYw32w0rJQ1s+8li/h/dtcnAUgwinUMA=; b=ej6Q6QGhe21BDD1Mob4VOevGXR9HqVsTuoY6TUzMp9PD/dqluvEP5IsLTCLNgS8Dm3uXJ2 QMs47g76OL5Nf505jfq6so82SH503mPuDB5a2kh3rzqMqFZwD7RRCzytv7Qzli3Zb8UMcL Qh3S4VBjj3g1LYq4L+GozGh510oK1V0= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-544-2ETB9x_bPuSjx1SX9NBOzA-1; Tue, 06 Sep 2022 21:32:14 -0400 X-MC-Unique: 2ETB9x_bPuSjx1SX9NBOzA-1 Received: by mail-qk1-f198.google.com with SMTP id ay10-20020a05620a178a00b006bbcab9d554so10792641qkb.13 for ; Tue, 06 Sep 2022 18:32:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date; bh=AtthsFhtqn/RYw32w0rJQ1s+8li/h/dtcnAUgwinUMA=; b=P7Rk8bMn14bksZnFowH2mRCzB4xyD5Se129O9sIrFWw2BlkMzzGVQNqov7VccjJoiL 9nUpG7TvwzqSVFG9TCuHUAiJVOnJM4N0nrBmF8hWzZJB3T1JXCEqtBL9TqdbHIXzymPu yCH1US8gPKkRqZQZJp6PmROQ+80cd+SydZve47LyS2XO2ztyd6tN2M+maC82GisMP6Vw 04RurYML1kmKuzgQM8WWZVFi+1PfT1VS6jjwQv6v+Y3LQKXDM68Qh2zPXlY2S8gY/AQR mGV9Y+nteoiLna5PKp09gUMPIwH9UVjdwgCARCtMDAgxpKykTX+w/AgwUXIiW+rHdpXI AvDA== X-Gm-Message-State: ACgBeo2JBb4VEeIXezECxhl/O6Byezs78TnycVz+48BNfIbHw4WiaIYb sGKRAbm6dD7zWPpdwNB6ZaL3rIZ/USd3sDU3qSfwND9NpblrH+ss57jUD75TVmVGSC2FC/cYwSc jLRULmxM4U+cxDFr5fw== X-Received: by 2002:a05:620a:40c2:b0:6bb:1687:3760 with SMTP id g2-20020a05620a40c200b006bb16873760mr1117550qko.475.1662514334100; Tue, 06 Sep 2022 18:32:14 -0700 (PDT) X-Google-Smtp-Source: AA6agR5lBPnTqn1yHKQqxgKragFE1aHWdQnzZXZd3uDVRh66+FO7hRmF9XNhVf6qmra75Pz0aqxzwA== X-Received: by 2002:a05:620a:40c2:b0:6bb:1687:3760 with SMTP id g2-20020a05620a40c200b006bb16873760mr1117533qko.475.1662514333603; Tue, 06 Sep 2022 18:32:13 -0700 (PDT) Received: from [192.168.1.101] (130-44-159-43.s15913.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id l21-20020a37f915000000b006bbe7ded98csm12493056qkj.112.2022.09.06.18.32.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Sep 2022 18:32:13 -0700 (PDT) Message-ID: <1a7b8ffa-4b39-0bd8-8d14-8d5d721dd1fa@redhat.com> Date: Tue, 6 Sep 2022 21:32:12 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH] libcpp, v3: Named universal character escapes and delimited escape sequence tweaks To: Jakub Jelinek , Joseph Myers , gcc-patches@gcc.gnu.org References: <5da578e7-9c43-99ea-15c1-aefc641a0654@redhat.com> <37250e6c-80f9-2b93-a381-c1c9b869c04d@redhat.com> From: Jason Merrill In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 9/3/22 06:54, Jakub Jelinek wrote: > On Sat, Sep 03, 2022 at 12:29:52PM +0200, Jakub Jelinek wrote: >> On Thu, Sep 01, 2022 at 03:00:28PM -0400, Jason Merrill wrote: >>> We might as well use the same flag name, and document it to mean what it >>> currently means for GCC. >> >> Ok, following patch introduces -Wunicode (on by default). >> >>> It looks like this is handling \N{abc}, for which "incomplete" seems like >>> the wrong description; it's complete, just wrong, and the diagnostic doesn't >>> help correct it. >> >> And also will emit the is not a valid universal character with did you mean >> if it matches loosely, otherwise will use the not terminated with } after >> ... wording. >> >> Ok if it passes bootstrap/regtest? OK, thanks. > Actually, treating the !strict case like the strict case except for always > warning instead of error if outside of literals is simpler. > > The following version does that. The only difference on the testcases is in > the > int f = a\N{abc}); > cases where it emits different diagnostics. > > 2022-09-03 Jakub Jelinek > > libcpp/ > * include/cpplib.h (struct cpp_options): Add cpp_warn_unicode member. > (enum cpp_warning_reason): Add CPP_W_UNICODE. > * init.cc (cpp_create_reader): Initialize cpp_warn_unicode. > * charset.cc (_cpp_valid_ucn): In possible identifier contexts, don't > handle \u{ or \N{ specially in -std=c* modes except -std=c++2{3,b}. > In possible identifier contexts, don't emit an error and punt > if \N isn't followed by {, or if \N{} surrounds some lower case > letters or _. In possible identifier contexts when not C++23, don't > emit an error but warning about unknown character names and treat as > separate tokens. When treating as separate tokens \u{ or \N{, emit > warnings. > gcc/ > * doc/invoke.texi (-Wno-unicode): Document. > gcc/c-family/ > * c.opt (Winvalid-utf8): Use ObjC instead of objC. Remove > " in comments" from description. > (Wunicode): New option. > gcc/testsuite/ > * c-c++-common/cpp/delimited-escape-seq-4.c: New test. > * c-c++-common/cpp/delimited-escape-seq-5.c: New test. > * c-c++-common/cpp/delimited-escape-seq-6.c: New test. > * c-c++-common/cpp/delimited-escape-seq-7.c: New test. > * c-c++-common/cpp/named-universal-char-escape-5.c: New test. > * c-c++-common/cpp/named-universal-char-escape-6.c: New test. > * c-c++-common/cpp/named-universal-char-escape-7.c: New test. > * g++.dg/cpp23/named-universal-char-escape1.C: New test. > * g++.dg/cpp23/named-universal-char-escape2.C: New test. > > --- libcpp/include/cpplib.h.jj 2022-09-03 09:35:41.465984642 +0200 > +++ libcpp/include/cpplib.h 2022-09-03 11:30:57.250677870 +0200 > @@ -565,6 +565,10 @@ struct cpp_options > 2 if it should be a pedwarn. */ > unsigned char cpp_warn_invalid_utf8; > > + /* True if libcpp should warn about invalid forms of delimited or named > + escape sequences. */ > + bool cpp_warn_unicode; > + > /* True if -finput-charset= option has been used explicitly. */ > bool cpp_input_charset_explicit; > > @@ -675,7 +679,8 @@ enum cpp_warning_reason { > CPP_W_CXX20_COMPAT, > CPP_W_EXPANSION_TO_DEFINED, > CPP_W_BIDIRECTIONAL, > - CPP_W_INVALID_UTF8 > + CPP_W_INVALID_UTF8, > + CPP_W_UNICODE > }; > > /* Callback for header lookup for HEADER, which is the name of a > --- libcpp/init.cc.jj 2022-09-01 09:47:23.729892618 +0200 > +++ libcpp/init.cc 2022-09-03 11:19:10.954452329 +0200 > @@ -228,6 +228,7 @@ cpp_create_reader (enum c_lang lang, cpp > CPP_OPTION (pfile, warn_date_time) = 0; > CPP_OPTION (pfile, cpp_warn_bidirectional) = bidirectional_unpaired; > CPP_OPTION (pfile, cpp_warn_invalid_utf8) = 0; > + CPP_OPTION (pfile, cpp_warn_unicode) = 1; > CPP_OPTION (pfile, cpp_input_charset_explicit) = 0; > > /* Default CPP arithmetic to something sensible for the host for the > --- libcpp/charset.cc.jj 2022-09-01 14:19:47.462235851 +0200 > +++ libcpp/charset.cc 2022-09-03 12:42:41.800923600 +0200 > @@ -1448,7 +1448,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const > if (str[-1] == 'u') > { > length = 4; > - if (str < limit && *str == '{') > + if (str < limit > + && *str == '{' > + && (!identifier_pos > + || CPP_OPTION (pfile, delimited_escape_seqs) > + || !CPP_OPTION (pfile, std))) > { > str++; > /* Magic value to indicate no digits seen. */ > @@ -1462,8 +1466,22 @@ _cpp_valid_ucn (cpp_reader *pfile, const > else if (str[-1] == 'N') > { > length = 4; > + if (identifier_pos > + && !CPP_OPTION (pfile, delimited_escape_seqs) > + && CPP_OPTION (pfile, std)) > + { > + *cp = 0; > + return false; > + } > if (str == limit || *str != '{') > - cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'"); > + { > + if (identifier_pos) > + { > + *cp = 0; > + return false; > + } > + cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'"); > + } > else > { > str++; > @@ -1489,15 +1507,19 @@ _cpp_valid_ucn (cpp_reader *pfile, const > > if (str < limit && *str == '}') > { > - if (name == str && identifier_pos) > + if (identifier_pos && name == str) > { > + cpp_warning (pfile, CPP_W_UNICODE, > + "empty named universal character escape " > + "sequence; treating it as separate tokens"); > *cp = 0; > return false; > } > if (name == str) > cpp_error (pfile, CPP_DL_ERROR, > "empty named universal character escape sequence"); > - else if (!CPP_OPTION (pfile, delimited_escape_seqs) > + else if ((!identifier_pos || strict) > + && !CPP_OPTION (pfile, delimited_escape_seqs) > && CPP_OPTION (pfile, cpp_pedantic)) > cpp_error (pfile, CPP_DL_PEDWARN, > "named universal character escapes are only valid " > @@ -1515,27 +1537,51 @@ _cpp_valid_ucn (cpp_reader *pfile, const > uname2c_tree, NULL); > if (result == (cppchar_t) -1) > { > - cpp_error (pfile, CPP_DL_ERROR, > - "\\N{%.*s} is not a valid universal " > - "character", (int) (str - name), name); > + bool ret = true; > + if (identifier_pos > + && (!CPP_OPTION (pfile, delimited_escape_seqs) > + || !strict)) > + ret = cpp_warning (pfile, CPP_W_UNICODE, > + "\\N{%.*s} is not a valid " > + "universal character; treating it " > + "as separate tokens", > + (int) (str - name), name); > + else > + cpp_error (pfile, CPP_DL_ERROR, > + "\\N{%.*s} is not a valid universal " > + "character", (int) (str - name), name); > > /* Try to do a loose name lookup according to > Unicode loose matching rule UAX44-LM2. */ > char canon_name[uname2c_max_name_len + 1]; > result = _cpp_uname2c_uax44_lm2 ((const char *) name, > str - name, canon_name); > - if (result != (cppchar_t) -1) > + if (result != (cppchar_t) -1 && ret) > cpp_error (pfile, CPP_DL_NOTE, > "did you mean \\N{%s}?", canon_name); > else > - result = 0x40; > + result = 0xC0; > + if (identifier_pos > + && (!CPP_OPTION (pfile, delimited_escape_seqs) > + || !strict)) > + { > + *cp = 0; > + return false; > + } > } > } > str++; > extend_char_range (char_range, loc_reader); > } > else if (identifier_pos) > - length = 1; > + { > + cpp_warning (pfile, CPP_W_UNICODE, > + "'\\N{' not terminated with '}' after %.*s; " > + "treating it as separate tokens", > + (int) (str - base), base); > + *cp = 0; > + return false; > + } > else > { > cpp_error (pfile, CPP_DL_ERROR, > @@ -1584,12 +1630,17 @@ _cpp_valid_ucn (cpp_reader *pfile, const > } > while (--length); > > - if (delimited > - && str < limit > - && *str == '}' > - && (length != 32 || !identifier_pos)) > + if (delimited && str < limit && *str == '}') > { > - if (length == 32) > + if (length == 32 && identifier_pos) > + { > + cpp_warning (pfile, CPP_W_UNICODE, > + "empty delimited escape sequence; " > + "treating it as separate tokens"); > + *cp = 0; > + return false; > + } > + else if (length == 32) > cpp_error (pfile, CPP_DL_ERROR, > "empty delimited escape sequence"); > else if (!CPP_OPTION (pfile, delimited_escape_seqs) > @@ -1607,6 +1658,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const > error message in that case. */ > if (length && identifier_pos) > { > + if (delimited) > + cpp_warning (pfile, CPP_W_UNICODE, > + "'\\u{' not terminated with '}' after %.*s; " > + "treating it as separate tokens", > + (int) (str - base), base); > *cp = 0; > return false; > } > --- gcc/doc/invoke.texi.jj 2022-09-03 09:35:40.966991672 +0200 > +++ gcc/doc/invoke.texi 2022-09-03 11:39:03.875914845 +0200 > @@ -365,7 +365,7 @@ Objective-C and Objective-C++ Dialects}. > -Winfinite-recursion @gol > -Winit-self -Winline -Wno-int-conversion -Wint-in-bool-context @gol > -Wno-int-to-pointer-cast -Wno-invalid-memory-model @gol > --Winvalid-pch -Winvalid-utf8 -Wjump-misses-init @gol > +-Winvalid-pch -Winvalid-utf8 -Wno-unicode -Wjump-misses-init @gol > -Wlarger-than=@var{byte-size} -Wlogical-not-parentheses -Wlogical-op @gol > -Wlong-long -Wno-lto-type-mismatch -Wmain -Wmaybe-uninitialized @gol > -Wmemset-elt-size -Wmemset-transposed-args @gol > @@ -9577,6 +9577,12 @@ Warn if an invalid UTF-8 character is fo > This warning is on by default for C++23 if @option{-finput-charset=UTF-8} > is used and turned into error with @option{-pedantic-errors}. > > +@item -Wno-unicode > +@opindex Wunicode > +@opindex Wno-unicode > +Don't diagnose invalid forms of delimited or named escape sequences which are > +treated as separate tokens. @option{Wunicode} is enabled by default. > + > @item -Wlong-long > @opindex Wlong-long > @opindex Wno-long-long > --- gcc/c-family/c.opt.jj 2022-09-03 09:35:40.206002393 +0200 > +++ gcc/c-family/c.opt 2022-09-03 11:17:04.529201926 +0200 > @@ -822,8 +822,8 @@ C ObjC C++ ObjC++ CPP(warn_invalid_pch) > Warn about PCH files that are found but not used. > > Winvalid-utf8 > -C objC C++ ObjC++ CPP(cpp_warn_invalid_utf8) CppReason(CPP_W_INVALID_UTF8) Var(warn_invalid_utf8) Init(0) Warning > -Warn about invalid UTF-8 characters in comments. > +C ObjC C++ ObjC++ CPP(cpp_warn_invalid_utf8) CppReason(CPP_W_INVALID_UTF8) Var(warn_invalid_utf8) Init(0) Warning > +Warn about invalid UTF-8 characters. > > Wjump-misses-init > C ObjC Var(warn_jump_misses_init) Warning LangEnabledby(C ObjC,Wc++-compat) > @@ -1345,6 +1345,10 @@ Wundef > C ObjC C++ ObjC++ CPP(warn_undef) CppReason(CPP_W_UNDEF) Var(cpp_warn_undef) Init(0) Warning > Warn if an undefined macro is used in an #if directive. > > +Wunicode > +C ObjC C++ ObjC++ CPP(cpp_warn_unicode) CppReason(CPP_W_UNICODE) Var(warn_unicode) Init(1) Warning > +Warn about invalid forms of delimited or named escape sequences. > + > Wuninitialized > C ObjC C++ ObjC++ LTO LangEnabledBy(C ObjC C++ ObjC++ LTO,Wall) > ; > --- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-4.c.jj 2022-09-03 11:13:37.570068845 +0200 > +++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-4.c 2022-09-03 11:56:52.818054420 +0200 > @@ -0,0 +1,13 @@ > +/* P2290R3 - Delimited escape sequences */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=gnu99 -Wno-c++-compat" { target c } } */ > +/* { dg-options "-std=gnu++20" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\u{}); /* { dg-warning "empty delimited escape sequence; treating it as separate tokens" } */ > +int c = a\u{); /* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" } */ > +int d = a\u{12XYZ}); /* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" } */ > +int e = a\u123); > +int f = a\U1234567); > --- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-5.c.jj 2022-09-03 11:13:37.570068845 +0200 > +++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-5.c 2022-09-03 12:01:35.618124647 +0200 > @@ -0,0 +1,13 @@ > +/* P2290R3 - Delimited escape sequences */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=c17 -Wno-c++-compat" { target c } } */ > +/* { dg-options "-std=c++23" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\u{}); /* { dg-warning "empty delimited escape sequence; treating it as separate tokens" "" { target c++23 } } */ > +int c = a\u{); /* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" "" { target c++23 } } */ > +int d = a\u{12XYZ}); /* { dg-warning "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" "" { target c++23 } } */ > +int e = a\u123); > +int f = a\U1234567); > --- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-6.c.jj 2022-09-03 11:59:36.573778876 +0200 > +++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-6.c 2022-09-03 11:59:55.808511591 +0200 > @@ -0,0 +1,13 @@ > +/* P2290R3 - Delimited escape sequences */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=gnu99 -Wno-c++-compat -Wno-unicode" { target c } } */ > +/* { dg-options "-std=gnu++20 -Wno-unicode" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\u{}); /* { dg-bogus "empty delimited escape sequence; treating it as separate tokens" } */ > +int c = a\u{); /* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" } */ > +int d = a\u{12XYZ}); /* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" } */ > +int e = a\u123); > +int f = a\U1234567); > --- gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-7.c.jj 2022-09-03 12:01:48.958939255 +0200 > +++ gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-7.c 2022-09-03 12:02:16.765552854 +0200 > @@ -0,0 +1,13 @@ > +/* P2290R3 - Delimited escape sequences */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=c17 -Wno-c++-compat -Wno-unicode" { target c } } */ > +/* { dg-options "-std=c++23 -Wno-unicode" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\u{}); /* { dg-bogus "empty delimited escape sequence; treating it as separate tokens" } */ > +int c = a\u{); /* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{; treating it as separate tokens" } */ > +int d = a\u{12XYZ}); /* { dg-bogus "'\\\\u\\\{' not terminated with '\\\}' after \\\\u\\\{12; treating it as separate tokens" } */ > +int e = a\u123); > +int f = a\U1234567); > --- gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-5.c.jj 2022-09-03 11:13:37.570068845 +0200 > +++ gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-5.c 2022-09-03 12:45:18.968747909 +0200 > @@ -0,0 +1,17 @@ > +/* P2071R2 - Named universal character escapes */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=gnu99 -Wno-c++-compat" { target c } } */ > +/* { dg-options "-std=gnu++20" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\N{}); /* { dg-warning "empty named universal character escape sequence; treating it as separate tokens" } */ > +int c = a\N{); /* { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" } */ > +int d = a\N); > +int e = a\NARG); > +int f = a\N{abc}); /* { dg-warning "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" } */ > +int g = a\N{ABC.123}); /* { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" } */ > +int h = a\N{NON-EXISTENT CHAR}); /* { dg-warning "\\\\N\\\{NON-EXISTENT CHAR\\\} is not a valid universal character; treating it as separate tokens" } */ > +int i = a\N{Latin_Small_Letter_A_With_Acute}); /* { dg-warning "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" } */ > + /* { dg-message "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target *-*-* } .-1 } */ > --- gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-6.c.jj 2022-09-03 11:13:37.570068845 +0200 > +++ gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-6.c 2022-09-03 11:44:34.558316155 +0200 > @@ -0,0 +1,17 @@ > +/* P2071R2 - Named universal character escapes */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=c17 -Wno-c++-compat" { target c } } */ > +/* { dg-options "-std=c++20" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\N{}); > +int c = a\N{); > +int d = a\N); > +int e = a\NARG); > +int f = a\N{abc}); > +int g = a\N{ABC.123}); > +int h = a\N{NON-EXISTENT CHAR}); /* { dg-bogus "is not a valid universal character" } */ > +int i = a\N{Latin_Small_Letter_A_With_Acute}); > +int j = a\N{LATIN SMALL LETTER A WITH ACUTE}); > --- gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-7.c.jj 2022-09-03 12:18:31.296022384 +0200 > +++ gcc/testsuite/c-c++-common/cpp/named-universal-char-escape-7.c 2022-09-03 12:45:57.663212248 +0200 > @@ -0,0 +1,17 @@ > +/* P2071R2 - Named universal character escapes */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target wchar } */ > +/* { dg-options "-std=gnu99 -Wno-c++-compat -Wno-unicode" { target c } } */ > +/* { dg-options "-std=gnu++20 -Wno-unicode" { target c++ } } */ > + > +#define z(x) 0 > +#define a z( > +int b = a\N{}); /* { dg-bogus "empty named universal character escape sequence; treating it as separate tokens" } */ > +int c = a\N{); /* { dg-bogus "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" } */ > +int d = a\N); > +int e = a\NARG); > +int f = a\N{abc}); /* { dg-bogus "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" } */ > +int g = a\N{ABC.123}); /* { dg-bogus "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" } */ > +int h = a\N{NON-EXISTENT CHAR}); /* { dg-bogus "\\\\N\\\{NON-EXISTENT CHAR\\\} is not a valid universal character; treating it as separate tokens" } */ > +int i = a\N{Latin_Small_Letter_A_With_Acute}); /* { dg-bogus "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" } */ > + /* { dg-bogus "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target *-*-* } .-1 } */ > --- gcc/testsuite/g++.dg/cpp23/named-universal-char-escape1.C.jj 2022-09-03 11:13:37.571068831 +0200 > +++ gcc/testsuite/g++.dg/cpp23/named-universal-char-escape1.C 2022-09-03 12:44:03.893787182 +0200 > @@ -0,0 +1,16 @@ > +// P2071R2 - Named universal character escapes > +// { dg-do compile } > +// { dg-require-effective-target wchar } > + > +#define z(x) 0 > +#define a z( > +int b = a\N{}); // { dg-warning "empty named universal character escape sequence; treating it as separate tokens" "" { target c++23 } } > +int c = a\N{); // { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" "" { target c++23 } } > +int d = a\N); > +int e = a\NARG); > +int f = a\N{abc}); // { dg-warning "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" "" { target c++23 } } > +int g = a\N{ABC.123}); // { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" "" { target c++23 } } > +int h = a\N{NON-EXISTENT CHAR}); // { dg-error "is not a valid universal character" "" { target c++23 } } > + // { dg-error "was not declared in this scope" "" { target c++23 } .-1 } > +int i = a\N{Latin_Small_Letter_A_With_Acute}); // { dg-warning "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" "" { target c++23 } } > + // { dg-message "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target c++23 } .-1 } > --- gcc/testsuite/g++.dg/cpp23/named-universal-char-escape2.C.jj 2022-09-03 11:13:37.571068831 +0200 > +++ gcc/testsuite/g++.dg/cpp23/named-universal-char-escape2.C 2022-09-03 12:44:31.723401937 +0200 > @@ -0,0 +1,18 @@ > +// P2071R2 - Named universal character escapes > +// { dg-do compile } > +// { dg-require-effective-target wchar } > +// { dg-options "" } > + > +#define z(x) 0 > +#define a z( > +int b = a\N{}); // { dg-warning "empty named universal character escape sequence; treating it as separate tokens" } > +int c = a\N{); // { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{; treating it as separate tokens" } > +int d = a\N); > +int e = a\NARG); > +int f = a\N{abc}); // { dg-warning "\\\\N\\\{abc\\\} is not a valid universal character; treating it as separate tokens" } > +int g = a\N{ABC.123}); // { dg-warning "'\\\\N\\\{' not terminated with '\\\}' after \\\\N\\\{ABC; treating it as separate tokens" } > +int h = a\N{NON-EXISTENT CHAR}); // { dg-error "is not a valid universal character" "" { target c++23 } } > + // { dg-error "was not declared in this scope" "" { target c++23 } .-1 } > + // { dg-warning "\\\\N\\\{NON-EXISTENT CHAR\\\} is not a valid universal character; treating it as separate tokens" "" { target c++20_down } .-2 } > +int i = a\N{Latin_Small_Letter_A_With_Acute}); // { dg-warning "\\\\N\\\{Latin_Small_Letter_A_With_Acute\\\} is not a valid universal character; treating it as separate tokens" } > + // { dg-message "did you mean \\\\N\\\{LATIN SMALL LETTER A WITH ACUTE\\\}\\?" "" { target *-*-* } .-1 } > > > Jakub >