From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 410C23858427 for ; Fri, 26 Aug 2022 13:59:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 410C23858427 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661522351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XAW6iLX4teG8yITmAH+l9Ad9QwMIGMYbs912SYHkTXY=; b=besYYIEnKFiK6RctAhuomWYjbFD8Md2sedDyu6Iy3HZTQFowONRK0GT03tgKDEZysKcjGN PxG0rhS2r+V0a8zGDsfIvSWYG8kW2dtr6claxYbx8eVTqasWa0QeaJMF7buhdodOI6lUjX PlnnDfeH/fJUpwgtokKgp8bSMGJ9YYg= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-346-MpMXPLiwMQify5u4M5xEVg-1; Fri, 26 Aug 2022 09:59:10 -0400 X-MC-Unique: MpMXPLiwMQify5u4M5xEVg-1 Received: by mail-qk1-f197.google.com with SMTP id az11-20020a05620a170b00b006bc374c71e8so1285637qkb.17 for ; Fri, 26 Aug 2022 06:59:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc; bh=XAW6iLX4teG8yITmAH+l9Ad9QwMIGMYbs912SYHkTXY=; b=PnLwoA5oaP3Dev2Q14WQjnqchHroscsZCZGge9my53z8k1VtNK45pjsw8QFoITiHJV lDGgxAL4a45NhPUwwOdlpWRxspuwmGSN9uLe2RkEApjI29FvRQuZ8KY+WE3Lz9/8deEs BLKc1O1zBRDsFuwtezn053CjMoIN+tDhOYRC2rsN32NNpuI3tSMDYU8AMCkAvpylOkUD t/4kz+yABlmiCmBbTKxh4aJShqxioi0N8gIPlGw15aUWW1xgg+tE7h3ROBAFj/iwcvd1 IXihBJG54XyX4lQW8N36Vx6WU+TIdjMFLvPqHlvKZklYNMpKjcfX2CztoQht5fZBsLOm jytQ== X-Gm-Message-State: ACgBeo1Klua1ZbtUmutHMw2PBG3a7fLwytXpQtSNFeCUfmnR8QM0YYvz 3eoH9e0QuDYqQeG6WLChJEjGiWUmMMqWYlLqfWU7EgSENxGHjqdMpdRx1HuP/FZLMjF6nkOJ+H8 1Ti8jha+TuJjAcirlDw== X-Received: by 2002:a05:620a:130d:b0:6bb:724:1bd9 with SMTP id o13-20020a05620a130d00b006bb07241bd9mr6511734qkj.554.1661522350359; Fri, 26 Aug 2022 06:59:10 -0700 (PDT) X-Google-Smtp-Source: AA6agR5TlYH4P0lUuiyJAe3eKrXty7PmcWSsq4hHrsEz0fRJd9sSkzR//SsNogoh/PmB6u5GgEi87Q== X-Received: by 2002:a05:620a:130d:b0:6bb:724:1bd9 with SMTP id o13-20020a05620a130d00b006bb07241bd9mr6511705qkj.554.1661522349894; Fri, 26 Aug 2022 06:59:09 -0700 (PDT) Received: from [192.168.1.101] (130-44-159-43.s15913.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id e64-20020a376943000000b006bc192d277csm1715281qkc.10.2022.08.26.06.59.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 26 Aug 2022 06:59:09 -0700 (PDT) Message-ID: <7762dd3e-9afd-f72f-7137-bd430618a5cb@redhat.com> Date: Fri, 26 Aug 2022 09:59:08 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH] libcpp: Implement P2362R3 - Remove non-encodable wide character literals and multicharacter [PR106647] To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org References: From: Jason Merrill In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 8/26/22 03:35, Jakub Jelinek wrote: > Hi! > > My understanding of the paper is that we just want to promote the CPP_WCHAR > "character constant too long for its type" warning to error as it is already > error for u8, u and U literals. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. > 2022-08-26 Jakub Jelinek > > PR c++/106647 > * charset.cc (wide_str_to_charconst): Implement P2362R3 - Remove > non-encodable wide character literals and multicharacter. For > C++23 use CPP_DL_ERROR instead of CPP_DL_WARNING for > "character constant too long for its type" diagnostics on CPP_WCHAR > literals. > > * g++.dg/cpp23/wchar-multi1.C: New test. > * g++.dg/cpp23/wchar-multi2.C: New test. > > --- libcpp/charset.cc.jj 2022-08-25 11:54:38.849924475 +0200 > +++ libcpp/charset.cc 2022-08-25 18:36:20.650415220 +0200 > @@ -2170,7 +2170,11 @@ wide_str_to_charconst (cpp_reader *pfile > character constant is guaranteed to overflow. */ > if (str.len > nbwc * 2) > cpp_error (pfile, (CPP_OPTION (pfile, cplusplus) > - && (type == CPP_CHAR16 || type == CPP_CHAR32)) > + && (type == CPP_CHAR16 > + || type == CPP_CHAR32 > + /* In C++23 this is error even for L'ab'. */ > + || (type == CPP_WCHAR > + && CPP_OPTION (pfile, size_t_literals)))) > ? CPP_DL_ERROR : CPP_DL_WARNING, > "character constant too long for its type"); > > --- gcc/testsuite/g++.dg/cpp23/wchar-multi1.C.jj 2022-08-25 18:08:01.973426155 +0200 > +++ gcc/testsuite/g++.dg/cpp23/wchar-multi1.C 2022-08-25 18:51:30.476687112 +0200 > @@ -0,0 +1,42 @@ > +// P2362R3 - Remove non-encodable wide character literals and multicharacter > +// wide character literals. > +// { dg-do compile } > + > +char a = 'a'; > +int b = 'ab'; // { dg-warning "multi-character character constant" } > +int c = '\u05D9'; // { dg-warning "multi-character character constant" } > +#if __SIZEOF_INT__ > 2 > +int d = '\U0001F525'; // { dg-warning "multi-character character constant" "" { target int32 } } > +#endif > +int e = 'abcd'; // { dg-warning "multi-character character constant" } > +wchar_t f = L'f'; > +wchar_t g = L'gh'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > +wchar_t h = L'ijkl'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > +wchar_t i = L'\U0001F525'; // { dg-error "character constant too long for its type" "" { target { c++23 && { ! 4byte_wchar_t } } } } > + // { dg-warning "character constant too long for its type" "" { target { c++20_down && { ! 4byte_wchar_t } } } .-1 } > +#ifdef __cpp_char8_t > +typedef char8_t u8; > +#else > +typedef char u8; > +#endif > +#if __cpp_unicode_characters >= 201411 > +u8 j = u8'j'; > +u8 k = u8'kl'; // { dg-error "character constant too long for its type" "" { target c++17 } } > +u8 l = u8'\U0001F525'; // { dg-error "character constant too long for its type" "" { target c++17 } } > +#endif > +#if __cpp_unicode_characters >= 200704 > +char16_t m = u'm'; > +char16_t n = u'no'; // { dg-error "character constant too long for its type" "" { target c++11 } } > +char16_t o = u'\u05D9'; > +char16_t p = u'\U0001F525'; // { dg-error "character constant too long for its type" "" { target c++11 } } > +char32_t q = U'm'; > +char32_t r = U'no'; // { dg-error "character constant too long for its type" "" { target c++11 } } > +char32_t s = U'\u05D9'; > +char32_t t = U'\U0001F525'; > +#endif > +wchar_t u = L'\u0065\u0301'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > +wchar_t v = L'é'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > --- gcc/testsuite/g++.dg/cpp23/wchar-multi2.C.jj 2022-08-25 18:51:53.744386945 +0200 > +++ gcc/testsuite/g++.dg/cpp23/wchar-multi2.C 2022-08-25 18:53:03.317489442 +0200 > @@ -0,0 +1,43 @@ > +// P2362R3 - Remove non-encodable wide character literals and multicharacter > +// wide character literals. > +// { dg-do compile } > +// { dg-options "-fshort-wchar" } > + > +char a = 'a'; > +int b = 'ab'; // { dg-warning "multi-character character constant" } > +int c = '\u05D9'; // { dg-warning "multi-character character constant" } > +#if __SIZEOF_INT__ > 2 > +int d = '\U0001F525'; // { dg-warning "multi-character character constant" "" { target int32 } } > +#endif > +int e = 'abcd'; // { dg-warning "multi-character character constant" } > +wchar_t f = L'f'; > +wchar_t g = L'gh'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > +wchar_t h = L'ijkl'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > +wchar_t i = L'\U0001F525'; // { dg-error "character constant too long for its type" "" { target { c++23 } } } > + // { dg-warning "character constant too long for its type" "" { target { c++20_down } } .-1 } > +#ifdef __cpp_char8_t > +typedef char8_t u8; > +#else > +typedef char u8; > +#endif > +#if __cpp_unicode_characters >= 201411 > +u8 j = u8'j'; > +u8 k = u8'kl'; // { dg-error "character constant too long for its type" "" { target c++17 } } > +u8 l = u8'\U0001F525'; // { dg-error "character constant too long for its type" "" { target c++17 } } > +#endif > +#if __cpp_unicode_characters >= 200704 > +char16_t m = u'm'; > +char16_t n = u'no'; // { dg-error "character constant too long for its type" "" { target c++11 } } > +char16_t o = u'\u05D9'; > +char16_t p = u'\U0001F525'; // { dg-error "character constant too long for its type" "" { target c++11 } } > +char32_t q = U'm'; > +char32_t r = U'no'; // { dg-error "character constant too long for its type" "" { target c++11 } } > +char32_t s = U'\u05D9'; > +char32_t t = U'\U0001F525'; > +#endif > +wchar_t u = L'\u0065\u0301'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > +wchar_t v = L'é'; // { dg-error "character constant too long for its type" "" { target c++23 } } > + // { dg-warning "character constant too long for its type" "" { target c++20_down } .-1 } > > Jakub >