From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id D5D7E385AC3E for ; Thu, 2 Dec 2021 20:33:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D5D7E385AC3E Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-199-Zq_w2A6oOY2AimOxb88ysw-1; Thu, 02 Dec 2021 15:33:01 -0500 X-MC-Unique: Zq_w2A6oOY2AimOxb88ysw-1 Received: by mail-qk1-f199.google.com with SMTP id bs14-20020a05620a470e00b0046b1e29f53cso1185372qkb.0 for ; Thu, 02 Dec 2021 12:33:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=rMNYk5qxTwoo5Tngsr89lFjZ1GI7oUDr0MRzz+0Lbk8=; b=MHqbIEzLK94ptXO+h2ZXGAmBIZvsuE/ljSpMobf8zpDnmrlkF0Wp7zarWIPAKHRjBp tamRrDoDHwqLfvG/RyM9Xbpd7sm+k44aEnKapPlKKQwhw4LwqC1opfd0i9u74MH1leZt 6tCIvwZvPTlGKAEYC2Ou7B9zLGjNRDCRiYGmjyP5rm32h1g99qgVFpZEkLTdNnU6F+im CwrSmd7PhfPPE/hrxXgdDO++i0LaAUVKKuucU6JRCKWTPeSILpta+N9G3QqaerxSebaI InKGuG6LyeDb+F2T1tl7WKqP1e4QIL44AIiMo4Bo/xRPG0fJwyTMgXQCJrkTWYzw19bb H6aw== X-Gm-Message-State: AOAM533+99yhuNcv2htOdZxnSud4dyyMCprSDXTPY0xjtGehZaS8zc/l whamjV0nhi+HR2vWdu9IMbR1KyU8hJAzI8BaTdpTS/vI9NPfnCUSnM1QuON3R/7JpQc3tirM5zH Zgn3QjdF1WWftlVgqTQ== X-Received: by 2002:ac8:5a54:: with SMTP id o20mr16178359qta.480.1638477180764; Thu, 02 Dec 2021 12:33:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJxLRKLixyEFtzsMFR4aJciRYAMZWpCTgEHzbAtAGAFtPg+9oblPOQFqV2njAiMNwfSzQZG5rA== X-Received: by 2002:ac8:5a54:: with SMTP id o20mr16178328qta.480.1638477180360; Thu, 02 Dec 2021 12:33:00 -0800 (PST) Received: from [192.168.1.149] (130-44-159-43.s15913.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id h12sm588431qkp.52.2021.12.02.12.32.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Dec 2021 12:32:59 -0800 (PST) Message-ID: <1f03c962-edc6-40ed-0327-76dedfca832c@redhat.com> Date: Thu, 2 Dec 2021 15:32:58 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.2 Subject: Re: [PATCH] c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4 To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org, Jonathan Wakely References: <20211129163142.GG2646553@tucnak> <20211130121707.GM2646553@tucnak> <20211201094600.GV2646553@tucnak> From: Jason Merrill In-Reply-To: <20211201094600.GV2646553@tucnak> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPAM_BODY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2021 20:33:05 -0000 On 12/1/21 04:46, Jakub Jelinek wrote: > On Tue, Nov 30, 2021 at 03:19:19PM -0500, Jason Merrill wrote: >> On 11/30/21 07:17, Jakub Jelinek wrote: >>> On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote: >>>> It's a DR. Really, it was intended to be part of C++20; at the Cologne >>>> meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix >>>> could go in as part of that paper. >>> >>> Ok, changed to be done unconditionally now. >>> >>>> Also, allowing indeterminate values that are never read was in C++20 >>>> (P1331). >>> >>> Reading P1331R2 again, I'm still puzzled. >>> Our current behavior (both before and after this patch) is that if >>> some variable is scalar and has indeterminate value or if an aggregate >>> variable has some members (possibly nested) with indeterminate values, >>> in constexpr contexts we allow copying those into other vars of the >>> same type (e.g. the testcases in the patch below test mere copying >>> of the whole structures or unsigned char result of __builtin_bit_cast), >> >> That seems to be a bug, since the copy involves an lvalue-to-rvalue >> conversion. > > Yeah. The questions are > 1) where in constexpr.c to check for this and diagnose it > 2) if we do it quite often, whether check for it won't be too expensive, > because we want to check recursively for CONSTRUCTORs with > CONSTRUCTOR_NO_CLEARING and possibly whether it has any members or > array elements that are omitted in the CONSTRUCTOR (or can we rely > on CONSTRUCTOR_NO_CLEARING being set only if it is incomplete? Then > we'd need to make sure we clear that flag if we store the last member > or array element into it) > E.g. for 1), I wonder where exactly the lvalue-to-rvalue conversions occur, > say for > struct S { int a, b, c; }; > constexpr int > foo () > { > struct S s; > s.a = 1; > s.c = 3; > return s.a; // Does lvalue-to-rvalue conversion happen here on whole s > // and then on s.a too, or just on s.a ? Only on s.a. Class copy is modeled as memberwise copy, even though we optimize it into block copy for trivially copyable classes. > } > E.g. when cxx_eval_component_reference or cxx_eval_array_reference > or cxx_eval_bit_field_ref we cxx_eval_constant_expression the base > first, so if the 1) diagnostics would be in cxx_eval_constant_expression > when it is called on an lvalue_p with !lval, we'd trigger it even when > we might not want to trigger it... I think we can work around that so that we complain about the full CONSTRUCTOR when we're doing a full copy, and otherwise only complain about the member. >>> but we reject if we actually use them in some other way (e.g. try to >>> read a member from a variable that has that member indeterminate, >>> see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an >>> unsigned char variable. >> >> That's correct. >> >>> Then there is P1331R2 which makes the UB on >>> "an lvalue-to-rvalue conversion that is applied to an object with >>> indeterminate value ([basic.indet]);" >>> but isn't even the >>> unsigned char a = __builtin_bit_cast (unsigned char, u); >>> unsigned char b = a; >>> case non-constant then when __builtin_bit_cast returns indeterminate value? >> >> Good point. So it would seem to follow that if the output is going to have >> an indeterminate value, it's non-constant, we don't have to work hard in >> constexpr evaluation, and f1-4 are all non-constant. And the new bit_cast >> text is only interesting for non-constant evaluation. > > I think it isn't that simple. One thing is that std::bit_cast is described > as a black box that takes a reference argument, so whether lvalue-to-rvalue > conversion happens there is just something that isn't guaranteed (though, > the way __builtin_bit_cast is implemented which is called with the value > rather than its address there is lvalue-to-rvalue conversion on it). Eh, all functions in the library are described as black boxes. But your next argument is more persuasive: > But another more important thing is that at least in the testcases in the > patch if one kills those extra copies of __builtin_bit_cast lhs to other > vars which are non-constant according to P1331R2, I don't see anything > that would imply they are non-constant. If there is the lvalue-to-rvalue > conversion on what the std::bit_cast argument refers to, that would make > non-constant e.g. class objects with some members indeterminate (e.g. > in the above foo s is like that), but in the testcases the > __builtin_bit_cast arguments have all named members initialized, all they > have is some padding bits in between those members and that can't be > non-constant in itself, otherwise most of constant evaluation would be > non-constant. And the std::bit_cast rules say what happens in that case. Ah, good point. >>> __builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens >>> in that case, so supposely >>> unsigned char a = __builtin_bit_cast (unsigned char, u); >>> is fine, but on >> >> Eh, there's clearly an lvalue-rvalue conversion involved in reading from the >> source value. > > Yes, but if u doesn't contain any indeterminate (leaf) members, then > it shouldn't be non-constant. > > So IMHO we need the patch I've posted (with the testcases slightly adjusted > not to do those extra copies afterwards for now), > and try to deal with the lvalue-to-rvalue conversion of indeterminate later, > and once done, perhaps in a copy of those testcases readd those extra copies > afterwards and verify it is rejected. Makes sense, except: > + gcc_assert (end == 1 || end == 2); This seems to assert that the value representation of a bit-field can't span more than two bytes, which is wrong for, say, struct A { int : 1; unsigned __int128 c : 128; // value representation spans 17 bytes }; Jason