From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailrelay.tugraz.at (mailrelay.tugraz.at [129.27.2.202]) by sourceware.org (Postfix) with ESMTPS id 637773858284 for ; Fri, 24 Feb 2023 08:36:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 637773858284 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=tugraz.at Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tugraz.at Received: from vra-173-101.tugraz.at (vra-173-101.tugraz.at [129.27.173.101]) by mailrelay.tugraz.at (Postfix) with ESMTPSA id 4PNNXt0CpKz1LM0P; Fri, 24 Feb 2023 09:36:45 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.11.0 mailrelay.tugraz.at 4PNNXt0CpKz1LM0P DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tugraz.at; s=mailrelay; t=1677227807; bh=mb4e5lsTBECQ1DvsA86gL5E+OSvdhIDtshHUCHbHgQo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=RnnHSFlwGE+quBtdx6ITjpqQ0f4H3WJ74IiWIp/WH7ciH43lXAh2s9PkxZdNWksyX 5FY/BFyqxl5zsVFKawM9lBiGMYpq94z05Zim+KcyCAHKlvL49Z7fmxJ89yRoQc+WXw TLdcUbUnX0SHmcWspTpRhIkiPIc/r30bbHy5LPhs= Message-ID: <9d34a5da747601b0d9a3512cddfaf113726620ee.camel@tugraz.at> Subject: Re: Missed warning (-Wuse-after-free) From: Martin Uecker To: "Serge E. Hallyn" , Alex Colomar Cc: GCC , Iker Pedrosa , Florian Weimer , Paul Eggert , Michael Kerrisk , =?UTF-8?Q?J=E2=82=91=E2=82=99=E2=82=9B?= Gustedt , David Malcolm , Sam James , Jonathan Wakely Date: Fri, 24 Feb 2023 09:36:45 +0100 In-Reply-To: <20230224012114.GA360078@mail.hallyn.com> References: <8ed6d28c-69dc-fed8-5ab5-99f685f06fac@gmail.com> <38e7e994a81d2a18666404dbaeb556f3508a6bd6.camel@redhat.com> <23d3a3ff-adad-ac2e-92a6-4e19f4093143@gmail.com> <2148ef80dee2a034ee531d662fc8709d26159ec5.camel@tugraz.at> <0049730a-e28c-0e0f-8d92-695395f1ec21@gmail.com> <6edeb3c197c327c1c6639d322c53ec6056039a33.camel@tugraz.at> <20230224012114.GA360078@mail.hallyn.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.3-1+deb11u1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TUG-Backscatter-control: G/VXY7/6zeyuAY/PU2/0qw X-Spam-Scanner: SpamAssassin 3.003001 X-Spam-Score-relay: -1.9 X-Scanned-By: MIMEDefang 2.74 on 129.27.10.116 X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn: > On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote: > > Hi Martin, > > > > On 2/23/23 20:57, Martin Uecker wrote: > > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar: > > > > Hi Martin, > > > > > > > > On 2/17/23 14:48, Martin Uecker wrote: > > > > > > This new wording doesn't even allow one to use memcmp(3); > > > > > > just reading the pointer value, however you do it, is UB. > > > > > > > > > > memcmp would not use the pointer value but work > > > > > on the representation bytes and is still allowed. > > > > > > > > Hmm, interesting. It's rather unspecified behavior. Still > > > > unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or > > > > false randomly; the compiler may compile out the call to memcmp(3), > > > > since it knows it won't produce any observable behavior. > > > > > > > > > > > > > > No, I think several things get mixed up here. > > > > > > The representation of a pointer that becomes invalid > > > does not change. > > > > > > So (0 === memcmp(&p, &p, sizeof(p)) always > > > evaluates to true. > > > > > > Also in general, an unspecified value is simply unspecified > > > but does not change anymore. > > Right. p is its own thing - n bytes on the stack containing some value. > Once it comes into scope, it doesn't change on its own. And if I do > free(p) or o = realloc(p), then the value of p itself - the n bytes on > the stack - does not change. Yes, but one comment about terminology:. The C standard differentiates between the representation, i.e. the bytes on the stack, and the value. The representation is converted to a value during lvalue conversion. For an invalid pointer the representation is indeterminate because it now does not point to a valid object anymore. So it is not possible to convert the representation to a value during lvalue conversion. In other words, it does not make sense to speak of the value of the pointer anymore. > I realize C11 appears to have changed that. I fear that in doing so it > actually risks increasing the confusion about pointers. IMO it's much > easier to reason about > > o = realloc(p, X); > > (and more baroque constructions) when keeping in mind that o, p, and the > object pointed to by either one are all different things. > What did change in C11? As far as I know, the pointer model did not change in C11. > > > Reading an uninitialized value of automatic storage whose > > > address was not taken is undefined behavior, so everything > > > is possible afterwards. > > > > > > An uninitialized variable whose address was taken has a > > > representation which can represent an unspecified value > > > or a no-value (trap) representation. Reading the > > > representation itself is always ok and gives consistent > > > results. Reading the variable can be undefined behavior > > > iff it is a trap representation, otherwise you get > > > the unspecified value which is stored there. > > > > > > At least this is my reading of the C standard. Compilers > > > are not full conformant. > > > > Does all this imply that the following is well defined behavior (and shall > > print what one would expect)? > > > >   free(p); > > > >   (void) &p; // take the address > >   // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > > >   printf("%p\n", p); // we took previously its address, > >                       // so now it has to hold consistently > >                       // the previous value > > > > No, the printf is not well defined, because the lvalue conversion of the pointer with indeterminate representation may lead to undefined behavior. Martin > > This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely > > convinced, but might be. > > Again, p is just an n byte variable which happens to have (one hopes) > pointed at a previously malloc'd address. > > And I'd argue that pre-C11, this was not confusing, and would not have > felt weird to you. > > But I am most grateful to you for having brought this to my attention. > I may not agree with it and not like it, but it's right there in the > spec, so time for me to adjust :) >