public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Martin Uecker <uecker@tugraz.at>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	"Alex Colomar" <alx.manpages@gmail.com>, GCC <gcc@gcc.gnu.org>,
	"Iker Pedrosa" <ipedrosa@redhat.com>,
	"Florian Weimer" <fweimer@redhat.com>,
	"Paul Eggert" <eggert@cs.ucla.edu>,
	"Michael Kerrisk" <mtk.manpages@gmail.com>,
	"Jₑₙₛ Gustedt" <jens.gustedt@inria.fr>,
	"David Malcolm" <dmalcolm@redhat.com>,
	"Sam James" <sam@gentoo.org>,
	"Jonathan Wakely" <jwakely.gcc@gmail.com>
Subject: Re: Missed warning (-Wuse-after-free)
Date: Fri, 24 Feb 2023 10:01:45 -0600	[thread overview]
Message-ID: <20230224160145.GA374298@mail.hallyn.com> (raw)
In-Reply-To: <9d34a5da747601b0d9a3512cddfaf113726620ee.camel@tugraz.at>

On Fri, Feb 24, 2023 at 09:36:45AM +0100, Martin Uecker wrote:
> Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn:
> > On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote:
> > > Hi Martin,
> > > 
> > > On 2/23/23 20:57, Martin Uecker wrote:
> > > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar:
> > > > > Hi Martin,
> > > > > 
> > > > > On 2/17/23 14:48, Martin Uecker wrote:
> > > > > > > This new wording doesn't even allow one to use memcmp(3);
> > > > > > > just reading the pointer value, however you do it, is UB.
> > > > > > 
> > > > > > memcmp would not use the pointer value but work
> > > > > > on the representation bytes and is still allowed.
> > > > > 
> > > > > Hmm, interesting.  It's rather unspecified behavior. Still
> > > > > unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or
> > > > > false randomly; the compiler may compile out the call to memcmp(3),
> > > > > since it knows it won't produce any observable behavior.
> > > > > 
> > > > > <https://software.codidact.com/posts/287905>
> > > > 
> > > > No, I think several things get mixed up here.
> > > > 
> > > > The representation of a pointer that becomes invalid
> > > > does not change.
> > > > 
> > > > So (0 === memcmp(&p, &p, sizeof(p)) always
> > > > evaluates to true.
> > > > 
> > > > Also in general, an unspecified value is simply unspecified
> > > > but does not change anymore.
> > 
> > Right.  p is its own thing - n bytes on the stack containing some value.
> > Once it comes into scope, it doesn't change on its own.  And if I do
> > free(p) or o = realloc(p), then the value of p itself - the n bytes on
> > the stack - does not change.
> 
> Yes, but one comment about terminology:. The C standard
> differentiates between the representation, i.e. the bytes on
> the stack, and the value.  The representation is converted to
> a value during lvalue conversion.  For an invalid pointer
> the representation is indeterminate because it now does not
> point to a valid object anymore.  So it is not possible to
> convert the representation to a value during lvalue conversion.
> In other words, it does not make sense to speak of the value
> of the pointer anymore.

I'm sure there are, especially from an implementer's point of view,
great reasons for this.

However, as just a user, the "value" of 'void *p' should absolutely
not be tied to whatever is at that address.  I'm given a simple
linear memory space, under which sits an entirely different view
obfuscated by page tables, but that doesn't concern me.  if I say
void *p = -1, then if I print p, then I expect to see that value.

Since I'm complaining about standards I'm picking and choosing here,
but I'll still point at the printf(3) manpage :)  :

       p      The  void * pointer argument is printed in hexadecimal (as if by %#x
              or %#lx).

> > I realize C11 appears to have changed that.  I fear that in doing so it
> > actually risks increasing the confusion about pointers.  IMO it's much
> > easier to reason about
> > 
> > 	o = realloc(p, X);
> > 
> > (and more baroque constructions) when keeping in mind that o, p, and the
> > object pointed to by either one are all different things.
> > 
> 
> What did change in C11? As far as I know, the pointer model
> did not change in C11.

I haven't looked in more detail, and don't really plan to, but my
understanding is that the text of:

  The lifetime of an object is the portion of program execution during which storage is
  guaranteed to be reserved for it. An object exists, has a constant address, and retains
  its last-stored value throughout its lifetime. If an object is referred to outside of its
  lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
  the object it points to (or just past) reaches the end of its lifetime.

(especially the last sentence) was new.

Maybe the words "value of a pointer" don't mean what I think they
mean.  But that's the phrase to which I object.  The n bytes on
the stack, p, are not changed just because something happened with
the accounting for the memory at the address represented by that
value.  If they do, then that's not 'C' any more.

> > > > Reading an uninitialized value of automatic storage whose
> > > > address was not taken is undefined behavior, so everything
> > > > is possible afterwards.
> > > > 
> > > > An uninitialized variable whose address was taken has a
> > > > representation which can represent an unspecified value
> > > > or a no-value (trap) representation. Reading the
> > > > representation itself is always ok and gives consistent
> > > > results. Reading the variable can be undefined behavior
> > > > iff it is a trap representation, otherwise you get
> > > > the unspecified value which is stored there.
> > > > 
> > > > At least this is my reading of the C standard. Compilers
> > > > are not full conformant.
> > > 
> > > Does all this imply that the following is well defined behavior (and shall
> > > print what one would expect)?
> > > 
> > >   free(p);
> > > 
> > >   (void) &p;  // take the address
> > >   // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ?
> > > 
> > >   printf("%p\n", p);  // we took previously its address,
> > >                       // so now it has to hold consistently
> > >                       // the previous value
> > > 
> > > 
> 
> No, the printf is not well defined, because the lvalue conversion
> of the pointer with indeterminate representation may lead to

Sorry, my eyes glaze over at 'lvalue conversion of the pointer' - is
that referring to what's at the target address?  Because to print the
hex value of p, it doesn't matter what's at *p - even if that memory
is no longer mapped, it does not affect the %x of p.

BTW I appreciate all the insight.  I'm arguing my case, but I'm quite
certain I'll walk away finally understanding why I'm wrong.

> undefined behavior.
> 
> 
> Martin
> 
> 
> > > This feels weird.  And a bit of a Schroedinger's pointer.  I'm not entirely
> > > convinced, but might be.
> > 
> > Again, p is just an n byte variable which happens to have (one hopes)
> > pointed at a previously malloc'd address.
> > 
> > And I'd argue that pre-C11, this was not confusing, and would not have
> > felt weird to you.
> > 
> > But I am most grateful to you for having brought this to my attention.
> > I may not agree with it and not like it, but it's right there in the
> > spec, so time for me to adjust :)
> > 
> 
> 
> 
> 

  reply	other threads:[~2023-02-24 16:01 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 14:35 Alejandro Colomar
2023-02-16 15:15 ` David Malcolm
2023-02-17  1:04   ` Alejandro Colomar
2023-02-17  1:05     ` Alejandro Colomar
2023-02-17  1:56       ` Sam James
2023-02-17  8:12     ` Martin Uecker
2023-02-17 11:35       ` Alejandro Colomar
2023-02-17 13:34         ` Andreas Schwab
2023-02-17 13:48         ` Martin Uecker
2023-02-23 19:23           ` Alex Colomar
2023-02-23 19:57             ` Martin Uecker
2023-02-24  0:02               ` Alex Colomar
2023-02-24  1:21                 ` Serge E. Hallyn
2023-02-24  1:42                   ` Alex Colomar
2023-02-24  3:01                     ` Peter Lafreniere
2023-02-24  8:52                       ` Martin Uecker
2023-02-24  8:43                     ` Martin Uecker
2023-02-24 16:10                     ` Serge E. Hallyn
2023-02-24  8:36                   ` Martin Uecker
2023-02-24 16:01                     ` Serge E. Hallyn [this message]
2023-02-24 16:37                       ` Martin Uecker
2023-02-17  3:48   ` Siddhesh Poyarekar
2023-02-17 11:22     ` Alejandro Colomar
2023-02-17 13:38       ` Siddhesh Poyarekar
2023-02-17 14:01         ` Mark Wielaard
2023-02-17 14:06           ` Siddhesh Poyarekar
2023-02-17 21:20         ` [PATCH] Make -Wuse-after-free=3 the default one in -Wall Alejandro Colomar
2023-02-17 21:39           ` Siddhesh Poyarekar
2023-02-17 21:41             ` Siddhesh Poyarekar
2023-02-17 22:58             ` Alejandro Colomar
2023-02-17 23:03               ` Siddhesh Poyarekar
2023-02-17 11:24     ` Missed warning (-Wuse-after-free) Jonathan Wakely
2023-02-17 11:43       ` Alejandro Colomar
2023-02-17 12:04         ` Jonathan Wakely
2023-02-17 12:53       ` Siddhesh Poyarekar
2023-02-17 14:10         ` Jonathan Wakely
2023-02-17 13:44     ` David Malcolm
2023-02-17 14:01       ` Siddhesh Poyarekar
2023-02-17  8:49 ` Yann Droneaud

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230224160145.GA374298@mail.hallyn.com \
    --to=serge@hallyn.com \
    --cc=alx.manpages@gmail.com \
    --cc=dmalcolm@redhat.com \
    --cc=eggert@cs.ucla.edu \
    --cc=fweimer@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=ipedrosa@redhat.com \
    --cc=jens.gustedt@inria.fr \
    --cc=jwakely.gcc@gmail.com \
    --cc=mtk.manpages@gmail.com \
    --cc=sam@gentoo.org \
    --cc=uecker@tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).