public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Uecker, Martin" <Martin.Uecker@med.uni-goettingen.de>
To: "Peter.Sewell@cl.cam.ac.uk" <Peter.Sewell@cl.cam.ac.uk>,
	"richard.guenther@gmail.com" <richard.guenther@gmail.com>
Cc: "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>,
	"law@redhat.com" <law@redhat.com>,
	"cl-c-memory-object-model@lists.cam.ac.uk"
	<cl-c-memory-object-model@lists.cam.ac.uk>
Subject: Re: C provenance semantics proposal
Date: Wed, 17 Apr 2019 11:53:00 -0000	[thread overview]
Message-ID: <1555502021.4884.1.camel@med.uni-goettingen.de> (raw)
In-Reply-To: <CAFiYyc3Ri_U5Sqsv1gm6JhsOv=DYLB6LxtSLy7smP9sr-g+LWA@mail.gmail.com>


Hi Richard,

Am Mittwoch, den 17.04.2019, 11:41 +0200 schrieb Richard Biener:
> On Wed, Apr 17, 2019 at 11:15 AM Peter Sewell <Peter.Sewell@cl.cam.ac.uk> wrote:
> > 
> > On 17/04/2019, Richard Biener <richard.guenther@gmail.com> wrote:
> > > On Fri, Apr 12, 2019 at 5:31 PM Peter Sewell <Peter.Sewell@cl.cam.ac.uk>
> > > wrote:

...
> > > So this is not what GCC implements which tracks provenance through
> > > non-pointer types to a limited extent when only copying is taking place.
> > > 
> > > Your proposal makes
> > > 
> > >  int a, b;
> > >  int *p = &a;
> > >  int *q = &b;
> > >  uintptr_t pi = (uintptr_t)p; //expose
> > >  uintptr_t qi = (uintptr_t)q; //expose
> > >  pi += 4;
> > >  if (pi == qi)
> > >    *(int *)pi = 1;
> > > 
> > > well-defined since (int *)pi now has the provenance of &b.
> > 
> > Yes.  (Just to be clear: it's not that we think the above example is
> > desirable in itself, but it's well-defined as a consequence of what
> > we do to make other common idioms, eg pointer bit manipulation,
> > well-defined.)
> > 
> > > Note GCC, when tracking provenance of non-pointer type
> > > adds like in
> > > 
> > >   int *p = &a;
> > >   uintptr_t pi = (uintptr_t)p;
> > >   pi += 4;
> > > 
> > > considers pi to have provenance "anything" (not sure if you
> > > have something like that) since we add 4 which has provenance
> > > "anything" to pi which has provenance &a.
> > 
> > We don't at present have a provenance "anything", but if the gcc
> > "anything" means that it's assumed that it might alias with anything,
> > then it looks like gcc's implementing a sound approximation to
> > the proposal here?
> 
> GCC makes the code well-defined whereas the proposal would make
> dereferencing a pointer based on pi invoke undefined behavior?

No, if there is an exposed object where pi points to, it is
defined behaviour. 

>  Since
> your proposal is based on an abstract machine there isn't anything
> like a pointer with multiple provenances (which "anything" is), just
> pointers with no provenance (pointing outside of any object), right?

This is correct. What the proposal does though is put a limit
on where pointers obtained from integers are allowed to point
to: They cannot point to non-exposed objects. I assume GCC
"anything" provenances also cannot point to all possible
objects.

> For points-to analysis we of course have to track all possible
> provenances of a pointer (and if we know it doesn't point inside
> any object we make it point to nothing).

Yes, a compiler should track what it knows (it could also track
if it knows that some pointers point to the same object, etc.)
while the abstract machine knows everything there is to know.

> Btw, GCC changed its behavior here to support optimizing matlab
> generated C code which passes pointers to arrays across functions
> by marshalling them in two float typed halves (yikes!).  GCC is able
> to properly track provenance across the decomposition / recomposition
> when doing points-to analysis ;)

Impressive ;-)  I would have thought that such encoding
happens at ABI boundaries, where you cannot track anyway.
But this seems to occur inside compiled code?

While we do not attach a provenance to integers
in our proposal, it does not necessarily imply that a compiler
is not allowed to track such information. It then depends on
how it uses it.

For example,

int z;
int x;
uintptr_t pi = (uintptr_t)&x;

// encode in two floats ;-)

// pass floats around

// decode

int* p = (int*)pi;

If the compiler can prove that the address is still
the same, it can also reattach the original provenance
under some conditions.

But there is a caveat: It can only do this is it cannot
also be  a one-after pointer for z (or some other object).
If the address of 'z' is not exposed, it may be able to
assume this.

> Btw, one thing GCC struggles is when it applies rules that clearly
> apply to pointer dereferences to pointer equality compares where
> the standard has that special casing of comparing two pointers
> where one points one after an object requiring the comparison
> to evaluate to true when the objects are adjacent.  GCC
> currently statically optimizes if (&x + 1 == &y) to false for
> this reason (but not the corresponding integer comparison).

Yes, according to the current rules (and this doesn't change in
the proposal) two points comparing equal does not imply that
they are interchangable. Making the comparison unspecified 
(as C++) would not help. We could make it undefined, which
would make all optimizations based on the assumption that
the pointer are interchangable valid. But I fear that this
would introduce a corner case that could lead to subtle
and hard-to-detect bugs.

Martin

> Richard.
> 
> > 
> > best,
> > Peter
> > 
> > 
> > > > The user-disambiguation refinement adds some complexity but supports
> > > > roundtrip casts, from pointer to integer and back, of pointers that
> > > > are one-past a storage instance.
> 
> 

  reply	other threads:[~2019-04-17 11:53 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02  8:11 Peter Sewell
2019-04-12 14:51 ` Jeff Law
2019-04-12 15:31   ` Peter Sewell
2019-04-17  9:06     ` Richard Biener
2019-04-17  9:15       ` Peter Sewell
2019-04-17  9:41         ` Richard Biener
2019-04-17 11:53           ` Uecker, Martin [this message]
2019-04-17 12:41             ` Richard Biener
2019-04-17 12:56               ` Uecker, Martin
2019-04-17 13:35                 ` Richard Biener
2019-04-17 14:12                   ` Uecker, Martin
2019-04-17 17:31                     ` Peter Sewell
2019-04-18  9:32                     ` Richard Biener
2019-04-18  9:56                       ` Richard Biener
2019-04-18 10:48                         ` Peter Sewell
2019-04-18 11:57                         ` Uecker, Martin
2019-04-18 12:31                           ` Richard Biener
2019-04-18 13:25                             ` Uecker, Martin
2019-04-18 10:45                       ` Peter Sewell
2019-04-18 12:20                         ` Uecker, Martin
2019-04-18 12:42                           ` Richard Biener
2019-04-18 12:47                             ` Jakub Jelinek
2019-04-18 12:51                               ` Jakub Jelinek
2019-04-18 13:29                                 ` Jeff Law
2019-04-24 10:12                                   ` Richard Biener
2019-04-18 13:49                             ` Uecker, Martin
2019-04-19  8:19                             ` Jens Gustedt
2019-04-19  8:49                               ` Jakub Jelinek
2019-04-19  9:09                                 ` Jens Gustedt
2019-04-19  9:34                                   ` Jakub Jelinek
2019-04-21  8:15                                     ` Jens Gustedt
2019-04-24 10:24                                   ` Richard Biener
2019-04-24 18:43                                     ` Jeff Law
2019-04-24 19:21                                       ` Jens Gustedt
2019-04-19  9:11                                 ` Peter Sewell
2019-04-19  9:15                                   ` Jens Gustedt
2019-04-19  9:35                                     ` Peter Sewell
2019-04-19 10:35                                       ` Uecker, Martin
2019-04-19 10:01                               ` Uecker, Martin
2019-04-18 13:42                           ` Jeff Law
2019-04-18 13:54                             ` Uecker, Martin
2019-04-18 14:49                               ` Peter Sewell
2019-04-18 15:09                                 ` Uecker, Martin
2019-04-24 10:19                             ` Richard Biener
2019-04-24 18:41                               ` Jeff Law
2019-04-24 19:30                                 ` Philipp Klaus Krause
2019-04-24 19:55                                   ` Uecker, Martin
2019-04-24 19:33                                 ` Jakub Jelinek
2019-04-24 21:19                                 ` Peter Sewell
2019-04-25 12:42                                   ` Richard Biener
2019-04-25 13:03                                     ` Peter Sewell
2019-04-25 13:13                                       ` Richard Biener
2019-04-25 13:20                                         ` Peter Sewell
2019-04-29 14:31                                       ` Joseph Myers
2019-04-25 12:39                                 ` Richard Biener
2019-05-09 11:26                                   ` Ralf Jung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1555502021.4884.1.camel@med.uni-goettingen.de \
    --to=martin.uecker@med.uni-goettingen.de \
    --cc=Peter.Sewell@cl.cam.ac.uk \
    --cc=cl-c-memory-object-model@lists.cam.ac.uk \
    --cc=gcc@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).