public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: "Uecker, Martin" <Martin.Uecker@med.uni-goettingen.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] doc: clarify the situation with pointer arithmetic
Date: Mon, 27 Jan 2020 15:14:00 -0000	[thread overview]
Message-ID: <CAFiYyc0Vqfg6Q11oauaqyxF1ftOE4WyfKGkoTS4iXPWxHSwAJg@mail.gmail.com> (raw)
In-Reply-To: <1579823158.4442.3.camel@med.uni-goettingen.de>

On Fri, Jan 24, 2020 at 12:46 AM Uecker, Martin
<Martin.Uecker@med.uni-goettingen.de> wrote:
>
> Am Donnerstag, den 23.01.2020, 14:18 +0100 schrieb Richard Biener:
> > On Wed, Jan 22, 2020 at 12:40 PM Martin Sebor <msebor@gmail.com> wrote:
> > >
> > > On 1/22/20 8:32 AM, Richard Biener wrote:
> > > > On Tue, 21 Jan 2020, Alexander Monakov wrote:
> > > >
> > > > > On Tue, 21 Jan 2020, Richard Biener wrote:
> > > > >
> > > > > > Fourth.  That PNVI (I assume it's the whole pointer-provenance stuff)
> > > > > > wants to get the "best" of both which can never be done since a compiler
> > > > > > needs to have a way to be conservative - in this area it's conflicting
> > > > > > conservative treatment which is impossible.
> > > > >
> > > > > This paragraph is unclear, I don't immediately see what the conflicting goals
> > > > > are. The rest is clear enough given the previous discussions I saw.
> > > > >
> > > > > Did you mean the restriction that you cannot do arithmetic involving two
> > > > > integers based on pointers, get a value corresponding to one of them,
> > > > > cast it back and get a pointer suitable for accessing either of two
> > > > > originally pointed-to objects? I don't see that as a conflict because
> > > > > it places a restriction on users, not the compiler.
> > > >
> > > > As far as I remember the discussions PNVI requires to track
> > > > provenance for correctness, you may not miss or attach wrong provenance
> > > > to a pointer and there's only "single" provenance, not "many"
> > > > (aka, may point to A and B).  I don't see how you can ever implement that.
>
> I have not idea how you came to that conclusion. PNVI is perfectly
> compatible with a naive compiler who does not track provenance at
> all as well as an abstract machine that actually carries run-time
> provenance around with each pointer and checks every operation.
> It was designed specifically to allow both cases and everything
> in between (especially compilers who track provenance during
> compile time but the programs then do not track provenance at
> run-time).
>
> You may be confused by the abstract formulation that indeed
> assigns a single provenance to each pointer. A compiler would
> track its *knowledge about provenance*, which would be a set
> of possible targets.

Well, the question is whether PVNI allows the compiler to put any
additional restriction on what the provenance of an interger is.  It
appears not, so any attempt to track provenance through integers
is doomed until the cases are very simple.  I'm not sure that's desirable (*).

> > > The PVNI variant preferred by the object model group is referred
> > > to as "PNVI-ae-udi" which stands for "PNVI exposed-address user-
> > > disambiguation."  (The PNVI part stands for "Provenance Not Via
> > > Integers.)  This base PVNI model basically prohibits provenance
> > > tracking via integers, making it possible for programs to derive
> > > pointers to unrelated objects via casts between pointers and
> > > integers (and modifying the integer in between the casts).  This
> > > is considered a new restriction on implementations because
> > > the standard doesn't permit it (as you said upthread, all it
> > > specifies is that a pointer is equal to one obtained by casting
> > > the original to a intptr_t and back).
>
> This is not entirely clear what the standard means. 7.20.1.4.
>
> In my opinion, converting the same integer back should yield
> a valid pointer where "same" is defined in the usual sense
> (i.e. via mathematical identity and not via provenance).
>
> > > The -ae-udi variant limits this restriction on implementations
> > > to escaped pointers and provides a means for users/programs to
> > > disambiguate between pointers to adjacent objects (i.e., a past
> > > the end pointer and one to the beginning of the object stored
> > > there).  The latest proposal is in N2362, with an overview in
> > > N2378).  At the last WG14 meeting there was broad discomfort
> > > with adopting the proposal for C2X because of the absence of
> > > implementation experience and concerns raised by implementers.
> > > The guidance to the study group was to target a separate technical
> > > specification for the proposal and allow time for implementation
> > > experience.  If the feedback from implementers is positive
> > > (whatever that might mean) WG14 said it would consider adopting
> > > the model for a revision of C after C2X.
> > >
> > > Overall, the impact of the proposals as well as their goal is to
> > > constrain implementations to the (presumed) benefit of programs
> > > in terms of expressiveness.  There are numerous examples of code
> > > that's currently treated as undefined by one or more compilers
> > > (as a consequence of optimizations) that the model makes valid.
> > > I'm not aware of any optimization opportunities the proposal
> > > might open up in GCC.
> >
> > Well, PNVI limits optimization opportunities of GCC which currently
> > _does_ track provenance through integers and thus only allows
> > a very limited set(*) of "unrelated" pointers to appear here (documented
> > is that none are, implementation details differ from version to version).
> >
> > There are no optimization "opportunities" by making pointer <-> integer
> > conversions lose information.
>
> You are right: It is meant to constrain optimizations.
>
> The reasoning behind this that currently all compilers behave
> inconsistently and this is not terrible useful to anybody.
>
> At the same time, there does not appear to
> be any reasonable way how integers can have provenance.
> Any rules we came up with really got complicated and are
> also fundamentally at odds with the usual mathematical
> properties of integers one would naively expect.

So the original point where GCC started to track provenance through
non-pointers (PVNI should really be PVNNP since I guess tracking
provenance through floats isn't to be done either ;)) was a testcase
showing that Matlab (IIRC) generated C code funneled pointers through
a pair of floats (obviously a single float isn't enough for 64bit pointers ...)
and that prevents a good deal of optimization due to missed alias analysis.
I fixed that and now GCC happily tracks provenance through a pair of
floats ...

(*) this also shows the level of "obfuscation" needed to fool compilers
to lose provenance knowledge is hard to predict.

>
> > (*) for recent versions we allow pointers to globals to be "invented" but
> > place strict restrictions on automatic variables based on the idea that
> > you cannot have reliable absolute addressing of those (but you could
> > place objects at 0x12340 if you like via linker scripts)
>
> The idea with PVNI-ae would be that pointers could be "invented"
> only to "exposed" objects, i.e. address taken and the address
> was cast to int (or escaped compiler analysis). So pointers to
> other automatic variables can not be invented.
>
> > Then there are of course bugs in GCC (just found PR93381) when
> > tracking provenance through integers (GCC also disregards the
> > possibility of someone actually moving pointers in very weird ways
> > which you could say is a bug).
>
> Yes, nice example. With the current standard, it is not even
> clear whether this is a bug or not.
>
>
> Martin
>
> > You also can't easily disregard aligning bitwise ands of leaving the
> > current object if you consider overaligning so you'd again need
> > precise tracking of pointer offsets in points-to analysis which we don't have.
> >
> > Richard.
> >
> > >
> > > Martin

  reply	other threads:[~2020-01-27 14:43 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-20 15:50 Alexander Monakov
2020-01-20 21:53 ` Sandra Loosemore
2020-01-20 23:11   ` Alexander Monakov
2020-01-21  0:44 ` Joseph Myers
2020-01-21 13:57   ` Alexander Monakov
2020-01-21 14:59     ` Richard Biener
2020-01-21 15:08       ` Alexander Monakov
2020-01-22  8:04         ` Richard Biener
2020-01-22 12:09           ` Martin Sebor
2020-01-23 13:47             ` Richard Biener
2020-01-24  0:06               ` Uecker, Martin
2020-01-27 15:14                 ` Richard Biener [this message]
2020-01-28  4:02                   ` Uecker, Martin
2020-01-28  8:28                     ` Alexander Monakov
2020-01-28 10:16                       ` Richard Biener
2020-01-28 13:28                         ` Uecker, Martin
2020-01-29  9:01                           ` Richard Biener
2020-01-29 14:28                             ` Uecker, Martin
2020-01-30  9:47                               ` Richard Biener
2020-01-30 14:42                                 ` Uecker, Martin
2020-01-30 16:59                                   ` Michael Matz
2020-01-30 17:27                                     ` Michael Matz
2020-01-30 17:29                                     ` Uecker, Martin
2020-01-31  9:31                                       ` Richard Biener
2020-01-31 12:26                                         ` Uecker, Martin
2020-01-31 13:22                                           ` Richard Biener
2020-01-28 13:08                       ` Uecker, Martin
2020-01-28 18:04                         ` Alexander Monakov
2020-01-22  1:37       ` Joseph Myers
2020-01-22  8:15         ` Richard Biener
2020-01-22  1:29     ` Joseph Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc0Vqfg6Q11oauaqyxF1ftOE4WyfKGkoTS4iXPWxHSwAJg@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=Martin.Uecker@med.uni-goettingen.de \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).