public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Martin Sebor <msebor@gmail.com>
Cc: Jeff Law <law@redhat.com>, Gcc Patch List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
Date: Tue, 14 Nov 2017 09:13:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.20.1711140943290.12252@zhemvz.fhfr.qr> (raw)
In-Reply-To: <90f1f522-35b0-4989-bf77-b79948054281@gmail.com>

On Mon, 13 Nov 2017, Martin Sebor wrote:

> On 11/10/2017 01:00 AM, Richard Biener wrote:
> > On Thu, 9 Nov 2017, Jeff Law wrote:
> > 
> > > On 11/02/2017 05:48 AM, Richard Biener wrote:
> > > 
> > > > 
> > > > There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
> > > > past.  We have ripped out _most_ of them because of bad interaction
> > > > with dependence analysis and _b_o_s warnings.
> > > > 
> > > > But for example PRE might still end up propagating
> > > > 
> > > >  _1 = &ptr->a.b.c;
> > > >  _2 = (*_1)[i_3];
> > > > 
> > > > as
> > > > 
> > > >  _2 = ptr->a.b.c[i_3];
> > > > 
> > > > But it's not so much GCC building up GIMPLE expressions that would
> > > > cause false positives but "valid" C code and "invalid" C code
> > > > represented exactly the same in GCC.  Let's say
> > > I think this is a key point.
> > 
> > It's the usual issue with an optimizing compiler vs. a static analyzer.
> > We try to get rid of the little semantic details of the input languages
> > that in the end do not matter for code-generation but that makes
> > using those semantic details hard (sometimes the little details
> > are useful, like signed overflow being undefined).
> > 
> > For GIMPLE it's also often the case that we didn't really thoroughly
> > specify the semantics of the IL - like is an aggregate copy a
> > block copy (that's how we expand it to RTL) or a memberwise copy?
> > SRA treats it like the latter in some cases but memcpy folding
> > turns memcpy into aggregate assignments ... (now think about padding).
> > 
> > It's not that GCC doesn't have its set of existing issues with
> > respect to interpreting GIMPLE semantic as it seems fit in one way
> > here and in another way there.  I'm just always nervous when adding
> > new "interpretations" where I know the non-existing formal definition
> > of GIMPLE leaves things unspecified.
> > 
> > For example we _do_ use array bounds and array accesses (but _not_
> > and for now _nowhere_ if they appear in address computations!)
> > to derive niter information.  At the same time, because of this
> > exploitation, we try very very hard to never (ok, PRE above as a
> > couter-example) create an actual array access when dereferencing
> > a pointer that is constructed by taking the address of an array-ref.
> > That's why Martin added the warning to forwprop because that pass,
> > when forwarding such addresses, gets rid of the array-ref.
> > 
> > > > > Or, if that's not it, what exactly is your concern with this
> > > > > enhancement?  If it's that it's implemented in forwprop, what
> > > > > would be a better place, e.g., earlier in the optimization
> > > > > phase?  If it's something something else, I'd appreciate it
> > > > > if you could explain what.
> > > > 
> > > > For one implementing this in forwprop looks like a move in the
> > > > wrong direction.  I'd like to have separate warning passes or
> > > > at most amend warnings from optimization passes, not add new ones.
> > > I tend to agree.  That's one of the reasons why I pushed Aldy away from
> > > doing this kind of stuff within VRP.
> > > 
> > > What I envision is a pass which does a dominator walk through the
> > > blocks.  It gathers context sensitive range information as it does the
> > > walk.
> > > 
> > > As we encounter array references, we try to check them against the
> > > current range information.  We could also try to warn about certain
> > > pointer computations, though we have to be more careful with those.
> > > 
> > > Though I certainly still worry that the false positive cases which led
> > > Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
> > > and will limit the utility of doing more array range checking.
> > 
> > I fear while this might be a little bit cleaner you'd still have to
> > do this very very early in the optimization pipeline (see all the
> > hard time we had with __builtin_object_size) and thus you won't catch
> > very many cases unless you start doing an IPA pass and handle propagating
> > through memory.  Which is when you arrived at a full-blown static
> > analyzer.
> 
> I have a different concern with the general idea of moving these
> kinds of warnings into passes of their own.  It would unavoidably
> result in duplicating some code from the optimization passes (at
> a minimum, the GIMPLE traversal, but likely more than that).  It
> would also require pretty tight coupling between the warning pass
> and the optimization passes as the latter can defeat the work of
> the former (as happens in this case).  But most significantly,
> the separation isn't feasible in cases where the optimization
> pass computes and maintains non-trivial internal state (like,
> say, tree-object-size or tree-ssa-strlen).

Well, as I said elsewhere the goal is to modularize the "optimizations"
enough so you can run their analysis phase together from a
"warning pass" and get access to their internal state.  Like what
Jeff is doing with VRP.  I'm currently working on some region-based
value-numbering where the iteration scheme is supposed to allow
pluggin in, say, a value-range lattice.  It's currently not
allowing path separation (simulating jump-threading) at the moment
but you have to start somewhere ...

> I think of these kinds of "basic" warnings as error checking and
> handling.  It makes no more sense to me to have a separate pass
> check for input errors than it would to have one to check the
> validity of other kinds of input within GCC (such as trees of
> the wrong kind being passed from one function to another due to
> a GCC programmer's error).  They go hand in hand.
> 
> To be sure, there are more sophisticated kinds of checkers that
> depend on an in-depth analysis of the program that's outside of
> what most existing optimization passes do.  The sprintf warning
> comes to mind as an example.  But even those can be (and the
> sprintf pass has been) used to do both: detect and diagnose
> errors _and_ optimize code.  And even these passes (or especially
> they) do or would benefit from data computed by other optimization
> passes to deliver better results (whether it be avoiding false
> positives or improving the code they generate).

Sure - no disagreement here.  But then those warnings have to
operate within the limitations of the GIMPLE IL which was designed
for an optimizing compiler with a variety of input source languages.
That is, you simply can't implement every diagnostic you'd like
to have when coming from a source level perspective.

It seems to be you're somewhat pushing the boundary here ;)

Richard.

> That said, it would make perfect sense to me to have the warning
> code live in separate files from the optimization code, and keep
> the separation of concerns and responsibilities crisp and well
> defined by their APIs.

  reply	other threads:[~2017-11-14  8:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-29 16:15 Martin Sebor
2017-10-30 11:55 ` Richard Biener
2017-10-30 15:21   ` Martin Sebor
2017-10-30 19:59     ` Richard Biener
2017-10-30 20:40       ` Martin Sebor
2017-10-30 21:23         ` Richard Biener
2017-10-30 21:49           ` Martin Sebor
2017-11-02 11:48             ` Richard Biener
2017-11-10  1:12               ` Jeff Law
2017-11-10  8:25                 ` Richard Biener
2017-11-14  0:04                   ` Jeff Law
2017-11-14  9:11                     ` Richard Biener
2017-11-15  1:52                       ` Jeff Law
2017-11-14  5:22                   ` Martin Sebor
2017-11-14  9:13                     ` Richard Biener [this message]
2017-11-15  1:54                     ` Jeff Law
2017-10-30 22:16     ` Jeff Law
2017-10-30 23:30       ` Martin Sebor
2017-10-31  4:32         ` Jeff Law
2017-11-01 22:21           ` Martin Sebor
2017-11-02 11:27           ` Richard Biener
2017-10-30 22:16 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.1711140943290.12252@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).