public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Jeff Law <law@redhat.com>
Cc: Martin Sebor <msebor@gmail.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] warn on returning alloca and VLA (PR 71924, 90549)
Date: Tue, 04 Jun 2019 11:45:00 -0000	[thread overview]
Message-ID: <CAFiYyc1-xOt9CjXiA1W9a_Q7UVzae9=tWQGb9jU9WjNt4VBHTA@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc27jvesRA9t1QJZ=j-+0Ks5=wU9BM2Qpx-vYz1JRrjD2A@mail.gmail.com>

On Mon, Jun 3, 2019 at 1:27 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, Jun 3, 2019 at 11:37 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, May 31, 2019 at 5:35 PM Jeff Law <law@redhat.com> wrote:>
> > > On 5/30/19 4:56 PM, Martin Sebor wrote:
> > > > On 5/30/19 10:15 AM, Jeff Law wrote:
> > > >> On 5/30/19 9:34 AM, Martin Sebor wrote:
> > > >>
> > > >>>>> If the alias oracle can be used to give the same results without
> > > >>>>> excessive false positives then I think it would be fine to make
> > > >>>>> use of it.  Is that something you consider a prerequisite for
> > > >>>>> this change or should I look into it as a followup?
> > > >>>> I think we should explore it a bit before making a final decision.  It
> > > >>>> may guide us for other work in this space (like detecting escaping
> > > >>>> locals).   I think a dirty prototype to see if it's even in the right
> > > >>>> ballpark would make sense.
> > > >>>
> > > >>> Okay, let me look into it.
> > > >> Sounds good.  Again, go with a quick prototype to see if it's likely
> > > >> feasible.  The tests you've added should dramatically help evaluating if
> > > >> the oracle is up to the task.
> > > >
> > > > So to expand on what I said on the phone when we spoke: the problem
> > > > I quickly ran into with the prototype is that I wasn't able to find
> > > > a way to identify pointers to alloca/VLA storage.
> > > Your analysis matches my very quick read of the aliasing code.  It may
> > > be the case that the Steensgaard patent got in the way here.
> > >
> > > >
> > > > In the the points-to solution for the pointer being returned they
> > > > both have the vars_contains_escaped_heap flag set.  That seems like
> > > > an omission that shouldn't be hard to fix, but on its own, I don't
> > > > think it would be sufficient.
> > > RIght.  In theory the result of an alloca call shouldn't alias anything
> > > in the heap -- but there were some implementations of alloca that were
> > > built on top of malloc (ugh).  That flag may be catering to that case.
> > >
> > > But in the case of a __builtin_alloca that shouldn't apply.  Hmm.  That
> > > ultimately might be a bug.
> > >
> > > >
> > > > In the IL a VLA is represented as a pointer to an array, but when
> > > > returning a pointer into a VLA (at some offset so it's an SSA_NAME),
> > > > the pointer's point-to solution doesn't include the VLA pointer or
> > > > (AFAICS) make it possible to tell even that it is a VLA.  For example
> > > > here:
> > > >
> > > >   f (int n)
> > > >   {
> > > >     int * p;
> > > >     int[0:D.1912] * a.1;
> > > >     sizetype _1;
> > > >     void * saved_stack.3_3;
> > > >     sizetype _6;
> > > >
> > > >     <bb 2> [local count: 1073741824]:
> > > >     saved_stack.3_3 = __builtin_stack_save ();
> > > >     _1 = (sizetype) n_2(D);
> > > >     _6 = _1 * 4;
> > > >     a.1_8 = __builtin_alloca_with_align (_6, 32);
> > > >     p_9 = a.1_8 + _6;
> > > >     __builtin_stack_restore (saved_stack.3_3);
> > > >     return p_9;
> > > >   }
> > > >
> > > > p_9's solution's is:
> > > >
> > > >   p_9, points-to vars: { D.1925 } (escaped, escaped heap)
> > > >
> > > > I couldn't find out how to determine that D.1925 is a VLA (or even
> > > > what it is).  It's not among the function's local variables that
> > > > FOR_EACH_LOCAL_DECL iterates over.
> > > It's possible that decl was created internally as part of the alias
> > > oracle's analysis.
> >
> > Yes.  Note that only the UID was reserved the fake decl doesn't
> > live on.
> >
> > Note that for the testcase above the "local" alloca storage escapes
> > which means you run into a catch-22 here given points-to computes
> > a conservative correct solution and  you want to detect escaping
> > locals.  Usually detecting a pointer to local storage can be done
> > by using ptr_deref_may_alias_global_p but of course in this
> > case the storage was marked global by PTA itself (and our PTA
> > is not flow-sensitive and it doesn't distinguish an escape through
> > a return stmt from an escape through a call which is relevant
> > even for local storage).
> >
> > Feature-wise the PTA solver is missing sth like a "filter"
> > we could put in front of return stmts that doesn't let
> > addresses of locals leak.  The simplest way of implementing
> > this might be to not include 'returns' in the constraints at all
> > (in non-IPA mode) and handle them by post-processing the
> > solver result.  That gets us some additional flow-sensitivity
> > and a way to filter locals.  Let me see if I can cook up this.
> >
> > That may ultimatively also help the warning code where you
> > then can use ptr_deref_may_alias_global_p.
> >
> > Sth like the attached - completely untested (the
> > is_global_var check is likely too simplistic...).  It does
> > the job on alloca for me.
> >
> > p_5, points-to NULL, points-to vars: { D.1913 }
> > _6, points-to NULL, points-to vars: { D.1913 }
> >
> > foo (int n)
> > {
> >   void * p;
> >   long unsigned int _1;
> >   void * _6;
> >
> >   <bb 2> :
> >   _1 = (long unsigned int) n_2(D);
> >   p_5 = __builtin_alloca (_1);
> >   _6 = p_5;
> >   return p_5;
>
> I am testing the following fixed and more complete patch (still
> the actual return values we process optimally is restricted).
> I'm not sure whether it will really help real-world testcases
> since the lack of flow-sensitivity in general and the lack of
> distinguishing between escapes via calls vs. escapes via
> return pessimizes things (escapes to global memory complicates
> things there).  Also in IPA mode we cannot post-process
> returns like in the hack^Wpatch, a "filter" op would need to be
> added, complicating the solver.
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Needs another iteration since it miscompiles gengtype and

struct S { int *mem; };

struct S * __attribute__((noinline,noipa))
foo ()
{
  struct S *s = __builtin_malloc (sizeof (struct S));
  s->mem = __builtin_malloc (sizeof (int));
  s->mem[0] = 1;
  return s;
}

int
main()
{
  struct S *s = foo();
  if (s->mem[0] != 1)
    __builtin_abort ();
  return 0;
}

Richard.

> Richard.
>
> 2019-06-03  Richard Biener  <rguenther@suse.de>
>
>         * tree-ssa-structalias.c: Include tree-cfg.h.
>         (make_heapvar): Do not make heap vars artificial.
>         (find_func_aliases_for_builtin_call): Handle stack allocation
>         functions.
>         (find_func_aliases): Delay processing of simple enough returns
>         in non-IPA mode.
>         (set_uids_in_ptset): Adjust.
>         (find_what_var_points_to): Likewise.
>         (compute_points_to_sets): Post-process return statements,
>         amending the escaped solution.
>
>         * gcc.dg/tree-ssa/alias-37.c: New testcase.
>
> >
> > Richard.
> >
> > > See make_heapvar in tree-ssa-structalias.c
> > > >
> > > > By replacing the VLA in the test case with a call to malloc I get
> > > > this for the returned p_7:
> > > >
> > > >   p_7, points-to NULL, points-to vars: { D.1916 } (escaped, escaped heap)
> > > >
> > > > Again, I see no way to quickly tell that this pointer points into
> > > > the block returned from malloc.
> > > If there's a way to make that determination it'd have to be on the
> > > variable since the pt_solution flag bits don't carry a storage class
> > > directly.
> > >
> > > You might try to get a handle on those decls and dump them to see if
> > > there's anything easily identifiable.  But it may be easier to dig into
> > > the code that creates them.  A real quick scan of the aliasing code also
> > > shows the "variable_info" structure.  It's private to the aliasing code,
> > > but might guide you at things to look at.
> > >
> > > Regardless, I don't see an immediate path forward  using the oracle to
> > > identify objects in the stack for your patch.  WHich is unfortunate.
> > >
> > >
> > >
> > >
> > > Jeff

  reply	other threads:[~2019-06-04 11:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-22 21:34 Martin Sebor
2019-05-29 18:08 ` Jeff Law
2019-05-30 14:58   ` Martin Sebor
2019-05-30 15:21     ` Jeff Law
2019-05-30 15:48       ` Martin Sebor
2019-05-30 16:20         ` Jeff Law
2019-05-30 17:27           ` Jason Merrill
2019-05-31  0:26             ` Jeff Law
2019-05-30 23:35           ` Martin Sebor
2019-05-31 15:50             ` Jeff Law
2019-06-03  9:37               ` Richard Biener
2019-06-03 11:28                 ` Richard Biener
2019-06-04 11:45                   ` Richard Biener [this message]
2019-06-03 17:24                 ` Jeff Law
2019-05-31 21:19 ` Jeff Law
2019-06-03 23:24   ` Martin Sebor
2019-06-04 19:40     ` Martin Sebor
     [not found]       ` <224f8161-e370-bcbc-3ee6-5cff5835e016@redhat.com>
2019-06-19  3:19         ` Martin Sebor
2019-06-26 14:59           ` [PING] " Martin Sebor
2019-06-27  0:12           ` Jeff Law
2019-06-30 21:50             ` Martin Sebor
2019-07-02 20:59               ` Jeff Law
2019-07-11  6:45                 ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc1-xOt9CjXiA1W9a_Q7UVzae9=tWQGb9jU9WjNt4VBHTA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).