public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Martin Sebor <msebor@gmail.com>
Cc: Jeff Law <law@redhat.com>, GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PING #4] [PATCH] look harder for MEM_REF operand equality to avoid -Wstringop-truncation (PR 84561)
Date: Fri, 16 Nov 2018 08:46:00 -0000	[thread overview]
Message-ID: <CAFiYyc0W7QGk-PfrkmMAdTaunhdic5UqaG3hqc5xJueUeYOuAg@mail.gmail.com> (raw)
In-Reply-To: <ee66fcd1-c17d-dade-a05f-ea128f93a504@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7998 bytes --]

On Fri, Nov 16, 2018 at 4:09 AM Martin Sebor <msebor@gmail.com> wrote:
>
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html
>
> Do I need to change something in this patch to make it acceptable
> or should I assume we will leave this bug in GCC unfixed?

So the issue is the next_stmt handling because for the _next_ stmt
we did not yet replace uses with lattice values.  This information
was missing all the time along (and absent from patch context).

I notice the next_stmt handling is incredibly fragile and it doesn't
even check the store you identify as thouching the same object
writes a '\0', nor does it verify the store writes to a position after
the strncpy accessed area (but eventually anywhere is OK...).

So I really wonder why there's the restriction on 1:1 equality of the
base.  That relies on proper CSE (as you saw and tried to work-around
in your patch) and more.

So what I'd do is the attached.  Apply more leeway (and less at the
same time) and restrict it to stores from zero but allow any aliasing
one.  One could build up more precision by, instead of using
ptr_may_alias_ref_p use refs_may_alias_p and build up a ao_ref
for the strncpy destination using ao_ref_from_ptr_and_size, but
I didn't bother to figure out what constraint on len the function
computed up to this point.

The patch fixes the testcase.

Richard.

> On 10/31/2018 10:35 AM, Martin Sebor wrote:
> > Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html
> >
> > On 10/08/2018 03:40 PM, Martin Sebor wrote:
> >> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html
> >>
> >> On 10/01/2018 03:30 PM, Martin Sebor wrote:
> >>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html
> >>>
> >>> We have discussed a number of different approaches to moving
> >>> the warning somewhere else but none is feasible in the limited
> >>> amount of time remaining in stage 1 of GCC 9.  I'd like to
> >>> avoid the false positive in GCC 9 by using the originally
> >>> submitted, simple approach and look into the suggested design
> >>> changes for GCC 10.
> >>>
> >>> On 09/21/2018 08:36 AM, Martin Sebor wrote:
> >>>> On 09/20/2018 03:06 AM, Richard Biener wrote:
> >>>>> On Wed, Sep 19, 2018 at 4:19 PM Martin Sebor <msebor@gmail.com> wrote:
> >>>>>>
> >>>>>> On 09/18/2018 10:23 PM, Jeff Law wrote:
> >>>>>>> On 9/18/18 1:46 PM, Martin Sebor wrote:
> >>>>>>>> On 09/18/2018 12:58 PM, Jeff Law wrote:
> >>>>>>>>> On 9/18/18 11:12 AM, Martin Sebor wrote:
> >>>>>>>>>
> >>>>>>>>>>> My bad.  Sigh. CCP doesn't track copies, just constants, so
> >>>>>>>>>>> there's not
> >>>>>>>>>>> going to be any data structure you can exploit.  And I don't
> >>>>>>>>>>> think
> >>>>>>>>>>> there's a value number you can use to determine the two objects
> >>>>>>>>>>> are the
> >>>>>>>>>>> same.
> >>>>>>>>>>>
> >>>>>>>>>>> Hmm, let's back up a bit, what is does the relevant part of the
> >>>>>>>>>>> IL look
> >>>>>>>>>>> like before CCP?  Is the real problem here that we have
> >>>>>>>>>>> unpropagated
> >>>>>>>>>>> copies lying around in the IL?  Hmm, more likely the IL looksl
> >>>>>>>>>>> ike:
> >>>>>>>>>>>
> >>>>>>>>>>>    _8 = &pb_3(D)->a;
> >>>>>>>>>>>    _9 = _8;
> >>>>>>>>>>>    _1 = _9;
> >>>>>>>>>>>    strncpy (MEM_REF (&pb_3(D)->a), ...);
> >>>>>>>>>>>    MEM[(struct S *)_1].a[n_7] = 0;
> >>>>>>>>>>
> >>>>>>>>>> Yes, that is what the folder sees while the strncpy call is
> >>>>>>>>>> being transformed/folded by ccp.  The MEM_REF is folded just
> >>>>>>>>>> after the strncpy call and that's when it's transformed into
> >>>>>>>>>>
> >>>>>>>>>>   MEM[(struct S *)_8].a[n_7] = 0;
> >>>>>>>>>>
> >>>>>>>>>> (The assignments to _1 and _9 don't get removed until after
> >>>>>>>>>> the dom walk finishes).
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> If we were to propagate the copies out we'd at best have:
> >>>>>>>>>>>
> >>>>>>>>>>>    _8 = &pb_3(D)->a;
> >>>>>>>>>>>    strncpy (MEM_REF (&pb_3(D)->a), ...);
> >>>>>>>>>>>    MEM[(struct S *)_8].a[n_7] = 0;
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Is that in a form you can handle?  Or would we also need to
> >>>>>>>>>>> forward
> >>>>>>>>>>> propagate the address computation into the use of _8?
> >>>>>>>>>>
> >>>>>>>>>> The above works as long as we look at the def_stmt of _8 in
> >>>>>>>>>> the MEM_REF (we currently don't).  That's also what the last
> >>>>>>>>>> iteration of the loop does.  In this case (with _8) it would
> >>>>>>>>>> be discovered in the first iteration, so the loop could be
> >>>>>>>>>> replaced by a simple if statement.
> >>>>>>>>>>
> >>>>>>>>>> But I'm not sure I understand the concern with the loop.  Is
> >>>>>>>>>> it that we are looping at all, i.e., the cost?  Or that ccp
> >>>>>>>>>> is doing something wrong or suboptimal? (Should have
> >>>>>>>>>> propagated the value of _8 earlier?)
> >>>>>>>>> I suspect it's more a concern that things like copies are
> >>>>>>>>> typically
> >>>>>>>>> propagated away.   So their existence in the IL (and consequently
> >>>>>>>>> your
> >>>>>>>>> need to handle them) raises the question "has something else
> >>>>>>>>> failed to
> >>>>>>>>> do its job earlier".
> >>>>>>>>>
> >>>>>>>>> During which of the CCP passes is this happening?  Can we pull the
> >>>>>>>>> warning out of the folder (even if that means having a distinct
> >>>>>>>>> warning
> >>>>>>>>> pass over the IL?)
> >>>>>>>>
> >>>>>>>> It happens during the third run of the pass.
> >>>>>>>>
> >>>>>>>> The only way to do what you suggest that I could think of is
> >>>>>>>> to defer the strncpy to memcpy transformation until after
> >>>>>>>> the warning pass.  That was also my earlier suggestion: defer
> >>>>>>>> both it and the warning until the tree-ssa-strlen pass (where
> >>>>>>>> the warning is implemented to begin with -- the folder calls
> >>>>>>>> into it).
> >>>>>>> If it's happening that late (CCP3) in general, then ISTM we ought
> >>>>>>> to be
> >>>>>>> able to get the warning out of the folder.  We just have to pick the
> >>>>>>> right spot.
> >>>>>>>
> >>>>>>> warn_restrict runs before fold_all_builtins, but after dom/vrp so we
> >>>>>>> should have the IL in pretty good shape.  That seems like about the
> >>>>>>> right time.
> >>>>>>>
> >>>>>>> I wonder if we could generalize warn_restrict to be a more generic
> >>>>>>> warning pass over the IL and place it right before fold_builtins.
> >>>>>>
> >>>>>> The restrict pass doesn't know about string lengths so it can't
> >>>>>> handle all the warnings about string built-ins (the strlen pass
> >>>>>> now calls into it to issue some).  The strlen pass does so it
> >>>>>> could handle most if not all of them (the folder also calls
> >>>>>> into it to issue some warnings).  It would work even better if
> >>>>>> it were also integrated with the object size pass.
> >>>>>>
> >>>>>> We're already working on merging strlen with sprintf.  It seems
> >>>>>> to me that the strlen pass would benefit not only from that but
> >>>>>> also from integrating with object size and warn-restrict.  With
> >>>>>> that, -Wstringop-overflow could be moved from builtins.c into
> >>>>>> it as well (and also benefit not only from accurate string
> >>>>>> lengths but also from the more accurate object size info).
> >>>>>>
> >>>>>> What do you think about that?
> >>>>>
> >>>>> I think integrating the various "passes" (objectsize is also
> >>>>> as much a facility as a pass) generally makes sense given
> >>>>> it might end up improving all of them and reduce code duplication.
> >>>>
> >>>> Okay.  If Jeff agrees I'll see if I can make it happen for GCC
> >>>> 10.  Until then, does either of you have any suggestions for
> >>>> a different approach in this patch or is it acceptable with
> >>>> the loop as is?
> >>>>
> >>>> Martin
> >>>>
> >>>>>
> >>>>> Richard.
> >>>>>
> >>>>>>
> >>>>>> Martin
> >>>>>>
> >>>>>> PS I don't think I could do more than merger strlen and sprintf
> >>>>>> before stage 1 ends (if even that much) so this would be a longer
> >>>>>> term goal.
> >>>>
> >>>
> >>
> >
>

[-- Attachment #2: p --]
[-- Type: application/octet-stream, Size: 3103 bytes --]

diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 85a5de7ce05..2e92ee3abe2 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -321,6 +321,18 @@ ptr_deref_may_alias_ref_p_1 (tree ptr, ao_ref *ref)
   return true;
 }
 
+/* Return true if dereferencing PTR may alias REF.
+   The caller is responsible for applying TBAA to see if PTR
+   may access REF at all.  */
+
+bool
+ptr_deref_may_alias_ref_p (tree ptr, tree ref)
+{
+  ao_ref r;
+  ao_ref_init (&r, ref);
+  return ptr_deref_may_alias_ref_p_1 (ptr, &r);
+}
+
 /* Returns true if PTR1 and PTR2 compare unequal because of points-to.  */
 
 bool
diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
index 2ecc04d68fd..2b241257134 100644
--- a/gcc/tree-ssa-alias.h
+++ b/gcc/tree-ssa-alias.h
@@ -115,6 +115,7 @@ extern alias_set_type ao_ref_alias_set (ao_ref *);
 extern alias_set_type ao_ref_base_alias_set (ao_ref *);
 extern bool ptr_deref_may_alias_global_p (tree);
 extern bool ptr_derefs_may_alias_p (tree, tree);
+extern bool ptr_deref_may_alias_ref_p (tree, tree);
 extern bool ptrs_compare_unequal (tree, tree);
 extern bool ref_may_alias_global_p (tree);
 extern bool ref_may_alias_global_p (ao_ref *);
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 669c315dce2..da864f3ea7d 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1929,49 +1929,21 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, tree src, tree cnt)
 		 edges for now and only consider successor blocks along
 		 normal edges.  */
 	      edge e = EDGE_SUCC (bb, 0);
-	      if (!(e->flags & EDGE_ABNORMAL))
+	      if (!(e->flags & EDGE_COMPLEX))
 		{
-		  gsi = gsi_start_bb (e->dest);
+		  gsi = gsi_start_nondebug_after_labels_bb (e->dest);
 		  next_stmt = gsi_stmt (gsi);
-		  if (next_stmt && is_gimple_debug (next_stmt))
-		    {
-		      gsi_next_nondebug (&gsi);
-		      next_stmt = gsi_stmt (gsi);
-		    }
 		}
 	    }
 	}
     }
-
-  if (next_stmt && is_gimple_assign (next_stmt))
+  if (next_stmt
+      && gimple_assign_single_p (next_stmt)
+      && gimple_store_p (next_stmt)
+      && integer_zerop (gimple_assign_rhs1 (next_stmt)))
     {
       tree lhs = gimple_assign_lhs (next_stmt);
-      tree_code code = TREE_CODE (lhs);
-      if (code == ARRAY_REF || code == MEM_REF)
-	lhs = TREE_OPERAND (lhs, 0);
-
-      tree func = gimple_call_fndecl (stmt);
-      if (DECL_FUNCTION_CODE (func) == BUILT_IN_STPNCPY)
-	{
-	  tree ret = gimple_call_lhs (stmt);
-	  if (ret && operand_equal_p (ret, lhs, 0))
-	    return false;
-	}
-
-      /* Determine the base address and offset of the reference,
-	 ignoring the innermost array index.  */
-      if (TREE_CODE (ref) == ARRAY_REF)
-	ref = TREE_OPERAND (ref, 0);
-
-      poly_int64 dstoff;
-      tree dstbase = get_addr_base_and_unit_offset (ref, &dstoff);
-
-      poly_int64 lhsoff;
-      tree lhsbase = get_addr_base_and_unit_offset (lhs, &lhsoff);
-      if (lhsbase
-	  && dstbase
-	  && known_eq (dstoff, lhsoff)
-	  && operand_equal_p (dstbase, lhsbase, 0))
+      if (ptr_deref_may_alias_ref_p (dst, lhs))
 	return false;
     }
 

  reply	other threads:[~2018-11-16  8:46 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-30  0:12 Martin Sebor
2018-08-30  8:35 ` Richard Biener
2018-08-30 16:54   ` Martin Sebor
2018-08-30 17:22     ` Richard Biener
2018-08-30 17:39       ` Martin Sebor
2018-08-31 10:07         ` Richard Biener
2018-09-12 18:03           ` Martin Sebor
2018-09-14 21:35             ` Jeff Law
2018-09-14 23:44               ` Martin Sebor
2018-09-17 23:13                 ` Jeff Law
2018-09-18 17:38                   ` Martin Sebor
2018-09-18 19:24                     ` Jeff Law
2018-09-18 20:01                       ` Martin Sebor
2018-09-19  5:40                         ` Jeff Law
2018-09-19 14:31                           ` Martin Sebor
2018-09-20  9:21                             ` Richard Biener
2018-09-21 14:50                               ` Martin Sebor
2018-10-01 21:46                                 ` [PING] " Martin Sebor
2018-10-08 22:03                                   ` [PING #2] " Martin Sebor
2018-10-31 17:11                                     ` [PING #3] " Martin Sebor
2018-11-16  3:09                                       ` [PING #4] " Martin Sebor
2018-11-16  8:46                                         ` Richard Biener [this message]
2018-11-19 14:55                                           ` Jeff Law
2018-11-19 16:27                                           ` Martin Sebor
2018-11-20  9:23                                             ` Richard Biener
2018-10-04  3:08                             ` Jeff Law
2018-09-19 13:51                       ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc0W7QGk-PfrkmMAdTaunhdic5UqaG3hqc5xJueUeYOuAg@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).