public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store
       [not found] <20231010104908.839613857C41@sourceware.org>
@ 2023-10-10 11:44 ` Jakub Jelinek
  2023-10-10 11:59   ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2023-10-10 11:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Tue, Oct 10, 2023 at 10:49:04AM +0000, Richard Biener wrote:
> The following fixes a mistake in count_nonzero_bytes which happily
> skips over stores clobbering the memory we load a value we store
> from and then performs analysis on the memory state before the
> intermediate store.
> 
> The patch implements the most simple fix - guarantee that there are
> no intervening stores by tracking the original active virtual operand
> and comparing that to the one of a load we attempt to analyze.
> 
> This simple approach breaks two subtests of gcc.dg/Wstringop-overflow-69.c
> which I chose to XFAIL.

This function is a total mess, but I think punting right after the
gimple_assign_single_p test is too big hammer.
There are various cases and some of them are fine even when vuse is
different, others aren't.
The function at that point tries to handle CONSTRUCTOR, MEM_REF, or decls.

I don't see why the CONSTRUCTOR case couldn't be fine regardless of the
vuse.  Though, am not really sure when a CONSTRUCTOR would appear, the
lhs would need to be an SSA_NAME, so wouldn't for vectors that be a
VECTOR_CST instead, etc.?  Ah, perhaps a vector with some non-constant
element in it.
The MEM_REF case supposedly only if we can guarantee nothing could overwrite
it (so MEM_REF of TREE_STATIC TREE_READONLY could be fine, STRING_CST too,
anything else is wrong - count_nonzero_bytes_addr uses the
get_stridx/get_strinfo APIs, which for something that can be changed
always reflects only the state at the current statement.  So, e.g. the
get_stridx (exp, stmt) > 0 case in count_nonzero_bytes_addr is when
the caller must definitely punt if vuse is different.
Then for decls, again, CONST_DECLs, DECL_IN_CONSTANT_POOLs are certainly
fine.  Other decls for which ctor_for_folding returns non-error_mark_node
should be fine as well, I think ctor_useable_for_folding_p is supposed
to verify that.  STRING_CSTs should be fine as well.

So maybe pass the vuse down to count_nonzero_bytes_addr and return false
in the idx > 0 case in there if gimple_vuse (stmt) != vuse?

	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store
  2023-10-10 11:44 ` [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store Jakub Jelinek
@ 2023-10-10 11:59   ` Richard Biener
  2023-10-10 13:26     ` Jakub Jelinek
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2023-10-10 11:59 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Tue, 10 Oct 2023, Jakub Jelinek wrote:

> On Tue, Oct 10, 2023 at 10:49:04AM +0000, Richard Biener wrote:
> > The following fixes a mistake in count_nonzero_bytes which happily
> > skips over stores clobbering the memory we load a value we store
> > from and then performs analysis on the memory state before the
> > intermediate store.
> > 
> > The patch implements the most simple fix - guarantee that there are
> > no intervening stores by tracking the original active virtual operand
> > and comparing that to the one of a load we attempt to analyze.
> > 
> > This simple approach breaks two subtests of gcc.dg/Wstringop-overflow-69.c
> > which I chose to XFAIL.
> 
> This function is a total mess, but I think punting right after the
> gimple_assign_single_p test is too big hammer.
> There are various cases and some of them are fine even when vuse is
> different, others aren't.
> The function at that point tries to handle CONSTRUCTOR, MEM_REF, or decls.
> 
> I don't see why the CONSTRUCTOR case couldn't be fine regardless of the
> vuse.  Though, am not really sure when a CONSTRUCTOR would appear, the
> lhs would need to be an SSA_NAME, so wouldn't for vectors that be a
> VECTOR_CST instead, etc.?  Ah, perhaps a vector with some non-constant
> element in it.

Yeah, but what should that be interpreted to in terms of object-size?!

I think the only real case we'd see here is the MEM_REF RHS given
we know we have a register type value.  I'll note the function
totally misses handled_component_p (), it only seems to handle
*p and 'decl'.

> The MEM_REF case supposedly only if we can guarantee nothing could overwrite
> it (so MEM_REF of TREE_STATIC TREE_READONLY could be fine, STRING_CST too,
> anything else is wrong - count_nonzero_bytes_addr uses the
> get_stridx/get_strinfo APIs, which for something that can be changed
> always reflects only the state at the current statement.  So, e.g. the
> get_stridx (exp, stmt) > 0 case in count_nonzero_bytes_addr is when
> the caller must definitely punt if vuse is different.
> Then for decls, again, CONST_DECLs, DECL_IN_CONSTANT_POOLs are certainly
> fine.  Other decls for which ctor_for_folding returns non-error_mark_node
> should be fine as well, I think ctor_useable_for_folding_p is supposed
> to verify that.  STRING_CSTs should be fine as well.
> 
> So maybe pass the vuse down to count_nonzero_bytes_addr and return false
> in the idx > 0 case in there if gimple_vuse (stmt) != vuse?

I don't know enough of the pass to do better, can you take it from here?
One of the points is that we need to know the memory context (aka vuse)
of both the original store and the load we are interpreting.  For
_addr I wasn't sure how we arrive here.  As you said, this is a bit
of spaghetti and I don't want to untangle this any further.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store
  2023-10-10 11:59   ` Richard Biener
@ 2023-10-10 13:26     ` Jakub Jelinek
  2023-10-10 13:41       ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2023-10-10 13:26 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Tue, Oct 10, 2023 at 11:59:28AM +0000, Richard Biener wrote:
> > I don't see why the CONSTRUCTOR case couldn't be fine regardless of the
> > vuse.  Though, am not really sure when a CONSTRUCTOR would appear, the
> > lhs would need to be an SSA_NAME, so wouldn't for vectors that be a
> > VECTOR_CST instead, etc.?  Ah, perhaps a vector with some non-constant
> > element in it.
> 
> Yeah, but what should that be interpreted to in terms of object-size?!

The function in question doesn't compute object sizes, just minimum/maximum
number of non-zero bytes in some rhs of a store and whether everything
stored is 0s, or everything non-zeros, or some non-zeros followed by zero.

> I think the only real case we'd see here is the MEM_REF RHS given
> we know we have a register type value.  I'll note the function
> totally misses handled_component_p (), it only seems to handle
> *p and 'decl'.

Yeah, maybe we could handle even that at some point.
Though perhaps better first to rewrite it completely, because the recursive
calls where in some cases it means one thing and in another case something
completely different are just bad design (or lack thereof).

> > So maybe pass the vuse down to count_nonzero_bytes_addr and return false
> > in the idx > 0 case in there if gimple_vuse (stmt) != vuse?
> 
> I don't know enough of the pass to do better, can you take it from here?
> One of the points is that we need to know the memory context (aka vuse)
> of both the original store and the load we are interpreting.  For
> _addr I wasn't sure how we arrive here.  As you said, this is a bit
> of spaghetti and I don't want to untangle this any further.

I meant something like below, without getting rid of the -Wshadow stuff
in there my initial attempt didn't work.  This passes the new testcase
as well as the testcase you've been touching, but haven't tested it beyond
that yet.
In theory we could even handle some cases with gimple_vuse (stmt) != vuse,
because we save a copy of the strinfo state at the end of basic blocks and
only throw that away after we process all dominator children.  But we'd need
to figure out at which bb to look and temporarily switch the vectors.

2023-10-10  Richard Biener  <rguenther@suse.de>
	    Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/111519
	* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Add vuse
	argument and pass it through to recursive calls and
	count_nonzero_bytes_addr calls.  Don't shadow the stmt argument, but
	change stmt for gimple_assign_single_p statements for which we don't
	immediately punt.
	(strlen_pass::count_nonzero_bytes_addr): Add vuse argument and pass
	it through to recursive calls and count_nonzero_bytes calls.  Don't
	use get_strinfo if gimple_vuse (stmt) is different from vuse.  Don't
	shadow the stmt argument.

	* gcc.dg/torture/pr111519.c: New testcase.

--- gcc/tree-ssa-strlen.cc.jj	2023-08-30 11:21:38.539521966 +0200
+++ gcc/tree-ssa-strlen.cc	2023-10-10 15:05:44.731871218 +0200
@@ -281,14 +281,14 @@ public:
 			    gimple *stmt,
 			    unsigned lenrange[3], bool *nulterm,
 			    bool *allnul, bool *allnonnul);
-  bool count_nonzero_bytes (tree exp,
+  bool count_nonzero_bytes (tree exp, tree vuse,
 			    gimple *stmt,
 			    unsigned HOST_WIDE_INT offset,
 			    unsigned HOST_WIDE_INT nbytes,
 			    unsigned lenrange[3], bool *nulterm,
 			    bool *allnul, bool *allnonnul,
 			    ssa_name_limit_t &snlim);
-  bool count_nonzero_bytes_addr (tree exp,
+  bool count_nonzero_bytes_addr (tree exp, tree vuse,
 				 gimple *stmt,
 				 unsigned HOST_WIDE_INT offset,
 				 unsigned HOST_WIDE_INT nbytes,
@@ -4531,8 +4531,8 @@ nonzero_bytes_for_type (tree type, unsig
 }
 
 /* Recursively determine the minimum and maximum number of leading nonzero
-   bytes in the representation of EXP and set LENRANGE[0] and LENRANGE[1]
-   to each.
+   bytes in the representation of EXP at memory state VUSE and set
+   LENRANGE[0] and LENRANGE[1] to each.
    Sets LENRANGE[2] to the total size of the access (which may be less
    than LENRANGE[1] when what's being referenced by EXP is a pointer
    rather than an array).
@@ -4546,7 +4546,7 @@ nonzero_bytes_for_type (tree type, unsig
    Returns true on success and false otherwise.  */
 
 bool
-strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
+strlen_pass::count_nonzero_bytes (tree exp, tree vuse, gimple *stmt,
 				  unsigned HOST_WIDE_INT offset,
 				  unsigned HOST_WIDE_INT nbytes,
 				  unsigned lenrange[3], bool *nulterm,
@@ -4566,22 +4566,23 @@ strlen_pass::count_nonzero_bytes (tree e
 	     exact value is not known) recurse once to set the range
 	     for an arbitrary constant.  */
 	  exp = build_int_cst (type, 1);
-	  return count_nonzero_bytes (exp, stmt,
+	  return count_nonzero_bytes (exp, vuse, stmt,
 				      offset, 1, lenrange,
 				      nulterm, allnul, allnonnul, snlim);
 	}
 
-      gimple *stmt = SSA_NAME_DEF_STMT (exp);
-      if (gimple_assign_single_p (stmt))
+      gimple *g = SSA_NAME_DEF_STMT (exp);
+      if (gimple_assign_single_p (g))
 	{
-	  exp = gimple_assign_rhs1 (stmt);
+	  exp = gimple_assign_rhs1 (g);
 	  if (!DECL_P (exp)
 	      && TREE_CODE (exp) != CONSTRUCTOR
 	      && TREE_CODE (exp) != MEM_REF)
 	    return false;
 	  /* Handle DECLs, CONSTRUCTOR and MEM_REF below.  */
+	  stmt = g;
 	}
-      else if (gimple_code (stmt) == GIMPLE_PHI)
+      else if (gimple_code (g) == GIMPLE_PHI)
 	{
 	  /* Avoid processing an SSA_NAME that has already been visited
 	     or if an SSA_NAME limit has been reached.  Indicate success
@@ -4590,11 +4591,11 @@ strlen_pass::count_nonzero_bytes (tree e
 	    return res > 0;
 
 	  /* Determine the minimum and maximum from the PHI arguments.  */
-	  unsigned int n = gimple_phi_num_args (stmt);
+	  unsigned int n = gimple_phi_num_args (g);
 	  for (unsigned i = 0; i != n; i++)
 	    {
-	      tree def = gimple_phi_arg_def (stmt, i);
-	      if (!count_nonzero_bytes (def, stmt,
+	      tree def = gimple_phi_arg_def (g, i);
+	      if (!count_nonzero_bytes (def, vuse, g,
 					offset, nbytes, lenrange, nulterm,
 					allnul, allnonnul, snlim))
 		return false;
@@ -4652,7 +4653,7 @@ strlen_pass::count_nonzero_bytes (tree e
 	return false;
 
       /* Handle MEM_REF = SSA_NAME types of assignments.  */
-      return count_nonzero_bytes_addr (arg, stmt,
+      return count_nonzero_bytes_addr (arg, vuse, stmt,
 				       offset, nbytes, lenrange, nulterm,
 				       allnul, allnonnul, snlim);
     }
@@ -4765,7 +4766,7 @@ strlen_pass::count_nonzero_bytes (tree e
    bytes that are pointed to by EXP, which should be a pointer.  */
 
 bool
-strlen_pass::count_nonzero_bytes_addr (tree exp, gimple *stmt,
+strlen_pass::count_nonzero_bytes_addr (tree exp, tree vuse, gimple *stmt,
 				       unsigned HOST_WIDE_INT offset,
 				       unsigned HOST_WIDE_INT nbytes,
 				       unsigned lenrange[3], bool *nulterm,
@@ -4775,6 +4776,14 @@ strlen_pass::count_nonzero_bytes_addr (t
   int idx = get_stridx (exp, stmt);
   if (idx > 0)
     {
+      /* get_strinfo reflects string lengths before the current statement,
+	 where the current statement is the outermost count_nonzero_bytes
+	 stmt.  If there are any stores in between stmt and that
+	 current statement, the string length information might describe
+	 something significantly different.  */
+      if (gimple_vuse (stmt) != vuse)
+	return false;
+
       strinfo *si = get_strinfo (idx);
       if (!si)
 	return false;
@@ -4835,14 +4844,14 @@ strlen_pass::count_nonzero_bytes_addr (t
     }
 
   if (TREE_CODE (exp) == ADDR_EXPR)
-    return count_nonzero_bytes (TREE_OPERAND (exp, 0), stmt,
+    return count_nonzero_bytes (TREE_OPERAND (exp, 0), vuse, stmt,
 				offset, nbytes,
 				lenrange, nulterm, allnul, allnonnul, snlim);
 
   if (TREE_CODE (exp) == SSA_NAME)
     {
-      gimple *stmt = SSA_NAME_DEF_STMT (exp);
-      if (gimple_code (stmt) == GIMPLE_PHI)
+      gimple *g = SSA_NAME_DEF_STMT (exp);
+      if (gimple_code (g) == GIMPLE_PHI)
 	{
 	  /* Avoid processing an SSA_NAME that has already been visited
 	     or if an SSA_NAME limit has been reached.  Indicate success
@@ -4851,11 +4860,11 @@ strlen_pass::count_nonzero_bytes_addr (t
 	    return res > 0;
 
 	  /* Determine the minimum and maximum from the PHI arguments.  */
-	  unsigned int n = gimple_phi_num_args (stmt);
+	  unsigned int n = gimple_phi_num_args (g);
 	  for (unsigned i = 0; i != n; i++)
 	    {
-	      tree def = gimple_phi_arg_def (stmt, i);
-	      if (!count_nonzero_bytes_addr (def, stmt,
+	      tree def = gimple_phi_arg_def (g, i);
+	      if (!count_nonzero_bytes_addr (def, vuse, g,
 					     offset, nbytes, lenrange,
 					     nulterm, allnul, allnonnul,
 					     snlim))
@@ -4903,7 +4912,7 @@ strlen_pass::count_nonzero_bytes (tree e
 
   ssa_name_limit_t snlim;
   tree expr = expr_or_type;
-  return count_nonzero_bytes (expr, stmt,
+  return count_nonzero_bytes (expr, gimple_vuse (stmt), stmt,
 			      0, 0, lenrange, nulterm, allnul, allnonnul,
 			      snlim);
 }
--- gcc/testsuite/gcc.dg/torture/pr111519.c.jj	2023-10-10 14:36:22.195889648 +0200
+++ gcc/testsuite/gcc.dg/torture/pr111519.c	2023-10-10 14:36:22.195889648 +0200
@@ -0,0 +1,48 @@
+/* PR tree-optimization/111519 */
+/* { dg-do run } */
+
+int a, o;
+char b, f, i;
+long c;
+static signed char d;
+static char g;
+unsigned *h;
+signed char *e = &f;
+static signed char **j = &e;
+static long k[2];
+unsigned **l = &h;
+short m;
+volatile int z;
+
+__attribute__((noipa)) void
+foo (char *p)
+{
+  (void) p;
+}
+
+int
+main ()
+{
+  int p = z;
+  signed char *n = &d;
+  *n = 0;
+  while (c)
+    for (; i; i--)
+      ;
+  for (g = 0; g <= 1; g++)
+    {
+      *n = **j;
+      k[g] = 0 != &m;
+      *e = l && k[0];
+    }
+  if (p)
+    foo (&b);
+  for (; o < 4; o++)
+    {
+      a = d;
+      if (p)
+	foo (&b);
+    }
+  if (a != 1)
+    __builtin_abort ();
+}


	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store
  2023-10-10 13:26     ` Jakub Jelinek
@ 2023-10-10 13:41       ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2023-10-10 13:41 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Tue, 10 Oct 2023, Jakub Jelinek wrote:

> On Tue, Oct 10, 2023 at 11:59:28AM +0000, Richard Biener wrote:
> > > I don't see why the CONSTRUCTOR case couldn't be fine regardless of the
> > > vuse.  Though, am not really sure when a CONSTRUCTOR would appear, the
> > > lhs would need to be an SSA_NAME, so wouldn't for vectors that be a
> > > VECTOR_CST instead, etc.?  Ah, perhaps a vector with some non-constant
> > > element in it.
> > 
> > Yeah, but what should that be interpreted to in terms of object-size?!
> 
> The function in question doesn't compute object sizes, just minimum/maximum
> number of non-zero bytes in some rhs of a store and whether everything
> stored is 0s, or everything non-zeros, or some non-zeros followed by zero.
> 
> > I think the only real case we'd see here is the MEM_REF RHS given
> > we know we have a register type value.  I'll note the function
> > totally misses handled_component_p (), it only seems to handle
> > *p and 'decl'.
> 
> Yeah, maybe we could handle even that at some point.
> Though perhaps better first to rewrite it completely, because the recursive
> calls where in some cases it means one thing and in another case something
> completely different are just bad design (or lack thereof).

Yeah ... (true for many similar pieces in pointer-query and other
diagnostic passes...)

> > > So maybe pass the vuse down to count_nonzero_bytes_addr and return false
> > > in the idx > 0 case in there if gimple_vuse (stmt) != vuse?
> > 
> > I don't know enough of the pass to do better, can you take it from here?
> > One of the points is that we need to know the memory context (aka vuse)
> > of both the original store and the load we are interpreting.  For
> > _addr I wasn't sure how we arrive here.  As you said, this is a bit
> > of spaghetti and I don't want to untangle this any further.
> 
> I meant something like below, without getting rid of the -Wshadow stuff
> in there my initial attempt didn't work.  This passes the new testcase
> as well as the testcase you've been touching, but haven't tested it beyond
> that yet.

Works for me if it turns out working.

> In theory we could even handle some cases with gimple_vuse (stmt) != vuse,
> because we save a copy of the strinfo state at the end of basic blocks and
> only throw that away after we process all dominator children.  But we'd need
> to figure out at which bb to look and temporarily switch the vectors.

As we need sth for the branch(es) I think we should do that as followup
at most.

Thanks,
Richard.

> 2023-10-10  Richard Biener  <rguenther@suse.de>
> 	    Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/111519
> 	* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Add vuse
> 	argument and pass it through to recursive calls and
> 	count_nonzero_bytes_addr calls.  Don't shadow the stmt argument, but
> 	change stmt for gimple_assign_single_p statements for which we don't
> 	immediately punt.
> 	(strlen_pass::count_nonzero_bytes_addr): Add vuse argument and pass
> 	it through to recursive calls and count_nonzero_bytes calls.  Don't
> 	use get_strinfo if gimple_vuse (stmt) is different from vuse.  Don't
> 	shadow the stmt argument.
> 
> 	* gcc.dg/torture/pr111519.c: New testcase.
> 
> --- gcc/tree-ssa-strlen.cc.jj	2023-08-30 11:21:38.539521966 +0200
> +++ gcc/tree-ssa-strlen.cc	2023-10-10 15:05:44.731871218 +0200
> @@ -281,14 +281,14 @@ public:
>  			    gimple *stmt,
>  			    unsigned lenrange[3], bool *nulterm,
>  			    bool *allnul, bool *allnonnul);
> -  bool count_nonzero_bytes (tree exp,
> +  bool count_nonzero_bytes (tree exp, tree vuse,
>  			    gimple *stmt,
>  			    unsigned HOST_WIDE_INT offset,
>  			    unsigned HOST_WIDE_INT nbytes,
>  			    unsigned lenrange[3], bool *nulterm,
>  			    bool *allnul, bool *allnonnul,
>  			    ssa_name_limit_t &snlim);
> -  bool count_nonzero_bytes_addr (tree exp,
> +  bool count_nonzero_bytes_addr (tree exp, tree vuse,
>  				 gimple *stmt,
>  				 unsigned HOST_WIDE_INT offset,
>  				 unsigned HOST_WIDE_INT nbytes,
> @@ -4531,8 +4531,8 @@ nonzero_bytes_for_type (tree type, unsig
>  }
>  
>  /* Recursively determine the minimum and maximum number of leading nonzero
> -   bytes in the representation of EXP and set LENRANGE[0] and LENRANGE[1]
> -   to each.
> +   bytes in the representation of EXP at memory state VUSE and set
> +   LENRANGE[0] and LENRANGE[1] to each.
>     Sets LENRANGE[2] to the total size of the access (which may be less
>     than LENRANGE[1] when what's being referenced by EXP is a pointer
>     rather than an array).
> @@ -4546,7 +4546,7 @@ nonzero_bytes_for_type (tree type, unsig
>     Returns true on success and false otherwise.  */
>  
>  bool
> -strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
> +strlen_pass::count_nonzero_bytes (tree exp, tree vuse, gimple *stmt,
>  				  unsigned HOST_WIDE_INT offset,
>  				  unsigned HOST_WIDE_INT nbytes,
>  				  unsigned lenrange[3], bool *nulterm,
> @@ -4566,22 +4566,23 @@ strlen_pass::count_nonzero_bytes (tree e
>  	     exact value is not known) recurse once to set the range
>  	     for an arbitrary constant.  */
>  	  exp = build_int_cst (type, 1);
> -	  return count_nonzero_bytes (exp, stmt,
> +	  return count_nonzero_bytes (exp, vuse, stmt,
>  				      offset, 1, lenrange,
>  				      nulterm, allnul, allnonnul, snlim);
>  	}
>  
> -      gimple *stmt = SSA_NAME_DEF_STMT (exp);
> -      if (gimple_assign_single_p (stmt))
> +      gimple *g = SSA_NAME_DEF_STMT (exp);
> +      if (gimple_assign_single_p (g))
>  	{
> -	  exp = gimple_assign_rhs1 (stmt);
> +	  exp = gimple_assign_rhs1 (g);
>  	  if (!DECL_P (exp)
>  	      && TREE_CODE (exp) != CONSTRUCTOR
>  	      && TREE_CODE (exp) != MEM_REF)
>  	    return false;
>  	  /* Handle DECLs, CONSTRUCTOR and MEM_REF below.  */
> +	  stmt = g;
>  	}
> -      else if (gimple_code (stmt) == GIMPLE_PHI)
> +      else if (gimple_code (g) == GIMPLE_PHI)
>  	{
>  	  /* Avoid processing an SSA_NAME that has already been visited
>  	     or if an SSA_NAME limit has been reached.  Indicate success
> @@ -4590,11 +4591,11 @@ strlen_pass::count_nonzero_bytes (tree e
>  	    return res > 0;
>  
>  	  /* Determine the minimum and maximum from the PHI arguments.  */
> -	  unsigned int n = gimple_phi_num_args (stmt);
> +	  unsigned int n = gimple_phi_num_args (g);
>  	  for (unsigned i = 0; i != n; i++)
>  	    {
> -	      tree def = gimple_phi_arg_def (stmt, i);
> -	      if (!count_nonzero_bytes (def, stmt,
> +	      tree def = gimple_phi_arg_def (g, i);
> +	      if (!count_nonzero_bytes (def, vuse, g,
>  					offset, nbytes, lenrange, nulterm,
>  					allnul, allnonnul, snlim))
>  		return false;
> @@ -4652,7 +4653,7 @@ strlen_pass::count_nonzero_bytes (tree e
>  	return false;
>  
>        /* Handle MEM_REF = SSA_NAME types of assignments.  */
> -      return count_nonzero_bytes_addr (arg, stmt,
> +      return count_nonzero_bytes_addr (arg, vuse, stmt,
>  				       offset, nbytes, lenrange, nulterm,
>  				       allnul, allnonnul, snlim);
>      }
> @@ -4765,7 +4766,7 @@ strlen_pass::count_nonzero_bytes (tree e
>     bytes that are pointed to by EXP, which should be a pointer.  */
>  
>  bool
> -strlen_pass::count_nonzero_bytes_addr (tree exp, gimple *stmt,
> +strlen_pass::count_nonzero_bytes_addr (tree exp, tree vuse, gimple *stmt,
>  				       unsigned HOST_WIDE_INT offset,
>  				       unsigned HOST_WIDE_INT nbytes,
>  				       unsigned lenrange[3], bool *nulterm,
> @@ -4775,6 +4776,14 @@ strlen_pass::count_nonzero_bytes_addr (t
>    int idx = get_stridx (exp, stmt);
>    if (idx > 0)
>      {
> +      /* get_strinfo reflects string lengths before the current statement,
> +	 where the current statement is the outermost count_nonzero_bytes
> +	 stmt.  If there are any stores in between stmt and that
> +	 current statement, the string length information might describe
> +	 something significantly different.  */
> +      if (gimple_vuse (stmt) != vuse)
> +	return false;
> +
>        strinfo *si = get_strinfo (idx);
>        if (!si)
>  	return false;
> @@ -4835,14 +4844,14 @@ strlen_pass::count_nonzero_bytes_addr (t
>      }
>  
>    if (TREE_CODE (exp) == ADDR_EXPR)
> -    return count_nonzero_bytes (TREE_OPERAND (exp, 0), stmt,
> +    return count_nonzero_bytes (TREE_OPERAND (exp, 0), vuse, stmt,
>  				offset, nbytes,
>  				lenrange, nulterm, allnul, allnonnul, snlim);
>  
>    if (TREE_CODE (exp) == SSA_NAME)
>      {
> -      gimple *stmt = SSA_NAME_DEF_STMT (exp);
> -      if (gimple_code (stmt) == GIMPLE_PHI)
> +      gimple *g = SSA_NAME_DEF_STMT (exp);
> +      if (gimple_code (g) == GIMPLE_PHI)
>  	{
>  	  /* Avoid processing an SSA_NAME that has already been visited
>  	     or if an SSA_NAME limit has been reached.  Indicate success
> @@ -4851,11 +4860,11 @@ strlen_pass::count_nonzero_bytes_addr (t
>  	    return res > 0;
>  
>  	  /* Determine the minimum and maximum from the PHI arguments.  */
> -	  unsigned int n = gimple_phi_num_args (stmt);
> +	  unsigned int n = gimple_phi_num_args (g);
>  	  for (unsigned i = 0; i != n; i++)
>  	    {
> -	      tree def = gimple_phi_arg_def (stmt, i);
> -	      if (!count_nonzero_bytes_addr (def, stmt,
> +	      tree def = gimple_phi_arg_def (g, i);
> +	      if (!count_nonzero_bytes_addr (def, vuse, g,
>  					     offset, nbytes, lenrange,
>  					     nulterm, allnul, allnonnul,
>  					     snlim))
> @@ -4903,7 +4912,7 @@ strlen_pass::count_nonzero_bytes (tree e
>  
>    ssa_name_limit_t snlim;
>    tree expr = expr_or_type;
> -  return count_nonzero_bytes (expr, stmt,
> +  return count_nonzero_bytes (expr, gimple_vuse (stmt), stmt,
>  			      0, 0, lenrange, nulterm, allnul, allnonnul,
>  			      snlim);
>  }
> --- gcc/testsuite/gcc.dg/torture/pr111519.c.jj	2023-10-10 14:36:22.195889648 +0200
> +++ gcc/testsuite/gcc.dg/torture/pr111519.c	2023-10-10 14:36:22.195889648 +0200
> @@ -0,0 +1,48 @@
> +/* PR tree-optimization/111519 */
> +/* { dg-do run } */
> +
> +int a, o;
> +char b, f, i;
> +long c;
> +static signed char d;
> +static char g;
> +unsigned *h;
> +signed char *e = &f;
> +static signed char **j = &e;
> +static long k[2];
> +unsigned **l = &h;
> +short m;
> +volatile int z;
> +
> +__attribute__((noipa)) void
> +foo (char *p)
> +{
> +  (void) p;
> +}
> +
> +int
> +main ()
> +{
> +  int p = z;
> +  signed char *n = &d;
> +  *n = 0;
> +  while (c)
> +    for (; i; i--)
> +      ;
> +  for (g = 0; g <= 1; g++)
> +    {
> +      *n = **j;
> +      k[g] = 0 != &m;
> +      *e = l && k[0];
> +    }
> +  if (p)
> +    foo (&b);
> +  for (; o < 4; o++)
> +    {
> +      a = d;
> +      if (p)
> +	foo (&b);
> +    }
> +  if (a != 1)
> +    __builtin_abort ();
> +}
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store
@ 2023-10-10 10:49 Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2023-10-10 10:49 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

The following fixes a mistake in count_nonzero_bytes which happily
skips over stores clobbering the memory we load a value we store
from and then performs analysis on the memory state before the
intermediate store.

The patch implements the most simple fix - guarantee that there are
no intervening stores by tracking the original active virtual operand
and comparing that to the one of a load we attempt to analyze.

This simple approach breaks two subtests of gcc.dg/Wstringop-overflow-69.c
which I chose to XFAIL.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

	PR tree-optimization/111519
	* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes):
	Add virtual operand argument and pass it through.  Compare
	the memory state we try to analyze to the memory state we
	are going to use the result oon.
	(strlen_pass::count_nonzero_bytes_addr): Add virtual
	operand argument and pass it through.

	* gcc.dg/torture/pr111519.c: New testcase.
	* gcc.dg/Wstringop-overflow-69.c: XFAIL two subtests.
---
 gcc/testsuite/gcc.dg/Wstringop-overflow-69.c |  4 +-
 gcc/testsuite/gcc.dg/torture/pr111519.c      | 47 ++++++++++++++++++++
 gcc/tree-ssa-strlen.cc                       | 27 ++++++-----
 3 files changed, 64 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr111519.c

diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-69.c b/gcc/testsuite/gcc.dg/Wstringop-overflow-69.c
index be361fe620d..3c17fe13d8e 100644
--- a/gcc/testsuite/gcc.dg/Wstringop-overflow-69.c
+++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-69.c
@@ -57,9 +57,9 @@ void warn_vec_decl (void)
 {
   *(VC2*)a1 = c2;       // { dg-warning "writing 2 bytes into a region of size 1" }
   *(VC4*)a2 = c4;       // { dg-warning "writing 4 bytes into a region of size 2" }
-  *(VC4*)a3 = c4;       // { dg-warning "writing 4 bytes into a region of size 3" }
+  *(VC4*)a3 = c4;       // { dg-warning "writing 4 bytes into a region of size 3" "pr111519" { xfail *-*-* } }
   *(VC8*)a4 = c8;       // { dg-warning "writing 8 bytes into a region of size 4" }
-  *(VC8*)a7 = c8;       // { dg-warning "writing 8 bytes into a region of size 7" }
+  *(VC8*)a7 = c8;       // { dg-warning "writing 8 bytes into a region of size 7" "pr111519" { xfail *-*-* } }
   *(VC16*)a15 = c16;    // { dg-warning "writing 16 bytes into a region of size 15" }
 }
 
diff --git a/gcc/testsuite/gcc.dg/torture/pr111519.c b/gcc/testsuite/gcc.dg/torture/pr111519.c
new file mode 100644
index 00000000000..095bb1cd13b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr111519.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+
+int a, o;
+char b, f, i;
+long c;
+static signed char d;
+static char g;
+unsigned *h;
+signed char *e = &f;
+static signed char **j = &e;
+static long k[2];
+unsigned **l = &h;
+short m;
+volatile int z;
+
+__attribute__((noipa)) void
+foo (char *p)
+{
+  (void) p;
+}
+
+int
+main ()
+{
+  int p = z;
+  signed char *n = &d;
+  *n = 0;
+  while (c)
+    for (; i; i--)
+      ;
+  for (g = 0; g <= 1; g++)
+    {
+      *n = **j;
+      k[g] = 0 != &m;
+      *e = l && k[0];
+    }
+  if (p)
+    foo (&b);
+  for (; o < 4; o++)
+    {
+      a = d;
+      if (p)
+	foo (&b);
+    }
+  if (a != 1)
+    __builtin_abort ();
+}
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index 8b7ef919826..0ff3f2e308a 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -281,14 +281,14 @@ public:
 			    gimple *stmt,
 			    unsigned lenrange[3], bool *nulterm,
 			    bool *allnul, bool *allnonnul);
-  bool count_nonzero_bytes (tree exp,
+  bool count_nonzero_bytes (tree exp, tree vuse,
 			    gimple *stmt,
 			    unsigned HOST_WIDE_INT offset,
 			    unsigned HOST_WIDE_INT nbytes,
 			    unsigned lenrange[3], bool *nulterm,
 			    bool *allnul, bool *allnonnul,
 			    ssa_name_limit_t &snlim);
-  bool count_nonzero_bytes_addr (tree exp,
+  bool count_nonzero_bytes_addr (tree exp, tree vuse,
 				 gimple *stmt,
 				 unsigned HOST_WIDE_INT offset,
 				 unsigned HOST_WIDE_INT nbytes,
@@ -4531,8 +4531,8 @@ nonzero_bytes_for_type (tree type, unsigned lenrange[3],
 }
 
 /* Recursively determine the minimum and maximum number of leading nonzero
-   bytes in the representation of EXP and set LENRANGE[0] and LENRANGE[1]
-   to each.
+   bytes in the representation of EXP at memory state VUSE and set
+   LENRANGE[0] and LENRANGE[1] to each.
    Sets LENRANGE[2] to the total size of the access (which may be less
    than LENRANGE[1] when what's being referenced by EXP is a pointer
    rather than an array).
@@ -4546,7 +4546,7 @@ nonzero_bytes_for_type (tree type, unsigned lenrange[3],
    Returns true on success and false otherwise.  */
 
 bool
-strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
+strlen_pass::count_nonzero_bytes (tree exp, tree vuse, gimple *stmt,
 				  unsigned HOST_WIDE_INT offset,
 				  unsigned HOST_WIDE_INT nbytes,
 				  unsigned lenrange[3], bool *nulterm,
@@ -4566,7 +4566,7 @@ strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
 	     exact value is not known) recurse once to set the range
 	     for an arbitrary constant.  */
 	  exp = build_int_cst (type, 1);
-	  return count_nonzero_bytes (exp, stmt,
+	  return count_nonzero_bytes (exp, vuse, stmt,
 				      offset, 1, lenrange,
 				      nulterm, allnul, allnonnul, snlim);
 	}
@@ -4574,6 +4574,9 @@ strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
       gimple *stmt = SSA_NAME_DEF_STMT (exp);
       if (gimple_assign_single_p (stmt))
 	{
+	  /* Do not look across other stores.  */
+	  if (gimple_vuse (stmt) != vuse)
+	    return false;
 	  exp = gimple_assign_rhs1 (stmt);
 	  if (!DECL_P (exp)
 	      && TREE_CODE (exp) != CONSTRUCTOR
@@ -4594,7 +4597,7 @@ strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
 	  for (unsigned i = 0; i != n; i++)
 	    {
 	      tree def = gimple_phi_arg_def (stmt, i);
-	      if (!count_nonzero_bytes (def, stmt,
+	      if (!count_nonzero_bytes (def, vuse, stmt,
 					offset, nbytes, lenrange, nulterm,
 					allnul, allnonnul, snlim))
 		return false;
@@ -4652,7 +4655,7 @@ strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
 	return false;
 
       /* Handle MEM_REF = SSA_NAME types of assignments.  */
-      return count_nonzero_bytes_addr (arg, stmt,
+      return count_nonzero_bytes_addr (arg, vuse, stmt,
 				       offset, nbytes, lenrange, nulterm,
 				       allnul, allnonnul, snlim);
     }
@@ -4765,7 +4768,7 @@ strlen_pass::count_nonzero_bytes (tree exp, gimple *stmt,
    bytes that are pointed to by EXP, which should be a pointer.  */
 
 bool
-strlen_pass::count_nonzero_bytes_addr (tree exp, gimple *stmt,
+strlen_pass::count_nonzero_bytes_addr (tree exp, tree vuse, gimple *stmt,
 				       unsigned HOST_WIDE_INT offset,
 				       unsigned HOST_WIDE_INT nbytes,
 				       unsigned lenrange[3], bool *nulterm,
@@ -4835,7 +4838,7 @@ strlen_pass::count_nonzero_bytes_addr (tree exp, gimple *stmt,
     }
 
   if (TREE_CODE (exp) == ADDR_EXPR)
-    return count_nonzero_bytes (TREE_OPERAND (exp, 0), stmt,
+    return count_nonzero_bytes (TREE_OPERAND (exp, 0), vuse, stmt,
 				offset, nbytes,
 				lenrange, nulterm, allnul, allnonnul, snlim);
 
@@ -4855,7 +4858,7 @@ strlen_pass::count_nonzero_bytes_addr (tree exp, gimple *stmt,
 	  for (unsigned i = 0; i != n; i++)
 	    {
 	      tree def = gimple_phi_arg_def (stmt, i);
-	      if (!count_nonzero_bytes_addr (def, stmt,
+	      if (!count_nonzero_bytes_addr (def, NULL_TREE, stmt,
 					     offset, nbytes, lenrange,
 					     nulterm, allnul, allnonnul,
 					     snlim))
@@ -4903,7 +4906,7 @@ strlen_pass::count_nonzero_bytes (tree expr_or_type, gimple *stmt,
 
   ssa_name_limit_t snlim;
   tree expr = expr_or_type;
-  return count_nonzero_bytes (expr, stmt,
+  return count_nonzero_bytes (expr, gimple_vuse (stmt), stmt,
 			      0, 0, lenrange, nulterm, allnul, allnonnul,
 			      snlim);
 }
-- 
2.35.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-10 13:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20231010104908.839613857C41@sourceware.org>
2023-10-10 11:44 ` [PATCH] tree-optimization/111519 - strlen optimization skips clobbering store Jakub Jelinek
2023-10-10 11:59   ` Richard Biener
2023-10-10 13:26     ` Jakub Jelinek
2023-10-10 13:41       ` Richard Biener
2023-10-10 10:49 Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).