public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* folding (vec_)cond_expr in a binary operation
@ 2013-07-06 19:42 Marc Glisse
  2013-08-30  9:38 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Glisse @ 2013-07-06 19:42 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3449 bytes --]

Hello,

the first attached patch does not bootstrap and has at least 2 main 
issues. The second patch does pass bootstrap+testsuite, but I liked the 
first more...

First, the fold-const bit causes an assertion failure (building libjava) 
in combine_cond_expr_cond, which calls:

   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);

and then checks:

   /* Require that we got a boolean type out if we put one in.  */
   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));

which makes sense... Note that the 2 relevant types are:

(gdb) call debug_tree((tree)0x7ffff5e78930)
  <integer_type 0x7ffff5e78930 jboolean asm_written public unsigned type_3 QI
     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8 bitsizetype> constant 8>
     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000 sizetype> constant 1>
     align 8 symtab -169408800 alias set -1 canonical type 0x7ffff6635c78 precision 1 min <integer_cst 0x7ffff66321e0 0> max <integer_cst 0x7ffff6632200 1>
     pointer_to_this <pointer_type 0x7ffff5a1e540> chain <integer_type 0x7ffff6635dc8>>
(gdb) call debug_tree(type)
  <boolean_type 0x7ffff64f69d8 bool sizes-gimplified asm_written public unsigned QI
     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8 bitsizetype> constant 8>
     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000 sizetype> constant 1>
     align 8 symtab -170348640 alias set 19 canonical type 0x7ffff64f69d8 precision 1 min <integer_cst 0x7ffff64f92a0 0> max <integer_cst 0x7ffff64f92e0 1>
     pointer_to_this <pointer_type 0x7ffff5912dc8>>

I need to relax the conditions on the vec?-1:0 optimization because with 
vectors we often end up with several types that are equivalent. Can you 
think of a good way to do it? In the second patch I limit the 
transformation to vectors and hope it doesn't cause as much trouble 
there...



Second, the way the forwprop transformation is done, it can be necessary 
to run the forwprop pass several times in a row, which is not nice. It 
takes:

stmt_cond
stmt_binop

and produces:

stmt_folded
stmt_cond2

But one of the arguments of stmt_folded could be an ssa_name defined by a 
cond_expr, which could now be propagated into stmt_folded (there may be 
other new possible transformations). However, that cond_expr has already 
been visited and won't be again. The combine part of the pass uses a PLF 
to revisit statements, but the forwprop part doesn't have any specific 
mechanism.

The workarounds I can think of are:
- manually check if some of the arguments of stmt_folded are cond_expr and 
recursively call the function on them;
- move the transformation to the combine part of the pass;
- setup some system to revisit all earlier statements?

I went with the second option in the second patch, but this limitation of 
forward propagation seems strange to me.


(s/combine_binop_with_condition/forward_propagate_condition/ for the first 
patch)

2013-07-06  Marc Glisse  <marc.glisse@inria.fr>

 	PR tree-optimization/57755
gcc/
 	* tree-ssa-forwprop.c (combine_binop_with_condition): New function.
 	(ssa_forward_propagate_and_combine): Call it.
 	* fold-const.c (fold_ternary_loc): In gimple form, don't check for
 	type equality.

gcc/testsuite/
 	* g++.dg/tree-ssa/pr57755.C: New testcase.

(this message does not supersede 
http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html )

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 7373 bytes --]

Index: fold-const.c
===================================================================
--- fold-const.c	(revision 200736)
+++ fold-const.c	(working copy)
@@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
 
       /* Convert A ? 1 : 0 to simply A.  */
       if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
 				 : (integer_onep (op1)
 				    && !VECTOR_TYPE_P (type)))
 	  && integer_zerop (op2)
 	  /* If we try to convert OP0 to our type, the
 	     call to fold will try to move the conversion inside
 	     a COND, which will recurse.  In that case, the COND_EXPR
 	     is probably the best choice, so leave it alone.  */
-	  && type == TREE_TYPE (arg0))
+	  && (type == TREE_TYPE (arg0)
+	      || (in_gimple_form
+		  && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
 	return pedantic_non_lvalue_loc (loc, arg0);
 
       /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
 	 over COND_EXPR in cases such as floating point comparisons.  */
       if (integer_zerop (op1)
 	  && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
 				    : (integer_onep (op2)
 				       && !VECTOR_TYPE_P (type)))
 	  && truth_value_p (TREE_CODE (arg0)))
 	return pedantic_non_lvalue_loc (loc,
Index: tree-ssa-forwprop.c
===================================================================
--- tree-ssa-forwprop.c	(revision 200736)
+++ tree-ssa-forwprop.c	(working copy)
@@ -1135,20 +1135,135 @@ forward_propagate_comparison (gimple_stm
 
   /* Remove defining statements.  */
   return remove_prop_source_from_use (name);
 
 bailout:
   gsi_next (defgsi);
   return false;
 }
 
 
+/* Forward propagate the condition defined in *DEFGSI to uses in
+   binary operations.
+   Returns true if stmt is now unused.  Advance DEFGSI to the next
+   statement.  */
+
+static bool
+forward_propagate_condition (gimple_stmt_iterator *defgsi)
+{
+  gimple stmt = gsi_stmt (*defgsi);
+  tree name = gimple_assign_lhs (stmt);
+  enum tree_code defcode = gimple_assign_rhs_code (stmt);
+  gimple_stmt_iterator gsi;
+  enum tree_code code;
+  tree lhs, type;
+  gimple use_stmt;
+
+  /* Don't propagate ssa names that occur in abnormal phis.  */
+  if ((TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
+       && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs1 (stmt)))
+      || (TREE_CODE (gimple_assign_rhs2 (stmt)) == SSA_NAME
+        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs2 (stmt)))
+      || (TREE_CODE (gimple_assign_rhs3 (stmt)) == SSA_NAME
+        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs3 (stmt))))
+    goto bailout;
+
+  /* Do not un-cse, but propagate through copies.  */
+  use_stmt = get_prop_dest_stmt (name, &name);
+  if (!use_stmt
+      || !is_gimple_assign (use_stmt)
+      || gimple_has_side_effects (stmt)
+      || gimple_has_side_effects (use_stmt))
+    goto bailout;
+
+  gsi = gsi_for_stmt (use_stmt);
+  code = gimple_assign_rhs_code (use_stmt);
+  lhs = gimple_assign_lhs (use_stmt);
+  type = TREE_TYPE (lhs);
+
+  if (TREE_CODE_CLASS (code) == tcc_binary
+      || TREE_CODE_CLASS (code) == tcc_comparison)
+    {
+      bool cond_is_left = false;
+      bool trueok = false;
+      bool falseok = false;
+      if (name == gimple_assign_rhs1 (use_stmt))
+	cond_is_left = true;
+      else
+	gcc_assert (name == gimple_assign_rhs2 (use_stmt));
+      tree true_op, false_op;
+      if (cond_is_left)
+	{
+	  true_op = fold_build2_loc (gimple_location (stmt), code, type,
+				     gimple_assign_rhs2 (stmt),
+				     gimple_assign_rhs2 (use_stmt));
+	  false_op = fold_build2_loc (gimple_location (stmt), code, type,
+				      gimple_assign_rhs3 (stmt),
+				      gimple_assign_rhs2 (use_stmt));
+	}
+      else
+	{
+	  true_op = fold_build2_loc (gimple_location (stmt), code, type,
+				     gimple_assign_rhs1 (use_stmt),
+				     gimple_assign_rhs2 (stmt));
+	  false_op = fold_build2_loc (gimple_location (stmt), code, type,
+				      gimple_assign_rhs1 (use_stmt),
+				      gimple_assign_rhs3 (stmt));
+	}
+
+      if (is_gimple_val (true_op))
+	trueok = true;
+      if (is_gimple_val (false_op))
+	falseok = true;
+      /* At least one of them has to simplify, or it isn't worth it.  */
+      if (!trueok && !falseok)
+	goto bailout;
+      if (!trueok)
+	{
+	  if (!valid_gimple_rhs_p (true_op))
+	    goto bailout;
+	  gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+	  true_op = gimple_assign_lhs (g);
+	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
+	}
+      if (!falseok)
+	{
+	  if (!valid_gimple_rhs_p (false_op))
+	    goto bailout;
+	  gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+	  false_op = gimple_assign_lhs (g);
+	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
+	}
+
+      gimple_assign_set_rhs_with_ops_1 (&gsi, defcode,
+					gimple_assign_rhs1 (stmt),
+					true_op, false_op);
+    }
+  else
+    goto bailout;
+
+  fold_stmt (&gsi);
+  update_stmt (gsi_stmt (gsi));
+
+  /* When we remove stmt now the iterator defgsi goes off it's current
+     sequence, hence advance it now.  */
+  gsi_next (defgsi);
+
+  /* Remove defining statements.  */
+  return remove_prop_source_from_use (name);
+
+bailout:
+  gsi_next (defgsi);
+  return false;
+}
+
+
 /* GSI_P points to a statement which performs a narrowing integral
    conversion.
 
    Look for cases like:
 
      t = x & c;
      y = (T) t;
 
    Turn them into:
 
@@ -3382,20 +3497,25 @@ ssa_forward_propagate_and_combine (void)
 		    gsi_next (&gsi);
 		}
 	      else
 		gsi_next (&gsi);
 	    }
 	  else if (TREE_CODE_CLASS (code) == tcc_comparison)
 	    {
 	      if (forward_propagate_comparison (&gsi))
 	        cfg_changed = true;
 	    }
+	  else if (code == COND_EXPR || code == VEC_COND_EXPR)
+	    {
+	      if (forward_propagate_condition (&gsi))
+	        cfg_changed = true;
+	    }
 	  else
 	    gsi_next (&gsi);
 	}
 
       /* Combine stmts with the stmts defining their operands.
 	 Note we update GSI within the loop as necessary.  */
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
 	{
 	  gimple stmt = gsi_stmt (gsi);
 	  bool changed = false;
Index: testsuite/g++.dg/tree-ssa/pr57755.c
===================================================================
--- testsuite/g++.dg/tree-ssa/pr57755.c	(revision 0)
+++ testsuite/g++.dg/tree-ssa/pr57755.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1" } */
+typedef int vec __attribute__((vector_size(4*sizeof(int))));
+
+vec f(vec a,vec b){
+  vec z=a!=a;
+  vec o=z+1;
+  vec c=(a>3)?o:z;
+  vec d=(b>2)?o:z;
+  vec e=c&d;
+  return e!=0;
+}
+
+vec g(vec a,vec b){
+  vec z=a!=a;
+  vec c=(a<=42)?b:z;
+  return b&c;
+}
+
+/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
+/* { dg-final { cleanup-tree-dump "forwprop1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr57755.c
___________________________________________________________________
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL


[-- Attachment #3: Type: TEXT/PLAIN, Size: 7813 bytes --]

Index: testsuite/g++.dg/tree-ssa/pr57755.C
===================================================================
--- testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)
+++ testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1" } */
+typedef int vec __attribute__((vector_size(4*sizeof(int))));
+
+vec f(vec a,vec b){
+  vec z=a!=a;
+  vec o=z+1;
+  vec c=(a>3)?o:z;
+  vec d=(b>2)?o:z;
+  vec e=c&d;
+  return e!=0;
+}
+
+vec g(vec a,vec b){
+  vec z=a!=a;
+  vec c=(a<=42)?b:z;
+  return b&c;
+}
+
+/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
+/* { dg-final { cleanup-tree-dump "forwprop1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr57755.C
___________________________________________________________________
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: fold-const.c
===================================================================
--- fold-const.c	(revision 200736)
+++ fold-const.c	(working copy)
@@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
 
       /* Convert A ? 1 : 0 to simply A.  */
       if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
 				 : (integer_onep (op1)
 				    && !VECTOR_TYPE_P (type)))
 	  && integer_zerop (op2)
 	  /* If we try to convert OP0 to our type, the
 	     call to fold will try to move the conversion inside
 	     a COND, which will recurse.  In that case, the COND_EXPR
 	     is probably the best choice, so leave it alone.  */
-	  && type == TREE_TYPE (arg0))
+	  && (type == TREE_TYPE (arg0)
+	      || (in_gimple_form && code == VEC_COND_EXPR
+		  && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
 	return pedantic_non_lvalue_loc (loc, arg0);
 
       /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
 	 over COND_EXPR in cases such as floating point comparisons.  */
       if (integer_zerop (op1)
 	  && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
 				    : (integer_onep (op2)
 				       && !VECTOR_TYPE_P (type)))
 	  && truth_value_p (TREE_CODE (arg0)))
 	return pedantic_non_lvalue_loc (loc,
Index: tree-ssa-forwprop.c
===================================================================
--- tree-ssa-forwprop.c	(revision 200736)
+++ tree-ssa-forwprop.c	(working copy)
@@ -1135,20 +1135,131 @@ forward_propagate_comparison (gimple_stm
 
   /* Remove defining statements.  */
   return remove_prop_source_from_use (name);
 
 bailout:
   gsi_next (defgsi);
   return false;
 }
 
 
+/* Combine the binary operation defined in *GSI with one of its arguments
+   that is a condition.
+   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
+   run.  Else it returns 0.  */
+
+static int
+combine_binop_with_condition (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  gimple def_stmt;
+  enum tree_code defcode;
+  bool trueok = false;
+  bool falseok = false;
+  tree true_op, false_op;
+  location_t loc = gimple_location (stmt);
+
+  if (TREE_CODE (rhs1) != SSA_NAME)
+    goto second_op;
+
+  def_stmt = get_prop_source_stmt (rhs1, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+    goto second_op;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+    goto second_op;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2 (def_stmt),
+			     gimple_assign_rhs2 (stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3 (def_stmt),
+			      gimple_assign_rhs2 (stmt));
+
+  if (is_gimple_val (true_op))
+    trueok = true;
+  if (is_gimple_val (false_op))
+    falseok = true;
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+    goto second_op;
+  if (!trueok)
+    {
+      if (!valid_gimple_rhs_p (true_op))
+	goto second_op;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+      true_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  if (!falseok)
+    {
+      if (!valid_gimple_rhs_p (false_op))
+	goto second_op;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+      false_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  goto finish;
+
+second_op:
+  if (TREE_CODE (rhs2) != SSA_NAME)
+    return 0;
+
+  def_stmt = get_prop_source_stmt (rhs2, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+    return 0;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+    return 0;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
+			     gimple_assign_rhs2 (def_stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
+			      gimple_assign_rhs3 (def_stmt));
+
+  trueok = is_gimple_val (true_op);
+  falseok = is_gimple_val (false_op);
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+    return 0;
+  if (!trueok)
+    {
+      if (!valid_gimple_rhs_p (true_op))
+	return 0;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+      true_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  if (!falseok)
+    {
+      if (!valid_gimple_rhs_p (false_op))
+	return 0;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+      false_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+finish:
+  gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1 (def_stmt),
+				    true_op, false_op);
+
+  fold_stmt (gsi);
+  update_stmt (gsi_stmt (*gsi));
+
+  /* Remove defining statements.  */
+  return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;
+}
+
+
 /* GSI_P points to a statement which performs a narrowing integral
    conversion.
 
    Look for cases like:
 
      t = x & c;
      y = (T) t;
 
    Turn them into:
 
@@ -3403,21 +3514,35 @@ ssa_forward_propagate_and_combine (void)
 	  /* Mark stmt as potentially needing revisiting.  */
 	  gimple_set_plf (stmt, GF_PLF_1, false);
 
 	  switch (gimple_code (stmt))
 	    {
 	    case GIMPLE_ASSIGN:
 	      {
 		tree rhs1 = gimple_assign_rhs1 (stmt);
 		enum tree_code code = gimple_assign_rhs_code (stmt);
 
-		if ((code == BIT_NOT_EXPR
+		/* Something (associate_plusminus?) doesn't set "changed"
+		   properly, so we can't put this at the end with
+		   if (!changed && ...).  */
+		if (TREE_CODE_CLASS (code) == tcc_binary
+		    || TREE_CODE_CLASS (code) == tcc_comparison)
+		  {
+		    int did_something;
+		    did_something = combine_binop_with_condition (&gsi);
+		    if (did_something == 2)
+		      cfg_changed = true;
+		    changed = did_something != 0;
+		  }
+		if (changed)
+		  ;
+		else if ((code == BIT_NOT_EXPR
 		     || code == NEGATE_EXPR)
 		    && TREE_CODE (rhs1) == SSA_NAME)
 		  changed = simplify_not_neg_expr (&gsi);
 		else if (code == COND_EXPR
 			 || code == VEC_COND_EXPR)
 		  {
 		    /* In this case the entire COND_EXPR is in rhs1. */
 		    if (forward_propagate_into_cond (&gsi)
 			|| combine_cond_exprs (&gsi))
 		      {

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: folding (vec_)cond_expr in a binary operation
  2013-07-06 19:42 folding (vec_)cond_expr in a binary operation Marc Glisse
@ 2013-08-30  9:38 ` Richard Biener
  2013-09-02  9:47   ` Marc Glisse
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2013-08-30  9:38 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> the first attached patch does not bootstrap and has at least 2 main issues.
> The second patch does pass bootstrap+testsuite, but I liked the first
> more...
>
> First, the fold-const bit causes an assertion failure (building libjava) in
> combine_cond_expr_cond, which calls:
>
>   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>
> and then checks:
>
>   /* Require that we got a boolean type out if we put one in.  */
>   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>
> which makes sense... Note that the 2 relevant types are:

well, the check is probably old and can be relaxed to also allow all
compatible types.

> (gdb) call debug_tree((tree)0x7ffff5e78930)
>  <integer_type 0x7ffff5e78930 jboolean asm_written public unsigned type_3 QI
>     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8
> bitsizetype> constant 8>
>     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000
> sizetype> constant 1>
>     align 8 symtab -169408800 alias set -1 canonical type 0x7ffff6635c78
> precision 1 min <integer_cst 0x7ffff66321e0 0> max <integer_cst
> 0x7ffff6632200 1>
>     pointer_to_this <pointer_type 0x7ffff5a1e540> chain <integer_type
> 0x7ffff6635dc8>>
> (gdb) call debug_tree(type)
>  <boolean_type 0x7ffff64f69d8 bool sizes-gimplified asm_written public
> unsigned QI
>     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8
> bitsizetype> constant 8>
>     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000
> sizetype> constant 1>
>     align 8 symtab -170348640 alias set 19 canonical type 0x7ffff64f69d8
> precision 1 min <integer_cst 0x7ffff64f92a0 0> max <integer_cst
> 0x7ffff64f92e0 1>
>     pointer_to_this <pointer_type 0x7ffff5912dc8>>
>
> I need to relax the conditions on the vec?-1:0 optimization because with
> vectors we often end up with several types that are equivalent. Can you
> think of a good way to do it? In the second patch I limit the transformation
> to vectors and hope it doesn't cause as much trouble there...
>
>
>
> Second, the way the forwprop transformation is done, it can be necessary to
> run the forwprop pass several times in a row, which is not nice. It takes:
>
> stmt_cond
> stmt_binop
>
> and produces:
>
> stmt_folded
> stmt_cond2
>
> But one of the arguments of stmt_folded could be an ssa_name defined by a
> cond_expr, which could now be propagated into stmt_folded (there may be
> other new possible transformations). However, that cond_expr has already
> been visited and won't be again. The combine part of the pass uses a PLF to
> revisit statements, but the forwprop part doesn't have any specific
> mechanism.

forwprop is designed to re-process stmts it has folded to catch cascading
effects.  Now, as it as both a forward and a backward run you don't catch
2nd-order effects with iterating them.  On my TODO is to only do one kind,
either forward or backward propagation.

> The workarounds I can think of are:
> - manually check if some of the arguments of stmt_folded are cond_expr and
> recursively call the function on them;
> - move the transformation to the combine part of the pass;

this it is.

> - setup some system to revisit all earlier statements?
>
> I went with the second option in the second patch, but this limitation of
> forward propagation seems strange to me.
>
>
> (s/combine_binop_with_condition/forward_propagate_condition/ for the first
> patch)

Btw, as for the patch I don't like that you basically feed everything into
fold.  Yes, I know we do that for conditions because that's quite important
and nobody has re-written the condition folding as gimple pattern matcher.
I doubt that this transform is important enough to warrant another exception ;)

Richard.

> 2013-07-06  Marc Glisse  <marc.glisse@inria.fr>
>
>         PR tree-optimization/57755
> gcc/
>         * tree-ssa-forwprop.c (combine_binop_with_condition): New function.
>         (ssa_forward_propagate_and_combine): Call it.
>         * fold-const.c (fold_ternary_loc): In gimple form, don't check for
>         type equality.
>
> gcc/testsuite/
>         * g++.dg/tree-ssa/pr57755.C: New testcase.
>
> (this message does not supersede
> http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html )
>
> --
> Marc Glisse
> Index: fold-const.c
> ===================================================================
> --- fold-const.c        (revision 200736)
> +++ fold-const.c        (working copy)
> @@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
>
>        /* Convert A ? 1 : 0 to simply A.  */
>        if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
>                                  : (integer_onep (op1)
>                                     && !VECTOR_TYPE_P (type)))
>           && integer_zerop (op2)
>           /* If we try to convert OP0 to our type, the
>              call to fold will try to move the conversion inside
>              a COND, which will recurse.  In that case, the COND_EXPR
>              is probably the best choice, so leave it alone.  */
> -         && type == TREE_TYPE (arg0))
> +         && (type == TREE_TYPE (arg0)
> +             || (in_gimple_form
> +                 && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
>         return pedantic_non_lvalue_loc (loc, arg0);
>
>        /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
>          over COND_EXPR in cases such as floating point comparisons.  */
>        if (integer_zerop (op1)
>           && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
>                                     : (integer_onep (op2)
>                                        && !VECTOR_TYPE_P (type)))
>           && truth_value_p (TREE_CODE (arg0)))
>         return pedantic_non_lvalue_loc (loc,
> Index: tree-ssa-forwprop.c
> ===================================================================
> --- tree-ssa-forwprop.c (revision 200736)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -1135,20 +1135,135 @@ forward_propagate_comparison (gimple_stm
>
>    /* Remove defining statements.  */
>    return remove_prop_source_from_use (name);
>
>  bailout:
>    gsi_next (defgsi);
>    return false;
>  }
>
>
> +/* Forward propagate the condition defined in *DEFGSI to uses in
> +   binary operations.
> +   Returns true if stmt is now unused.  Advance DEFGSI to the next
> +   statement.  */
> +
> +static bool
> +forward_propagate_condition (gimple_stmt_iterator *defgsi)
> +{
> +  gimple stmt = gsi_stmt (*defgsi);
> +  tree name = gimple_assign_lhs (stmt);
> +  enum tree_code defcode = gimple_assign_rhs_code (stmt);
> +  gimple_stmt_iterator gsi;
> +  enum tree_code code;
> +  tree lhs, type;
> +  gimple use_stmt;
> +
> +  /* Don't propagate ssa names that occur in abnormal phis.  */
> +  if ((TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
> +       && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs1 (stmt)))
> +      || (TREE_CODE (gimple_assign_rhs2 (stmt)) == SSA_NAME
> +        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs2 (stmt)))
> +      || (TREE_CODE (gimple_assign_rhs3 (stmt)) == SSA_NAME
> +        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs3 (stmt))))
> +    goto bailout;
> +
> +  /* Do not un-cse, but propagate through copies.  */
> +  use_stmt = get_prop_dest_stmt (name, &name);
> +  if (!use_stmt
> +      || !is_gimple_assign (use_stmt)
> +      || gimple_has_side_effects (stmt)
> +      || gimple_has_side_effects (use_stmt))
> +    goto bailout;
> +
> +  gsi = gsi_for_stmt (use_stmt);
> +  code = gimple_assign_rhs_code (use_stmt);
> +  lhs = gimple_assign_lhs (use_stmt);
> +  type = TREE_TYPE (lhs);
> +
> +  if (TREE_CODE_CLASS (code) == tcc_binary
> +      || TREE_CODE_CLASS (code) == tcc_comparison)
> +    {
> +      bool cond_is_left = false;
> +      bool trueok = false;
> +      bool falseok = false;
> +      if (name == gimple_assign_rhs1 (use_stmt))
> +       cond_is_left = true;
> +      else
> +       gcc_assert (name == gimple_assign_rhs2 (use_stmt));
> +      tree true_op, false_op;
> +      if (cond_is_left)
> +       {
> +         true_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                    gimple_assign_rhs2 (stmt),
> +                                    gimple_assign_rhs2 (use_stmt));
> +         false_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                     gimple_assign_rhs3 (stmt),
> +                                     gimple_assign_rhs2 (use_stmt));
> +       }
> +      else
> +       {
> +         true_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                    gimple_assign_rhs1 (use_stmt),
> +                                    gimple_assign_rhs2 (stmt));
> +         false_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                     gimple_assign_rhs1 (use_stmt),
> +                                     gimple_assign_rhs3 (stmt));
> +       }
> +
> +      if (is_gimple_val (true_op))
> +       trueok = true;
> +      if (is_gimple_val (false_op))
> +       falseok = true;
> +      /* At least one of them has to simplify, or it isn't worth it.  */
> +      if (!trueok && !falseok)
> +       goto bailout;
> +      if (!trueok)
> +       {
> +         if (!valid_gimple_rhs_p (true_op))
> +           goto bailout;
> +         gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> true_op);
> +         true_op = gimple_assign_lhs (g);
> +         gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +       }
> +      if (!falseok)
> +       {
> +         if (!valid_gimple_rhs_p (false_op))
> +           goto bailout;
> +         gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> +         false_op = gimple_assign_lhs (g);
> +         gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +       }
> +
> +      gimple_assign_set_rhs_with_ops_1 (&gsi, defcode,
> +                                       gimple_assign_rhs1 (stmt),
> +                                       true_op, false_op);
> +    }
> +  else
> +    goto bailout;
> +
> +  fold_stmt (&gsi);
> +  update_stmt (gsi_stmt (gsi));
> +
> +  /* When we remove stmt now the iterator defgsi goes off it's current
> +     sequence, hence advance it now.  */
> +  gsi_next (defgsi);
> +
> +  /* Remove defining statements.  */
> +  return remove_prop_source_from_use (name);
> +
> +bailout:
> +  gsi_next (defgsi);
> +  return false;
> +}
> +
> +
>  /* GSI_P points to a statement which performs a narrowing integral
>     conversion.
>
>     Look for cases like:
>
>       t = x & c;
>       y = (T) t;
>
>     Turn them into:
>
> @@ -3382,20 +3497,25 @@ ssa_forward_propagate_and_combine (void)
>                     gsi_next (&gsi);
>                 }
>               else
>                 gsi_next (&gsi);
>             }
>           else if (TREE_CODE_CLASS (code) == tcc_comparison)
>             {
>               if (forward_propagate_comparison (&gsi))
>                 cfg_changed = true;
>             }
> +         else if (code == COND_EXPR || code == VEC_COND_EXPR)
> +           {
> +             if (forward_propagate_condition (&gsi))
> +               cfg_changed = true;
> +           }
>           else
>             gsi_next (&gsi);
>         }
>
>        /* Combine stmts with the stmts defining their operands.
>          Note we update GSI within the loop as necessary.  */
>        for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
>         {
>           gimple stmt = gsi_stmt (gsi);
>           bool changed = false;
> Index: testsuite/g++.dg/tree-ssa/pr57755.c
> ===================================================================
> --- testsuite/g++.dg/tree-ssa/pr57755.c (revision 0)
> +++ testsuite/g++.dg/tree-ssa/pr57755.c (revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1" } */
> +typedef int vec __attribute__((vector_size(4*sizeof(int))));
> +
> +vec f(vec a,vec b){
> +  vec z=a!=a;
> +  vec o=z+1;
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
> +  return e!=0;
> +}
> +
> +vec g(vec a,vec b){
> +  vec z=a!=a;
> +  vec c=(a<=42)?b:z;
> +  return b&c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
> +/* { dg-final { cleanup-tree-dump "forwprop1" } } */
>
> Property changes on: testsuite/g++.dg/tree-ssa/pr57755.c
> ___________________________________________________________________
> Added: svn:eol-style
>    + native
> Added: svn:keywords
>    + Author Date Id Revision URL
>
>
> Index: testsuite/g++.dg/tree-ssa/pr57755.C
> ===================================================================
> --- testsuite/g++.dg/tree-ssa/pr57755.C (revision 0)
> +++ testsuite/g++.dg/tree-ssa/pr57755.C (revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1" } */
> +typedef int vec __attribute__((vector_size(4*sizeof(int))));
> +
> +vec f(vec a,vec b){
> +  vec z=a!=a;
> +  vec o=z+1;
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
> +  return e!=0;
> +}
> +
> +vec g(vec a,vec b){
> +  vec z=a!=a;
> +  vec c=(a<=42)?b:z;
> +  return b&c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
> +/* { dg-final { cleanup-tree-dump "forwprop1" } } */
>
> Property changes on: testsuite/g++.dg/tree-ssa/pr57755.C
> ___________________________________________________________________
> Added: svn:keywords
>    + Author Date Id Revision URL
> Added: svn:eol-style
>    + native
>
> Index: fold-const.c
> ===================================================================
> --- fold-const.c        (revision 200736)
> +++ fold-const.c        (working copy)
> @@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
>
>        /* Convert A ? 1 : 0 to simply A.  */
>        if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
>                                  : (integer_onep (op1)
>                                     && !VECTOR_TYPE_P (type)))
>           && integer_zerop (op2)
>           /* If we try to convert OP0 to our type, the
>              call to fold will try to move the conversion inside
>              a COND, which will recurse.  In that case, the COND_EXPR
>              is probably the best choice, so leave it alone.  */
> -         && type == TREE_TYPE (arg0))
> +         && (type == TREE_TYPE (arg0)
> +             || (in_gimple_form && code == VEC_COND_EXPR
> +                 && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
>         return pedantic_non_lvalue_loc (loc, arg0);
>
>        /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
>          over COND_EXPR in cases such as floating point comparisons.  */
>        if (integer_zerop (op1)
>           && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
>                                     : (integer_onep (op2)
>                                        && !VECTOR_TYPE_P (type)))
>           && truth_value_p (TREE_CODE (arg0)))
>         return pedantic_non_lvalue_loc (loc,
> Index: tree-ssa-forwprop.c
> ===================================================================
> --- tree-ssa-forwprop.c (revision 200736)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -1135,20 +1135,131 @@ forward_propagate_comparison (gimple_stm
>
>    /* Remove defining statements.  */
>    return remove_prop_source_from_use (name);
>
>  bailout:
>    gsi_next (defgsi);
>    return false;
>  }
>
>
> +/* Combine the binary operation defined in *GSI with one of its arguments
> +   that is a condition.
> +   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
> +   run.  Else it returns 0.  */
> +
> +static int
> +combine_binop_with_condition (gimple_stmt_iterator *gsi)
> +{
> +  gimple stmt = gsi_stmt (*gsi);
> +  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  tree rhs1 = gimple_assign_rhs1 (stmt);
> +  tree rhs2 = gimple_assign_rhs2 (stmt);
> +  gimple def_stmt;
> +  enum tree_code defcode;
> +  bool trueok = false;
> +  bool falseok = false;
> +  tree true_op, false_op;
> +  location_t loc = gimple_location (stmt);
> +
> +  if (TREE_CODE (rhs1) != SSA_NAME)
> +    goto second_op;
> +
> +  def_stmt = get_prop_source_stmt (rhs1, true, NULL);
> +  if (!def_stmt || !can_propagate_from (def_stmt))
> +    goto second_op;
> +
> +  defcode = gimple_assign_rhs_code (def_stmt);
> +  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
> +    goto second_op;
> +
> +  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2
> (def_stmt),
> +                            gimple_assign_rhs2 (stmt));
> +  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3
> (def_stmt),
> +                             gimple_assign_rhs2 (stmt));
> +
> +  if (is_gimple_val (true_op))
> +    trueok = true;
> +  if (is_gimple_val (false_op))
> +    falseok = true;
> +  /* At least one of them has to simplify, or it isn't worth it.  */
> +  if (!trueok && !falseok)
> +    goto second_op;
> +  if (!trueok)
> +    {
> +      if (!valid_gimple_rhs_p (true_op))
> +       goto second_op;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
> +      true_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +  if (!falseok)
> +    {
> +      if (!valid_gimple_rhs_p (false_op))
> +       goto second_op;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> +      false_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +  goto finish;
> +
> +second_op:
> +  if (TREE_CODE (rhs2) != SSA_NAME)
> +    return 0;
> +
> +  def_stmt = get_prop_source_stmt (rhs2, true, NULL);
> +  if (!def_stmt || !can_propagate_from (def_stmt))
> +    return 0;
> +
> +  defcode = gimple_assign_rhs_code (def_stmt);
> +  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
> +    return 0;
> +
> +  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
> +                            gimple_assign_rhs2 (def_stmt));
> +  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
> +                             gimple_assign_rhs3 (def_stmt));
> +
> +  trueok = is_gimple_val (true_op);
> +  falseok = is_gimple_val (false_op);
> +  /* At least one of them has to simplify, or it isn't worth it.  */
> +  if (!trueok && !falseok)
> +    return 0;
> +  if (!trueok)
> +    {
> +      if (!valid_gimple_rhs_p (true_op))
> +       return 0;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
> +      true_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +  if (!falseok)
> +    {
> +      if (!valid_gimple_rhs_p (false_op))
> +       return 0;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> +      false_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +finish:
> +  gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1
> (def_stmt),
> +                                   true_op, false_op);
> +
> +  fold_stmt (gsi);
> +  update_stmt (gsi_stmt (*gsi));
> +
> +  /* Remove defining statements.  */
> +  return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;
> +}
> +
> +
>  /* GSI_P points to a statement which performs a narrowing integral
>     conversion.
>
>     Look for cases like:
>
>       t = x & c;
>       y = (T) t;
>
>     Turn them into:
>
> @@ -3403,21 +3514,35 @@ ssa_forward_propagate_and_combine (void)
>           /* Mark stmt as potentially needing revisiting.  */
>           gimple_set_plf (stmt, GF_PLF_1, false);
>
>           switch (gimple_code (stmt))
>             {
>             case GIMPLE_ASSIGN:
>               {
>                 tree rhs1 = gimple_assign_rhs1 (stmt);
>                 enum tree_code code = gimple_assign_rhs_code (stmt);
>
> -               if ((code == BIT_NOT_EXPR
> +               /* Something (associate_plusminus?) doesn't set "changed"
> +                  properly, so we can't put this at the end with
> +                  if (!changed && ...).  */
> +               if (TREE_CODE_CLASS (code) == tcc_binary
> +                   || TREE_CODE_CLASS (code) == tcc_comparison)
> +                 {
> +                   int did_something;
> +                   did_something = combine_binop_with_condition (&gsi);
> +                   if (did_something == 2)
> +                     cfg_changed = true;
> +                   changed = did_something != 0;
> +                 }
> +               if (changed)
> +                 ;
> +               else if ((code == BIT_NOT_EXPR
>                      || code == NEGATE_EXPR)
>                     && TREE_CODE (rhs1) == SSA_NAME)
>                   changed = simplify_not_neg_expr (&gsi);
>                 else if (code == COND_EXPR
>                          || code == VEC_COND_EXPR)
>                   {
>                     /* In this case the entire COND_EXPR is in rhs1. */
>                     if (forward_propagate_into_cond (&gsi)
>                         || combine_cond_exprs (&gsi))
>                       {
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: folding (vec_)cond_expr in a binary operation
  2013-08-30  9:38 ` Richard Biener
@ 2013-09-02  9:47   ` Marc Glisse
  2013-09-03 10:41     ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Glisse @ 2013-09-02  9:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Fri, 30 Aug 2013, Richard Biener wrote:

> On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> First, the fold-const bit causes an assertion failure (building libjava) in
>> combine_cond_expr_cond, which calls:
>>
>>   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>>
>> and then checks:
>>
>>   /* Require that we got a boolean type out if we put one in.  */
>>   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>>
>> which makes sense... Note that the 2 relevant types are:
>
> well, the check is probably old and can be relaxed to also allow all
> compatible types.

Ok. Maybe it could even be removed then, we shouldn't check in every 
function that calls fold_binary_loc that it returns something of the type 
that was asked for.

>> Second, the way the forwprop transformation is done, it can be necessary to
>> run the forwprop pass several times in a row, which is not nice. It takes:
>>
>> stmt_cond
>> stmt_binop
>>
>> and produces:
>>
>> stmt_folded
>> stmt_cond2
>>
>> But one of the arguments of stmt_folded could be an ssa_name defined by a
>> cond_expr, which could now be propagated into stmt_folded (there may be
>> other new possible transformations). However, that cond_expr has already
>> been visited and won't be again. The combine part of the pass uses a PLF to
>> revisit statements, but the forwprop part doesn't have any specific
>> mechanism.
>
> forwprop is designed to re-process stmts it has folded to catch cascading
> effects.  Now, as it as both a forward and a backward run you don't catch
> 2nd-order effects with iterating them.  On my TODO is to only do one kind,
> either forward or backward propagation.

My impression is that even internally in the forward part, the
reprocessing doesn't really work, but that'll disappear anyway when the
forward part disappears.

> Btw, as for the patch I don't like that you basically feed everything into
> fold.  Yes, I know we do that for conditions because that's quite important
> and nobody has re-written the condition folding as gimple pattern matcher.
> I doubt that this transform is important enough to warrant another 
> exception ;)

I am not sure what you want here. When I notice the pattern (a?b:c) op d, 
I need to check whether b op d and c op d are likely to simplify before 
transforming it to a?(b op d):(c op d). And currently I can't see any way 
to do that other than feeding (b op d) to fold. Even if/when we get a 
gimple folder, we will still need to call it in about the same way.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: folding (vec_)cond_expr in a binary operation
  2013-09-02  9:47   ` Marc Glisse
@ 2013-09-03 10:41     ` Richard Biener
  2013-09-03 13:38       ` Marc Glisse
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2013-09-03 10:41 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Mon, Sep 2, 2013 at 11:46 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 30 Aug 2013, Richard Biener wrote:
>
>> On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> First, the fold-const bit causes an assertion failure (building libjava)
>>> in
>>> combine_cond_expr_cond, which calls:
>>>
>>>   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>>>
>>> and then checks:
>>>
>>>   /* Require that we got a boolean type out if we put one in.  */
>>>   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>>>
>>> which makes sense... Note that the 2 relevant types are:
>>
>>
>> well, the check is probably old and can be relaxed to also allow all
>> compatible types.
>
>
> Ok. Maybe it could even be removed then, we shouldn't check in every
> function that calls fold_binary_loc that it returns something of the type
> that was asked for.
>
>
>>> Second, the way the forwprop transformation is done, it can be necessary
>>> to
>>> run the forwprop pass several times in a row, which is not nice. It
>>> takes:
>>>
>>> stmt_cond
>>> stmt_binop
>>>
>>> and produces:
>>>
>>> stmt_folded
>>> stmt_cond2
>>>
>>> But one of the arguments of stmt_folded could be an ssa_name defined by a
>>> cond_expr, which could now be propagated into stmt_folded (there may be
>>> other new possible transformations). However, that cond_expr has already
>>> been visited and won't be again. The combine part of the pass uses a PLF
>>> to
>>> revisit statements, but the forwprop part doesn't have any specific
>>> mechanism.
>>
>>
>> forwprop is designed to re-process stmts it has folded to catch cascading
>> effects.  Now, as it as both a forward and a backward run you don't catch
>> 2nd-order effects with iterating them.  On my TODO is to only do one kind,
>> either forward or backward propagation.
>
>
> My impression is that even internally in the forward part, the
> reprocessing doesn't really work, but that'll disappear anyway when the
> forward part disappears.
>
>
>> Btw, as for the patch I don't like that you basically feed everything into
>> fold.  Yes, I know we do that for conditions because that's quite
>> important
>> and nobody has re-written the condition folding as gimple pattern matcher.
>> I doubt that this transform is important enough to warrant another
>> exception ;)
>
>
> I am not sure what you want here. When I notice the pattern (a?b:c) op d, I
> need to check whether b op d and c op d are likely to simplify before
> transforming it to a?(b op d):(c op d). And currently I can't see any way to
> do that other than feeding (b op d) to fold. Even if/when we get a gimple
> folder, we will still need to call it in about the same way.

With a gimple folder you'd avoid building trees.  You are testing for
a quite complex pattern here - (a?b:c) & (d?e:f).  It seems to be that
for example

+  vec c=(a>3)?o:z;
+  vec d=(b>2)?o:z;
+  vec e=c&d;

would be better suited in the combine phase (you are interested in
combining the comparisons).  That is, look for a [&|^] b where
a and b are [VEC_]COND_EXPRs with the same values.  Heh,
and it's not obvious to me with looking for a minute what this
should be simplified to ... (so the code and the testcase needs some
comments on what you are trying to catch ...)

Richard.



> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: folding (vec_)cond_expr in a binary operation
  2013-09-03 10:41     ` Richard Biener
@ 2013-09-03 13:38       ` Marc Glisse
  2013-09-11 11:38         ` Marc Glisse
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Glisse @ 2013-09-03 13:38 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2856 bytes --]

(attaching the latest version. I only added comments since regtesting it)

On Tue, 3 Sep 2013, Richard Biener wrote:

>>> Btw, as for the patch I don't like that you basically feed everything into
>>> fold.  Yes, I know we do that for conditions because that's quite
>>> important
>>> and nobody has re-written the condition folding as gimple pattern matcher.
>>> I doubt that this transform is important enough to warrant another
>>> exception ;)
>>
>>
>> I am not sure what you want here. When I notice the pattern (a?b:c) op d, I
>> need to check whether b op d and c op d are likely to simplify before
>> transforming it to a?(b op d):(c op d). And currently I can't see any way to
>> do that other than feeding (b op d) to fold. Even if/when we get a gimple
>> folder, we will still need to call it in about the same way.
>
> With a gimple folder you'd avoid building trees.

Ah, so the problem is the cost of building those 2 trees? It will indeed
be better when we can avoid it, but I really don't expect the cost to be
that much...

> You are testing for
> a quite complex pattern here - (a?b:c) & (d?e:f).

But it is really handled in several steps. IIRC:
(a?1:0)&x becomes a?(x&1):0.
Since x is d?1:0, x&1 becomes d?1:0.
a?(d?1:0):0 is not (yet?) simplified further.

Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
then a?(d?-1:0):0 and finally a?d:0.

Each step can also trigger individually.

> It seems to be that
> for example
>
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
>
> would be better suited in the combine phase (you are interested in
> combining the comparisons).  That is, look for a [&|^] b where
> a and b are [VEC_]COND_EXPRs with the same values.

Hmm, I am already looking for a binary op which has at least one operand
which is a [VEC_]COND_EXPR, in the combine (=backward) part of
tree-ssa-forwprop.  Isn't that almost what you are suggesting?

> Heh, and it's not obvious to me with looking for a minute what this 
> should be simplified to ...

(a?x:y)&(b?x:y) doesn't really simplify in general.

> (so the code and the testcase needs some
> comments on what you are trying to catch ...)

a<b vectorizes to (a<b)?1:0. If we compute & and | of conditions, we end
up with & and | of vec_cond_expr, and it seems preferable to clean that
up, so ((a<b)?1:0)&((c<d)?1:0) would become ((a<b)&(c<d))?1:0. I don't
quite produce (a<b)&(c<d) but (a<b)?(c<d):0 instead, I guess the last
step (replacing vec_cond_expr with and_expr) would be easy to do at
expansion time if it isn't already. It could also be done earlier, but
we want to be careful since fold_binary_op_with_conditional_arg had
related infinite loops in the past.

This transformation is basically a gimple version of
fold_binary_op_with_conditional_arg, so it applies more widely than just
this vector example.

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 8961 bytes --]

Index: tree-ssa-forwprop.c
===================================================================
--- tree-ssa-forwprop.c	(revision 202185)
+++ tree-ssa-forwprop.c	(working copy)
@@ -363,23 +363,20 @@ combine_cond_expr_cond (gimple stmt, enu
   gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison);
 
   fold_defer_overflow_warnings ();
   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
   if (!t)
     {
       fold_undefer_overflow_warnings (false, NULL, 0);
       return NULL_TREE;
     }
 
-  /* Require that we got a boolean type out if we put one in.  */
-  gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
-
   /* Canonicalize the combined condition for use in a COND_EXPR.  */
   t = canonicalize_cond_expr_cond (t);
 
   /* Bail out if we required an invariant but didn't get one.  */
   if (!t || (invariant_only && !is_gimple_min_invariant (t)))
     {
       fold_undefer_overflow_warnings (false, NULL, 0);
       return NULL_TREE;
     }
 
@@ -1135,20 +1132,131 @@ forward_propagate_comparison (gimple_stm
 
   /* Remove defining statements.  */
   return remove_prop_source_from_use (name);
 
 bailout:
   gsi_next (defgsi);
   return false;
 }
 
 
+/* Combine the binary operation defined in *GSI with one of its arguments
+   that is a condition, i.e. (A ? B : C) op D becomes A ? (B op D) : (C op D).
+   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
+   run.  Else it returns 0.  */
+
+static int
+combine_binop_with_condition (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  gimple def_stmt;
+  enum tree_code defcode;
+  bool trueok = false;
+  bool falseok = false;
+  tree true_op, false_op;
+  location_t loc = gimple_location (stmt);
+
+  if (TREE_CODE (rhs1) != SSA_NAME)
+    goto second_op;
+
+  def_stmt = get_prop_source_stmt (rhs1, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+    goto second_op;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+    goto second_op;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2 (def_stmt),
+			     gimple_assign_rhs2 (stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3 (def_stmt),
+			      gimple_assign_rhs2 (stmt));
+
+  if (is_gimple_val (true_op))
+    trueok = true;
+  if (is_gimple_val (false_op))
+    falseok = true;
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+    goto second_op;
+  if (!trueok)
+    {
+      if (!valid_gimple_rhs_p (true_op))
+	goto second_op;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+      true_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  if (!falseok)
+    {
+      if (!valid_gimple_rhs_p (false_op))
+	goto second_op;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+      false_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  goto finish;
+
+second_op:
+  if (TREE_CODE (rhs2) != SSA_NAME)
+    return 0;
+
+  def_stmt = get_prop_source_stmt (rhs2, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+    return 0;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+    return 0;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
+			     gimple_assign_rhs2 (def_stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
+			      gimple_assign_rhs3 (def_stmt));
+
+  trueok = is_gimple_val (true_op);
+  falseok = is_gimple_val (false_op);
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+    return 0;
+  if (!trueok)
+    {
+      if (!valid_gimple_rhs_p (true_op))
+	return 0;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+      true_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  if (!falseok)
+    {
+      if (!valid_gimple_rhs_p (false_op))
+	return 0;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+      false_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+finish:
+  gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1 (def_stmt),
+				    true_op, false_op);
+
+  fold_stmt (gsi);
+  update_stmt (gsi_stmt (*gsi));
+
+  /* Remove defining statements.  */
+  return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;
+}
+
+
 /* GSI_P points to a statement which performs a narrowing integral
    conversion.
 
    Look for cases like:
 
      t = x & c;
      y = (T) t;
 
    Turn them into:
 
@@ -3403,21 +3511,35 @@ ssa_forward_propagate_and_combine (void)
 	  /* Mark stmt as potentially needing revisiting.  */
 	  gimple_set_plf (stmt, GF_PLF_1, false);
 
 	  switch (gimple_code (stmt))
 	    {
 	    case GIMPLE_ASSIGN:
 	      {
 		tree rhs1 = gimple_assign_rhs1 (stmt);
 		enum tree_code code = gimple_assign_rhs_code (stmt);
 
-		if ((code == BIT_NOT_EXPR
+		/* Something (associate_plusminus?) doesn't set "changed"
+		   properly, so we can't put this at the end with
+		   if (!changed && ...).  */
+		if (TREE_CODE_CLASS (code) == tcc_binary
+		    || TREE_CODE_CLASS (code) == tcc_comparison)
+		  {
+		    int did_something;
+		    did_something = combine_binop_with_condition (&gsi);
+		    if (did_something == 2)
+		      cfg_changed = true;
+		    changed = did_something != 0;
+		  }
+		if (changed)
+		  ;
+		else if ((code == BIT_NOT_EXPR
 		     || code == NEGATE_EXPR)
 		    && TREE_CODE (rhs1) == SSA_NAME)
 		  changed = simplify_not_neg_expr (&gsi);
 		else if (code == COND_EXPR
 			 || code == VEC_COND_EXPR)
 		  {
 		    /* In this case the entire COND_EXPR is in rhs1. */
 		    if (forward_propagate_into_cond (&gsi)
 			|| combine_cond_exprs (&gsi))
 		      {
Index: fold-const.c
===================================================================
--- fold-const.c	(revision 202185)
+++ fold-const.c	(working copy)
@@ -14122,21 +14122,23 @@ fold_ternary_loc (location_t loc, enum t
 
       /* Convert A ? 1 : 0 to simply A.  */
       if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
 				 : (integer_onep (op1)
 				    && !VECTOR_TYPE_P (type)))
 	  && integer_zerop (op2)
 	  /* If we try to convert OP0 to our type, the
 	     call to fold will try to move the conversion inside
 	     a COND, which will recurse.  In that case, the COND_EXPR
 	     is probably the best choice, so leave it alone.  */
-	  && type == TREE_TYPE (arg0))
+	  && (type == TREE_TYPE (arg0)
+	      || (in_gimple_form
+		  && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
 	return pedantic_non_lvalue_loc (loc, arg0);
 
       /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
 	 over COND_EXPR in cases such as floating point comparisons.  */
       if (integer_zerop (op1)
 	  && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
 				    : (integer_onep (op2)
 				       && !VECTOR_TYPE_P (type)))
 	  && truth_value_p (TREE_CODE (arg0)))
 	return pedantic_non_lvalue_loc (loc,
Index: testsuite/g++.dg/tree-ssa/pr57755.C
===================================================================
--- testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)
+++ testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1" } */
+typedef int vec __attribute__((vector_size(4*sizeof(int))));
+
+vec f(vec a,vec b){
+  vec z=a!=a; /* zero */
+  vec o=z+1; /* one */
+  vec c=(a>3)?o:z; /* scalar equivalent: c = a>3 */
+  vec d=(b>2)?o:z; /* scalar equivalent: d = b>2 */
+  vec e=c&d; /* simplify to (a>3)?((b>2)?1:0):0 */
+  return e!=0; /* simplify to (a>3)?((b>2)?-1:0):0 then (a>3)?(b>2):0 */
+}
+
+vec g(vec a,vec b){
+  vec z=a!=a; /* zero */
+  vec c=(a<=42)?b:z; /* equivalent to (a<=42)&b */
+  return b&c; /* notice that &b is useless */
+  /* return (a<=42)?b:0 */
+}
+
+/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
+/* { dg-final { cleanup-tree-dump "forwprop1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr57755.C
___________________________________________________________________
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: folding (vec_)cond_expr in a binary operation
  2013-09-03 13:38       ` Marc Glisse
@ 2013-09-11 11:38         ` Marc Glisse
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Glisse @ 2013-09-11 11:38 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

Any other comments on this patch?

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00129.html

On Tue, 3 Sep 2013, Marc Glisse wrote:

> (attaching the latest version. I only added comments since regtesting it)
>
> On Tue, 3 Sep 2013, Richard Biener wrote:
>
>>>> Btw, as for the patch I don't like that you basically feed everything 
>>>> into
>>>> fold.  Yes, I know we do that for conditions because that's quite
>>>> important
>>>> and nobody has re-written the condition folding as gimple pattern 
>>>> matcher.
>>>> I doubt that this transform is important enough to warrant another
>>>> exception ;)
>>> 
>>> 
>>> I am not sure what you want here. When I notice the pattern (a?b:c) op d, 
>>> I
>>> need to check whether b op d and c op d are likely to simplify before
>>> transforming it to a?(b op d):(c op d). And currently I can't see any way 
>>> to
>>> do that other than feeding (b op d) to fold. Even if/when we get a gimple
>>> folder, we will still need to call it in about the same way.
>> 
>> With a gimple folder you'd avoid building trees.
>
> Ah, so the problem is the cost of building those 2 trees? It will indeed
> be better when we can avoid it, but I really don't expect the cost to be
> that much...
>
>> You are testing for
>> a quite complex pattern here - (a?b:c) & (d?e:f).
>
> But it is really handled in several steps. IIRC:
> (a?1:0)&x becomes a?(x&1):0.
> Since x is d?1:0, x&1 becomes d?1:0.
> a?(d?1:0):0 is not (yet?) simplified further.
>
> Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
> then a?(d?-1:0):0 and finally a?d:0.
>
> Each step can also trigger individually.
>
>> It seems to be that
>> for example
>> 
>> +  vec c=(a>3)?o:z;
>> +  vec d=(b>2)?o:z;
>> +  vec e=c&d;
>> 
>> would be better suited in the combine phase (you are interested in
>> combining the comparisons).  That is, look for a [&|^] b where
>> a and b are [VEC_]COND_EXPRs with the same values.
>
> Hmm, I am already looking for a binary op which has at least one operand
> which is a [VEC_]COND_EXPR, in the combine (=backward) part of
> tree-ssa-forwprop.  Isn't that almost what you are suggesting?
>
>> Heh, and it's not obvious to me with looking for a minute what this should 
>> be simplified to ...
>
> (a?x:y)&(b?x:y) doesn't really simplify in general.
>
>> (so the code and the testcase needs some
>> comments on what you are trying to catch ...)
>
> a<b vectorizes to (a<b)?1:0. If we compute & and | of conditions, we end
> up with & and | of vec_cond_expr, and it seems preferable to clean that
> up, so ((a<b)?1:0)&((c<d)?1:0) would become ((a<b)&(c<d))?1:0. I don't
> quite produce (a<b)&(c<d) but (a<b)?(c<d):0 instead, I guess the last
> step (replacing vec_cond_expr with and_expr) would be easy to do at
> expansion time if it isn't already. It could also be done earlier, but
> we want to be careful since fold_binary_op_with_conditional_arg had
> related infinite loops in the past.
>
> This transformation is basically a gimple version of
> fold_binary_op_with_conditional_arg, so it applies more widely than just
> this vector example.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-11 10:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-06 19:42 folding (vec_)cond_expr in a binary operation Marc Glisse
2013-08-30  9:38 ` Richard Biener
2013-09-02  9:47   ` Marc Glisse
2013-09-03 10:41     ` Richard Biener
2013-09-03 13:38       ` Marc Glisse
2013-09-11 11:38         ` Marc Glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).