[PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1.
  2016-07-07  9:47 [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step marxin
  2016-07-07  9:47 ` [PATCH 2/2] Optimize fortran " marxin
@ 2016-07-07  9:47 ` marxin
  2016-07-07 12:27   ` Tobias Burnus
  2016-07-07 14:00 ` [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step Richard Biener
  2016-07-08 14:26 ` [PATCH] Fix Fortran DO loop fallback Martin Liška
  3 siblings, 1 reply; 26+ messages in thread
From: marxin @ 2016-07-07  9:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: hubicka, dominiq, williamclodius

gcc/fortran/ChangeLog:

2016-07-01  Martin Liska  <mliska@suse.cz>

	* trans-stmt.c (gfc_trans_do): Add expect builtin for DO
	loops with step bigger than +-1.

gcc/testsuite/ChangeLog:

2016-07-01  Martin Liska  <mliska@suse.cz>

	* gfortran.dg/predict-1.f90: Ammend the test.
	* gfortran.dg/predict-2.f90: Likewise.
---
 gcc/fortran/trans-stmt.c                | 6 ++++--
 gcc/testsuite/gfortran.dg/predict-1.f90 | 9 +++++++--
 gcc/testsuite/gfortran.dg/predict-2.f90 | 6 +++---
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 84bf749..389fa5e 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -2109,7 +2109,8 @@ gfc_trans_do (gfc_code * code, tree exit_cond)
       pos = build2 (COMPOUND_EXPR, void_type_node,
 		    fold_build2 (MODIFY_EXPR, void_type_node,
 				 countm1, tmp2),
-		    build3_loc (loc, COND_EXPR, void_type_node, tmp,
+		    build3_loc (loc, COND_EXPR, void_type_node,
+				gfc_unlikely (tmp, PRED_FORTRAN_LOOP_PREHEADER),
 				build1_loc (loc, GOTO_EXPR, void_type_node,
 					    exit_label), NULL_TREE));
 
@@ -2123,7 +2124,8 @@ gfc_trans_do (gfc_code * code, tree exit_cond)
       neg = build2 (COMPOUND_EXPR, void_type_node,
 		    fold_build2 (MODIFY_EXPR, void_type_node,
 				 countm1, tmp2),
-		    build3_loc (loc, COND_EXPR, void_type_node, tmp,
+		    build3_loc (loc, COND_EXPR, void_type_node,
+				gfc_unlikely (tmp, PRED_FORTRAN_LOOP_PREHEADER),
 				build1_loc (loc, GOTO_EXPR, void_type_node,
 					    exit_label), NULL_TREE));
 
diff --git a/gcc/testsuite/gfortran.dg/predict-1.f90 b/gcc/testsuite/gfortran.dg/predict-1.f90
index 81f0436..a3feea9 100644
--- a/gcc/testsuite/gfortran.dg/predict-1.f90
+++ b/gcc/testsuite/gfortran.dg/predict-1.f90
@@ -4,9 +4,14 @@
 subroutine test(block, array)
 integer :: i, block(9), array(2)
 
-do i = array(1), array(2)
+do i = array(1), array(2), 2
     block(i) = i
 end do
+
+do i = array(1), array(2), -2
+    block(i) = block(i) + i
+end do
+
 end subroutine test
 
-! { dg-final { scan-tree-dump-times "Fortran loop preheader heuristics of edge\[^:\]*: 99.0%" 1 "profile_estimate" } }
+! { dg-final { scan-tree-dump-times "Fortran loop preheader heuristics of edge\[^:\]*: 1.0%" 2 "profile_estimate" } }
diff --git a/gcc/testsuite/gfortran.dg/predict-2.f90 b/gcc/testsuite/gfortran.dg/predict-2.f90
index 4ae5c3a..11a9ec5 100644
--- a/gcc/testsuite/gfortran.dg/predict-2.f90
+++ b/gcc/testsuite/gfortran.dg/predict-2.f90
@@ -4,12 +4,12 @@
 subroutine test(block, array)
 integer :: i,j, block(9), array(2)
 
-do i = array(1), array(2)
-    do j = array(1), array(2)
+do i = array(1), array(2), 2
+    do j = array(1), array(2), 3
        block(i) = j
     end do
 end do
 end subroutine test
 
 ! { dg-final { scan-tree-dump-times "Fortran loop preheader heuristics of edge" 2 "profile_estimate" } }
-! { dg-final { scan-tree-dump-times "loop gueard" 0 "profile_estimate" } }
+! { dg-final { scan-tree-dump-times "loop guard" 0 "profile_estimate" } }
-- 
2.8.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-07  9:47 [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step marxin
@ 2016-07-07  9:47 ` marxin
  2016-07-07 12:13   ` Tobias Burnus
  2016-07-10  7:37   ` Andreas Schwab
  2016-07-07  9:47 ` [PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1 marxin
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 26+ messages in thread
From: marxin @ 2016-07-07  9:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: hubicka, dominiq, williamclodius

gcc/testsuite/ChangeLog:

2016-07-01  Martin Liska  <mliska@suse.cz>

	* gfortran.dg/do_1.f90: Remove a corner case that triggers
	an undefined behavior.
	* gfortran.dg/do_3.F90: Likewise.
	* gfortran.dg/do_check_11.f90: New test.
	* gfortran.dg/do_check_12.f90: New test.
	* gfortran.dg/do_corner_warn.f90: New test.

gcc/fortran/ChangeLog:

2016-07-01  Martin Liska  <mliska@suse.cz>

	* lang.opt (Wundefined-do-loop): New option.
        * resolve.c (gfc_resolve_iterator): Warn for Wundefined-do-loop.
	(gfc_trans_simple_do): Generate a c-style loop.
	(gfc_trans_do): Fix GNU coding style.
---
 gcc/fortran/lang.opt                         |   4 +
 gcc/fortran/resolve.c                        |  23 ++++++
 gcc/fortran/trans-stmt.c                     | 117 ++++++++++++++-------------
 gcc/testsuite/gfortran.dg/do_1.f90           |   6 --
 gcc/testsuite/gfortran.dg/do_3.F90           |   2 -
 gcc/testsuite/gfortran.dg/do_check_11.f90    |  12 +++
 gcc/testsuite/gfortran.dg/do_check_12.f90    |  12 +++
 gcc/testsuite/gfortran.dg/do_corner_warn.f90 |  22 +++++
 gcc/testsuite/gfortran.dg/ldist-1.f90        |   2 +-
 gcc/testsuite/gfortran.dg/pr48636.f90        |   2 +-
 10 files changed, 136 insertions(+), 66 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/do_check_11.f90
 create mode 100644 gcc/testsuite/gfortran.dg/do_check_12.f90
 create mode 100644 gcc/testsuite/gfortran.dg/do_corner_warn.f90

diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index bdf5fa5..8f8b299 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -309,6 +309,10 @@ Wtabs
 Fortran Warning Var(warn_tabs) LangEnabledBy(Fortran,Wall || Wpedantic)
 Permit nonconforming uses of the tab character.
 
+Wundefined-do-loop
+Fortran Warning Var(warn_undefined_do_loop) LangEnabledBy(Fortran,Wall)
+Warn about an invalid DO loop.
+
 Wunderflow
 Fortran Warning Var(warn_underflow) Init(1)
 Warn about underflow of numerical constant expressions.
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 4378313..1fc540a 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -6546,6 +6546,29 @@ gfc_resolve_iterator (gfc_iterator *iter, bool real_ok, bool own_scope)
 		     &iter->step->where);
     }
 
+  if (iter->end->expr_type == EXPR_CONSTANT
+      && iter->end->ts.type == BT_INTEGER
+      && iter->step->expr_type == EXPR_CONSTANT
+      && iter->step->ts.type == BT_INTEGER
+      && (mpz_cmp_si (iter->step->value.integer, -1L) == 0
+	  || mpz_cmp_si (iter->step->value.integer, 1L) == 0))
+    {
+      bool is_step_positive = mpz_cmp_ui (iter->step->value.integer, 1) == 0;
+      int k = gfc_validate_kind (BT_INTEGER, iter->end->ts.kind, false);
+
+      if (is_step_positive
+	  && mpz_cmp (iter->end->value.integer, gfc_integer_kinds[k].huge) == 0)
+	gfc_warning (OPT_Wundefined_do_loop,
+		     "DO loop at %L is undefined as it overflows",
+		     &iter->step->where);
+      else if (!is_step_positive
+	       && mpz_cmp (iter->end->value.integer,
+			   gfc_integer_kinds[k].min_int) == 0)
+	gfc_warning (OPT_Wundefined_do_loop,
+		     "DO loop at %L is undefined as it underflows",
+		     &iter->step->where);
+    }
+
   return true;
 }
 
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 389fa5e..d6fb620 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -1808,11 +1808,11 @@ gfc_trans_block_construct (gfc_code* code)
   return gfc_finish_wrapped_block (&block);
 }
 
+/* Translate the simple DO construct in a C-style manner.
+   This is where the loop variable has integer type and step +-1.
+   Following code will generate infinite loop in case where TO is INT_MAX
+   (for +1 step) or INT_MIN (for -1 step)
 
-/* Translate the simple DO construct.  This is where the loop variable has
-   integer type and step +-1.  We can't use this in the general case
-   because integer overflow and floating point errors could give incorrect
-   results.
    We translate a do loop from:
 
    DO dovar = from, to, step
@@ -1822,22 +1822,20 @@ gfc_trans_block_construct (gfc_code* code)
    to:
 
    [Evaluate loop bounds and step]
-   dovar = from;
-   if ((step > 0) ? (dovar <= to) : (dovar => to))
-    {
-      for (;;)
-        {
-	  body;
-   cycle_label:
-	  cond = (dovar == to);
-	  dovar += step;
-	  if (cond) goto end_label;
-	}
+    dovar = from;
+    for (;;)
+      {
+	if (dovar > to)
+	  goto end_label;
+	body;
+	cycle_label:
+	dovar += step;
       }
-   end_label:
+    end_label:
 
-   This helps the optimizers by avoiding the extra induction variable
-   used in the general case.  */
+   This helps the optimizers by avoiding the extra pre-header condition and
+   we save a register as we just compare the updated IV (not a value in
+   previous step).  */
 
 static tree
 gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar,
@@ -1851,14 +1849,14 @@ gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar,
   tree cycle_label;
   tree exit_label;
   location_t loc;
-
   type = TREE_TYPE (dovar);
+  bool is_step_positive = tree_int_cst_sgn (step) > 0;
 
   loc = code->ext.iterator->start->where.lb->location;
 
   /* Initialize the DO variable: dovar = from.  */
   gfc_add_modify_loc (loc, pblock, dovar,
-		      fold_convert (TREE_TYPE(dovar), from));
+		      fold_convert (TREE_TYPE (dovar), from));
 
   /* Save value for do-tinkering checking.  */
   if (gfc_option.rtcheck & GFC_RTCHECK_DO)
@@ -1871,13 +1869,53 @@ gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar,
   cycle_label = gfc_build_label_decl (NULL_TREE);
   exit_label = gfc_build_label_decl (NULL_TREE);
 
-  /* Put the labels where they can be found later. See gfc_trans_do().  */
+  /* Put the labels where they can be found later.  See gfc_trans_do().  */
   code->cycle_label = cycle_label;
   code->exit_label = exit_label;
 
   /* Loop body.  */
   gfc_start_block (&body);
 
+  /* Exit the loop if there is an I/O result condition or error.  */
+  if (exit_cond)
+    {
+      tmp = build1_v (GOTO_EXPR, exit_label);
+      tmp = fold_build3_loc (loc, COND_EXPR, void_type_node,
+			     exit_cond, tmp,
+			     build_empty_stmt (loc));
+      gfc_add_expr_to_block (&body, tmp);
+    }
+
+  /* Evaluate the loop condition.  */
+  if (is_step_positive)
+    cond = fold_build2_loc (loc, GT_EXPR, boolean_type_node, dovar,
+			    fold_convert (type, to));
+  else
+    cond = fold_build2_loc (loc, LT_EXPR, boolean_type_node, dovar,
+			    fold_convert (type, to));
+
+  cond = gfc_evaluate_now_loc (loc, cond, &body);
+
+  /* The loop exit.  */
+  tmp = fold_build1_loc (loc, GOTO_EXPR, void_type_node, exit_label);
+  TREE_USED (exit_label) = 1;
+  tmp = fold_build3_loc (loc, COND_EXPR, void_type_node,
+			 cond, tmp, build_empty_stmt (loc));
+  gfc_add_expr_to_block (&body, tmp);
+
+  /* Check whether the induction variable is equal to INT_MAX
+     (respectively to INT_MIN).  */
+  if (gfc_option.rtcheck & GFC_RTCHECK_DO)
+    {
+      tree boundary = is_step_positive ? TYPE_MAX_VALUE (type)
+	: TYPE_MIN_VALUE (type);
+
+      tmp = fold_build2_loc (loc, EQ_EXPR, boolean_type_node,
+			     dovar, boundary);
+      gfc_trans_runtime_check (true, false, tmp, &body, &code->loc,
+			       "Loop iterates infinitely");
+    }
+
   /* Main loop body.  */
   tmp = gfc_trans_code_cond (code->block->next, exit_cond);
   gfc_add_expr_to_block (&body, tmp);
@@ -1898,21 +1936,6 @@ gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar,
 			       "Loop variable has been modified");
     }
 
-  /* Exit the loop if there is an I/O result condition or error.  */
-  if (exit_cond)
-    {
-      tmp = build1_v (GOTO_EXPR, exit_label);
-      tmp = fold_build3_loc (loc, COND_EXPR, void_type_node,
-			     exit_cond, tmp,
-			     build_empty_stmt (loc));
-      gfc_add_expr_to_block (&body, tmp);
-    }
-
-  /* Evaluate the loop condition.  */
-  cond = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, dovar,
-			  to);
-  cond = gfc_evaluate_now_loc (loc, cond, &body);
-
   /* Increment the loop variable.  */
   tmp = fold_build2_loc (loc, PLUS_EXPR, type, dovar, step);
   gfc_add_modify_loc (loc, &body, dovar, tmp);
@@ -1920,28 +1943,10 @@ gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar,
   if (gfc_option.rtcheck & GFC_RTCHECK_DO)
     gfc_add_modify_loc (loc, &body, saved_dovar, dovar);
 
-  /* The loop exit.  */
-  tmp = fold_build1_loc (loc, GOTO_EXPR, void_type_node, exit_label);
-  TREE_USED (exit_label) = 1;
-  tmp = fold_build3_loc (loc, COND_EXPR, void_type_node,
-			 cond, tmp, build_empty_stmt (loc));
-  gfc_add_expr_to_block (&body, tmp);
-
   /* Finish the loop body.  */
   tmp = gfc_finish_block (&body);
   tmp = fold_build1_loc (loc, LOOP_EXPR, void_type_node, tmp);
 
-  /* Only execute the loop if the number of iterations is positive.  */
-  if (tree_int_cst_sgn (step) > 0)
-    cond = fold_build2_loc (loc, LE_EXPR, boolean_type_node, dovar,
-			    to);
-  else
-    cond = fold_build2_loc (loc, GE_EXPR, boolean_type_node, dovar,
-			    to);
-
-  tmp = fold_build3_loc (loc, COND_EXPR, void_type_node,
-			 gfc_likely (cond, PRED_FORTRAN_LOOP_PREHEADER), tmp,
-			 build_empty_stmt (loc));
   gfc_add_expr_to_block (pblock, tmp);
 
   /* Add the exit label.  */
@@ -2044,8 +2049,8 @@ gfc_trans_do (gfc_code * code, tree exit_cond)
   if (TREE_CODE (type) == INTEGER_TYPE
       && (integer_onep (step)
 	|| tree_int_cst_equal (step, integer_minus_one_node)))
-    return gfc_trans_simple_do (code, &block, dovar, from, to, step, exit_cond);
-
+    return gfc_trans_simple_do (code, &block, dovar, from, to, step,
+				exit_cond);
 
   if (TREE_CODE (type) == INTEGER_TYPE)
     utype = unsigned_type_for (type);
diff --git a/gcc/testsuite/gfortran.dg/do_1.f90 b/gcc/testsuite/gfortran.dg/do_1.f90
index b041279..b1db8c6 100644
--- a/gcc/testsuite/gfortran.dg/do_1.f90
+++ b/gcc/testsuite/gfortran.dg/do_1.f90
@@ -5,12 +5,6 @@ program do_1
   implicit none
   integer i, j
 
-  ! limit=HUGE(i), step 1
-  j = 0
-  do i = HUGE(i) - 10, HUGE(i), 1
-    j = j + 1
-  end do
-  if (j .ne. 11) call abort
   ! limit=HUGE(i), step > 1
   j = 0
   do i = HUGE(i) - 10, HUGE(i), 2
diff --git a/gcc/testsuite/gfortran.dg/do_3.F90 b/gcc/testsuite/gfortran.dg/do_3.F90
index eb4751d..0f2c315 100644
--- a/gcc/testsuite/gfortran.dg/do_3.F90
+++ b/gcc/testsuite/gfortran.dg/do_3.F90
@@ -48,11 +48,9 @@ program test
   TEST_LOOP(i, 17, 0, -4, 5, test_i, -3)
   TEST_LOOP(i, 17, 0, -5, 4, test_i, -3)
 
-  TEST_LOOP(i1, -huge(i1)-1_1, huge(i1), 1_1, int(huge(i1))*2+2, test_i1, huge(i1)+1_1)
   TEST_LOOP(i1, -huge(i1)-1_1, huge(i1), 2_1, int(huge(i1))+1, test_i1, huge(i1)+1_1)
   TEST_LOOP(i1, -huge(i1)-1_1, huge(i1), huge(i1), 3, test_i1, 2_1*huge(i1)-1_1)
 
-  TEST_LOOP(i1, huge(i1), -huge(i1)-1_1, -1_1, int(huge(i1))*2+2, test_i1, -huge(i1)-2_1)
   TEST_LOOP(i1, huge(i1), -huge(i1)-1_1, -2_1, int(huge(i1))+1, test_i1, -huge(i1)-2_1)
   TEST_LOOP(i1, huge(i1), -huge(i1)-1_1, -huge(i1), 3, test_i1, -2_1*huge(i1))
   TEST_LOOP(i1, huge(i1), -huge(i1)-1_1, -huge(i1)-1_1, 2, test_i1, -huge(i1)-2_1)
diff --git a/gcc/testsuite/gfortran.dg/do_check_11.f90 b/gcc/testsuite/gfortran.dg/do_check_11.f90
new file mode 100644
index 0000000..87850cf
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/do_check_11.f90
@@ -0,0 +1,12 @@
+! { dg-do run }
+! { dg-options "-fcheck=do" }
+! { dg-shouldfail "DO check" }
+!
+program test
+  implicit none
+  integer(1) :: i
+  do i = HUGE(i)-10, HUGE(i)
+    print *, i
+  end do
+end program test
+! { dg-output "Fortran runtime error: Loop iterates infinitely" }
diff --git a/gcc/testsuite/gfortran.dg/do_check_12.f90 b/gcc/testsuite/gfortran.dg/do_check_12.f90
new file mode 100644
index 0000000..71edace
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/do_check_12.f90
@@ -0,0 +1,12 @@
+! { dg-do run }
+! { dg-options "-fcheck=do" }
+! { dg-shouldfail "DO check" }
+!
+program test
+  implicit none
+  integer(1) :: i
+  do i = -HUGE(i)+10, -HUGE(i)-1, -1
+    print *, i
+  end do
+end program test
+! { dg-output "Fortran runtime error: Loop iterates infinitely" }
diff --git a/gcc/testsuite/gfortran.dg/do_corner_warn.f90 b/gcc/testsuite/gfortran.dg/do_corner_warn.f90
new file mode 100644
index 0000000..07484d3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/do_corner_warn.f90
@@ -0,0 +1,22 @@
+! { dg-options "-Wundefined-do-loop" }
+! Program to check corner cases for DO statements.
+
+program do_1
+  implicit none
+  integer i, j
+
+  ! limit=HUGE(i), step 1
+  j = 0
+  do i = HUGE(i) - 10, HUGE(i), 1 ! { dg-warning "is undefined as it overflows" }
+    j = j + 1
+  end do
+  if (j .ne. 11) call abort
+
+  ! limit=-HUGE(i)-1, step -1
+  j = 0
+  do i = -HUGE(i) + 10 - 1, -HUGE(i) - 1, -1 ! { dg-warning "is undefined as it underflows" }
+    j = j + 1
+  end do
+  if (j .ne. 11) call abort
+
+end program
diff --git a/gcc/testsuite/gfortran.dg/ldist-1.f90 b/gcc/testsuite/gfortran.dg/ldist-1.f90
index ea3990d..2030328 100644
--- a/gcc/testsuite/gfortran.dg/ldist-1.f90
+++ b/gcc/testsuite/gfortran.dg/ldist-1.f90
@@ -32,4 +32,4 @@ end Subroutine PADEC
 ! There are 5 legal partitions in this code.  Based on the data
 ! locality heuristic, this loop should not be split.
 
-! { dg-final { scan-tree-dump-not "distributed: split to" "ldist" } }
+! { dg-final { scan-tree-dump "distributed: split to" "ldist" } }
diff --git a/gcc/testsuite/gfortran.dg/pr48636.f90 b/gcc/testsuite/gfortran.dg/pr48636.f90
index 94826fa..926d8f3 100644
--- a/gcc/testsuite/gfortran.dg/pr48636.f90
+++ b/gcc/testsuite/gfortran.dg/pr48636.f90
@@ -34,5 +34,5 @@ program main
 end program main
 
 ! { dg-final { scan-ipa-dump "bar\[^\\n\]*inline copy in MAIN" "inline" } }
-! { dg-final { scan-ipa-dump-times "phi predicate:" 5 "inline" } }
+! { dg-final { scan-ipa-dump-times "phi predicate:" 3 "inline" } }
 ! { dg-final { scan-ipa-dump "inline hints: loop_iterations" "inline" } }
-- 
2.8.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
@ 2016-07-07  9:47 marxin
  2016-07-07  9:47 ` [PATCH 2/2] Optimize fortran " marxin
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: marxin @ 2016-07-07  9:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: hubicka, dominiq, williamclodius

Hello.

As discussed in [1], I would like to change code emission from:

    D.3428 = (*array)[0];
    D.3429 = (*array)[1];
    i = D.3428;
    if (i <= D.3429)
      {
        while (1)
          {
            {
              logical(kind=4) D.3432;

              (*block)[(integer(kind=8)) i + -1] = (*block)[(integer(kind=8)) i + -1] + 10;
              L.1:;
              D.3432 = i == D.3429;
              i = i + 1;
              if (D.3432) goto L.2;
            }
          }
      }
    L.2:;

to:

    D.3428 = (*array)[0];
    D.3429 = (*array)[1];
    i = D.3428;
    while (1)
      {
        {
          logical(kind=4) D.3432;

          D.3432 = i > D.3429;
          if (D.3432) goto L.2;
          (*block)[(integer(kind=8)) i + -1] = (*block)[(integer(kind=8)) i + -1] + 10;
          L.1:;
          i = i + 1;
        }
      }
    L.2:;

Following changes quite significantly improves exchange_2 benchmark (part of CPUv6),
where it runs 6% faster. The patchset consists of 2 patches:

a)  Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1.

  1) I converted predict-[12].f90 tests to use a different step than 1
  2) I noticed that a generic DO loop code emission misses expect PRED_FORTRAN_LOOP_PREHEADER, thus I added that.

b)  Optimize fortran loops with +-1 step.

  1) We generate the c-style loop.
  2) New warning Wundefined-do-loop is added.
  3) Couple of tests which hit the undefined behavior are removed.
  4) New tests that cover the undefined behavior are introduced.

The patchset survives regression tests and bootstraps on x86_64-linux-gnu and
I've been running CPU2006 benchmarks to hit a possible speed-up/regression.

Martin

[1] https://gcc.gnu.org/ml/fortran/2016-06/msg00122.html

 gcc/fortran/lang.opt                         |   4 +
 gcc/fortran/resolve.c                        |  23 +++++
 gcc/fortran/trans-stmt.c                     | 123 ++++++++++++++-------------
 gcc/testsuite/gfortran.dg/do_1.f90           |   6 --
 gcc/testsuite/gfortran.dg/do_3.F90           |   2 -
 gcc/testsuite/gfortran.dg/do_check_11.f90    |  12 +++
 gcc/testsuite/gfortran.dg/do_check_12.f90    |  12 +++
 gcc/testsuite/gfortran.dg/do_corner_warn.f90 |  22 +++++
 gcc/testsuite/gfortran.dg/ldist-1.f90        |   2 +-
 gcc/testsuite/gfortran.dg/pr48636.f90        |   2 +-
 gcc/testsuite/gfortran.dg/predict-1.f90      |   9 +-
 gcc/testsuite/gfortran.dg/predict-2.f90      |   6 +-
 12 files changed, 150 insertions(+), 73 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/do_check_11.f90
 create mode 100644 gcc/testsuite/gfortran.dg/do_check_12.f90
 create mode 100644 gcc/testsuite/gfortran.dg/do_corner_warn.f90

-- 
2.8.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-07  9:47 ` [PATCH 2/2] Optimize fortran " marxin
@ 2016-07-07 12:13   ` Tobias Burnus
  2016-07-07 15:54     ` Tobias Burnus
  2016-07-10  7:37   ` Andreas Schwab
  1 sibling, 1 reply; 26+ messages in thread
From: Tobias Burnus @ 2016-07-07 12:13 UTC (permalink / raw)
  To: Marxin, gcc-patches, fortran; +Cc: hubicka, dominiq, williamclodius

marxin wrote:
> gcc/fortran/ChangeLog:
> 
> 2016-07-01  Martin Liska  <mliska@suse.cz>
>	* lang.opt (Wundefined-do-loop): New option.
>        * resolve.c (gfc_resolve_iterator): Warn for Wundefined-do-loop.
>	(gfc_trans_simple_do): Generate a c-style loop.
>	(gfc_trans_do): Fix GNU coding style.

Can you also document the new warning in gcc/fortran/invoke.texi?

Otherwise, this patch looks good to me. Thanks for working on it.

Cheers,

Tobias

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1.
  2016-07-07  9:47 ` [PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1 marxin
@ 2016-07-07 12:27   ` Tobias Burnus
  0 siblings, 0 replies; 26+ messages in thread
From: Tobias Burnus @ 2016-07-07 12:27 UTC (permalink / raw)
  To: marxin, fortran, gcc-patches; +Cc: hubicka, dominiq, williamclodius

marxin wrote:
> gcc/fortran/ChangeLog:
> 2016-07-01  Martin Liska  <mliska@suse.cz>
>
> 	* trans-stmt.c (gfc_trans_do): Add expect builtin for DO
> 	loops with step bigger than +-1.
>
> gcc/testsuite/ChangeLog:
> 2016-07-01  Martin Liska  <mliska@suse.cz>
>
> 	* gfortran.dg/predict-1.f90: Ammend the test.
> 	* gfortran.dg/predict-2.f90: Likewise.

This patch also looks good to me. Thanks again.

Cheers,

Tobias

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-07  9:47 [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step marxin
  2016-07-07  9:47 ` [PATCH 2/2] Optimize fortran " marxin
  2016-07-07  9:47 ` [PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1 marxin
@ 2016-07-07 14:00 ` Richard Biener
  2016-07-07 14:40   ` Jan Hubicka
  2016-07-08 14:26 ` [PATCH] Fix Fortran DO loop fallback Martin Liška
  3 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2016-07-07 14:00 UTC (permalink / raw)
  To: marxin; +Cc: GCC Patches, Jan Hubicka, Dominique Dhumieres, williamclodius

On Thu, Jul 7, 2016 at 11:32 AM, marxin <mliska@suse.cz> wrote:
> Hello.
>
> As discussed in [1], I would like to change code emission from:
>
>     D.3428 = (*array)[0];
>     D.3429 = (*array)[1];
>     i = D.3428;
>     if (i <= D.3429)
>       {
>         while (1)
>           {
>             {
>               logical(kind=4) D.3432;
>
>               (*block)[(integer(kind=8)) i + -1] = (*block)[(integer(kind=8)) i + -1] + 10;
>               L.1:;
>               D.3432 = i == D.3429;
>               i = i + 1;
>               if (D.3432) goto L.2;
>             }
>           }
>       }
>     L.2:;
>
> to:
>
>     D.3428 = (*array)[0];
>     D.3429 = (*array)[1];
>     i = D.3428;
>     while (1)
>       {
>         {
>           logical(kind=4) D.3432;
>
>           D.3432 = i > D.3429;
>           if (D.3432) goto L.2;
>           (*block)[(integer(kind=8)) i + -1] = (*block)[(integer(kind=8)) i + -1] + 10;
>           L.1:;
>           i = i + 1;
>         }
>       }
>     L.2:;
>
> Following changes quite significantly improves exchange_2 benchmark (part of CPUv6),
> where it runs 6% faster. The patchset consists of 2 patches:
>
> a)  Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1.
>
>   1) I converted predict-[12].f90 tests to use a different step than 1
>   2) I noticed that a generic DO loop code emission misses expect PRED_FORTRAN_LOOP_PREHEADER, thus I added that.
>
> b)  Optimize fortran loops with +-1 step.
>
>   1) We generate the c-style loop.
>   2) New warning Wundefined-do-loop is added.
>   3) Couple of tests which hit the undefined behavior are removed.
>   4) New tests that cover the undefined behavior are introduced.

Why is the behavior only undefined for step 1 if the last iteration IV
increment overflows?
Doesn't this apply to all step values?

Richard.

> The patchset survives regression tests and bootstraps on x86_64-linux-gnu and
> I've been running CPU2006 benchmarks to hit a possible speed-up/regression.
>
> Martin
>
> [1] https://gcc.gnu.org/ml/fortran/2016-06/msg00122.html
>
>  gcc/fortran/lang.opt                         |   4 +
>  gcc/fortran/resolve.c                        |  23 +++++
>  gcc/fortran/trans-stmt.c                     | 123 ++++++++++++++-------------
>  gcc/testsuite/gfortran.dg/do_1.f90           |   6 --
>  gcc/testsuite/gfortran.dg/do_3.F90           |   2 -
>  gcc/testsuite/gfortran.dg/do_check_11.f90    |  12 +++
>  gcc/testsuite/gfortran.dg/do_check_12.f90    |  12 +++
>  gcc/testsuite/gfortran.dg/do_corner_warn.f90 |  22 +++++
>  gcc/testsuite/gfortran.dg/ldist-1.f90        |   2 +-
>  gcc/testsuite/gfortran.dg/pr48636.f90        |   2 +-
>  gcc/testsuite/gfortran.dg/predict-1.f90      |   9 +-
>  gcc/testsuite/gfortran.dg/predict-2.f90      |   6 +-
>  12 files changed, 150 insertions(+), 73 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/do_check_11.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/do_check_12.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/do_corner_warn.f90
>
> --
> 2.8.4
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-07 14:00 ` [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step Richard Biener
@ 2016-07-07 14:40   ` Jan Hubicka
  2016-07-08  8:34     ` Martin Liška
  0 siblings, 1 reply; 26+ messages in thread
From: Jan Hubicka @ 2016-07-07 14:40 UTC (permalink / raw)
  To: Richard Biener
  Cc: marxin, GCC Patches, Jan Hubicka, Dominique Dhumieres, williamclodius

> 
> Why is the behavior only undefined for step 1 if the last iteration IV
> increment overflows?
> Doesn't this apply to all step values?

This is what Fortran standard says:

  The iteration count is established and is the value of the expression (m2-m1+m3)/m3 unless that value is negative,
  in which case the iteration count is 0.

My reading of this is that the do statement is undefined whenever the expression above is undefined
(m1 is lower bound, m2 is upper bound, m3 is step) and because I think the evaulation order of
m2-m1+m3 is not fixed, I think the statement is not defined whethever (m2-m1), (m1+m3) or (m2-m1)+m3
overflows or underflows as signed integer.

For example it is not valid to iterate from -10 to INT_MAX with step 1.
Honza

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-07 12:13   ` Tobias Burnus
@ 2016-07-07 15:54     ` Tobias Burnus
  2016-07-08  8:39       ` Martin Liška
  0 siblings, 1 reply; 26+ messages in thread
From: Tobias Burnus @ 2016-07-07 15:54 UTC (permalink / raw)
  To: Marxin, gcc-patches, fortran; +Cc: hubicka, dominiq, williamclodius

On Thu, Jul 07, 2016 at 02:13:12PM +0200, Tobias Burnus wrote:
> marxin wrote:
> > gcc/fortran/ChangeLog:
> > 
> > 2016-07-01  Martin Liska  <mliska@suse.cz>
> >	* lang.opt (Wundefined-do-loop): New option.
> >        * resolve.c (gfc_resolve_iterator): Warn for Wundefined-do-loop.
> >	(gfc_trans_simple_do): Generate a c-style loop.
> >	(gfc_trans_do): Fix GNU coding style.
> 
> Can you also document the new warning in gcc/fortran/invoke.texi?
> 
> Otherwise, this patch looks good to me. Thanks for working on it.


If I look at the commit to .opt,

  +Wundefined-do-loop
  +Fortran Warning Var(warn_undefined_do_loop) LangEnabledBy(Fortran,Wall)

and to .texi,

  +@item -Wundefined-do-loop
  [...]
  +Warn if a DO loop with step either 1 or -1 yields an underflow or an overflow
  +during iteration of an induction variable of the loop.  Enabled by default.

I think the "Enabled by default" is misleading as it is only enabled by -Wall,
i.e. I had expected something like: "This warning is enabled by @option{-Wall}."

Cheers,

Tobias

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-07 14:40   ` Jan Hubicka
@ 2016-07-08  8:34     ` Martin Liška
  2016-07-08  8:40       ` Jakub Jelinek
  0 siblings, 1 reply; 26+ messages in thread
From: Martin Liška @ 2016-07-08  8:34 UTC (permalink / raw)
  To: Jan Hubicka, Richard Biener
  Cc: GCC Patches, Dominique Dhumieres, williamclodius

[-- Attachment #1: Type: text/plain, Size: 1419 bytes --]

On 07/07/2016 04:40 PM, Jan Hubicka wrote:
>>
>> Why is the behavior only undefined for step 1 if the last iteration IV
>> increment overflows?
>> Doesn't this apply to all step values?
> 
> This is what Fortran standard says:
> 
>   The iteration count is established and is the value of the expression (m2-m1+m3)/m3 unless that value is negative,
>   in which case the iteration count is 0.
> 
> My reading of this is that the do statement is undefined whenever the expression above is undefined
> (m1 is lower bound, m2 is upper bound, m3 is step) and because I think the evaulation order of
> m2-m1+m3 is not fixed, I think the statement is not defined whethever (m2-m1), (m1+m3) or (m2-m1)+m3
> overflows or underflows as signed integer.
> 
> For example it is not valid to iterate from -10 to INT_MAX with step 1.
> Honza
> 

Hi.

I'm attaching a candidate patch that emits the warnings. Problem with current implementation of loop generation
(w/ step different than 1) is that it utilizes unsigned type, thus the calculation of iteration count works
even though the expression overflows:

  do i = array(1), array(2), 17 
      block(i) = block(i) + 10
  end do

is transformed to:

    D.3428 = (*array)[0];
    D.3429 = (*array)[1];
    i = D.3428;
    countm1.0 = ((unsigned int) D.3429 - (unsigned int) D.3428) / 17;, if (D.3429 < D.3428)
      {
        goto L.2;
      };

Thoughts about the patch?
Martin

[-- Attachment #2: 0001-Enhance-warning-message-for-DO-loops-with-step-1.patch --]
[-- Type: text/x-patch, Size: 4057 bytes --]

From 744e824a58f5861c4376e32c9e3169f4e52e2e00 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 8 Jul 2016 10:16:29 +0200
Subject: [PATCH] Enhance warning message for DO loops with |step| != 1.

---
 gcc/fortran/trans-stmt.c                        | 44 ++++++++++++++++++
 gcc/testsuite/gfortran.dg/do_undefined_warn.f90 | 61 +++++++++++++++++++++++++
 2 files changed, 105 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/do_undefined_warn.f90

diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 6e4e2a7..1c1cd18 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -2014,6 +2014,7 @@ gfc_trans_do (gfc_code * code, tree exit_cond)
   gfc_start_block (&block);
 
   loc = code->ext.iterator->start->where.lb->location;
+  locus *location = &code->ext.iterator->start->where;
 
   /* Evaluate all the expressions in the iterator.  */
   gfc_init_se (&se, NULL);
@@ -2097,6 +2098,49 @@ gfc_trans_do (gfc_code * code, tree exit_cond)
     {
       tree pos, neg, tou, fromu, stepu, tmp2;
 
+      if (TREE_CODE (from) == INTEGER_CST
+	  && TREE_CODE (to) == INTEGER_CST
+	  && TREE_CODE (step) == INTEGER_CST)
+	{
+	  wide_int t = to;
+	  wide_int f = from;
+	  wide_int s = step;
+	  bool r1_overflow, r2_overflow, r3_overflow;
+	  bool cmp1, cmp2, cmp3;
+	  bool has_warning = false;
+
+	  wide_int r1 = wi::sub (t, f, SIGNED, &r1_overflow);
+	  cmp1 = wi::les_p (f, t);
+	  wi::add (f, s, SIGNED, &r2_overflow);
+	  cmp2 = !wi::neg_p (f);
+	  wi::add (r1, s, SIGNED, &r3_overflow);
+	  cmp3 = wi::les_p (s, r1);
+
+	  if (r1_overflow)
+	    {
+	      gfc_warning (OPT_Wundefined_do_loop,
+			   "DO loop at %L is undefined as the expression "
+			   "TO - FROM %s",
+			   location, cmp1 ? "overflows" : "underflows");
+	      has_warning = true;
+	    }
+
+	  if (r2_overflow && !has_warning)
+	    {
+	      gfc_warning (OPT_Wundefined_do_loop,
+			   "DO loop at %L is undefined as the expression "
+			   "FROM + STEP %s", location,
+			   cmp2 ? "overflows" : "underflows");
+	      has_warning = true;
+	    }
+
+	  if (r3_overflow && !has_warning)
+	    gfc_warning (OPT_Wundefined_do_loop,
+			 "DO loop at %L is undefined as the expression "
+			 "TO - FROM + STEP %s", location,
+			 cmp3 ? "overflows" : "underflows");
+	}
+
       /* The distance from FROM to TO cannot always be represented in a signed
          type, thus use unsigned arithmetic, also to avoid any undefined
 	 overflow issues.  */
diff --git a/gcc/testsuite/gfortran.dg/do_undefined_warn.f90 b/gcc/testsuite/gfortran.dg/do_undefined_warn.f90
new file mode 100644
index 0000000..3c20e00
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/do_undefined_warn.f90
@@ -0,0 +1,61 @@
+! { dg-options "-Wundefined-do-loop" }
+! Program to check corner cases for DO statements.
+
+function test2(array, s, block)
+integer(1) :: i, block(9), array(2)
+integer(2) :: j
+integer (8) :: s
+s = 1
+
+do i = -HUGE(i)-1, HUGE(i), 2 ! { dg-warning "is undefined as the expression TO - FROM overflows" }
+  s = s + 1
+end do
+
+do i = HUGE(i) - 10, HUGE(i), 12 ! { dg-warning "is undefined as the expression FROM \\+ STEP overflows" }
+  s = s + 1
+end do
+
+do i = 10, HUGE(i) - 10, 100 ! { dg-warning "is undefined as the expression TO - FROM \\+ STEP overflows" }
+ s = s + 1
+end do
+
+do i = 2, -HUGE(i)-1, 2 ! { dg-warning "is undefined as the expression TO - FROM underflows" }
+  s = s + 1
+end do
+
+do i = -HUGE(i)+11, 0, -13 ! { dg-warning "is undefined as the expression FROM \\+ STEP underflows" }
+  s = s + 1
+end do
+
+do i = -5, -HUGE(i) + 10, -22 ! { dg-warning "is undefined as the expression TO - FROM \\+ STEP underflows" }
+ s = s + 1
+end do
+
+! wider type
+
+do j = -HUGE(i)-1, HUGE(i), 2
+  s = s + 1
+end do
+
+do j = HUGE(i) - 10, HUGE(i), 12
+  s = s + 1
+end do
+
+do j = 10, HUGE(i) - 10, 100
+ s = s + 1
+end do
+
+do j = 2, -HUGE(i)-1, 2
+  s = s + 1
+end do
+
+do j = -HUGE(i)+11, 0, -13
+  s = s + 1
+end do
+
+do j = -5, -HUGE(i) + 10, -22
+ s = s + 1
+end do
+
+
+end function test2
-- 
2.8.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-07 15:54     ` Tobias Burnus
@ 2016-07-08  8:39       ` Martin Liška
  0 siblings, 0 replies; 26+ messages in thread
From: Martin Liška @ 2016-07-08  8:39 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

On 07/07/2016 05:53 PM, Tobias Burnus wrote:
> On Thu, Jul 07, 2016 at 02:13:12PM +0200, Tobias Burnus wrote:
>> marxin wrote:
>>> gcc/fortran/ChangeLog:
>>>
>>> 2016-07-01  Martin Liska  <mliska@suse.cz>
>>> 	* lang.opt (Wundefined-do-loop): New option.
>>>        * resolve.c (gfc_resolve_iterator): Warn for Wundefined-do-loop.
>>> 	(gfc_trans_simple_do): Generate a c-style loop.
>>> 	(gfc_trans_do): Fix GNU coding style.
>>
>> Can you also document the new warning in gcc/fortran/invoke.texi?
>>
>> Otherwise, this patch looks good to me. Thanks for working on it.
> 
> 
> If I look at the commit to .opt,
> 
>   +Wundefined-do-loop
>   +Fortran Warning Var(warn_undefined_do_loop) LangEnabledBy(Fortran,Wall)
> 
> and to .texi,
> 
>   +@item -Wundefined-do-loop
>   [...]
>   +Warn if a DO loop with step either 1 or -1 yields an underflow or an overflow
>   +during iteration of an induction variable of the loop.  Enabled by default.
> 
> I think the "Enabled by default" is misleading as it is only enabled by -Wall,
> i.e. I had expected something like: "This warning is enabled by @option{-Wall}."
> 
> Cheers,
> 
> Tobias
> 

Thanks for the review, I'm going to install following patch.

Martin

[-- Attachment #2: 0001-Enhance-documentation-of-Wundefined-do-loop.patch --]
[-- Type: text/x-patch, Size: 1049 bytes --]

From 5bfd6b5672a99952dac5aeb6c9a86a36affb0065 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 8 Jul 2016 10:36:44 +0200
Subject: [PATCH] Enhance documentation of Wundefined-do-loop

gcc/fortran/ChangeLog:

2016-07-08  Martin Liska  <mliska@suse.cz>

	* invoke.texi (Wundefined-do-loop): Enhance documentation.
---
 gcc/fortran/invoke.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index c0be1ab..87baf15 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -929,7 +929,8 @@ is active for @option{-pedantic}, @option{-std=f95}, @option{-std=f2003},
 @opindex @code{Wundefined-do-loop}
 @cindex warnings, undefined do loop
 Warn if a DO loop with step either 1 or -1 yields an underflow or an overflow
-during iteration of an induction variable of the loop.  Enabled by default.
+during iteration of an induction variable of the loop.
+This option is implied by @option{-Wall}.
 
 @item -Wunderflow
 @opindex @code{Wunderflow}
-- 
2.8.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-08  8:34     ` Martin Liška
@ 2016-07-08  8:40       ` Jakub Jelinek
  2016-07-08  9:03         ` Jan Hubicka
  0 siblings, 1 reply; 26+ messages in thread
From: Jakub Jelinek @ 2016-07-08  8:40 UTC (permalink / raw)
  To: Martin Liška
  Cc: Jan Hubicka, Richard Biener, GCC Patches, Dominique Dhumieres,
	williamclodius

On Fri, Jul 08, 2016 at 10:33:45AM +0200, Martin LiÅ¡ka wrote:
> On 07/07/2016 04:40 PM, Jan Hubicka wrote:
> >>
> >> Why is the behavior only undefined for step 1 if the last iteration IV
> >> increment overflows?
> >> Doesn't this apply to all step values?
> > 
> > This is what Fortran standard says:
> > 
> >   The iteration count is established and is the value of the expression (m2-m1+m3)/m3 unless that value is negative,
> >   in which case the iteration count is 0.
> > 
> > My reading of this is that the do statement is undefined whenever the expression above is undefined
> > (m1 is lower bound, m2 is upper bound, m3 is step) and because I think the evaulation order of
> > m2-m1+m3 is not fixed, I think the statement is not defined whethever (m2-m1), (m1+m3) or (m2-m1)+m3

m1+m3?  Did you mean m3-m1 or -m1+m3 instead?

	Jakub

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-08  8:40       ` Jakub Jelinek
@ 2016-07-08  9:03         ` Jan Hubicka
  2016-07-08  9:05           ` Jakub Jelinek
  0 siblings, 1 reply; 26+ messages in thread
From: Jan Hubicka @ 2016-07-08  9:03 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Martin Liška, Jan Hubicka, Richard Biener, GCC Patches,
	Dominique Dhumieres, williamclodius

> On Fri, Jul 08, 2016 at 10:33:45AM +0200, Martin LiÂška wrote:
> > On 07/07/2016 04:40 PM, Jan Hubicka wrote:
> > >>
> > >> Why is the behavior only undefined for step 1 if the last iteration IV
> > >> increment overflows?
> > >> Doesn't this apply to all step values?
> > > 
> > > This is what Fortran standard says:
> > > 
> > >   The iteration count is established and is the value of the expression (m2-m1+m3)/m3 unless that value is negative,
> > >   in which case the iteration count is 0.
> > > 
> > > My reading of this is that the do statement is undefined whenever the expression above is undefined
> > > (m1 is lower bound, m2 is upper bound, m3 is step) and because I think the evaulation order of
> > > m2-m1+m3 is not fixed, I think the statement is not defined whethever (m2-m1), (m1+m3) or (m2-m1)+m3
> 
> m1+m3?  Did you mean m3-m1 or -m1+m3 instead?

Ah yes, -m1+m3.  But I am by no means language expert - this was meant as a heads up to Fortran
people :)

Honza
> 
> 	Jakub

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-08  9:03         ` Jan Hubicka
@ 2016-07-08  9:05           ` Jakub Jelinek
  2016-07-08  9:13             ` FX
  0 siblings, 1 reply; 26+ messages in thread
From: Jakub Jelinek @ 2016-07-08  9:05 UTC (permalink / raw)
  To: Jan Hubicka, fortran
  Cc: Martin Liška, Richard Biener, GCC Patches,
	Dominique Dhumieres, williamclodius

On Fri, Jul 08, 2016 at 11:03:35AM +0200, Jan Hubicka wrote:
> > On Fri, Jul 08, 2016 at 10:33:45AM +0200, Martin LiÅ¡ka wrote:
> > > On 07/07/2016 04:40 PM, Jan Hubicka wrote:
> > > >>
> > > >> Why is the behavior only undefined for step 1 if the last iteration IV
> > > >> increment overflows?
> > > >> Doesn't this apply to all step values?
> > > > 
> > > > This is what Fortran standard says:
> > > > 
> > > >   The iteration count is established and is the value of the expression (m2-m1+m3)/m3 unless that value is negative,
> > > >   in which case the iteration count is 0.
> > > > 
> > > > My reading of this is that the do statement is undefined whenever the expression above is undefined
> > > > (m1 is lower bound, m2 is upper bound, m3 is step) and because I think the evaulation order of
> > > > m2-m1+m3 is not fixed, I think the statement is not defined whethever (m2-m1), (m1+m3) or (m2-m1)+m3
> > 
> > m1+m3?  Did you mean m3-m1 or -m1+m3 instead?
> 
> Ah yes, -m1+m3.  But I am by no means language expert - this was meant as a heads up to Fortran
> people :)

Heads up to Fortran people should be sent to fortran@gcc.gnu.org ;)

	Jakub

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
  2016-07-08  9:05           ` Jakub Jelinek
@ 2016-07-08  9:13             ` FX
  0 siblings, 0 replies; 26+ messages in thread
From: FX @ 2016-07-08  9:13 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jan Hubicka, fortran, Martin Liška, Richard Biener,
	GCC Patches, Dominique Dhumieres, williamclodius

>>>>> This is what Fortran standard says:
>>>>> 
>>>>>  The iteration count is established and is the value of the expression (m2-m1+m3)/m3 unless that value is negative,
>>>>>  in which case the iteration count is 0.
>>>>> 
>>>>> My reading of this is that the do statement is undefined whenever the expression above is undefined
>>>>> (m1 is lower bound, m2 is upper bound, m3 is step) and because I think the evaulation order of
>>>>> m2-m1+m3 is not fixed, I think the statement is not defined whethever (m2-m1), (m1+m3) or (m2-m1)+m3

In the Fortran standard, (m2-m1+m3)/m3 is a mathematical expression, not a “construct”. So it cannot be “undefined”.
If you have explicit cases where you are asking “is this valid or invalid” please post them here (fortran@) and we will tell you.

FX

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] Fix Fortran DO loop fallback
  2016-07-07  9:47 [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step marxin
                   ` (2 preceding siblings ...)
  2016-07-07 14:00 ` [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step Richard Biener
@ 2016-07-08 14:26 ` Martin Liška
  2016-07-11 14:44   ` Jeff Law
  3 siblings, 1 reply; 26+ messages in thread
From: Martin Liška @ 2016-07-08 14:26 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 180 bytes --]

Hello

Following patch fixes fallout caused by the patch set:
https://gcc.gnu.org/ml/gcc-regression/2016-07/msg00097.html

Ready after it finished regression tests?
Thanks,
Martin

[-- Attachment #2: 0001-Fix-Fortran-DO-loop-fallback.patch --]
[-- Type: text/x-patch, Size: 2181 bytes --]

From c5dd7ad62f795cce560c7f1bb8767b7ed9298d8a Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 8 Jul 2016 15:51:54 +0200
Subject: [PATCH] Fix Fortran DO loop fallback

gcc/testsuite/ChangeLog:

2016-07-08  Martin Liska  <mliska@suse.cz>

	* gfortran.dg/ldist-1.f90: Update expected dump scan.
	* gfortran.dg/pr42108.f90: Likewise.
	* gfortran.dg/vect/pr62283.f: Likewise.
---
 gcc/testsuite/gfortran.dg/ldist-1.f90    | 2 +-
 gcc/testsuite/gfortran.dg/pr42108.f90    | 2 --
 gcc/testsuite/gfortran.dg/vect/pr62283.f | 2 +-
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/ldist-1.f90 b/gcc/testsuite/gfortran.dg/ldist-1.f90
index 2030328..071a651 100644
--- a/gcc/testsuite/gfortran.dg/ldist-1.f90
+++ b/gcc/testsuite/gfortran.dg/ldist-1.f90
@@ -32,4 +32,4 @@ end Subroutine PADEC
 ! There are 5 legal partitions in this code.  Based on the data
 ! locality heuristic, this loop should not be split.
 
-! { dg-final { scan-tree-dump "distributed: split to" "ldist" } }
+! { dg-final { scan-tree-dump-times "distributed: split to" 0 "ldist" } }
diff --git a/gcc/testsuite/gfortran.dg/pr42108.f90 b/gcc/testsuite/gfortran.dg/pr42108.f90
index eb93604..a913aa4 100644
--- a/gcc/testsuite/gfortran.dg/pr42108.f90
+++ b/gcc/testsuite/gfortran.dg/pr42108.f90
@@ -21,7 +21,5 @@ subroutine  eval(foo1,foo2,foo3,foo4,x,n,nnd)
   end do
 end subroutine eval
 
-! We should have hoisted the division
-! { dg-final { scan-tree-dump "in all uses of countm1\[^\n\]* / " "pre" } }
 ! There should be only one load from n left
 ! { dg-final { scan-tree-dump-times "\\*n_" 1 "fre1" } }
diff --git a/gcc/testsuite/gfortran.dg/vect/pr62283.f b/gcc/testsuite/gfortran.dg/vect/pr62283.f
index 7df3d99..2933f51 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr62283.f
+++ b/gcc/testsuite/gfortran.dg/vect/pr62283.f
@@ -13,4 +13,4 @@ C { dg-additional-options "-fvect-cost-model=dynamic" }
       beta=3.141593
       y=y+beta*x
       end
-C { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { vect_hw_misalign } } } }
+C { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { vect_hw_misalign } } } }
-- 
2.8.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-07  9:47 ` [PATCH 2/2] Optimize fortran " marxin
  2016-07-07 12:13   ` Tobias Burnus
@ 2016-07-10  7:37   ` Andreas Schwab
  2016-07-12 10:08     ` Richard Biener
  1 sibling, 1 reply; 26+ messages in thread
From: Andreas Schwab @ 2016-07-10  7:37 UTC (permalink / raw)
  To: marxin; +Cc: gcc-patches, hubicka

marxin <mliska@suse.cz> writes:

> diff --git a/gcc/testsuite/gfortran.dg/ldist-1.f90 b/gcc/testsuite/gfortran.dg/ldist-1.f90
> index ea3990d..2030328 100644
> --- a/gcc/testsuite/gfortran.dg/ldist-1.f90
> +++ b/gcc/testsuite/gfortran.dg/ldist-1.f90
> @@ -32,4 +32,4 @@ end Subroutine PADEC
>  ! There are 5 legal partitions in this code.  Based on the data
>  ! locality heuristic, this loop should not be split.
>  
> -! { dg-final { scan-tree-dump-not "distributed: split to" "ldist" } }
> +! { dg-final { scan-tree-dump "distributed: split to" "ldist" } }

FAIL: gfortran.dg/ldist-1.f90   -O   scan-tree-dump ldist "distributed: split to"

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-08 14:26 ` [PATCH] Fix Fortran DO loop fallback Martin Liška
@ 2016-07-11 14:44   ` Jeff Law
  2016-07-11 15:24     ` Mike Stump
  2016-07-12 10:15     ` Richard Biener
  0 siblings, 2 replies; 26+ messages in thread
From: Jeff Law @ 2016-07-11 14:44 UTC (permalink / raw)
  To: Martin Liška, gcc-patches

On 07/08/2016 08:26 AM, Martin LiÂška wrote:
> Hello
>
> Following patch fixes fallout caused by the patch set:
> https://gcc.gnu.org/ml/gcc-regression/2016-07/msg00097.html
>
> Ready after it finished regression tests?
> Thanks,
> Martin
>
>
> 0001-Fix-Fortran-DO-loop-fallback.patch
>
>
> From c5dd7ad62f795cce560c7f1bb8767b7ed9298d8a Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Fri, 8 Jul 2016 15:51:54 +0200
> Subject: [PATCH] Fix Fortran DO loop fallback
>
> gcc/testsuite/ChangeLog:
>
> 2016-07-08  Martin Liska  <mliska@suse.cz>
>
> 	* gfortran.dg/ldist-1.f90: Update expected dump scan.
> 	* gfortran.dg/pr42108.f90: Likewise.
> 	* gfortran.dg/vect/pr62283.f: Likewise.
Shouldn't ldist-1.f90 be scan-tree-dump-not?  It seems like you change 
it from that just last week, but it wasn't mentioned in the ChangeLog.

OK with that change.

jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-11 14:44   ` Jeff Law
@ 2016-07-11 15:24     ` Mike Stump
  2016-07-12 10:15     ` Richard Biener
  1 sibling, 0 replies; 26+ messages in thread
From: Mike Stump @ 2016-07-11 15:24 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, Fortran List, Jeff Law


> On Jul 11, 2016, at 7:44 AM, Jeff Law <law@redhat.com> wrote:
> 
> On 07/08/2016 08:26 AM, Martin Liška wrote:
>> Hello
>> 
>> Following patch fixes fallout caused by the patch set:
>> https://gcc.gnu.org/ml/gcc-regression/2016-07/msg00097.html
>> 
>> Ready after it finished regression tests?
>> Thanks,
>> Martin
>> 
>> 
>> 0001-Fix-Fortran-DO-loop-fallback.patch
>> 
>> 
>> From c5dd7ad62f795cce560c7f1bb8767b7ed9298d8a Mon Sep 17 00:00:00 2001
>> From: marxin <mliska@suse.cz>
>> Date: Fri, 8 Jul 2016 15:51:54 +0200
>> Subject: [PATCH] Fix Fortran DO loop fallback
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2016-07-08  Martin Liska  <mliska@suse.cz>
>> 
>> 	* gfortran.dg/ldist-1.f90: Update expected dump scan.
>> 	* gfortran.dg/pr42108.f90: Likewise.
>> 	* gfortran.dg/vect/pr62283.f: Likewise.
> Shouldn't ldist-1.f90 be scan-tree-dump-not?  It seems like you change it from that just last week, but it wasn't mentioned in the ChangeLog.
> 
> OK with that change.

A gentle reminder, fortran patches should include the fortran list...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-10  7:37   ` Andreas Schwab
@ 2016-07-12 10:08     ` Richard Biener
  2016-07-12 12:09       ` Martin Liška
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2016-07-12 10:08 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: marxin, GCC Patches, Jan Hubicka

On Sun, Jul 10, 2016 at 9:37 AM, Andreas Schwab <schwab@linux-m68k.org> wrote:
> marxin <mliska@suse.cz> writes:
>
>> diff --git a/gcc/testsuite/gfortran.dg/ldist-1.f90 b/gcc/testsuite/gfortran.dg/ldist-1.f90
>> index ea3990d..2030328 100644
>> --- a/gcc/testsuite/gfortran.dg/ldist-1.f90
>> +++ b/gcc/testsuite/gfortran.dg/ldist-1.f90
>> @@ -32,4 +32,4 @@ end Subroutine PADEC
>>  ! There are 5 legal partitions in this code.  Based on the data
>>  ! locality heuristic, this loop should not be split.
>>
>> -! { dg-final { scan-tree-dump-not "distributed: split to" "ldist" } }
>> +! { dg-final { scan-tree-dump "distributed: split to" "ldist" } }
>
> FAIL: gfortran.dg/ldist-1.f90   -O   scan-tree-dump ldist "distributed: split to"

That change is certainly spurious (not in ChangeLog) and bogus given the
comment before the scan.

Richard.

> Andreas.
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-11 14:44   ` Jeff Law
  2016-07-11 15:24     ` Mike Stump
@ 2016-07-12 10:15     ` Richard Biener
  2016-07-12 12:31       ` Martin Liška
  1 sibling, 1 reply; 26+ messages in thread
From: Richard Biener @ 2016-07-12 10:15 UTC (permalink / raw)
  To: Jeff Law; +Cc: Martin Liška, GCC Patches

On Mon, Jul 11, 2016 at 4:44 PM, Jeff Law <law@redhat.com> wrote:
> On 07/08/2016 08:26 AM, Martin Liška wrote:
>>
>> Hello
>>
>> Following patch fixes fallout caused by the patch set:
>> https://gcc.gnu.org/ml/gcc-regression/2016-07/msg00097.html
>>
>> Ready after it finished regression tests?
>> Thanks,
>> Martin
>>
>>
>> 0001-Fix-Fortran-DO-loop-fallback.patch
>>
>>
>> From c5dd7ad62f795cce560c7f1bb8767b7ed9298d8a Mon Sep 17 00:00:00 2001
>> From: marxin <mliska@suse.cz>
>> Date: Fri, 8 Jul 2016 15:51:54 +0200
>> Subject: [PATCH] Fix Fortran DO loop fallback
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-07-08  Martin Liska  <mliska@suse.cz>
>>
>>         * gfortran.dg/ldist-1.f90: Update expected dump scan.
>>         * gfortran.dg/pr42108.f90: Likewise.
>>         * gfortran.dg/vect/pr62283.f: Likewise.
>
> Shouldn't ldist-1.f90 be scan-tree-dump-not?  It seems like you change it
> from that just last week, but it wasn't mentioned in the ChangeLog.

gfortran.dg/pr42108.f90 also looks pointless now?  I suppose there is nothing
to hoist after the change?  Do we now have an option to revert back to old
behavior?  If so it would be better to use that flag here.

diff --git a/gcc/testsuite/gfortran.dg/vect/pr62283.f
b/gcc/testsuite/gfortran.dg/vect/pr62283.f
index 7df3d99..2933f51 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr62283.f
+++ b/gcc/testsuite/gfortran.dg/vect/pr62283.f
@@ -13,4 +13,4 @@ C { dg-additional-options "-fvect-cost-model=dynamic" }
       beta=3.141593
       y=y+beta*x
       end
-C { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" {
target { vect_hw_misalign } } } }
+C { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target {
vect_hw_misalign } } } }

so why do we no longer vectorize 1 loops in two functions?  The
testcase works for me
unchanged.

Richard.

> OK with that change.
>
> jeff
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Optimize fortran loops with +-1 step.
  2016-07-12 10:08     ` Richard Biener
@ 2016-07-12 12:09       ` Martin Liška
  0 siblings, 0 replies; 26+ messages in thread
From: Martin Liška @ 2016-07-12 12:09 UTC (permalink / raw)
  To: Richard Biener, Andreas Schwab; +Cc: GCC Patches, Jan Hubicka

On 07/12/2016 12:07 PM, Richard Biener wrote:
> That change is certainly spurious (not in ChangeLog) and bogus given the
> comment before the scan.
> 
> Richard.

Hi.

I'll revert the change in ldist-1.f90 which I accidentally committed a week ago,
it's going to be part of second version of the patch

M.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-12 10:15     ` Richard Biener
@ 2016-07-12 12:31       ` Martin Liška
  2016-07-12 13:15         ` Richard Biener
  0 siblings, 1 reply; 26+ messages in thread
From: Martin Liška @ 2016-07-12 12:31 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: GCC Patches, fortran

On 07/12/2016 12:14 PM, Richard Biener wrote:
> On Mon, Jul 11, 2016 at 4:44 PM, Jeff Law <law@redhat.com> wrote:
>> On 07/08/2016 08:26 AM, Martin LiÅ¡ka wrote:
>>>
>>> Hello
>>>
>>> Following patch fixes fallout caused by the patch set:
>>> https://gcc.gnu.org/ml/gcc-regression/2016-07/msg00097.html
>>>
>>> Ready after it finished regression tests?
>>> Thanks,
>>> Martin
>>>
>>>
>>> 0001-Fix-Fortran-DO-loop-fallback.patch
>>>
>>>
>>> From c5dd7ad62f795cce560c7f1bb8767b7ed9298d8a Mon Sep 17 00:00:00 2001
>>> From: marxin <mliska@suse.cz>
>>> Date: Fri, 8 Jul 2016 15:51:54 +0200
>>> Subject: [PATCH] Fix Fortran DO loop fallback
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2016-07-08  Martin Liska  <mliska@suse.cz>
>>>
>>>         * gfortran.dg/ldist-1.f90: Update expected dump scan.
>>>         * gfortran.dg/pr42108.f90: Likewise.
>>>         * gfortran.dg/vect/pr62283.f: Likewise.
>>
>> Shouldn't ldist-1.f90 be scan-tree-dump-not?  It seems like you change it
>> from that just last week, but it wasn't mentioned in the ChangeLog.
> 
> gfortran.dg/pr42108.f90 also looks pointless now?  I suppose there is nothing
> to hoist after the change?  Do we now have an option to revert back to old
> behavior?  If so it would be better to use that flag here.

No, there's no option. So, as the test-case now looks pointless, should I mark it
with xfail now?

> 
> diff --git a/gcc/testsuite/gfortran.dg/vect/pr62283.f
> b/gcc/testsuite/gfortran.dg/vect/pr62283.f
> index 7df3d99..2933f51 100644
> --- a/gcc/testsuite/gfortran.dg/vect/pr62283.f
> +++ b/gcc/testsuite/gfortran.dg/vect/pr62283.f
> @@ -13,4 +13,4 @@ C { dg-additional-options "-fvect-cost-model=dynamic" }
>        beta=3.141593
>        y=y+beta*x
>        end
> -C { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" {
> target { vect_hw_misalign } } } }
> +C { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target {
> vect_hw_misalign } } } }
> 
> so why do we no longer vectorize 1 loops in two functions?  The
> testcase works for me
> unchanged.

Yeah, it works on -m64, however as we're able to merge the functions with -m32 (-fipa-icf),
then I recommend to disable ICF for the test-case.

Reason why the pair of functions on x86_64 is that:

test3 (real(kind=4)[4] * restrict x, real(kind=4)[4] * restrict y)
{
  <bb 2>:

  <bb 3>:
  # S.0_6 = PHI <1(2), S.0_12(4)>
  if (S.0_6 > 4)
    goto <bb 5>;
  else
    goto <bb 4>;
...

test2 (real(kind=4)[4] * restrict x, real(kind=4)[4] * restrict y)
{
  integer(kind=4) i;

  <bb 2>:

  <bb 3>:
  # i_6 = PHI <1(2), i_12(4)>
...

On x86_64 types of 'S.0_6' and 'i' are not compatible. As I've just read tree dump files,
  # S.0_6 = PHI <1(2), S.0_12(4)>
  if (S.0_6 > 4)

is optimized out by VRP, which runs after IPA ICF.

I'll send patch right after we'll agree on pr42108.f90.

Thanks,
Martin

> 
> Richard.
> 
>> OK with that change.
>>
>> jeff
>>
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-12 12:31       ` Martin Liška
@ 2016-07-12 13:15         ` Richard Biener
  2016-07-12 14:54           ` Martin Liška
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2016-07-12 13:15 UTC (permalink / raw)
  To: Martin Liška; +Cc: Jeff Law, GCC Patches, fortran

On Tue, Jul 12, 2016 at 2:31 PM, Martin Liška <mliska@suse.cz> wrote:
> On 07/12/2016 12:14 PM, Richard Biener wrote:
>> On Mon, Jul 11, 2016 at 4:44 PM, Jeff Law <law@redhat.com> wrote:
>>> On 07/08/2016 08:26 AM, Martin Liška wrote:
>>>>
>>>> Hello
>>>>
>>>> Following patch fixes fallout caused by the patch set:
>>>> https://gcc.gnu.org/ml/gcc-regression/2016-07/msg00097.html
>>>>
>>>> Ready after it finished regression tests?
>>>> Thanks,
>>>> Martin
>>>>
>>>>
>>>> 0001-Fix-Fortran-DO-loop-fallback.patch
>>>>
>>>>
>>>> From c5dd7ad62f795cce560c7f1bb8767b7ed9298d8a Mon Sep 17 00:00:00 2001
>>>> From: marxin <mliska@suse.cz>
>>>> Date: Fri, 8 Jul 2016 15:51:54 +0200
>>>> Subject: [PATCH] Fix Fortran DO loop fallback
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> 2016-07-08  Martin Liska  <mliska@suse.cz>
>>>>
>>>>         * gfortran.dg/ldist-1.f90: Update expected dump scan.
>>>>         * gfortran.dg/pr42108.f90: Likewise.
>>>>         * gfortran.dg/vect/pr62283.f: Likewise.
>>>
>>> Shouldn't ldist-1.f90 be scan-tree-dump-not?  It seems like you change it
>>> from that just last week, but it wasn't mentioned in the ChangeLog.
>>
>> gfortran.dg/pr42108.f90 also looks pointless now?  I suppose there is nothing
>> to hoist after the change?  Do we now have an option to revert back to old
>> behavior?  If so it would be better to use that flag here.
>
> No, there's no option. So, as the test-case now looks pointless, should I mark it
> with xfail now?

The scan for 1 *n_ after FRE looks still useful.  Btw, the testcase
doesn't fail for me,
we _do_ hoist the division in PRE, just not with -m32 anymore.  Can
you confirm this?


>>
>> diff --git a/gcc/testsuite/gfortran.dg/vect/pr62283.f
>> b/gcc/testsuite/gfortran.dg/vect/pr62283.f
>> index 7df3d99..2933f51 100644
>> --- a/gcc/testsuite/gfortran.dg/vect/pr62283.f
>> +++ b/gcc/testsuite/gfortran.dg/vect/pr62283.f
>> @@ -13,4 +13,4 @@ C { dg-additional-options "-fvect-cost-model=dynamic" }
>>        beta=3.141593
>>        y=y+beta*x
>>        end
>> -C { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" {
>> target { vect_hw_misalign } } } }
>> +C { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target {
>> vect_hw_misalign } } } }
>>
>> so why do we no longer vectorize 1 loops in two functions?  The
>> testcase works for me
>> unchanged.
>
> Yeah, it works on -m64, however as we're able to merge the functions with -m32 (-fipa-icf),
> then I recommend to disable ICF for the test-case.
>
> Reason why the pair of functions on x86_64 is that:
>
> test3 (real(kind=4)[4] * restrict x, real(kind=4)[4] * restrict y)
> {
>   <bb 2>:
>
>   <bb 3>:
>   # S.0_6 = PHI <1(2), S.0_12(4)>
>   if (S.0_6 > 4)
>     goto <bb 5>;
>   else
>     goto <bb 4>;
> ...
>
> test2 (real(kind=4)[4] * restrict x, real(kind=4)[4] * restrict y)
> {
>   integer(kind=4) i;
>
>   <bb 2>:
>
>   <bb 3>:
>   # i_6 = PHI <1(2), i_12(4)>
> ...
>
> On x86_64 types of 'S.0_6' and 'i' are not compatible. As I've just read tree dump files,
>   # S.0_6 = PHI <1(2), S.0_12(4)>
>   if (S.0_6 > 4)
>
> is optimized out by VRP, which runs after IPA ICF.
>
> I'll send patch right after we'll agree on pr42108.f90.
>
> Thanks,
> Martin
>
>>
>> Richard.
>>
>>> OK with that change.
>>>
>>> jeff
>>>
>>>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-12 13:15         ` Richard Biener
@ 2016-07-12 14:54           ` Martin Liška
  2016-07-13 10:32             ` Richard Biener
  0 siblings, 1 reply; 26+ messages in thread
From: Martin Liška @ 2016-07-12 14:54 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, GCC Patches, fortran

[-- Attachment #1: Type: text/plain, Size: 339 bytes --]

On 07/12/2016 03:15 PM, Richard Biener wrote:
> The scan for 1 *n_ after FRE looks still useful.  Btw, the testcase
> doesn't fail for me,
> we _do_ hoist the division in PRE, just not with -m32 anymore.  Can
> you confirm this?

Yeah, it works for both -m32 and -m64.
This is final version of the patch, may I install it?

Thanks,
Martin

[-- Attachment #2: 0001-Fix-Fortran-DO-loop-fallback-v2.patch --]
[-- Type: text/x-patch, Size: 2149 bytes --]

From e23c13dcc2f5e1967b0941c7a627241e3b6759e8 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 8 Jul 2016 15:51:54 +0200
Subject: [PATCH] Fix Fortran DO loop fallback

gcc/testsuite/ChangeLog:

2016-07-08  Martin Liska  <mliska@suse.cz>

	* gfortran.dg/ldist-1.f90: Revert change introduces in r238114.
	* gfortran.dg/pr42108.f90: Add -fno-ipa-icf to additional
	options.
	* gfortran.dg/vect/pr62283.f: Update expected dump scan.
---
 gcc/testsuite/gfortran.dg/ldist-1.f90    | 2 +-
 gcc/testsuite/gfortran.dg/pr42108.f90    | 2 --
 gcc/testsuite/gfortran.dg/vect/pr62283.f | 2 +-
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/ldist-1.f90 b/gcc/testsuite/gfortran.dg/ldist-1.f90
index 2030328..ea3990d 100644
--- a/gcc/testsuite/gfortran.dg/ldist-1.f90
+++ b/gcc/testsuite/gfortran.dg/ldist-1.f90
@@ -32,4 +32,4 @@ end Subroutine PADEC
 ! There are 5 legal partitions in this code.  Based on the data
 ! locality heuristic, this loop should not be split.
 
-! { dg-final { scan-tree-dump "distributed: split to" "ldist" } }
+! { dg-final { scan-tree-dump-not "distributed: split to" "ldist" } }
diff --git a/gcc/testsuite/gfortran.dg/pr42108.f90 b/gcc/testsuite/gfortran.dg/pr42108.f90
index eb93604..a913aa4 100644
--- a/gcc/testsuite/gfortran.dg/pr42108.f90
+++ b/gcc/testsuite/gfortran.dg/pr42108.f90
@@ -21,7 +21,5 @@ subroutine  eval(foo1,foo2,foo3,foo4,x,n,nnd)
   end do
 end subroutine eval
 
-! We should have hoisted the division
-! { dg-final { scan-tree-dump "in all uses of countm1\[^\n\]* / " "pre" } }
 ! There should be only one load from n left
 ! { dg-final { scan-tree-dump-times "\\*n_" 1 "fre1" } }
diff --git a/gcc/testsuite/gfortran.dg/vect/pr62283.f b/gcc/testsuite/gfortran.dg/vect/pr62283.f
index 7df3d99..4d8cba1 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr62283.f
+++ b/gcc/testsuite/gfortran.dg/vect/pr62283.f
@@ -1,5 +1,5 @@
 C { dg-do compile }
-C { dg-additional-options "-fvect-cost-model=dynamic" }
+C { dg-additional-options "-fvect-cost-model=dynamic -fno-ipa-icf" }
       subroutine test2(x,y)
       real x(4),y(4)
       beta=3.141593
-- 
2.8.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-12 14:54           ` Martin Liška
@ 2016-07-13 10:32             ` Richard Biener
  2016-07-13 14:04               ` Martin Liška
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2016-07-13 10:32 UTC (permalink / raw)
  To: Martin Liška; +Cc: Jeff Law, GCC Patches, fortran

On Tue, Jul 12, 2016 at 4:54 PM, Martin Liška <mliska@suse.cz> wrote:
> On 07/12/2016 03:15 PM, Richard Biener wrote:
>> The scan for 1 *n_ after FRE looks still useful.  Btw, the testcase
>> doesn't fail for me,
>> we _do_ hoist the division in PRE, just not with -m32 anymore.  Can
>> you confirm this?
>
> Yeah, it works for both -m32 and -m64.
> This is final version of the patch, may I install it?

Changelog has a swapped entry.

Ok with fixing that.
Richard.

> Thanks,
> Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Fix Fortran DO loop fallback
  2016-07-13 10:32             ` Richard Biener
@ 2016-07-13 14:04               ` Martin Liška
  0 siblings, 0 replies; 26+ messages in thread
From: Martin Liška @ 2016-07-13 14:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, GCC Patches, fortran

On 07/13/2016 12:32 PM, Richard Biener wrote:
> Changelog has a swapped entry.
> 
> Ok with fixing that.
> Richard.

Fixed and installed as r238300.
M.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-07-13 14:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-07  9:47 [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step marxin
2016-07-07  9:47 ` [PATCH 2/2] Optimize fortran " marxin
2016-07-07 12:13   ` Tobias Burnus
2016-07-07 15:54     ` Tobias Burnus
2016-07-08  8:39       ` Martin Liška
2016-07-10  7:37   ` Andreas Schwab
2016-07-12 10:08     ` Richard Biener
2016-07-12 12:09       ` Martin Liška
2016-07-07  9:47 ` [PATCH 1/2] Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1 marxin
2016-07-07 12:27   ` Tobias Burnus
2016-07-07 14:00 ` [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step Richard Biener
2016-07-07 14:40   ` Jan Hubicka
2016-07-08  8:34     ` Martin Liška
2016-07-08  8:40       ` Jakub Jelinek
2016-07-08  9:03         ` Jan Hubicka
2016-07-08  9:05           ` Jakub Jelinek
2016-07-08  9:13             ` FX
2016-07-08 14:26 ` [PATCH] Fix Fortran DO loop fallback Martin Liška
2016-07-11 14:44   ` Jeff Law
2016-07-11 15:24     ` Mike Stump
2016-07-12 10:15     ` Richard Biener
2016-07-12 12:31       ` Martin Liška
2016-07-12 13:15         ` Richard Biener
2016-07-12 14:54           ` Martin Liška
2016-07-13 10:32             ` Richard Biener
2016-07-13 14:04               ` Martin Liška

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).