From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rguenth@sourceware.org>
Received: by sourceware.org (Postfix, from userid 1666)
	id 494303857835; Wed, 11 Jan 2023 11:59:49 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 494303857835
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1673438389;
	bh=HBBT6ADVdfil62TIx4/jJiy4tJQsBwwMlmfGnPFA19Q=;
	h=From:To:Subject:Date:From;
	b=MrgzYq1r5a7LDsq1nbZsWtOMf7svrZv9EmEfmHCbLNDXWo0ojJ7R2S+2dY+I9xlfm
	 BfF89lmYBGBWeoxaChX59TqqJ0Az6Votcv5403geREpmtJhumcG/xEA8cx8jvq7YTr
	 9LDaWcSnimTFdFFbyq2sCUe/uXQHYOT7Q4nOwzQY=
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="utf-8"
From: Richard Biener <rguenth@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc r13-5103] tree-optimization/108352 - FSM threads creating
 irreducible loops
X-Act-Checkin: gcc
X-Git-Author: Richard Biener <rguenther@suse.de>
X-Git-Refname: refs/heads/master
X-Git-Oldrev: 445a48a226ffd530b37bcdc13b6bdca94ba2e122
X-Git-Newrev: 7c9f20fcfdc2d8453df88ceb7e693debfcd678c0
Message-Id: <20230111115949.494303857835@sourceware.org>
Date: Wed, 11 Jan 2023 11:59:49 +0000 (GMT)
List-Id: <gcc-cvs.sourceware.org>

https://gcc.gnu.org/g:7c9f20fcfdc2d8453df88ceb7e693debfcd678c0

commit r13-5103-g7c9f20fcfdc2d8453df88ceb7e693debfcd678c0
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jan 11 12:07:16 2023 +0100

    tree-optimization/108352 - FSM threads creating irreducible loops
    
    The following relaxes a heuristic that prevents creating irreducible
    loops from FSM threads not covering multi-way branches.  Instead of
    allowing threads that adhere to
    
          && (n_insns * (unsigned) param_fsm_scale_path_stmts
              > (m_path.length () *
                 (unsigned) param_fsm_scale_path_blocks))
    
    with reasoning "We also consider it worth creating an irreducible inner loop if
    the number of copied statement is low relative to the length of the path --
    in that case there's little the traditional loop optimizer would have done
    anyway, so an irreducible loop is not so bad." that I cannot make much
    sense of the following patch changes that to only allow those after
    loop optimization and when they are (scaled) short:
    
          && (!(cfun->curr_properties & PROP_loop_opts_done)
              || (m_n_insns * param_fsm_scale_path_stmts
                  >= param_max_jump_thread_duplication_stmts)))
    
    This allows us to get rid of --param fsm-scale-path-blocks which
    previous to the bisected revision allowed an enlarged path covering
    the original allowance (but we do not consider that enlarged path
    now because enlarging it doesn't add any information).
    
            PR tree-optimization/108352
            * tree-ssa-threadbackward.cc
            (back_threader_profitability::profitable_path_p): Adjust
            heuristic that allows non-multi-way branch threads creating
            irreducible loops.
            * doc/invoke.texi (--param fsm-scale-path-blocks): Remove.
            (--param fsm-scale-path-stmts): Adjust.
            * params.opt (--param=fsm-scale-path-blocks=): Remove.
            (-param=fsm-scale-path-stmts=): Adjust description.
    
            * gcc.dg/tree-ssa/ssa-thread-21.c: New testcase.
            * gcc.dg/tree-ssa/vrp46.c: Remove --param fsm-scale-path-blocks=1.

Diff:
---
 gcc/doc/invoke.texi                           |  7 ++-----
 gcc/params.opt                                |  6 +-----
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c | 26 ++++++++++++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/vrp46.c         |  2 +-
 gcc/tree-ssa-threadbackward.cc                | 18 +++++++-----------
 5 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 80d942917bd..701c228bd0a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15981,16 +15981,13 @@ Max. size of loc list for which reverse ops should be added.
 
 @item fsm-scale-path-stmts
 Scale factor to apply to the number of statements in a threading path
-when comparing to the number of (scaled) blocks.
+crossing a loop backedge when comparing to
+@option{--param=max-jump-thread-duplication-stmts}.
 
 @item uninit-control-dep-attempts
 Maximum number of nested calls to search for control dependencies
 during uninitialized variable analysis.
 
-@item fsm-scale-path-blocks
-Scale factor to apply to the number of blocks in a threading path
-when comparing to the number of (scaled) statements.
-
 @item sched-autopref-queue-depth
 Hardware autoprefetcher scheduler model control flag.
 Number of lookahead cycles the model looks into; at '
diff --git a/gcc/params.opt b/gcc/params.opt
index e178dec1600..929131254d2 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -134,13 +134,9 @@ Maximum number of basic blocks before EVRP uses a sparse cache.
 Common Joined UInteger Var(param_evrp_switch_limit) Init(50) Optimization Param
 Maximum number of outgoing edges in a switch before EVRP will not process it.
 
--param=fsm-scale-path-blocks=
-Common Joined UInteger Var(param_fsm_scale_path_blocks) Init(3) IntegerRange(1, 10) Param Optimization
-Scale factor to apply to the number of blocks in a threading path when comparing to the number of (scaled) statements.
-
 -param=fsm-scale-path-stmts=
 Common Joined UInteger Var(param_fsm_scale_path_stmts) Init(2) IntegerRange(1, 10) Param Optimization
-Scale factor to apply to the number of statements in a threading path when comparing to the number of (scaled) blocks.
+Scale factor to apply to the number of statements in a threading path crossing a loop backedge when comparing to max-jump-thread-duplication-stmts.
 
 -param=gcse-after-reload-critical-fraction=
 Common Joined UInteger Var(param_gcse_after_reload_critical_fraction) Init(10) Param Optimization
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c
new file mode 100644
index 00000000000..16537ccfb61
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-thread2-stats -fdump-tree-optimized" } */
+
+long a;
+int b;
+void bar64_(void);
+void foo();
+int main() {
+  char c = 0;
+  unsigned d = 10;
+  int e = 2;
+  for (; d; d--) {
+    bar64_();
+    b = d;
+    e && (c = (e = 0) != 4) > 1;
+  }
+  if (c < 1)
+    foo();
+  a = b;
+}
+
+/* We need to perform a non-multi-way branch FSM thread creating an
+   irreducible loop in thread2 to allow followup threading to
+   remove the call to foo ().  */
+/* { dg-final { scan-tree-dump "Jumps threaded: 1" "thread2" } } */
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
index ebdc2e3eedb..6d3f75209d1 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 --param fsm-scale-path-blocks=1" } */
+/* { dg-options "-O2 -fdump-tree-vrp1" } */
 
 int func_81 (int);
 int func_98 (int);
diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc
index 8a64535d195..fcbb95b08be 100644
--- a/gcc/tree-ssa-threadbackward.cc
+++ b/gcc/tree-ssa-threadbackward.cc
@@ -868,22 +868,18 @@ back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path,
      a multiway branch, in which case we have deemed it worth losing
      other loop optimizations later.
 
-     We also consider it worth creating an irreducible inner loop if
-     the number of copied statement is low relative to the length of
-     the path -- in that case there's little the traditional loop
-     optimizer would have done anyway, so an irreducible loop is not
-     so bad.  */
+     We also consider it worth creating an irreducible inner loop after
+     loop optimizations if the number of copied statement is low.  */
   if (!m_threaded_multiway_branch
       && *creates_irreducible_loop
-      && (m_n_insns * (unsigned) param_fsm_scale_path_stmts
-	  > (m_path.length () *
-	     (unsigned) param_fsm_scale_path_blocks)))
-
+      && (!(cfun->curr_properties & PROP_loop_opts_done)
+	  || (m_n_insns * param_fsm_scale_path_stmts
+	      >= param_max_jump_thread_duplication_stmts)))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file,
-		 "  FAIL: Would create irreducible loop without threading "
-		 "multiway branch.\n");
+		 "  FAIL: Would create irreducible loop early without "
+		 "threading multiway branch.\n");
       /* We compute creates_irreducible_loop only late.  */
       return false;
     }