From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-491766-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 21744 invoked by alias); 6 Dec 2018 06:30:54 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 21734 invoked by uid 89); 6 Dec 2018 06:30:53 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=20181130, sk:generic, successive, 2018-11-30
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 06 Dec 2018 06:30:51 +0000
Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id EC7E47B020;	Thu,  6 Dec 2018 06:30:49 +0000 (UTC)
Received: from tucnak.zalov.cz (ovpn-117-214.ams2.redhat.com [10.36.117.214])	by smtp.corp.redhat.com (Postfix) with ESMTPS id 7BC1D6012C;	Thu,  6 Dec 2018 06:30:49 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1])	by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id wB66Ukjq008964;	Thu, 6 Dec 2018 07:30:47 +0100
Received: (from jakub@localhost)	by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id wB66Uk8v008963;	Thu, 6 Dec 2018 07:30:46 +0100
Date: Thu, 06 Dec 2018 06:30:00 -0000
From: Jakub Jelinek <jakub@redhat.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org
Subject: [PATCH] Don't optimize successive divs if there is also a mod with the same last arg (PR tree-optimization/85726)
Message-ID: <20181206063046.GM12380@tucnak>
Reply-To: Jakub Jelinek <jakub@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.9.2 (2017-12-15)
X-IsSubscribed: yes
X-SW-Source: 2018-12/txt/msg00340.txt.bz2

Hi!

This is my proposal for fixing this PR, just a heuristics when optimizing
successive divides might not be a good idea (first testcase), and includes
Marc's testcases which showed cases where optimizing successive divides is a
good idea even if the inner divide is not a single use.

Unfortunately match.pd doesn't allow to capture the outermost expression, so
I can't narrow it even more by checking if the outer divide is in the same
bb as the modulo.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-12-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/85726
	* generic-match-head.c (optimize_successive_divisions_p): New function.
	* gimple-match-head.c (optimize_successive_divisions_p): Likewise.
	* match.pd: Don't combine successive divisions if they aren't exact
	and optimize_successive_divisions_p is false.

	* gcc.dg/tree-ssa/pr85726-1.c: New test.
	* gcc.dg/tree-ssa/pr85726-2.c: New test.
	* gcc.dg/tree-ssa/pr85726-3.c: New test.
	* gcc.dg/tree-ssa/pr85726-4.c: New test.

--- gcc/generic-match-head.c.jj	2018-03-28 21:14:28.124743854 +0200
+++ gcc/generic-match-head.c	2018-12-05 16:07:43.710801347 +0100
@@ -77,3 +77,12 @@ canonicalize_math_after_vectorization_p
 {
   return false;
 }
+
+/* Return true if successive divisions can be optimized.
+   Defer to GIMPLE opts.  */
+
+static inline bool
+optimize_successive_divisions_p (tree, tree)
+{
+  return false;
+}
--- gcc/gimple-match-head.c.jj	2018-10-10 10:50:55.812109572 +0200
+++ gcc/gimple-match-head.c	2018-12-05 16:39:50.358122589 +0100
@@ -1163,3 +1163,27 @@ optimize_pow_to_exp (tree arg0, tree arg
     return false;
   return true;
 }
+
+/* Return true if a division INNER_DIV / DIVISOR where INNER_DIV
+   is another division can be optimized.  Don't optimize if INNER_DIV
+   is used in a TRUNC_MOD_EXPR with DIVISOR as second operand.  */
+
+static bool
+optimize_successive_divisions_p (tree divisor, tree inner_div)
+{
+  if (!gimple_in_ssa_p (cfun) || TREE_CODE (inner_div) != SSA_NAME)
+    return false;
+
+  imm_use_iterator imm_iter;
+  use_operand_p use_p;
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, inner_div)
+    {
+      gimple *use_stmt = USE_STMT (use_p);
+      if (!is_gimple_assign (use_stmt)
+	  || gimple_assign_rhs_code (use_stmt) != TRUNC_MOD_EXPR
+	  || !operand_equal_p (gimple_assign_rhs2 (use_stmt), divisor, 0))
+	continue;
+      return false;
+    }
+  return true;
+}
--- gcc/match.pd.jj	2018-11-30 21:36:22.273762329 +0100
+++ gcc/match.pd	2018-12-05 16:47:27.798596450 +0100
@@ -312,17 +312,19 @@ (define_operator_list COND_TERNARY
    and floor_div is trickier and combining round_div even more so.  */
 (for div (trunc_div exact_div)
  (simplify
-  (div (div @0 INTEGER_CST@1) INTEGER_CST@2)
+  (div (div@3 @0 INTEGER_CST@1) INTEGER_CST@2)
   (with {
     wi::overflow_type overflow;
     wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
 			    TYPE_SIGN (type), &overflow);
    }
-   (if (!overflow)
-    (div @0 { wide_int_to_tree (type, mul); })
-    (if (TYPE_UNSIGNED (type)
-	 || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
-     { build_zero_cst (type); })))))
+   (if (div == EXACT_DIV_EXPR
+	|| optimize_successive_divisions_p (@2, @3))
+    (if (!overflow)
+     (div @0 { wide_int_to_tree (type, mul); })
+     (if (TYPE_UNSIGNED (type)
+	  || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
+      { build_zero_cst (type); }))))))
 
 /* Combine successive multiplications.  Similar to above, but handling
    overflow is different.  */
--- gcc/testsuite/gcc.dg/tree-ssa/pr85726-1.c.jj	2018-12-05 16:55:24.852680964 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-1.c	2018-12-05 16:50:14.489853926 +0100
@@ -0,0 +1,19 @@
+/* PR tree-optimization/85726 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump " / 16;" "optimized" } } */
+/* { dg-final { scan-tree-dump " / 3;" "optimized" } } */
+/* { dg-final { scan-tree-dump " % 3;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " / 48;" "optimized" } } */
+
+int ww, vv;
+
+int
+foo (int y)
+{
+  int z = y / 16;
+  int w = z / 3;
+  int v = z % 3;
+  ww = w;
+  return v;
+}
--- gcc/testsuite/gcc.dg/tree-ssa/pr85726-2.c.jj	2018-12-05 16:55:27.620634707 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-2.c	2018-12-05 16:51:25.636678886 +0100
@@ -0,0 +1,15 @@
+/* PR tree-optimization/85726 */
+/* { dg-do compile { target int32 } } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump " / 3145728;" "optimized" } } */
+/* { dg-final { scan-tree-dump "y = 0;" "optimized" } } */
+
+int x, y;
+
+void
+foo (int n)
+{
+  int c = 3 << 20;
+  x = n / c;
+  y = x / c;
+}
--- gcc/testsuite/gcc.dg/tree-ssa/pr85726-3.c.jj	2018-12-05 16:55:30.948579089 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-3.c	2018-12-05 16:52:57.350146115 +0100
@@ -0,0 +1,15 @@
+/* PR tree-optimization/85726 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times " / 3;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " / 15;" 1 "optimized" } } */
+
+int x, y, z;
+
+void
+foo (int n)
+{
+  x = n / 3;
+  y = x / 5;
+  z = n / 15;
+}
--- gcc/testsuite/gcc.dg/tree-ssa/pr85726-4.c.jj	2018-12-05 16:55:34.588518255 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-4.c	2018-12-05 16:55:04.789016282 +0100
@@ -0,0 +1,15 @@
+/* PR tree-optimization/85726 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times " / 4;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " / 16;" 1 "optimized" } } */
+
+int x, y, z;
+
+void
+foo (int n)
+{
+  x = n / 4;
+  y = x / 4;
+  z = y * 16 | 15;
+}

	Jakub