From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 40533 invoked by alias); 6 Dec 2018 09:01:14 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 40144 invoked by uid 89); 6 Dec 2018 09:00:44 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,SPF_PASS,TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy= X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 06 Dec 2018 09:00:23 +0000 Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0DD71AFA1; Thu, 6 Dec 2018 09:00:09 +0000 (UTC) Date: Thu, 06 Dec 2018 09:01:00 -0000 From: Richard Biener To: Jakub Jelinek cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Don't optimize successive divs if there is also a mod with the same last arg (PR tree-optimization/85726) In-Reply-To: <20181206063046.GM12380@tucnak> Message-ID: References: <20181206063046.GM12380@tucnak> User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2018-12/txt/msg00346.txt.bz2 On Thu, 6 Dec 2018, Jakub Jelinek wrote: > Hi! > > This is my proposal for fixing this PR, just a heuristics when optimizing > successive divides might not be a good idea (first testcase), and includes > Marc's testcases which showed cases where optimizing successive divides is a > good idea even if the inner divide is not a single use. > > Unfortunately match.pd doesn't allow to capture the outermost expression, so > I can't narrow it even more by checking if the outer divide is in the same > bb as the modulo. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? + if (!gimple_in_ssa_p (cfun) || TREE_CODE (inner_div) != SSA_NAME) + return false; I think the latter is always true. The code handles the case of missing immediate uses by performing the folding which is probably OK but a general "issue" of using stmt operands or immediate uses in GIMPLE folding patterns. I think the patch is OK but I also hope we do not add too many of these heuristics... Richard. > 2018-12-06 Jakub Jelinek > > PR tree-optimization/85726 > * generic-match-head.c (optimize_successive_divisions_p): New function. > * gimple-match-head.c (optimize_successive_divisions_p): Likewise. > * match.pd: Don't combine successive divisions if they aren't exact > and optimize_successive_divisions_p is false. > > * gcc.dg/tree-ssa/pr85726-1.c: New test. > * gcc.dg/tree-ssa/pr85726-2.c: New test. > * gcc.dg/tree-ssa/pr85726-3.c: New test. > * gcc.dg/tree-ssa/pr85726-4.c: New test. > > --- gcc/generic-match-head.c.jj 2018-03-28 21:14:28.124743854 +0200 > +++ gcc/generic-match-head.c 2018-12-05 16:07:43.710801347 +0100 > @@ -77,3 +77,12 @@ canonicalize_math_after_vectorization_p > { > return false; > } > + > +/* Return true if successive divisions can be optimized. > + Defer to GIMPLE opts. */ > + > +static inline bool > +optimize_successive_divisions_p (tree, tree) > +{ > + return false; > +} > --- gcc/gimple-match-head.c.jj 2018-10-10 10:50:55.812109572 +0200 > +++ gcc/gimple-match-head.c 2018-12-05 16:39:50.358122589 +0100 > @@ -1163,3 +1163,27 @@ optimize_pow_to_exp (tree arg0, tree arg > return false; > return true; > } > + > +/* Return true if a division INNER_DIV / DIVISOR where INNER_DIV > + is another division can be optimized. Don't optimize if INNER_DIV > + is used in a TRUNC_MOD_EXPR with DIVISOR as second operand. */ > + > +static bool > +optimize_successive_divisions_p (tree divisor, tree inner_div) > +{ > + if (!gimple_in_ssa_p (cfun) || TREE_CODE (inner_div) != SSA_NAME) > + return false; > + > + imm_use_iterator imm_iter; > + use_operand_p use_p; > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, inner_div) > + { > + gimple *use_stmt = USE_STMT (use_p); > + if (!is_gimple_assign (use_stmt) > + || gimple_assign_rhs_code (use_stmt) != TRUNC_MOD_EXPR > + || !operand_equal_p (gimple_assign_rhs2 (use_stmt), divisor, 0)) > + continue; > + return false; > + } > + return true; > +} > --- gcc/match.pd.jj 2018-11-30 21:36:22.273762329 +0100 > +++ gcc/match.pd 2018-12-05 16:47:27.798596450 +0100 > @@ -312,17 +312,19 @@ (define_operator_list COND_TERNARY > and floor_div is trickier and combining round_div even more so. */ > (for div (trunc_div exact_div) > (simplify > - (div (div @0 INTEGER_CST@1) INTEGER_CST@2) > + (div (div@3 @0 INTEGER_CST@1) INTEGER_CST@2) > (with { > wi::overflow_type overflow; > wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2), > TYPE_SIGN (type), &overflow); > } > - (if (!overflow) > - (div @0 { wide_int_to_tree (type, mul); }) > - (if (TYPE_UNSIGNED (type) > - || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) > - { build_zero_cst (type); }))))) > + (if (div == EXACT_DIV_EXPR > + || optimize_successive_divisions_p (@2, @3)) > + (if (!overflow) > + (div @0 { wide_int_to_tree (type, mul); }) > + (if (TYPE_UNSIGNED (type) > + || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) > + { build_zero_cst (type); })))))) > > /* Combine successive multiplications. Similar to above, but handling > overflow is different. */ > --- gcc/testsuite/gcc.dg/tree-ssa/pr85726-1.c.jj 2018-12-05 16:55:24.852680964 +0100 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-1.c 2018-12-05 16:50:14.489853926 +0100 > @@ -0,0 +1,19 @@ > +/* PR tree-optimization/85726 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump " / 16;" "optimized" } } */ > +/* { dg-final { scan-tree-dump " / 3;" "optimized" } } */ > +/* { dg-final { scan-tree-dump " % 3;" "optimized" } } */ > +/* { dg-final { scan-tree-dump-not " / 48;" "optimized" } } */ > + > +int ww, vv; > + > +int > +foo (int y) > +{ > + int z = y / 16; > + int w = z / 3; > + int v = z % 3; > + ww = w; > + return v; > +} > --- gcc/testsuite/gcc.dg/tree-ssa/pr85726-2.c.jj 2018-12-05 16:55:27.620634707 +0100 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-2.c 2018-12-05 16:51:25.636678886 +0100 > @@ -0,0 +1,15 @@ > +/* PR tree-optimization/85726 */ > +/* { dg-do compile { target int32 } } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump " / 3145728;" "optimized" } } */ > +/* { dg-final { scan-tree-dump "y = 0;" "optimized" } } */ > + > +int x, y; > + > +void > +foo (int n) > +{ > + int c = 3 << 20; > + x = n / c; > + y = x / c; > +} > --- gcc/testsuite/gcc.dg/tree-ssa/pr85726-3.c.jj 2018-12-05 16:55:30.948579089 +0100 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-3.c 2018-12-05 16:52:57.350146115 +0100 > @@ -0,0 +1,15 @@ > +/* PR tree-optimization/85726 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-times " / 3;" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times " / 15;" 1 "optimized" } } */ > + > +int x, y, z; > + > +void > +foo (int n) > +{ > + x = n / 3; > + y = x / 5; > + z = n / 15; > +} > --- gcc/testsuite/gcc.dg/tree-ssa/pr85726-4.c.jj 2018-12-05 16:55:34.588518255 +0100 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr85726-4.c 2018-12-05 16:55:04.789016282 +0100 > @@ -0,0 +1,15 @@ > +/* PR tree-optimization/85726 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-times " / 4;" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times " / 16;" 1 "optimized" } } */ > + > +int x, y, z; > + > +void > +foo (int n) > +{ > + x = n / 4; > + y = x / 4; > + z = y * 16 | 15; > +} > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)