From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 112915 invoked by alias); 24 Jan 2018 03:42:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 112895 invoked by uid 89); 24 Jan 2018 03:42:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=free!, Latin, latin, brasil X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 24 Jan 2018 03:42:24 +0000 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A783FC070132 for ; Wed, 24 Jan 2018 03:42:21 +0000 (UTC) Received: from freie.home (ovpn04.gateway.prod.ext.phx2.redhat.com [10.5.9.4]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3C4597C127 for ; Wed, 24 Jan 2018 03:42:20 +0000 (UTC) Received: from livre (livre.home [172.31.160.2]) by freie.home (8.15.2/8.15.2) with ESMTP id w0O3gA1q009193; Wed, 24 Jan 2018 01:42:11 -0200 From: Alexandre Oliva To: gcc-patches@gcc.gnu.org Subject: [PR81611] improve auto-inc Date: Wed, 24 Jan 2018 05:28:00 -0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2018-01/txt/msg01996.txt.bz2 These two patches fix PR81611. The first one improves forwprop so that we avoid adding SSA conflicting by forwpropping the iv increment, which may cause both the incremented and the original value to be live, even when the iv is copied between the PHI node and the increment. We already handled the case in which there aren't any such copies. Alas, this is not enough to address the problem on avr, even though it fixes it on e.g. ppc. The reason is that avr rejects var+offset addresses, and this prevents the memory access in a post-inc code sequence from being adjusted to an address that auto-inc-dec can recognize. The second patch adjusts auto-inc-dec to recognize a code sequence in which the original, unincremented pseudo is used in an address after it's incremented into another pseudo, and turn that into a post-inc address, leaving the copying for subsequent passes to eliminate. Regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu and aarch64-linux-gnu. Ok to install? I'd appreciate suggestions on how to turn the submitted testcase into a regression test; I suppose an avr-specific test that requires the auto-inc transformation is a possibility, but that feels a bit too limited, doesn't it? Thoughts? Thanks in advance, [PR81611] accept copies in simple_iv_increment_p If there are copies between the GIMPLE_PHI at the loop body and the increment that reaches it (presumably through a back edge), still regard it as a simple_iv_increment, so that we won't consider the value in the back edge eligible for forwprop. Doing so would risk making the phi node and the incremented conflicting value live within the loop, and the phi node to be preserved for propagated uses after the loop. for gcc/ChangeLog PR tree-optimization/81611 * tree-ssa-dom.c (simple_iv_increment_p): Skip intervening copies. --- gcc/tree-ssa-dom.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 2b371667253a..3c0ff9458342 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1276,8 +1276,11 @@ record_equality (tree x, tree y, class const_and_copies *const_and_copies) /* Returns true when STMT is a simple iv increment. It detects the following situation: - i_1 = phi (..., i_2) - i_2 = i_1 +/- ... */ + i_1 = phi (..., i_k) + [...] + i_j = i_{j-1} for each j : 2 <= j <= k-1 + [...] + i_k = i_{k-1} +/- ... */ bool simple_iv_increment_p (gimple *stmt) @@ -1305,8 +1308,18 @@ simple_iv_increment_p (gimple *stmt) return false; phi = SSA_NAME_DEF_STMT (preinc); - if (gimple_code (phi) != GIMPLE_PHI) - return false; + while (gimple_code (phi) != GIMPLE_PHI) + { + /* Follow trivial copies, but not the DEF used in a back edge, + so that we don't prevent coalescing. */ + if (gimple_code (phi) != GIMPLE_ASSIGN + || gimple_assign_lhs (phi) != preinc + || !gimple_assign_ssa_name_copy_p (phi)) + return false; + + preinc = gimple_assign_rhs1 (phi); + phi = SSA_NAME_DEF_STMT (preinc); + } for (i = 0; i < gimple_phi_num_args (phi); i++) if (gimple_phi_arg_def (phi, i) == lhs) [PR81611] turn inc-and-use-of-dead-orig into auto-inc When the addressing modes available on the machine don't allow offsets in addresses, odds are that post-increments will be represented in trees and RTL as: y <= x + 1 ... *(x) ... x <= y so deal with this form so as to create auto-inc addresses that we'd otherwise miss. for gcc/ChangeLog PR rtl-optimization/81611 * auto-inc-dec.c (attempt_change): Move dead note from mem_insn if it's the next use of regno (find_address): Take address use of reg holding non-incremented value. (merge_in_block): Attempt to use a mem insn that is the next use of the original regno. --- gcc/auto-inc-dec.c | 46 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c index d02fa9d081c7..4ffbcf56a456 100644 --- a/gcc/auto-inc-dec.c +++ b/gcc/auto-inc-dec.c @@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg) before the memory reference. */ gcc_assert (mov_insn); emit_insn_before (mov_insn, inc_insn.insn); - move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); + regno = REGNO (inc_insn.reg0); + if (reg_next_use[regno] == mem_insn.insn) + move_dead_notes (mov_insn, mem_insn.insn, inc_insn.reg0); + else + move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); regno = REGNO (inc_insn.reg_res); reg_next_def[regno] = mov_insn; @@ -842,7 +846,7 @@ find_address (rtx *address_of_x) if (code == MEM && rtx_equal_p (XEXP (x, 0), inc_insn.reg_res)) { - /* Match with *reg0. */ + /* Match with *reg_res. */ mem_insn.mem_loc = address_of_x; mem_insn.reg0 = inc_insn.reg_res; mem_insn.reg1_is_const = true; @@ -850,6 +854,17 @@ find_address (rtx *address_of_x) mem_insn.reg1 = GEN_INT (0); return -1; } + if (code == MEM && inc_insn.reg1_is_const && inc_insn.reg0 + && rtx_equal_p (XEXP (x, 0), inc_insn.reg0)) + { + /* Match with *reg0 AKA *(reg_res - reg1_val). */ + mem_insn.mem_loc = address_of_x; + mem_insn.reg0 = inc_insn.reg_res; + mem_insn.reg1_is_const = true; + mem_insn.reg1_val = -inc_insn.reg1_val; + mem_insn.reg1 = GEN_INT (mem_insn.reg1_val); + return -1; + } if (code == MEM && GET_CODE (XEXP (x, 0)) == PLUS && rtx_equal_p (XEXP (XEXP (x, 0), 0), inc_insn.reg_res)) { @@ -1370,6 +1385,33 @@ merge_in_block (int max_reg, basic_block bb) insn_is_add_or_inc = false; } } + + if (ok && insn_is_add_or_inc + && inc_insn.reg0 != inc_insn.reg_res) + { + regno = REGNO (inc_insn.reg0); + int luid = DF_INSN_LUID (mem_insn.insn); + mem_insn.insn = get_next_ref (regno, bb, reg_next_use); + + /* Try a mem use of reg0 before that of reg_res + too. If there's no further use of reg_res, + there's no point in trying an auto-inc, and + if the use of reg0 is after that of reg_res, + it will be too late for the auto-inc to + compute reg_res's correct value. */ + if (mem_insn.insn + && luid > DF_INSN_LUID (mem_insn.insn) + && find_address (&PATTERN (mem_insn.insn)) == -1) + { + if (dump_file) + dump_mem_insn (dump_file); + if (try_merge ()) + { + success_in_block++; + insn_is_add_or_inc = false; + } + } + } } } } -- Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer