From: Jiong Wang <jiong.wang@arm.com>
To: Jeff Law <law@redhat.com>
Cc: Steven Bosscher <stevenb.gcc@gmail.com>,
"gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Kenneth Zadeck <zadeck@naturalbridge.com>
Subject: Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
Date: Tue, 21 Apr 2015 14:43:00 -0000 [thread overview]
Message-ID: <n99vbgp35kl.fsf@arm.com> (raw)
In-Reply-To: <CAAfDdZ1G_0A0k0RRF2XO_5W6xN13BoA_18+s-Z68P1YTS32mMA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3634 bytes --]
Jiong Wang writes:
> 2015-04-14 18:24 GMT+01:00 Jeff Law <law@redhat.com>:
>> On 04/14/2015 10:48 AM, Steven Bosscher wrote:
>>>>
>>>> So I think this stage2/3 binary difference is acceptable?
>>>
>>>
>>> No, they should be identical. If there's a difference, then there's a
>>> bug - which, it seems, you've already found, too.
>>
>> RIght. And so the natural question is how to fix.
>>
>> At first glance it would seem like having this new code ignore dependencies
>> rising from debug insns would work.
>>
>> Which then begs the question, what happens to the debug insn -- it's
>> certainly not going to be correct anymore if the transformation is made.
>
> Exactly.
>
> The debug_insn 2776 in my example is to record the base address of a
> local array. the new code is doing correctly here by not shuffling the
> operands of insn 2556 and 2557 as there is additional reference of
> reg:1473 from debug insn, although the code will still execute correctly
> if we do the transformation.
>
> my understanding to fix this:
>
> * delete the out-of-date mismatch debug_insn? as there is no guarantee
> to generate accurate debug info under -O2.
>
> IMO, this debug_insn may affect "DW_AT_location" field for variable
> descrption of "classes" in .debug_info section, but it's omitted in
> the final output already.
>
> <3><38a4d>: Abbrev Number: 137 (DW_TAG_variable)
> <38a4f> DW_AT_name : (indirect string, offset: 0x18db): classes
> <38a53> DW_AT_decl_file : 1
> <38a54> DW_AT_decl_line : 548
> <38a56> DW_AT_type : <0x38cb4>
>
> * update the debug_insn? if the following change is OK with dwarf standard
>
> from
>
> insn0: reg0 = fp + reg1
> debug_insn: var_loc = reg0 + const_off
> insn1: reg2 = reg0 + const_off
>
> to
>
> insn0: reg0 = fp + const_off
> debug_insn: var_loc = reg0 + reg1
> insn1: reg2 = reg0 + reg1
>
> Thanks,
>
And attachment is the new patch which will update debug_insn as
described in the second solution above.
Now the stage2/3 binary differences on AArch64 gone away. Bootstrap OK.
On AArch64, this patch give 600+ new rtl loop invariants found across
spec2k6 float. +4.5% perf improvement on 436.cactusADM because four new
invariants found in the critical function "regex_compile".
The similar improvements may be achieved on other RISC backends like
powerpc/mips I guess.
One thing to mention, for AArch64, one minor glitch in
aarch64_legitimize_address needs to be fixed to let this patch take
effect, I will send out that patch later as it's a seperate issue.
Powerpc/Mips don't have this glitch in LEGITIMIZE_ADDRESS hook, so
should be OK, and I verified the base address of local array in the
testcase given by Seb on pr62173 do hoisted on ppc64 now. I think
pr62173 is fixed on those 64bit arch by this patch.
Thoughts?
Thanks.
2015-04-21 Jiong Wang <jiong.wang@arm.com>
gcc/
* loop-invariant.c (find_defs): Enable DF_DU_CHAIN build.
(vfp_const_iv): New hash table.
(expensive_addr_check_p): New boolean.
(init_inv_motion_data): Initialize new variables.>
(free_inv_motion_data): Release hash table.
(create_new_invariant): Set cheap_address to false for iv in
vfp_const_iv table.
(find_invariant_insn): Skip dependencies check for iv in vfp_const_iv
table.
(use_for_single_du): New function.
(reshuffle_insn_with_vfp): Likewise.
(find_invariants_bb): Call reshuffle_insn_with_vfp.
gcc/testsuite/
* gcc.dg/pr62173.c: New testcase.
--
Regards,
Jiong
[-- Attachment #2: fix-dwarf.patch --]
[-- Type: text/x-diff, Size: 7390 bytes --]
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..f70dfb0 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -203,6 +203,8 @@ typedef struct invariant *invariant_p;
/* The invariants. */
static vec<invariant_p> invariants;
+static hash_table <pointer_hash <rtx_insn> > *vfp_const_iv;
+static bool need_expensive_addr_check_p;
/* Check the size of the invariant table and realloc if necessary. */
@@ -695,7 +697,7 @@ find_defs (struct loop *loop)
df_remove_problem (df_chain);
df_process_deferred_rescans ();
- df_chain_add_problem (DF_UD_CHAIN);
+ df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
df_analyze_loop (loop);
check_invariant_table_size ();
@@ -742,6 +744,9 @@ create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on,
See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html . */
inv->cheap_address = address_cost (SET_SRC (set), word_mode,
ADDR_SPACE_GENERIC, speed) < 3;
+
+ if (need_expensive_addr_check_p && vfp_const_iv->find (insn))
+ inv->cheap_address = false;
}
else
{
@@ -952,7 +957,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed)
return;
depends_on = BITMAP_ALLOC (NULL);
- if (!check_dependencies (insn, depends_on))
+ if (!vfp_const_iv->find (insn)
+ && !check_dependencies (insn, depends_on))
{
BITMAP_FREE (depends_on);
return;
@@ -1007,6 +1013,180 @@ find_invariants_insn (rtx_insn *insn, bool always_reached, bool always_executed)
record_uses (insn);
}
+/* Find the single use of def of insn. At the same time,
+ make sure def of insn is the single def which reach the use.
+ NOTE: debug_insn can affect DF, var_location debug insn may reference
+ the same rtl expr as variable location, such debug_insn should not
+ affect the insn shuffling while we do need to update the loc of the
+ debug_insn after insn shuffling. So here we will record such debug_insn. */
+
+static rtx_insn *
+use_for_single_du (rtx_insn *insn, rtx_insn **debug_insn, rtx *debug_expr_loc)
+{
+ struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+ df_ref def;
+
+ FOR_EACH_INSN_INFO_DEF (def, insn_info)
+ {
+ struct df_link *chain = DF_REF_CHAIN (def);
+
+ if (!chain)
+ return NULL;
+
+ /* For multi use, only allow the second use be debug_insn. */
+ if (chain->next
+ && (chain->next->next
+ || NONDEBUG_INSN_P (DF_REF_INSN (chain->next->ref))))
+ return NULL;
+
+ if (chain->next)
+ {
+ rtx_insn *insn = DF_REF_INSN (chain->next->ref);
+ rtx debug_pat = PATTERN (insn);
+ if (GET_CODE (debug_pat) != VAR_LOCATION)
+ return NULL;
+ else
+ {
+ *debug_insn = insn;
+ *debug_expr_loc = PAT_VAR_LOCATION_LOC (debug_pat);
+ }
+ }
+
+ df_ref use = chain->ref;
+ chain = DF_REF_CHAIN (use);
+
+ if (!chain
+ || chain->next)
+ return NULL;
+
+ return DF_REF_INSN (use);
+ }
+
+ return NULL;
+}
+
+/* This function also try to transform
+
+ RA <- fixed_reg + RC
+ RD <- MEM (RA + const_offset)
+
+ into:
+
+ RA <- fixed_reg + const_offset
+ RD <- MEM (RA + RC) <- pos0
+
+ If use of RA in the second insn is the single use, and the define of
+ RA in the first insn is the single def reach the second insn.
+
+ After this change, the first instruction is loop invariant. */
+
+static void
+reshuffle_insn_with_vfp (rtx_insn *insn)
+{
+ rtx set = single_set (insn);
+
+ if (!set
+ || GET_CODE (SET_SRC (set)) != PLUS)
+ return;
+
+ rtx dest = SET_DEST (set);
+ rtx op0 = XEXP (SET_SRC (set), 0);
+ rtx op1 = XEXP (SET_SRC (set), 1);
+ rtx non_vfp_op = op1;
+
+ if (op1 == frame_pointer_rtx
+ || op1 == stack_pointer_rtx
+ || op1 == virtual_stack_vars_rtx)
+ {
+ non_vfp_op = op0;
+ std::swap (op0, op1);
+ }
+
+ if (GET_CODE (op1) == SIGN_EXTEND
+ || GET_CODE (op1) == ZERO_EXTEND)
+ op1 = XEXP (op1, 0);
+
+ if (!(REG_P (dest) && REG_P (op0) && REG_P (op1)
+ && (op0 == frame_pointer_rtx
+ || op0 == stack_pointer_rtx
+ || op0 == virtual_stack_vars_rtx)))
+ return;
+
+ rtx_insn *use_insn, *debug_insn = NULL;
+ rtx debug_expr_loc;
+
+ if (!(use_insn = use_for_single_du (insn, &debug_insn, &debug_expr_loc)))
+ return;
+
+ rtx u_set = single_set (use_insn);
+
+ if (!(u_set && (REG_P (SET_DEST (u_set)) || MEM_P (SET_DEST (u_set)))))
+ return;
+
+ rtx mem_addr;
+
+ if (REG_P (SET_DEST (u_set)))
+ mem_addr = SET_SRC (u_set);
+ else
+ mem_addr = SET_DEST (u_set);
+
+ if (debug_insn && !rtx_equal_p (debug_expr_loc, mem_addr))
+ return;
+
+ if (GET_CODE (mem_addr) == ZERO_EXTEND
+ || GET_CODE (mem_addr) == SIGN_EXTEND
+ || GET_CODE (mem_addr) == TRUNCATE)
+ mem_addr = XEXP (mem_addr, 0);
+
+ if (!MEM_P (mem_addr)
+ || GET_CODE (XEXP (mem_addr, 0)) != PLUS)
+ return;
+
+ rtx *mem_plus_loc = &XEXP (mem_addr, 0);
+ rtx u_op0 = XEXP (XEXP (mem_addr, 0), 0);
+ rtx u_op1 = XEXP (XEXP (mem_addr, 0), 1);
+
+ if (!(REG_P (u_op0) && CONST_INT_P (u_op1)
+ && REGNO (dest) == REGNO (u_op0)))
+ return;
+
+ rtx new_src = plus_constant (GET_MODE (op0), op0, INTVAL (u_op1));
+ validate_change (insn, &SET_SRC (set), new_src, 1);
+ new_src = gen_rtx_PLUS (GET_MODE (u_op0), u_op0, non_vfp_op);
+ validate_change (use_insn, mem_plus_loc, new_src, 1);
+ if (debug_insn)
+ validate_change (debug_insn, &XEXP(debug_expr_loc, 0), copy_rtx (new_src),
+ 1);
+
+ if (apply_change_group ())
+ {
+ rtx_insn **slot = vfp_const_iv->find_slot (insn, INSERT);
+
+ if (*slot)
+ gcc_unreachable ();
+ else
+ *slot = insn;
+
+ need_expensive_addr_check_p = true;
+
+ /* We should update REG_DEAD info also. */
+ rtx note;
+ if ((note = find_reg_note (insn, REG_DEAD, non_vfp_op)))
+ {
+ remove_note (insn, note);
+ add_shallow_copy_of_reg_note (use_insn, note);
+ }
+
+ if (dump_file)
+ fprintf (dump_file,
+ "\nRe-associate insn %d and %d for later"
+ " RTL loop invariant hoisting.\n",
+ INSN_UID (insn), INSN_UID (use_insn));
+ }
+
+ return;
+}
+
/* Finds invariants in basic block BB. ALWAYS_REACHED is true if the
basic block is always executed. ALWAYS_EXECUTED is true if the basic
block is always executed, unless the program ends due to a function
@@ -1022,6 +1197,8 @@ find_invariants_bb (basic_block bb, bool always_reached, bool always_executed)
if (!NONDEBUG_INSN_P (insn))
continue;
+ reshuffle_insn_with_vfp (insn);
+
find_invariants_insn (insn, always_reached, always_executed);
if (always_reached
@@ -1647,6 +1824,9 @@ move_invariants (struct loop *loop)
static void
init_inv_motion_data (void)
{
+ need_expensive_addr_check_p = false;
+ vfp_const_iv = new hash_table <pointer_hash <rtx_insn> > (4);
+
actual_stamp = 1;
invariants.create (100);
@@ -1682,6 +1862,8 @@ free_inv_motion_data (void)
free (inv);
}
invariants.release ();
+
+ delete vfp_const_iv;
}
/* Move the invariants out of the LOOP. */
next prev parent reply other threads:[~2015-04-21 14:43 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-04 11:00 Jiong Wang
2014-12-04 11:07 ` Richard Biener
2014-12-04 11:07 ` Richard Biener
2014-12-04 19:32 ` Jiong Wang
2014-12-15 15:29 ` Jiong Wang
2014-12-15 15:36 ` Jiong Wang
2014-12-17 16:19 ` Richard Biener
2014-12-18 17:08 ` Jiong Wang
2014-12-18 21:16 ` Jiong Wang
2014-12-18 22:19 ` Segher Boessenkool
2014-12-19 4:06 ` Bin.Cheng
2014-12-19 10:29 ` Jiong Wang
2014-12-19 11:45 ` Richard Biener
2014-12-19 15:31 ` Kenneth Zadeck
2015-02-11 11:20 ` Jiong Wang
2015-02-11 14:22 ` Kenneth Zadeck
2015-02-11 18:18 ` Jiong Wang
2015-04-14 15:06 ` Jiong Wang
2015-04-14 16:49 ` Steven Bosscher
2015-04-14 17:24 ` Jeff Law
2015-04-14 21:49 ` Jiong Wang
2015-04-21 14:43 ` Jiong Wang [this message]
2015-04-24 1:55 ` Jeff Law
2015-04-24 17:05 ` Jiong Wang
2015-05-14 20:04 ` Jeff Law
2015-05-14 22:07 ` Jiong Wang
2015-05-14 22:24 ` Jeff Law
2015-05-21 21:51 ` Jiong Wang
2015-05-27 16:11 ` Jeff Law
2015-09-02 13:49 ` Jiong Wang
2015-09-02 20:52 ` Jeff Law
2015-04-28 12:16 ` Jiong Wang
2015-04-28 14:00 ` Matthew Fortune
2015-04-28 14:31 ` Jiong Wang
2015-04-19 16:20 ` Jiong Wang
2014-12-19 12:09 ` Eric Botcazou
2014-12-19 15:21 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=n99vbgp35kl.fsf@arm.com \
--to=jiong.wang@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
--cc=stevenb.gcc@gmail.com \
--cc=zadeck@naturalbridge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).