public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jiong Wang <jiong.wang@arm.com>
To: Jeff Law <law@redhat.com>
Cc: Steven Bosscher <stevenb.gcc@gmail.com>,
	"gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	Kenneth Zadeck <zadeck@naturalbridge.com>
Subject: Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
Date: Tue, 21 Apr 2015 14:43:00 -0000	[thread overview]
Message-ID: <n99vbgp35kl.fsf@arm.com> (raw)
In-Reply-To: <CAAfDdZ1G_0A0k0RRF2XO_5W6xN13BoA_18+s-Z68P1YTS32mMA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3634 bytes --]


Jiong Wang writes:

> 2015-04-14 18:24 GMT+01:00 Jeff Law <law@redhat.com>:
>> On 04/14/2015 10:48 AM, Steven Bosscher wrote:
>>>>
>>>> So I think this stage2/3 binary difference is acceptable?
>>>
>>>
>>> No, they should be identical. If there's a difference, then there's a
>>> bug - which, it seems, you've already found, too.
>>
>> RIght.  And so the natural question is how to fix.
>>
>> At first glance it would seem like having this new code ignore dependencies
>> rising from debug insns would work.
>>
>> Which then begs the question, what happens to the debug insn -- it's
>> certainly not going to be correct anymore if the transformation is made.
>
> Exactly.
>
> The debug_insn 2776 in my example is to record the base address of a
> local array. the new code is doing correctly here by not shuffling the
> operands of insn 2556 and 2557 as there is additional reference of
> reg:1473 from debug insn, although the code will still execute correctly
> if we do the transformation.
>
> my understanding to fix this:
>
>   * delete the out-of-date mismatch debug_insn? as there is no guarantee
>     to generate accurate debug info under -O2.
>
>     IMO, this debug_insn may affect "DW_AT_location" field for variable
>     descrption of "classes" in .debug_info section, but it's omitted in
>     the final output already.
>
>     <3><38a4d>: Abbrev Number: 137 (DW_TAG_variable)
>     <38a4f>   DW_AT_name : (indirect string, offset: 0x18db): classes
>     <38a53>   DW_AT_decl_file   : 1
>     <38a54>   DW_AT_decl_line   : 548
>     <38a56>   DW_AT_type        : <0x38cb4>
>
>   * update the debug_insn? if the following change is OK with dwarf standard
>
>    from
>
>      insn0: reg0 = fp + reg1
>      debug_insn: var_loc = reg0 + const_off
>      insn1: reg2 = reg0 + const_off
>
>    to
>
>      insn0: reg0 = fp + const_off
>      debug_insn: var_loc = reg0 + reg1
>      insn1: reg2 = reg0 + reg1
>
> Thanks,
>

And attachment is the new patch which will update debug_insn as
described in the second solution above.

Now the stage2/3 binary differences on AArch64 gone away. Bootstrap OK.

On AArch64, this patch give 600+ new rtl loop invariants found across
spec2k6 float. +4.5% perf improvement on 436.cactusADM because four new
invariants found in the critical function "regex_compile".

The similar improvements may be achieved on other RISC backends like
powerpc/mips I guess.

One thing to mention, for AArch64, one minor glitch in
aarch64_legitimize_address needs to be fixed to let this patch take
effect, I will send out that patch later as it's a seperate issue.
Powerpc/Mips don't have this glitch in LEGITIMIZE_ADDRESS hook, so
should be OK, and I verified the base address of local array in the
testcase given by Seb on pr62173 do hoisted on ppc64 now. I think
pr62173 is fixed on those 64bit arch by this patch.

Thoughts?

Thanks.

2015-04-21  Jiong Wang  <jiong.wang@arm.com>

gcc/
  * loop-invariant.c (find_defs): Enable DF_DU_CHAIN build.
  (vfp_const_iv): New hash table.
  (expensive_addr_check_p): New boolean.
  (init_inv_motion_data): Initialize new variables.>
  (free_inv_motion_data): Release hash table.
  (create_new_invariant): Set cheap_address to false for iv in
  vfp_const_iv table.
  (find_invariant_insn): Skip dependencies check for iv in vfp_const_iv
  table.
  (use_for_single_du): New function.
  (reshuffle_insn_with_vfp): Likewise.
  (find_invariants_bb): Call reshuffle_insn_with_vfp.

gcc/testsuite/
   * gcc.dg/pr62173.c: New testcase.

-- 
Regards,
Jiong


[-- Attachment #2: fix-dwarf.patch --]
[-- Type: text/x-diff, Size: 7390 bytes --]

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..f70dfb0 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -203,6 +203,8 @@ typedef struct invariant *invariant_p;
 /* The invariants.  */
 
 static vec<invariant_p> invariants;
+static hash_table <pointer_hash <rtx_insn> > *vfp_const_iv;
+static bool need_expensive_addr_check_p;
 
 /* Check the size of the invariant table and realloc if necessary.  */
 
@@ -695,7 +697,7 @@ find_defs (struct loop *loop)
 
   df_remove_problem (df_chain);
   df_process_deferred_rescans ();
-  df_chain_add_problem (DF_UD_CHAIN);
+  df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_analyze_loop (loop);
   check_invariant_table_size ();
@@ -742,6 +744,9 @@ create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on,
 	 See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html .  */
       inv->cheap_address = address_cost (SET_SRC (set), word_mode,
 					 ADDR_SPACE_GENERIC, speed) < 3;
+
+      if (need_expensive_addr_check_p && vfp_const_iv->find (insn))
+	inv->cheap_address = false;
     }
   else
     {
@@ -952,7 +957,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed)
     return;
 
   depends_on = BITMAP_ALLOC (NULL);
-  if (!check_dependencies (insn, depends_on))
+  if (!vfp_const_iv->find (insn)
+      && !check_dependencies (insn, depends_on))
     {
       BITMAP_FREE (depends_on);
       return;
@@ -1007,6 +1013,180 @@ find_invariants_insn (rtx_insn *insn, bool always_reached, bool always_executed)
   record_uses (insn);
 }
 
+/*  Find the single use of def of insn. At the same time,
+    make sure def of insn is the single def which reach the use.
+    NOTE: debug_insn can affect DF, var_location debug insn may reference
+    the same rtl expr as variable location, such debug_insn should not
+    affect the insn shuffling while we do need to update the loc of the
+    debug_insn after insn shuffling. So here we will record such debug_insn.  */
+
+static rtx_insn *
+use_for_single_du (rtx_insn *insn, rtx_insn **debug_insn, rtx *debug_expr_loc)
+{
+  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+  df_ref def;
+
+  FOR_EACH_INSN_INFO_DEF (def, insn_info)
+    {
+      struct df_link *chain = DF_REF_CHAIN (def);
+
+      if (!chain)
+	return NULL;
+
+      /* For multi use, only allow the second use be debug_insn.  */
+      if (chain->next
+	  && (chain->next->next
+	      || NONDEBUG_INSN_P (DF_REF_INSN (chain->next->ref))))
+	return NULL;
+
+      if (chain->next)
+	{
+	  rtx_insn *insn = DF_REF_INSN (chain->next->ref);
+	  rtx debug_pat = PATTERN (insn);
+	  if (GET_CODE (debug_pat) != VAR_LOCATION)
+	    return NULL;
+	  else
+	    {
+	      *debug_insn = insn;
+	      *debug_expr_loc = PAT_VAR_LOCATION_LOC (debug_pat);
+	    }
+	}
+
+      df_ref use = chain->ref;
+      chain = DF_REF_CHAIN (use);
+
+      if (!chain
+	  || chain->next)
+	return NULL;
+
+      return DF_REF_INSN (use);
+    }
+
+  return NULL;
+}
+
+/* This function also try to transform
+
+     RA <- fixed_reg + RC
+     RD <- MEM (RA + const_offset)
+
+   into:
+
+     RA <- fixed_reg + const_offset
+     RD <- MEM (RA + RC) <- pos0
+
+  If use of RA in the second insn is the single use, and the define of
+  RA in the first insn is the single def reach the second insn.
+
+  After this change, the first instruction is loop invariant.  */
+
+static void
+reshuffle_insn_with_vfp (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+
+  if (!set
+      || GET_CODE (SET_SRC (set)) != PLUS)
+    return;
+
+  rtx dest = SET_DEST (set);
+  rtx op0 = XEXP (SET_SRC (set), 0);
+  rtx op1 = XEXP (SET_SRC (set), 1);
+  rtx non_vfp_op = op1;
+
+  if (op1 == frame_pointer_rtx
+      || op1 == stack_pointer_rtx
+      || op1 == virtual_stack_vars_rtx)
+    {
+      non_vfp_op = op0;
+      std::swap (op0, op1);
+    }
+
+  if (GET_CODE (op1) == SIGN_EXTEND
+      || GET_CODE (op1) == ZERO_EXTEND)
+    op1 = XEXP (op1, 0);
+
+  if (!(REG_P (dest) && REG_P (op0) && REG_P (op1)
+	&& (op0 == frame_pointer_rtx
+	    || op0 == stack_pointer_rtx
+	    || op0 == virtual_stack_vars_rtx)))
+    return;
+
+  rtx_insn *use_insn, *debug_insn = NULL;
+  rtx debug_expr_loc;
+
+  if (!(use_insn = use_for_single_du (insn, &debug_insn, &debug_expr_loc)))
+    return;
+
+  rtx u_set = single_set (use_insn);
+
+  if (!(u_set && (REG_P (SET_DEST (u_set)) || MEM_P (SET_DEST (u_set)))))
+    return;
+
+  rtx mem_addr;
+
+  if (REG_P (SET_DEST (u_set)))
+    mem_addr = SET_SRC (u_set);
+  else
+    mem_addr = SET_DEST (u_set);
+
+  if (debug_insn && !rtx_equal_p (debug_expr_loc, mem_addr))
+    return;
+
+  if (GET_CODE (mem_addr) == ZERO_EXTEND
+      || GET_CODE (mem_addr) == SIGN_EXTEND
+      || GET_CODE (mem_addr) == TRUNCATE)
+    mem_addr = XEXP (mem_addr, 0);
+
+  if (!MEM_P (mem_addr)
+      || GET_CODE (XEXP (mem_addr, 0)) != PLUS)
+    return;
+
+  rtx *mem_plus_loc = &XEXP (mem_addr, 0);
+  rtx u_op0 = XEXP (XEXP (mem_addr, 0), 0);
+  rtx u_op1 = XEXP (XEXP (mem_addr, 0), 1);
+
+  if (!(REG_P (u_op0) && CONST_INT_P (u_op1)
+	&& REGNO (dest) == REGNO (u_op0)))
+    return;
+
+  rtx new_src = plus_constant (GET_MODE (op0), op0, INTVAL (u_op1));
+  validate_change (insn, &SET_SRC (set), new_src, 1);
+  new_src = gen_rtx_PLUS (GET_MODE (u_op0), u_op0, non_vfp_op);
+  validate_change (use_insn, mem_plus_loc, new_src, 1);
+  if (debug_insn)
+    validate_change (debug_insn, &XEXP(debug_expr_loc, 0), copy_rtx (new_src),
+		     1);
+
+  if (apply_change_group ())
+    {
+      rtx_insn **slot = vfp_const_iv->find_slot (insn, INSERT);
+
+      if (*slot)
+	gcc_unreachable ();
+      else
+	*slot = insn;
+
+      need_expensive_addr_check_p = true;
+
+      /* We should update REG_DEAD info also.  */
+      rtx note;
+      if ((note = find_reg_note (insn, REG_DEAD, non_vfp_op)))
+	{
+	  remove_note (insn, note);
+	  add_shallow_copy_of_reg_note (use_insn, note);
+	}
+
+      if (dump_file)
+	fprintf (dump_file,
+		 "\nRe-associate insn %d and %d for later"
+		 " RTL loop invariant hoisting.\n",
+		 INSN_UID (insn), INSN_UID (use_insn));
+    }
+
+  return;
+}
+
 /* Finds invariants in basic block BB.  ALWAYS_REACHED is true if the
    basic block is always executed.  ALWAYS_EXECUTED is true if the basic
    block is always executed, unless the program ends due to a function
@@ -1022,6 +1197,8 @@ find_invariants_bb (basic_block bb, bool always_reached, bool always_executed)
       if (!NONDEBUG_INSN_P (insn))
 	continue;
 
+      reshuffle_insn_with_vfp (insn);
+
       find_invariants_insn (insn, always_reached, always_executed);
 
       if (always_reached
@@ -1647,6 +1824,9 @@ move_invariants (struct loop *loop)
 static void
 init_inv_motion_data (void)
 {
+  need_expensive_addr_check_p = false;
+  vfp_const_iv = new hash_table <pointer_hash <rtx_insn> > (4);
+
   actual_stamp = 1;
 
   invariants.create (100);
@@ -1682,6 +1862,8 @@ free_inv_motion_data (void)
       free (inv);
     }
   invariants.release ();
+
+  delete vfp_const_iv;
 }
 
 /* Move the invariants out of the LOOP.  */

  reply	other threads:[~2015-04-21 14:43 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04 11:00 Jiong Wang
2014-12-04 11:07 ` Richard Biener
2014-12-04 11:07   ` Richard Biener
2014-12-04 19:32     ` Jiong Wang
2014-12-15 15:29       ` Jiong Wang
2014-12-15 15:36         ` Jiong Wang
2014-12-17 16:19           ` Richard Biener
2014-12-18 17:08             ` Jiong Wang
2014-12-18 21:16               ` Jiong Wang
2014-12-18 22:19               ` Segher Boessenkool
2014-12-19  4:06                 ` Bin.Cheng
2014-12-19 10:29                   ` Jiong Wang
2014-12-19 11:45                     ` Richard Biener
2014-12-19 15:31                       ` Kenneth Zadeck
2015-02-11 11:20                         ` Jiong Wang
2015-02-11 14:22                           ` Kenneth Zadeck
2015-02-11 18:18                             ` Jiong Wang
2015-04-14 15:06                               ` Jiong Wang
2015-04-14 16:49                                 ` Steven Bosscher
2015-04-14 17:24                                   ` Jeff Law
2015-04-14 21:49                                     ` Jiong Wang
2015-04-21 14:43                                       ` Jiong Wang [this message]
2015-04-24  1:55                                         ` Jeff Law
2015-04-24 17:05                                           ` Jiong Wang
2015-05-14 20:04                                             ` Jeff Law
2015-05-14 22:07                                               ` Jiong Wang
2015-05-14 22:24                                                 ` Jeff Law
2015-05-21 21:51                                                   ` Jiong Wang
2015-05-27 16:11                                                     ` Jeff Law
2015-09-02 13:49                                                       ` Jiong Wang
2015-09-02 20:52                                                         ` Jeff Law
2015-04-28 12:16                                         ` Jiong Wang
2015-04-28 14:00                                           ` Matthew Fortune
2015-04-28 14:31                                             ` Jiong Wang
2015-04-19 16:20                               ` Jiong Wang
2014-12-19 12:09                     ` Eric Botcazou
2014-12-19 15:21                   ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=n99vbgp35kl.fsf@arm.com \
    --to=jiong.wang@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=stevenb.gcc@gmail.com \
    --cc=zadeck@naturalbridge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).