public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH AArch64]Force register scaling out of mem ref and comment why
@ 2016-02-04 10:12 Bin Cheng
  2016-02-04 10:15 ` James Greenhalgh
  0 siblings, 1 reply; 2+ messages in thread
From: Bin Cheng @ 2016-02-04 10:12 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd, James Greenhalgh, Marcus Shawcroft, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 1417 bytes --]

Hi,
There is a performance regression caused by my previous change to aarch64_legitimize_address, in which I forced constant offset out of memory ref if the address expr is in the form of "reg1 + reg2 << scale + const".  The intention is to reveal as many loop-invariant opportunities as possible, while depend on GIMPLE optimizers picking up CSE opportunities of "reg << scale" among different memory references.

Though the assumption still holds, gimple optimizers are not powerful enough to pick up CSE opportunities of register scaling expressions at current time.  Here comes a workaround: this patch forces register scaling expression out of memory ref, so that RTL CSE pass can handle common register scaling expressions issue, of course, at a cost of possibly missed loop invariants.

James and I collected perf data, fortunately this change can improve performance for several cases in various benchmarks, while doesn't cause big regression.  It also recovers big regression we observed before for the previous change.

I also added comment explaining why the workaround is necessary.  I also files PR69653 as an example showing tree optimizer should be improved.

Bootstrap and test on AArch64, is it OK?

Thanks,
bin


2016-02-04  Bin Cheng  <bin.cheng@arm.com>

	* config/aarch64/aarch64.c (aarch64_legitimize_address): Force
	register scaling out of memory reference and comment why.

[-- Attachment #2: force-reg_scaling-out-of-mem_ref-20160201.txt --]
[-- Type: text/plain, Size: 1807 bytes --]

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 03bc1b9..24a7d5b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4926,13 +4926,18 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
 	       Rt = Ra + Rc;
 	       addr = Rt + Rb<<SCALE.
 
-	     Here we split CONST out of memory referece because:
+	     TODO: We really should split CONST out of memory referece
+	     because:
 	       a) We depend on GIMPLE optimizers to pick up common sub
 		  expression involving the scaling operation.
 	       b) The index Rb is likely a loop iv, it's better to split
 		  the CONST so that computation of new base Rt is a loop
 		  invariant and can be moved out of loop.  This is more
-		  important when the original base Ra is sfp related.  */
+		  important when the original base Ra is sfp related.
+
+	     Unfortunately, GIMPLE optimizers (e.g., SLSR) can not handle
+	     this kind of CSE opportunity at the time of this change, we
+	     have to force register scaling expr out of memory ref now.  */
 	  else if (REG_P (op0) || REG_P (op1))
 	    {
 	      machine_mode addr_mode = GET_MODE (x);
@@ -4942,14 +4947,13 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
 	      if (REG_P (op1))
 		std::swap (op0, op1);
 
-	      rtx addr = gen_rtx_PLUS (addr_mode, op1, base);
+	      rtx addr = plus_constant (addr_mode, base, offset);
 
 	      if (aarch64_legitimate_address_hook_p (mode, addr, false))
 		{
-		  base = force_operand (plus_constant (addr_mode,
-						       op0, offset),
+		  base = force_operand (gen_rtx_PLUS (addr_mode, op1, op0),
 					NULL_RTX);
-		  return gen_rtx_PLUS (addr_mode, op1, base);
+		  return plus_constant (addr_mode, base, offset);
 		}
 	    }
 	}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH AArch64]Force register scaling out of mem ref and comment why
  2016-02-04 10:12 [PATCH AArch64]Force register scaling out of mem ref and comment why Bin Cheng
@ 2016-02-04 10:15 ` James Greenhalgh
  0 siblings, 0 replies; 2+ messages in thread
From: James Greenhalgh @ 2016-02-04 10:15 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches, nd, Marcus Shawcroft, Richard Earnshaw

On Thu, Feb 04, 2016 at 10:11:53AM +0000, Bin Cheng wrote:
> Hi,
> There is a performance regression caused by my previous change to
> aarch64_legitimize_address, in which I forced constant offset out of memory
> ref if the address expr is in the form of "reg1 + reg2 << scale + const".
> The intention is to reveal as many loop-invariant opportunities as possible,
> while depend on GIMPLE optimizers picking up CSE opportunities of "reg <<
> scale" among different memory references.
> 
> Though the assumption still holds, gimple optimizers are not powerful enough
> to pick up CSE opportunities of register scaling expressions at current time.
> Here comes a workaround: this patch forces register scaling expression out of
> memory ref, so that RTL CSE pass can handle common register scaling
> expressions issue, of course, at a cost of possibly missed loop invariants.
> 
> James and I collected perf data, fortunately this change can improve
> performance for several cases in various benchmarks, while doesn't cause big
> regression.  It also recovers big regression we observed before for the
> previous change.
> 
> I also added comment explaining why the workaround is necessary.  I also
> files PR69653 as an example showing tree optimizer should be improved.
> 
> Bootstrap and test on AArch64, is it OK?

OK.

Thanks,
James

> 
> Thanks,
> bin
> 
> 
> 2016-02-04  Bin Cheng  <bin.cheng@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_legitimize_address): Force
> 	register scaling out of memory reference and comment why.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-02-04 10:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-04 10:12 [PATCH AArch64]Force register scaling out of mem ref and comment why Bin Cheng
2016-02-04 10:15 ` James Greenhalgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).