public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, ARM] Keep constants in register when expanding
@ 2014-08-05  9:31 Zhenqiang Chen
  2014-08-08 15:22 ` Ramana Radhakrishnan
  2014-08-12  9:24 ` Richard Earnshaw
  0 siblings, 2 replies; 7+ messages in thread
From: Zhenqiang Chen @ 2014-08-05  9:31 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1898 bytes --]

Hi,

For some large constants, ARM will split them during expanding, which
makes impossible to hoist them out the loop or shared by different
references (refer the test case in the patch).

The patch keeps some constants in registers. If the constant can not
be optimized, the cprop and combine passes can optimize them as what
we do in current expand pass with

define_insn_and_split "*arm_subsi3_insn"
define_insn_and_split "*arm_andsi3_insn"
define_insn_and_split "*iorsi3_insn"
define_insn_and_split "*arm_xorsi3"

The patch does not modify addsi3 since the define_insn_and_split
"*arm_addsi3" is only valid when (reload_completed ||
!arm_eliminable_register (operands[1])). The cprop and combine passes
can not optimize the large constant if we put it in register, which
will lead to regression.

For logic operators, the patch skips changes for constants:

INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2])

since expand pass always uses "sign-extend" to get the value
(trunc_int_for_mode called from immed_wide_int_const) for rtl, and
logs show most negative values are UNSIGNED when they are TREE node.
And combine pass is smart enough to recover the negative value to
positive value.

Bootstrap and no make check regression on Chrome book.
For coremark, dhrystone and eembcv1, no any code size and performance
change on Cortex-M4.
No any file in CSiBE has code size change for Cortex-A15 and Cortex-M4.
No Spec2000 performance regression on Chrome book and dumped assemble
codes only show very few difference.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-08-05  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        * config/arm/arm.md (subsi3, andsi3, iorsi3, xorsi3): Keep some
        large constants in register other than split them.

testsuite/ChangeLog:
2014-08-05  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        * gcc.target/arm/maskdata.c: New test.

[-- Attachment #2: skip-split.patch --]
[-- Type: text/x-patch, Size: 3704 bytes --]

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index bd8ea8f..c8b3001 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1162,9 +1162,16 @@
     {
       if (TARGET_32BIT)
         {
-          arm_split_constant (MINUS, SImode, NULL_RTX,
-	                      INTVAL (operands[1]), operands[0],
-	  		      operands[2], optimize && can_create_pseudo_p ());
+	  if (!const_ok_for_arm (INTVAL (operands[1]))
+	      && can_create_pseudo_p ())
+	    {
+	      operands[1] = force_reg (SImode, operands[1]);
+	      emit_insn (gen_subsi3 (operands[0], operands[1], operands[2]));
+	    }
+	  else
+	    arm_split_constant (MINUS, SImode, NULL_RTX,
+				INTVAL (operands[1]), operands[0], operands[2],
+				optimize && can_create_pseudo_p ());
           DONE;
 	}
       else /* TARGET_THUMB1 */
@@ -2077,6 +2084,17 @@
 	      emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
 							 operands[1]));
 	    }
+	  else if (!(const_ok_for_arm (INTVAL (operands[2]))
+		     || const_ok_for_arm (~INTVAL (operands[2]))
+			/* zero_extendhi instruction is efficient enough.  */
+		     || INTVAL (operands[2]) == 0xffff
+		     || (INTVAL (operands[2]) < 0
+			 && const_ok_for_arm (-INTVAL (operands[2]))))
+		   && can_create_pseudo_p ())
+	    {
+	      operands[2] = force_reg (SImode, operands[2]);
+	      emit_insn (gen_andsi3 (operands[0], operands[1], operands[2]));
+	    }
 	  else
 	    arm_split_constant (AND, SImode, NULL_RTX,
 				INTVAL (operands[2]), operands[0],
@@ -2882,9 +2900,20 @@
     {
       if (TARGET_32BIT)
         {
-          arm_split_constant (IOR, SImode, NULL_RTX,
-	                      INTVAL (operands[2]), operands[0], operands[1],
-			      optimize && can_create_pseudo_p ());
+	  if (!(const_ok_for_arm (INTVAL (operands[2]))
+		|| (TARGET_THUMB2
+		    && const_ok_for_arm (~INTVAL (operands[2])))
+		|| (INTVAL (operands[2]) < 0
+		    && const_ok_for_arm (-INTVAL (operands[2]))))
+	      && can_create_pseudo_p ())
+	    {
+	      operands[2] = force_reg (SImode, operands[2]);
+	      emit_insn (gen_iorsi3 (operands[0], operands[1], operands[2]));
+	    }
+	  else
+	    arm_split_constant (IOR, SImode, NULL_RTX,
+				INTVAL (operands[2]), operands[0], operands[1],
+				optimize && can_create_pseudo_p ());
           DONE;
 	}
       else /* TARGET_THUMB1 */
@@ -3052,9 +3081,18 @@
     {
       if (TARGET_32BIT)
         {
-          arm_split_constant (XOR, SImode, NULL_RTX,
-	                      INTVAL (operands[2]), operands[0], operands[1],
-			      optimize && can_create_pseudo_p ());
+	  if (!(const_ok_for_arm (INTVAL (operands[2]))
+		|| (INTVAL (operands[2]) < 0
+		     && const_ok_for_arm (-INTVAL (operands[2]))))
+	      && can_create_pseudo_p ())
+	    {
+	      operands[2] = force_reg (SImode, operands[2]);
+	      emit_insn (gen_xorsi3 (operands[0], operands[1], operands[2]));
+	    }
+	  else
+	    arm_split_constant (XOR, SImode, NULL_RTX,
+				INTVAL (operands[2]), operands[0], operands[1],
+				optimize && can_create_pseudo_p ());
           DONE;
 	}
       else /* TARGET_THUMB1 */

diff --git a/gcc/testsuite/gcc.target/arm/maskdata.c b/gcc/testsuite/gcc.target/arm/maskdata.c
new file mode 100644
index 0000000..b4231a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/maskdata.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 -fno-gcse " }  */
+/* { dg-require-effective-target arm_thumb2_ok } */
+
+#define MASK 0xfe00ff
+void maskdata (int * data, int len)
+{
+   int i = len;
+   for (; i > 0; i -= 2)
+    {
+      data[i] &= MASK;
+      data[i + 1] &= MASK;
+    }
+}
+/* { dg-final { scan-assembler "254" } } */
+/* { dg-final { scan-assembler "255" } } */

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-08-13  7:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-05  9:31 [PATCH, ARM] Keep constants in register when expanding Zhenqiang Chen
2014-08-08 15:22 ` Ramana Radhakrishnan
2014-08-11  2:35   ` Zhenqiang Chen
2014-08-11 11:15     ` Ramana Radhakrishnan
2014-08-12  5:37       ` Zhenqiang Chen
2014-08-12  9:24 ` Richard Earnshaw
2014-08-13  7:46   ` Zhenqiang Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).