public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch][ARM] Don't generate redundant zero_extend before smlabb
@ 2010-12-09 16:19 Andrew Stubbs
  2010-12-15 10:15 ` Richard Sandiford
  2010-12-16 20:05 ` Richard Earnshaw
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Stubbs @ 2010-12-09 16:19 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1047 bytes --]

The attached patch adjusts the ARM machine description to prevent GCC 
emitting redundant zero-extends before 16-bit->32-bit multiply and 
accumulate operations.

The problem is that maddhisi4 pattern has the operands of plus swapped, 
with respect to the (de facto?) canonical form always returned by 
combine_simplify_rtx. This means that recog does not match the pattern, 
and the zero-extends are not optimized away.

The patch simply swaps the order of the operands. maddhidi4 appears to 
have a similar problem, so I've corrected it there also.

Test case:

int footrunc (int x, int a, int b)
{
      return x + (short) a * (short) b;
}

Before, compiled with "-O2":

          mov     ip, r1, asl #16
          mov     r3, r2, asl #16
          mov     r1, ip, lsr #16
          mov     r2, r3, lsr #16
          smlabb  r0, r2, r1, r0
          bx      lr

(On armv7a/thumb2 the code uses uxth, but the problem is the same.)

After:
          smlabb  r0, r1, r2, r0
          bx      lr

OK for commit, after stage 1 opens again?

Andrew


[-- Attachment #2: arm-madd-combine.patch --]
[-- Type: text/x-patch, Size: 1565 bytes --]

2010-12-09  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/arm/arm.md (maddhisi4, *maddhidi4): Use the canonical
	operand order for plus.

---
 src/gcc-mainline/gcc/config/arm/arm.md |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gcc-mainline/gcc/config/arm/arm.md b/src/gcc-mainline/gcc/config/arm/arm.md
index 8bc9926..c816126 100644
--- a/src/gcc-mainline/gcc/config/arm/arm.md
+++ b/src/gcc-mainline/gcc/config/arm/arm.md
@@ -1800,11 +1800,11 @@
 
 (define_insn "maddhisi4"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-	(plus:SI (match_operand:SI 3 "s_register_operand" "r")
-		 (mult:SI (sign_extend:SI
+	(plus:SI (mult:SI (sign_extend:SI
 			   (match_operand:HI 1 "s_register_operand" "%r"))
 			  (sign_extend:SI
-			   (match_operand:HI 2 "s_register_operand" "r")))))]
+			   (match_operand:HI 2 "s_register_operand" "r")))
+		 (match_operand:SI 3 "s_register_operand" "r")))]
   "TARGET_DSP_MULTIPLY"
   "smlabb%?\\t%0, %1, %2, %3"
   [(set_attr "insn" "smlaxy")
@@ -1814,11 +1814,11 @@
 (define_insn "*maddhidi4"
   [(set (match_operand:DI 0 "s_register_operand" "=r")
 	(plus:DI
-	  (match_operand:DI 3 "s_register_operand" "0")
 	  (mult:DI (sign_extend:DI
 	 	    (match_operand:HI 1 "s_register_operand" "%r"))
 		   (sign_extend:DI
-		    (match_operand:HI 2 "s_register_operand" "r")))))]
+		    (match_operand:HI 2 "s_register_operand" "r")))
+	  (match_operand:DI 3 "s_register_operand" "0")))]
   "TARGET_DSP_MULTIPLY"
   "smlalbb%?\\t%Q0, %R0, %1, %2"
   [(set_attr "insn" "smlalxy")

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch][ARM] Don't generate redundant zero_extend before smlabb
  2010-12-09 16:19 [patch][ARM] Don't generate redundant zero_extend before smlabb Andrew Stubbs
@ 2010-12-15 10:15 ` Richard Sandiford
  2010-12-15 10:18   ` Richard Sandiford
  2010-12-16 20:05 ` Richard Earnshaw
  1 sibling, 1 reply; 5+ messages in thread
From: Richard Sandiford @ 2010-12-15 10:15 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches

Andrew Stubbs <ams@codesourcery.com> writes:
> The attached patch adjusts the ARM machine description to prevent GCC 
> emitting redundant zero-extends before 16-bit->32-bit multiply and 
> accumulate operations.
>
> The problem is that maddhisi4 pattern has the operands of plus swapped, 
> with respect to the (de facto?) canonical form always returned by 
> combine_simplify_rtx. This means that recog does not match the pattern, 
> and the zero-extends are not optimized away.
>
> The patch simply swaps the order of the operands. maddhidi4 appears to 
> have a similar problem, so I've corrected it there also.
>
> Test case:
>
> int footrunc (int x, int a, int b)
> {
>       return x + (short) a * (short) b;
> }
>
> Before, compiled with "-O2":
>
>           mov     ip, r1, asl #16
>           mov     r3, r2, asl #16
>           mov     r1, ip, lsr #16
>           mov     r2, r3, lsr #16
>           smlabb  r0, r2, r1, r0
>           bx      lr
>
> (On armv7a/thumb2 the code uses uxth, but the problem is the same.)
>
> After:
>           smlabb  r0, r1, r2, r0
>           bx      lr
>
> OK for commit, after stage 1 opens again?

Could this go in during stage 3?  Like you say, the current maddhisi4
expander (which is known to optabs) generates "noncanonical" rtx.
That seems like a bug.

Richard

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch][ARM] Don't generate redundant zero_extend before smlabb
  2010-12-15 10:15 ` Richard Sandiford
@ 2010-12-15 10:18   ` Richard Sandiford
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Sandiford @ 2010-12-15 10:18 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches

Richard Sandiford <richard.sandiford@linaro.org> writes:
> Could this go in during stage 3?  Like you say, the current maddhisi4
> expander (which is known to optabs) generates "noncanonical" rtx.
> That seems like a bug.

Sorry to follow up on myself, but to be more specific: as Paolo said
recently, "noncanonical rtx" in this type of situation really means
"invalid rtx", so having optabs generate what is effectively invalid
rtx seems like a correctness bug.

Richard

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch][ARM] Don't generate redundant zero_extend before smlabb
  2010-12-09 16:19 [patch][ARM] Don't generate redundant zero_extend before smlabb Andrew Stubbs
  2010-12-15 10:15 ` Richard Sandiford
@ 2010-12-16 20:05 ` Richard Earnshaw
  2010-12-17 17:02   ` Andrew Stubbs
  1 sibling, 1 reply; 5+ messages in thread
From: Richard Earnshaw @ 2010-12-16 20:05 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches


On Thu, 2010-12-09 at 16:01 +0000, Andrew Stubbs wrote:
> The attached patch adjusts the ARM machine description to prevent GCC 
> emitting redundant zero-extends before 16-bit->32-bit multiply and 
> accumulate operations.
> 
> The problem is that maddhisi4 pattern has the operands of plus swapped, 
> with respect to the (de facto?) canonical form always returned by 
> combine_simplify_rtx. This means that recog does not match the pattern, 
> and the zero-extends are not optimized away.
> 
> The patch simply swaps the order of the operands. maddhidi4 appears to 
> have a similar problem, so I've corrected it there also.
> 
> Test case:
> 
> int footrunc (int x, int a, int b)
> {
>       return x + (short) a * (short) b;
> }
> 
> Before, compiled with "-O2":
> 
>           mov     ip, r1, asl #16
>           mov     r3, r2, asl #16
>           mov     r1, ip, lsr #16
>           mov     r2, r3, lsr #16
>           smlabb  r0, r2, r1, r0
>           bx      lr
> 
> (On armv7a/thumb2 the code uses uxth, but the problem is the same.)
> 
> After:
>           smlabb  r0, r1, r2, r0
>           bx      lr
> 
> OK for commit, after stage 1 opens again?
> 
> Andrew

This is ok to go in now.  While you're modifying these patterns, the %
in the register constraint does not do anything useful (and actually
makes the compiler slightly slower) since the constraints and predicates
are otherwise identical, so it might as well be removed.

R.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch][ARM] Don't generate redundant zero_extend before smlabb
  2010-12-16 20:05 ` Richard Earnshaw
@ 2010-12-17 17:02   ` Andrew Stubbs
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Stubbs @ 2010-12-17 17:02 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 403 bytes --]

On 16/12/10 18:26, Richard Earnshaw wrote:
> This is ok to go in now.  While you're modifying these patterns, the %
> in the register constraint does not do anything useful (and actually
> makes the compiler slightly slower) since the constraints and predicates
> are otherwise identical, so it might as well be removed.
>

Thanks Richard.

I've made the adjustment, and committed the attached.

Andrew

[-- Attachment #2: arm-madd-combine.patch --]
[-- Type: text/x-patch, Size: 1714 bytes --]

2010-12-17  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/arm/arm.md (maddhisi4, *maddhidi4): Use the canonical
	operand order for plus.
	Drop redundant % from constraints.

---
 src/gcc-mainline/gcc/config/arm/arm.md |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gcc-mainline/gcc/config/arm/arm.md b/src/gcc-mainline/gcc/config/arm/arm.md
index 20431d3..dd7555b 100644
--- a/src/gcc-mainline/gcc/config/arm/arm.md
+++ b/src/gcc-mainline/gcc/config/arm/arm.md
@@ -1793,11 +1793,11 @@
 
 (define_insn "maddhisi4"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-	(plus:SI (match_operand:SI 3 "s_register_operand" "r")
-		 (mult:SI (sign_extend:SI
-			   (match_operand:HI 1 "s_register_operand" "%r"))
+	(plus:SI (mult:SI (sign_extend:SI
+			   (match_operand:HI 1 "s_register_operand" "r"))
 			  (sign_extend:SI
-			   (match_operand:HI 2 "s_register_operand" "r")))))]
+			   (match_operand:HI 2 "s_register_operand" "r")))
+		 (match_operand:SI 3 "s_register_operand" "r")))]
   "TARGET_DSP_MULTIPLY"
   "smlabb%?\\t%0, %1, %2, %3"
   [(set_attr "insn" "smlaxy")
@@ -1807,11 +1807,11 @@
 (define_insn "*maddhidi4"
   [(set (match_operand:DI 0 "s_register_operand" "=r")
 	(plus:DI
-	  (match_operand:DI 3 "s_register_operand" "0")
 	  (mult:DI (sign_extend:DI
-	 	    (match_operand:HI 1 "s_register_operand" "%r"))
+	 	    (match_operand:HI 1 "s_register_operand" "r"))
 		   (sign_extend:DI
-		    (match_operand:HI 2 "s_register_operand" "r")))))]
+		    (match_operand:HI 2 "s_register_operand" "r")))
+	  (match_operand:DI 3 "s_register_operand" "0")))]
   "TARGET_DSP_MULTIPLY"
   "smlalbb%?\\t%Q0, %R0, %1, %2"
   [(set_attr "insn" "smlalxy")

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-12-17 16:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-09 16:19 [patch][ARM] Don't generate redundant zero_extend before smlabb Andrew Stubbs
2010-12-15 10:15 ` Richard Sandiford
2010-12-15 10:18   ` Richard Sandiford
2010-12-16 20:05 ` Richard Earnshaw
2010-12-17 17:02   ` Andrew Stubbs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).