public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][Aarch64] Relational compare zero not merged into subtract
@ 2017-06-01 23:54 Michael Collison
  2017-07-10 17:12 ` James Greenhalgh
  2017-07-11  9:20 ` Richard Earnshaw (lists)
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Collison @ 2017-06-01 23:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 774 bytes --]

This patch improves code generation for relational compares against zero that are not merged into a subtract instruction. This patch improves the >= and < cases.

An example of the '<' case:

int lt (int x, int y)
{
  if ((x - y) < 0)
    return 10;

  return 0;
}

Trunk generates:

lt:
	sub	w1, w0, w1
	mov	w0, 10
	cmp	w1, 0
	csel	w0, w0, wzr, lt
	ret

With the patch we can eliminate the redundant subtract and now generate:

lt:
	cmp	w0, w1
	mov	w0, 10
	csel	w0, w0, wzr, mi
	ret

Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?

2017-06-01  Michael Collison  <michael.collison@arm.com>

	* config/aarch64/aarch64-simd.md(aarch64_sub<mode>_compare0):
	New pattern.
	* testsuite/gcc.target/aarch64/cmp-2.c: New testcase.

[-- Attachment #2: pr7261.patch.patch --]
[-- Type: application/octet-stream, Size: 1192 bytes --]

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 51368e2..b90c728 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2011,6 +2011,17 @@
   [(set_attr "type" "alus_sreg,alus_imm,alus_imm")]
 )
 
+(define_insn "aarch64_sub<mode>_compare0"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (minus:GPI (match_operand:GPI 0 "register_operand" "r")
+		   (match_operand:GPI 1 "aarch64_plus_operand" "r"))
+	 (const_int 0)))]
+  ""
+  "cmp\\t%<w>0, %<w>1"
+  [(set_attr "type" "alus_sreg")]
+)
+
 (define_insn "*compare_neg<mode>"
   [(set (reg:CC_Z CC_REGNUM)
 	(compare:CC_Z
diff --git a/gcc/testsuite/gcc.target/aarch64/cmp-2.c b/gcc/testsuite/gcc.target/aarch64/cmp-2.c
new file mode 100644
index 0000000..1201664
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cmp-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int lt (int x, int y)
+{
+  if ((x - y) < 0)
+    return 10;
+
+  return 0;
+}
+
+int ge (int x, int y)
+{
+  if ((x - y) >= 0)
+    return 10;
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "csel\t" 2 } } */
+/* { dg-final { scan-assembler-not "sub\t" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][Aarch64] Relational compare zero not merged into subtract
  2017-06-01 23:54 [PATCH][Aarch64] Relational compare zero not merged into subtract Michael Collison
@ 2017-07-10 17:12 ` James Greenhalgh
  2017-07-11  7:26   ` Michael Collison
  2017-07-11  9:20 ` Richard Earnshaw (lists)
  1 sibling, 1 reply; 5+ messages in thread
From: James Greenhalgh @ 2017-07-10 17:12 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches, nd

On Thu, Jun 01, 2017 at 11:54:33PM +0000, Michael Collison wrote:
> This patch improves code generation for relational compares against zero that
> are not merged into a subtract instruction. This patch improves the >= and <
> cases.
> 
> An example of the '<' case:
> 
> int lt (int x, int y)
> {
>   if ((x - y) < 0)
>     return 10;
> 
>   return 0;
> }
> 
> Trunk generates:
> 
> lt:
> 	sub	w1, w0, w1
> 	mov	w0, 10
> 	cmp	w1, 0
> 	csel	w0, w0, wzr, lt
> 	ret
> 
> With the patch we can eliminate the redundant subtract and now generate:
> 
> lt:
> 	cmp	w0, w1
> 	mov	w0, 10
> 	csel	w0, w0, wzr, mi
> 	ret

I'm not up to speed on the way we use CC register modes in the AArch64
Backend. On the one hand looking at patterns like *sub<mode>3_compare0, this
patch looks correct. Those too generate subs instructions, and only set the
CC_NZ CC register. But on the other hand cmp<mode> sets the full CC
register. As cmp is an alias for subs, should they not set the same
CC register mode? Certainly the instruction sets more than just N and Z.

As I say, I don't understand this area, and I'm not sure if my objection
is reasonable. Hopefully someone who knows CC modes better could help me
understand why this is correct.

Thanks,
James

> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?
> 
> 2017-06-01  Michael Collison  <michael.collison@arm.com>
> 
> 	* config/aarch64/aarch64-simd.md(aarch64_sub<mode>_compare0):
> 	New pattern.
> 	* testsuite/gcc.target/aarch64/cmp-2.c: New testcase.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH][Aarch64] Relational compare zero not merged into subtract
  2017-07-10 17:12 ` James Greenhalgh
@ 2017-07-11  7:26   ` Michael Collison
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Collison @ 2017-07-11  7:26 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, nd

James,

The subtract instruction only reliably sets the N and Z flags. We convey this information in aarch64_seelct_cc_mode.

Regards,

Michael Collison

-----Original Message-----
From: James Greenhalgh [mailto:james.greenhalgh@arm.com] 
Sent: Monday, July 10, 2017 10:12 AM
To: Michael Collison <Michael.Collison@arm.com>
Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
Subject: Re: [PATCH][Aarch64] Relational compare zero not merged into subtract

On Thu, Jun 01, 2017 at 11:54:33PM +0000, Michael Collison wrote:
> This patch improves code generation for relational compares against 
> zero that are not merged into a subtract instruction. This patch 
> improves the >= and < cases.
> 
> An example of the '<' case:
> 
> int lt (int x, int y)
> {
>   if ((x - y) < 0)
>     return 10;
> 
>   return 0;
> }
> 
> Trunk generates:
> 
> lt:
> 	sub	w1, w0, w1
> 	mov	w0, 10
> 	cmp	w1, 0
> 	csel	w0, w0, wzr, lt
> 	ret
> 
> With the patch we can eliminate the redundant subtract and now generate:
> 
> lt:
> 	cmp	w0, w1
> 	mov	w0, 10
> 	csel	w0, w0, wzr, mi
> 	ret

I'm not up to speed on the way we use CC register modes in the AArch64 Backend. On the one hand looking at patterns like *sub<mode>3_compare0, this patch looks correct. Those too generate subs instructions, and only set the CC_NZ CC register. But on the other hand cmp<mode> sets the full CC register. As cmp is an alias for subs, should they not set the same CC register mode? Certainly the instruction sets more than just N and Z.

As I say, I don't understand this area, and I'm not sure if my objection is reasonable. Hopefully someone who knows CC modes better could help me understand why this is correct.

Thanks,
James

> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?
> 
> 2017-06-01  Michael Collison  <michael.collison@arm.com>
> 
> 	* config/aarch64/aarch64-simd.md(aarch64_sub<mode>_compare0):
> 	New pattern.
> 	* testsuite/gcc.target/aarch64/cmp-2.c: New testcase.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][Aarch64] Relational compare zero not merged into subtract
  2017-06-01 23:54 [PATCH][Aarch64] Relational compare zero not merged into subtract Michael Collison
  2017-07-10 17:12 ` James Greenhalgh
@ 2017-07-11  9:20 ` Richard Earnshaw (lists)
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Earnshaw (lists) @ 2017-07-11  9:20 UTC (permalink / raw)
  To: Michael Collison, gcc-patches; +Cc: nd

On 02/06/17 00:54, Michael Collison wrote:
> This patch improves code generation for relational compares against zero that are not merged into a subtract instruction. This patch improves the >= and < cases.
> 
> An example of the '<' case:
> 
> int lt (int x, int y)
> {
>   if ((x - y) < 0)
>     return 10;
> 
>   return 0;
> }
> 
> Trunk generates:
> 
> lt:
> 	sub	w1, w0, w1
> 	mov	w0, 10
> 	cmp	w1, 0
> 	csel	w0, w0, wzr, lt
> 	ret
> 
> With the patch we can eliminate the redundant subtract and now generate:
> 
> lt:
> 	cmp	w0, w1
> 	mov	w0, 10
> 	csel	w0, w0, wzr, mi
> 	ret
> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?
> 
> 2017-06-01  Michael Collison  <michael.collison@arm.com>
> 
> 	* config/aarch64/aarch64-simd.md(aarch64_sub<mode>_compare0):
> 	New pattern.
> 	* testsuite/gcc.target/aarch64/cmp-2.c: New testcase.
> 
> 

OK.

R.

> pr7261.patch.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 51368e2..b90c728 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -2011,6 +2011,17 @@
>    [(set_attr "type" "alus_sreg,alus_imm,alus_imm")]
>  )
>  
> +(define_insn "aarch64_sub<mode>_compare0"
> +  [(set (reg:CC_NZ CC_REGNUM)
> +	(compare:CC_NZ
> +	 (minus:GPI (match_operand:GPI 0 "register_operand" "r")
> +		   (match_operand:GPI 1 "aarch64_plus_operand" "r"))
> +	 (const_int 0)))]
> +  ""
> +  "cmp\\t%<w>0, %<w>1"
> +  [(set_attr "type" "alus_sreg")]
> +)
> +
>  (define_insn "*compare_neg<mode>"
>    [(set (reg:CC_Z CC_REGNUM)
>  	(compare:CC_Z
> diff --git a/gcc/testsuite/gcc.target/aarch64/cmp-2.c b/gcc/testsuite/gcc.target/aarch64/cmp-2.c
> new file mode 100644
> index 0000000..1201664
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cmp-2.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int lt (int x, int y)
> +{
> +  if ((x - y) < 0)
> +    return 10;
> +
> +  return 0;
> +}
> +
> +int ge (int x, int y)
> +{
> +  if ((x - y) >= 0)
> +    return 10;
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "csel\t" 2 } } */
> +/* { dg-final { scan-assembler-not "sub\t" } } */
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH][Aarch64] Relational compare zero not merged into subtract
@ 2017-07-11 10:21 Wilco Dijkstra
  0 siblings, 0 replies; 5+ messages in thread
From: Wilco Dijkstra @ 2017-07-11 10:21 UTC (permalink / raw)
  To: Michael Collison, James Greenhalgh; +Cc: GCC Patches, nd

Michael Collison wrote:
>
> The subtract instruction only reliably sets the N and Z flags. We convey this information in
> aarch64_seelct_cc_mode.

The SUBS and CMP set the N and Z flags identically - although they also set C and V, they
are different if there is overflow.  CC_NZmode is used after merging a compare with zero into
an ALU instruction - generally N and Z are valid. This means that LT and GE condition codes
must be translated into MI and PL (which happens in aarch64_get_condition_code_1).

At a higher level like match.pd you could transform (x - y) < 0 into x < y if there is no signed
overflow, but this isn't safe in RTL given source types are not available.

Wilco

> Trunk generates:
> 
> lt:
> 	sub	w1, w0, w1
> 	mov	w0, 10
> 	cmp	w1, 0
> 	csel	w0, w0, wzr, lt
> 	ret
> 
> With the patch we can eliminate the redundant subtract and now generate:
> 
> lt:
> 	cmp	w0, w1
> 	mov	w0, 10
> 	csel	w0, w0, wzr, mi
> 	ret

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-07-11 10:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-01 23:54 [PATCH][Aarch64] Relational compare zero not merged into subtract Michael Collison
2017-07-10 17:12 ` James Greenhalgh
2017-07-11  7:26   ` Michael Collison
2017-07-11  9:20 ` Richard Earnshaw (lists)
2017-07-11 10:21 Wilco Dijkstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).