public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern
@ 2023-09-26  9:55 Juzhe-Zhong
  2023-09-26 11:34 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Juzhe-Zhong @ 2023-09-26  9:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther, Juzhe-Zhong

Current COND_ADD reduction pattern can't optimize floating-point vector.
As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html
Allow COND_ADD reduction pattern to optimize floating-point vector.

Bootstrap and Regression is running.

Ok for trunk if tests pass ?

gcc/ChangeLog:

	* match.pd: Optimize COND_ADD reduction pattern.

---
 gcc/match.pd | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 3ce90c3333b..790d956fe69 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8863,8 +8863,11 @@ and,
 
    c = mask1 && mask2 ? d + b : d.  */
 (simplify
-  (IFN_COND_ADD @0 @1 (vec_cond @2 @3 integer_zerop) @1)
-   (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1))
+  (IFN_COND_ADD @0 @1 (vec_cond @2 @3 zerop@4) @1)
+   (if (ANY_INTEGRAL_TYPE_P (type)
+	|| (FLOAT_TYPE_P (type)
+	    && fold_real_zero_addition_p (type, NULL_TREE, @4, 0)))
+   (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1)))
 
 /* Detect simplication for a conditional length reduction where
 
-- 
2.36.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern
  2023-09-26  9:55 [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern Juzhe-Zhong
@ 2023-09-26 11:34 ` Richard Biener
  2023-09-26 12:19   ` Li, Pan2
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2023-09-26 11:34 UTC (permalink / raw)
  To: Juzhe-Zhong; +Cc: gcc-patches, richard.sandiford

On Tue, 26 Sep 2023, Juzhe-Zhong wrote:

> Current COND_ADD reduction pattern can't optimize floating-point vector.
> As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html
> Allow COND_ADD reduction pattern to optimize floating-point vector.
> 
> Bootstrap and Regression is running.
> 
> Ok for trunk if tests pass ?

OK.

> gcc/ChangeLog:
> 
> 	* match.pd: Optimize COND_ADD reduction pattern.
> 
> ---
>  gcc/match.pd | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 3ce90c3333b..790d956fe69 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8863,8 +8863,11 @@ and,
>  
>     c = mask1 && mask2 ? d + b : d.  */
>  (simplify
> -  (IFN_COND_ADD @0 @1 (vec_cond @2 @3 integer_zerop) @1)
> -   (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1))
> +  (IFN_COND_ADD @0 @1 (vec_cond @2 @3 zerop@4) @1)
> +   (if (ANY_INTEGRAL_TYPE_P (type)
> +	|| (FLOAT_TYPE_P (type)
> +	    && fold_real_zero_addition_p (type, NULL_TREE, @4, 0)))
> +   (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1)))
>  
>  /* Detect simplication for a conditional length reduction where
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern
  2023-09-26 11:34 ` Richard Biener
@ 2023-09-26 12:19   ` Li, Pan2
  0 siblings, 0 replies; 4+ messages in thread
From: Li, Pan2 @ 2023-09-26 12:19 UTC (permalink / raw)
  To: Richard Biener, Juzhe-Zhong; +Cc: gcc-patches, richard.sandiford

Committed as passed x86 bootstrap and regression test, thanks Richard.

Pan

-----Original Message-----
From: Richard Biener <rguenther@suse.de> 
Sent: Tuesday, September 26, 2023 7:35 PM
To: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Cc: gcc-patches@gcc.gnu.org; richard.sandiford@arm.com
Subject: Re: [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern

On Tue, 26 Sep 2023, Juzhe-Zhong wrote:

> Current COND_ADD reduction pattern can't optimize floating-point vector.
> As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html
> Allow COND_ADD reduction pattern to optimize floating-point vector.
> 
> Bootstrap and Regression is running.
> 
> Ok for trunk if tests pass ?

OK.

> gcc/ChangeLog:
> 
> 	* match.pd: Optimize COND_ADD reduction pattern.
> 
> ---
>  gcc/match.pd | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 3ce90c3333b..790d956fe69 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8863,8 +8863,11 @@ and,
>  
>     c = mask1 && mask2 ? d + b : d.  */
>  (simplify
> -  (IFN_COND_ADD @0 @1 (vec_cond @2 @3 integer_zerop) @1)
> -   (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1))
> +  (IFN_COND_ADD @0 @1 (vec_cond @2 @3 zerop@4) @1)
> +   (if (ANY_INTEGRAL_TYPE_P (type)
> +	|| (FLOAT_TYPE_P (type)
> +	    && fold_real_zero_addition_p (type, NULL_TREE, @4, 0)))
> +   (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1)))
>  
>  /* Detect simplication for a conditional length reduction where
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern
@ 2023-09-26  8:20 Juzhe-Zhong
  0 siblings, 0 replies; 4+ messages in thread
From: Juzhe-Zhong @ 2023-09-26  8:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther, Juzhe-Zhong

This patch leverage this commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=62b505a4d5fc89
to optimize COND_LEN_ADD reduction pattern.

We are doing optimization of VEC_COND_EXPR + COND_LEN_ADD -> COND_LEN_ADD.

Consider thsi following case:

#include <stdint.h>

void
pr11594 (uint64_t *restrict a, uint64_t *restrict b, int loop_size)
{
  uint64_t result = 0;

  for (int i = 0; i < loop_size; i++)
    {
      if (b[i] <= a[i])
	{
	  result += a[i];
	}
    }

  a[0] = result;
}

Before this patch:
        vsetvli a7,zero,e64,m1,ta,ma
        vmv.v.i v2,0
        vmv1r.v v3,v2                    --- redundant
.L3:
        vsetvli a5,a2,e64,m1,ta,ma
        vle64.v v1,0(a3)
        vle64.v v0,0(a1)
        slli    a6,a5,3
        vsetvli a7,zero,e64,m1,ta,ma
        sub     a2,a2,a5
        vmsleu.vv       v0,v0,v1
        add     a1,a1,a6
        vmerge.vvm      v1,v3,v1,v0     ---- redundant.
        add     a3,a3,a6
        vsetvli zero,a5,e64,m1,tu,ma
        vadd.vv v2,v2,v1
        bne     a2,zero,.L3
        li      a5,0
        vsetvli a4,zero,e64,m1,ta,ma
        vmv.s.x v1,a5
        vredsum.vs      v2,v2,v1
        vmv.x.s a5,v2
        sd      a5,0(a0)
        ret

After this patch:

	vsetvli	a6,zero,e64,m1,ta,ma
	vmv.v.i	v1,0
.L3:
	vsetvli	a5,a2,e64,m1,ta,ma
	vle64.v	v2,0(a4)
	vle64.v	v0,0(a1)
	slli	a3,a5,3
	vsetvli	a6,zero,e64,m1,ta,ma
	sub	a2,a2,a5
	vmsleu.vv	v0,v0,v2
	add	a1,a1,a3
	vsetvli	zero,a5,e64,m1,tu,mu
	add	a4,a4,a3
	vadd.vv	v1,v1,v2,v0.t
	bne	a2,zero,.L3
	li	a5,0
	vsetivli	zero,1,e64,m1,ta,ma
	vmv.s.x	v2,a5
	vsetvli	a5,zero,e64,m1,ta,ma
	vredsum.vs	v1,v1,v2
	vmv.x.s	a5,v1
	sd	a5,0(a0)
	ret

Bootstrap && Regression is running.

Ok for trunk when testing passes ?

	PR tree-optimization/111594
        PR tree-optimization/110660

gcc/ChangeLog:

	* match.pd: Optimize COND_LEN_ADD reduction.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_reduc-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/pr111594.c: New test.

---
 gcc/match.pd                                  | 14 +++++++++
 .../riscv/rvv/autovec/cond/cond_reduc-1.c     | 29 +++++++++++++++++++
 .../riscv/rvv/autovec/cond/pr111594.c         | 22 ++++++++++++++
 3 files changed, 65 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_reduc-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111594.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a17778fbaa6..5061c19e086 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8866,6 +8866,20 @@ and,
   (IFN_COND_ADD @0 @1 (vec_cond @2 @3 integer_zerop) @1)
    (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1))
 
+/* Detect simplication for a conditional length reduction where
+
+   a = mask ? b : 0
+   c = i < len + bias ? d + a : d
+
+   is turned into
+
+   c = mask && i < len + bias ? d + b : d.  */
+(simplify
+  (IFN_COND_LEN_ADD integer_truep @0 (vec_cond @1 @2 zerop@5) @0 @3 @4)
+   (if (ANY_INTEGRAL_TYPE_P (type)
+	|| fold_real_zero_addition_p (type, NULL_TREE, @5, 0))
+    (IFN_COND_LEN_ADD @1 @0 @2 @0 @3 @4)))
+
 /* For pointers @0 and @2 and nonnegative constant offset @1, look for
    expressions like:
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_reduc-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_reduc-1.c
new file mode 100644
index 00000000000..db6f9d1ec6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_reduc-1.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -fno-vect-cost-model -ffast-math -fdump-tree-optimized" } */
+
+#include <stdint-gcc.h>
+
+#define COND_REDUCTION(TYPE)                                                   \
+  TYPE foo##TYPE (TYPE *restrict a, TYPE *restrict b, int loop_size)           \
+  {                                                                            \
+    TYPE result = 0;                                                           \
+    for (int i = 0; i < loop_size; i++)                                        \
+      if (b[i] <= a[i])                                                        \
+	result += a[i];                                                        \
+    return result;                                                             \
+  }
+
+COND_REDUCTION (int8_t)
+COND_REDUCTION (int16_t)
+COND_REDUCTION (int32_t)
+COND_REDUCTION (int64_t)
+COND_REDUCTION (uint8_t)
+COND_REDUCTION (uint16_t)
+COND_REDUCTION (uint32_t)
+COND_REDUCTION (uint64_t)
+COND_REDUCTION (_Float16)
+COND_REDUCTION (float)
+COND_REDUCTION (double)
+
+/* { dg-final { scan-tree-dump-not "VCOND_MASK" "optimized" } } */
+/* { dg-final { scan-tree-dump-times "COND_LEN_ADD" 11 "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111594.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111594.c
new file mode 100644
index 00000000000..6d81b26fbd0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111594.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -fno-vect-cost-model -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+void
+pr11594 (uint64_t *restrict a, uint64_t *restrict b, int loop_size)
+{
+  uint64_t result = 0;
+
+  for (int i = 0; i < loop_size; i++)
+    {
+      if (b[i] <= a[i])
+	{
+	  result += a[i];
+	}
+    }
+
+  a[0] = result;
+}
+
+/* { dg-final { scan-assembler-not {vmerge} } } */
-- 
2.36.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-09-26 12:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-26  9:55 [PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern Juzhe-Zhong
2023-09-26 11:34 ` Richard Biener
2023-09-26 12:19   ` Li, Pan2
  -- strict thread matches above, loose matches on Subject: below --
2023-09-26  8:20 Juzhe-Zhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).